Job role : Data : Hyderabad / Pune 3-5 years
Joining : Immediate or within 15 Required : 3-5 Years
Are you a Python & PySpark expert with hands-on experience in GCP? Passionate about building scalable, high-performance data pipelines? Join our fast-paced team and be part of impactful projects in the cloud
Data Responsibilities :
- Design, build, and maintain scalable and efficient data pipelines using Python and PySpark
- Develop and optimize ETL / ELT workflows for large-scale datasets
- Work extensively with Google Cloud Platform (GCP) services including BigQuery, Dataflow, and Cloud Functions
- Implement containerized solutions using Docker, and manage code through Git and CI / CD pipelines
- Collaborate with data scientists, analysts, and other engineers to deliver high-quality data solutions
- Monitor, troubleshoot, and improve the performance of data pipelines & Qualifications :
- Proficiency in Python, PySpark, and Big Data technologies
- Strong experience in ETL / ELT, data modeling, distributed computing, and performance tuning
- Hands-on expertise in GCP and its services
- Working knowledge of workflow orchestration tools such as Apache Airflow or Cloud Composer
- GCP certification is a plus
- Experience with Docker, CI / CD practices, and version control tools like Git
(ref : hirist.tech)