About the Role :
We are seeking an experienced Python + ETL Developer with strong expertise in designing, building, and optimizing large-scale data pipelines.
This role requires a deep understanding of Python (Pandas / PySpark), advanced SQL, and ETL workflows, with a proven track record of delivering scalable, high-performance data solutions.
The ideal candidate will also have experience with cloud platforms and modern data warehouse Responsibilities :
ETL Development & Optimization :
- Design, develop, and optimize complex ETL workflows and data pipelines using Python and PySpark.
- Implement efficient data transformations, cleansing, and enrichment processes to ensure high data quality and integrity.
- Optimize ETL jobs for scalability, performance, and cost-efficiency in distributed Orchestration & Automation :
- Develop, schedule, and monitor ETL workflows using Airflow or equivalent orchestration tools.
- Ensure job reliability with automated monitoring, alerting, and error-handling mechanisms.
- Build reusable components and frameworks for ETL workflow Engineering & Cloud Integration :
- Integrate and manage data across AWS, GCP, or Azure environments.
- Implement best practices for cloud data pipelines, including storage optimization, security, and access management.
- Work with data warehouses (Snowflake, Redshift, BigQuery) to design efficient data models and query performance & Stakeholder Management :
- Partner with data analysts, data scientists, and business stakeholders to translate business needs into
robust technical solutions.
Provide technical guidance on data architecture, pipeline best practices, and optimization strategies.Collaborate with cross-functional teams to ensure alignment on data requirements and delivery Skills & Experience :Core Technical Skills :
Expert-level proficiency in Python (Pandas, PySpark).Strong knowledge of SQL (query optimization, stored procedures, analytical queries).Hands-on experience with ETL design, development, and performance & Platforms :Workflow orchestration : Airflow (or equivalent tools such as Luigi, Prefect).Cloud platforms : AWS / GCP / Azure.Data Warehouses : Snowflake, Redshift, Competencies :Experience with CI / CD pipelines and version control (Git).Familiarity with data governance, security, and compliance standards.Strong problem-solving, debugging, and performance tuning skills.Ability to work independently and in a collaborative, agile Qualifications :Experience in real-time data pipelines (Kafka, Spark Streaming).Knowledge of data modeling techniques and best practices.Exposure to DevOps practices for data engineering(ref : hirist.tech)