We are seeking a Data Engineer with 1-3 years of experience in building and managing data pipelines. The ideal candidate has strong SQL and Python skills, hands-on experience with Airflow, and exposure to cloud platforms (Google Cloud Platform preferred, AWS / Azure acceptable). Experience with Git is a must.
Responsibilities :
- Design, develop, and maintain ETL pipelines using Apache Airflow.
- Write optimized SQL queries for data extraction and transformation.
- Build and automate workflows using Python (pandas, pySpark, or similar libraries).
- Deploy, manage, and scale pipelines on cloud platforms (GCP / AWS / Azure).
- Collaborate with cross-functional teams to ensure data quality and reliability.
- Troubleshoot and optimize pipeline performance and identify and resolve pipeline failures.
- Apply version control practices (Git) for collaborative development.
Requirements :
1-3 years of experience in Data Engineering or a related role.Strong knowledge of SQL(query optimization, joins, window functions).Proficiency in Python for data engineering tasks.Hands-on experience with Apache Airflow or any orchestration tool.Cloud experience : Google Cloud Platform (preferred) or AWS / Azure.Mandatory experience with Git(branching, pull requests, code reviews).Basic understanding of Kafka and Kubernetes.Experience with data warehousing concepts and modeling.Nice to Have :
Familiarity with monitoring / alerting tools (e. g., Prometheus, Grafana, Cloud Monitoring).Exposure to CI / CD pipelines.Experience with streaming pipelines.Experience with a data quality tool.(ref : hirist.tech)