About the Role
Position : Data Engineer-I
Experience : 2-4 Years
Notice Period : Immediate-60 days
Hiring Locations : Bengaluru, Kolkata, Hyderabad, Chennai, Gurgaon
- Design, build, and optimize scalable data pipelines and ETL / ELT workflows using Spark (Scala / Python), SQL, and orchestration tools (e.G., Apache Airflow, Prefect, Luigi).
- Implement efficient solutions for high-volume, batch, real-time streaming, and event-driven data processing, leveraging best-in-class patterns and frameworks.
- Build and maintain data warehouse and lake house architectures (e.G., Snowflake, Databricks, Delta Lake, BigQuery, Redshift) to support analytics, data science, and BI workloads.
- Develop, automate, and monitor Airflow DAGs / jobs on cloud or Kubernetes, following robust deployment and operational practices (CI / CD, containerization, infra-as-code).
- Write performant, production-grade SQL for complex data aggregation, transformation, and analytics tasks.
- Ensure data quality, consistency, and governance across the stack, implementing processes for validation, cleansing, anomaly detection, and reconciliation.
- Collaborate with Data Scientists, Analysts, and DevOps engineers to ingest, structure, and expose structured, semi-structured, and unstructured data for diverse use-cases.
- Contribute to data modeling, schema design, data partitioning strategies, and ensure adherence to best practices for performance and cost optimization.
- Implement, document, and extend data lineage, cataloging, and observability through tools such as AWS Glue, Azure Purview, Amundsen, or open-source technologies.
- Apply and enforce data security, privacy, and compliance requirements (e.G., access control, data masking, retention policies, GDPR / CCPA).
- Take ownership of end-to-end data pipeline lifecycle : design, development, code reviews, testing, deployment, operational monitoring, and maintenance / troubleshooting.
- Contribute to frameworks, reusable modules, and automation to improve development efficiency and maintainability of the codebase.
- Stay abreast of industry trends and emerging technologies, participating in code reviews, technical discussions, and peer mentoring as needed.