Job Summary
We're looking for a Senior Data Engineer with 5-8 years of experience to build and maintain scalable, production-grade data pipelines. The ideal candidate is a strong software engineer with hands-on experience in Spark (3.x), Scala, SQL, and Python. You'll be responsible for designing and implementing ETL / ELT solutions, collaborating with teams to deliver data products, and mentoring junior engineers.
Key Responsibilities
- Design and build data pipelines using Apache Spark 3.x, Scala, Python, and SQL .
- Tune Spark jobs for performance and cost.
- Ensure code quality with unit tests, CI / CD, and code reviews.
- Collaborate with platform and DevOps teams to deploy and monitor pipelines.
- Troubleshoot production issues and perform root-cause analysis.
Qualifications
5-8 years of experience as a Data Engineer .Strong hands-on experience with Apache Spark 3.x .Proficient in Scala, Python, and advanced SQL .Experience with streaming frameworks (e.g., Spark Structured Streaming, Kafka).Knowledge of orchestration tools (e.g., Apache Airflow).Experience with cloud platforms like AWS Glue, EMR, Athena, Redshift, and S3 is a plus.Solid understanding of software engineering fundamentals (Git, CI / CD).