Position : Senior Data Engineer
Location : Pune
Experience : 5+ years
Job Summary :
We are seeking an experienced Senior Data Engineer with a strong background in PySpark, Spark, and Big Data technologies. The ideal candidate will have over five years of hands-on experience designing, building, and optimizing large-scale data processing systems and pipelines.
Key Responsibilities :
- Design, develop, and maintain scalable data pipelines and ETL processes using PySpark and Spark.
- Perform data wrangling, transformation, and integration across diverse data sources.
- Optimize the performance and reliability of data processing workflows.
- Implement data quality frameworks to ensure data accuracy and integrity.
- Develop and maintain data architectures and data models to support business needs.
- Collaborate with cross-functional teams including data analysts, software engineers, and business stakeholders to understand data requirements.
- Stay updated with the latest advancements in Big Data tools, frameworks, and best practices.
- Mentor and guide junior data engineers in technical and professional growth.
Required Skills and Qualifications :
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.5+ years of experience in data engineering roles.Expertise in PySpark and Spark for large-scale data processing.Strong understanding of Big Data ecosystems and technologies (e.g., Hadoop, Hive).Hands-on experience with ETL development, data modeling, and data warehousing.Proficiency in programming languages such as Python, Scala, or Java.Experience with cloud platforms like AWS, Azure, or Google Cloud.Strong problem-solving skills and a detail-oriented mindset.Excellent communication and collaboration skills.Preferred Qualifications :
Experience with containerization and orchestration tools like Docker and Kubernetes.Knowledge of SQL and relational database management systems.Familiarity with data orchestration tools like Airflow, Prefect, or similar.