Job Summary :
We are seeking a highly skilled and experienced Senior ETL Developer with strong expertise in PySpark, Python, along with a solid background in ETL (Extract, Transform, Load) processes across relational databases and cloud platforms. The ideal candidate will have a minimum of 5 years of hands-on experience developing scalable data pipelines and ETL workflows using modern tools and frameworks.
Responsibilities :
- Design, develop, and maintain ETL pipelines using PySpark and Python for large-scale data processing.
- Implement validation logic as part of ETL workflows.
- Work with relational databases (e.g., Oracle, PostgreSQL, SQL Server) to extract, transform, and load data efficiently.
- Develop and optimize data pipelines on cloud platforms (AWS) using Spark-based frameworks (Data Frames).
- Collaborate with data architects, analysts, and other stakeholders to understand requirements and deliver reliable data solutions.
- Ensure data integrity, performance, and quality in all ETL processes.
- Participate in code reviews, testing, and deployment of data solutions.
Required Qualifications :
5+ years of experience in ETL development with a strong foundation in data integration, transformation, and loading.Proven experience writing efficient code in PySpark and Python, especially for large-scale data sets.Hands-on experience with creating ETL workflows.Strong experience working with relational databases and writing complex SQL queries.Solid understanding of data modeling, performance tuning, and best practices in ETL design.Experience with cloud-based data processing using Spark on AWS.Strong communication and collaboration skills.Preferred Qualifications :
Experience with data orchestration tools like Apache Airflow or similar.Familiarity with data governance and security best practices.Experience in Agile / Scrum environments.Show more
Show less
Skills Required
Postgresql, Performance Tuning, Pyspark, Sql Server, Data Modeling, Sql Queries, Relational Databases, Oracle, Python, Aws