We are looking for a talented and motivated Data Engineer with strong experience in PySpark and Python to design build and maintain scalable data pipelines and infrastructure. The successful candidate will support the delivery of data-driven insights by transforming raw data into clean curated datasets for analytics and machine learning applications. Java experience is a plus and will be useful in hybrid environments.
Key Responsibilities :
Develop and optimize robust scalable data pipelines using PySpark and Python
Clean transform and enrich large-scale datasets from structured and unstructured sources
Implement data ingestion ETL / ELT workflows and integration strategies across cloud and on-prem platforms
Collaborate with data scientists analysts and business stakeholders to understand data requirements
Ensure data quality integrity and lineage throughout the data lifecycle
Participate in performance tuning troubleshooting and production support
Contribute to best practices in data engineering including code versioning testing and CI / CD
Qualifications :
Required Qualifications :
Bachelors degree in Computer Science Data Engineering or related field
3 years of experience in data engineering with a focus on PySpark and Python
Strong hands-on experience with distributed data processing frameworks (e.g. Apache Spark)
Solid understanding of SQL data modeling and relational databases
Experience working with cloud platforms (e.g. AWS Azure GCP)
Familiarity with workflow orchestration tools (e.g. Airflow Azure Data Factory)
Preferred Qualifications :
Java experience for supporting hybrid data platforms and legacy integrations
Exposure to data lakes delta lakes and modern data architectures
Knowledge of containerization (Docker) Kubernetes and CI / CD pipelines
Familiarity with data governance security and compliance frameworks
Additional Information :
We believe in supporting our team professionally and personally.
OUR COMMITMENT TO DIVERSITY
At Sia we believe in fostering a diverse equitable and inclusive culture where our employees and partners are valued and thrive in a sense of belonging. We are committed to recruiting and developing a diverse network of employees and investing in their growth by providing unique opportunities for professional and cultural immersion. Our commitment toward inclusion motivates dynamic collaboration with our clients building trust by creating an inclusive environment of curiosity and learning which affects lasting impact.
Please visit our website for more information.
Sia is an equal opportunity employer. All aspects of employment including hiring promotion remuneration or discipline are based solely on performance competence conduct or business needs.
Remote Work : Employment Type :
Full-time
Key Skills
Apache Hive,S3,Hadoop,Redshift,Spark,AWS,Apache Pig,NoSQL,Big Data,Data Warehouse,Kafka,Scala
Experience : years
Vacancy : 1
Data Engineer • Mumbai, India