Job Title : Data Engineer
Contract Period : 12 Months
Location : Offshore candidates accepted (Singapore Based Company)
Work Timing : 6.30 AM to 3.30 PM or 7.00 AM to 4.00 PM (IST - India timing)
Experience
Minimum 4+ years as a Data Engineer or similar role.(Please don't apply if less than 4 years exp in Data Engineer)
Proven experience in Python, Spark, and PySpark (non-negotiable).(mandatory
Hands-on in building ETL pipelines, real-time streaming, and data transformations .
Worked with data warehouses, cloud platforms (AWS / Azure / GCP) , and databases .
✅ Technical Skills
Spark Core API : RDDs, transformations / actions, DAG execution.
Spark SQL : DataFrames, schema optimization, UDFs.
Streaming : Structured Streaming, Kafka integration.
Data Handling : S3, HDFS, JDBC, Parquet, Avro, ORC.
Orchestration : Airflow / Prefect.
Performance Tuning : Partitioning, caching, broadcast joins.
Cloud Deployment : Databricks, AWS EMR, Azure HDInsight, GCP Dataproc.
CI / CD : Pytest / unittest for Spark jobs, Jenkins, GitHub Actions.
✅ Education & Soft Skills
Bachelor’s / Master’s in Computer Science, Computer Engineering, or equivalent.
Strong analytical, problem-solving, and communication skills.
About the Role
We are seeking an experienced Data Engineer to join our team and support data-driven initiatives. This role involves building scalable pipelines, working with streaming data, and collaborating with data scientists and business stakeholders to deliver high-quality solutions.
Key Responsibilities
Design, build, and optimize data pipelines and ETL workflows .
Manage and process large datasets using Spark, PySpark, and SQL .
Build and maintain real-time streaming applications with Spark Streaming / Kafka.
Collaborate with data scientists and product teams to integrate AI / ML models into production.
Ensure data quality, scalability, and performance in all pipelines.
Deploy and manage Spark workloads on cloud platforms (AWS, Azure, GCP, Databricks) .
Automate testing and deployment of Spark jobs via CI / CD pipelines.
Requirements
Bachelor’s / Master’s degree in Computer Science, Computer Engineering, or related field.
Minimum 4 years of professional experience as a Data Engineer.
Strong expertise in Python, Spark, PySpark .
Hands-on experience with Spark SQL, DataFrames, UDFs, DAG execution .
Knowledge of data ingestion tools (Kafka, Flume, Kinesis) and data formats (Parquet, Avro, ORC).
Proficiency in Airflow / Prefect for scheduling and orchestration.
Familiarity with performance tuning in Spark (partitioning, caching, broadcast joins).
Experience deploying on Databricks, AWS EMR, Azure HDInsight, or GCP Dataproc .
Exposure to testing and CI / CD for data pipelines (pytest, Jenkins, GitHub Actions).
Budget
Onshore : As per market standards.
Data Engineer • Navi Mumbai, Maharashtra, India