Job Title : Data Engineer
Contract Period : 12 Months
Location : Offshore candidates accepted (Singapore Based Company)
Work Timing : 6.30 AM to 3.30 PM or 7.00 AM to 4.00 PM (IST - India timing)
Experience
- Minimum 4+ years as a Data Engineer or similar role.(Please don't apply if less than 4 years exp in Data Engineer)
- Proven experience in Python, Spark, and PySpark (non-negotiable).(mandatory
- Hands-on in building ETL pipelines, real-time streaming, and data transformations .
- Worked with data warehouses, cloud platforms (AWS / Azure / GCP) , and databases .
✅ Technical Skills
Spark Core API : RDDs, transformations / actions, DAG execution.Spark SQL : DataFrames, schema optimization, UDFs.Streaming : Structured Streaming, Kafka integration.Data Handling : S3, HDFS, JDBC, Parquet, Avro, ORC.Orchestration : Airflow / Prefect.Performance Tuning : Partitioning, caching, broadcast joins.Cloud Deployment : Databricks, AWS EMR, Azure HDInsight, GCP Dataproc.CI / CD : Pytest / unittest for Spark jobs, Jenkins, GitHub Actions.✅ Education & Soft Skills
Bachelor’s / Master’s in Computer Science, Computer Engineering, or equivalent.Strong analytical, problem-solving, and communication skills.About the Role
We are seeking an experienced Data Engineer to join our team and support data-driven initiatives. This role involves building scalable pipelines, working with streaming data, and collaborating with data scientists and business stakeholders to deliver high-quality solutions.
Key Responsibilities
Design, build, and optimize data pipelines and ETL workflows .Manage and process large datasets using Spark, PySpark, and SQL .Build and maintain real-time streaming applications with Spark Streaming / Kafka.Collaborate with data scientists and product teams to integrate AI / ML models into production.Ensure data quality, scalability, and performance in all pipelines.Deploy and manage Spark workloads on cloud platforms (AWS, Azure, GCP, Databricks) .Automate testing and deployment of Spark jobs via CI / CD pipelines.Requirements
Bachelor’s / Master’s degree in Computer Science, Computer Engineering, or related field.Minimum 4 years of professional experience as a Data Engineer.Strong expertise in Python, Spark, PySpark .Hands-on experience with Spark SQL, DataFrames, UDFs, DAG execution .Knowledge of data ingestion tools (Kafka, Flume, Kinesis) and data formats (Parquet, Avro, ORC).Proficiency in Airflow / Prefect for scheduling and orchestration.Familiarity with performance tuning in Spark (partitioning, caching, broadcast joins).Experience deploying on Databricks, AWS EMR, Azure HDInsight, or GCP Dataproc .Exposure to testing and CI / CD for data pipelines (pytest, Jenkins, GitHub Actions).Budget
Onshore : As per market standards.