As part of the offshore development team, the AWS Developers will be responsible for implementing ingestion and transformation pipelines using PySpark, orchestrating jobs via MWAA, and converting legacy Cloudera jobs to AWS-native services.
Key Responsibilities :
- Write ingestion scripts (batch & stream) to migrate data from on-prem to S3.
- Translate existing HiveQL into SparkSQL / PySpark jobs.
- Configure MWAA DAGs to orchestrate job dependencies.
- Build Iceberg tables with appropriate partitioning and metadata handling.
- Validate job outputs and write unit tests.
Required Skills :
35 years in data engineering, with strong exposure to AWS.Experience in EMR (Spark), S3, PySpark, SQL.Working knowledge of Cloudera / HDFS and legacy Hadoop pipelines.Prior experience with data lake / lakehouse implementations is a plusMandatory Skills
AWS DeveloperSkills Required
S3, Pyspark, Aws, Sparksql, Hiveql