Role Overview
We are seeking a skilled and motivated Data Engineer with 5–8 years of experience in building scalable data pipelines using Python, PySpark, and AWS services. The ideal candidate will have hands-on expertise in big data processing, orchestration using AWS Step Functions, and serverless computing with AWS Lambda. Familiarity with DynamoDB and deployment of ETL programs in AWS is essential.
Key Responsibilities
- Design, develop, and maintain robust data pipelines using Python and PySpark
- Handle large-scale data processing and transformation using AWS services
- Implement orchestration workflows using AWS Step Functions
- Develop and manage serverless components using AWS Lambda
- Deploy and monitor ETL programs in AWS environments
- Configure and optimize DynamoDB for data storage and retrieval
- Collaborate with cross-functional teams to understand data requirements and deliver scalable solutions
- Ensure data quality, integrity, and security across all stages of the pipeline
Required Skills & Qualifications
5–8 years of experience in data engineering or related fieldStrong proficiency in Python and PySparkSolid understanding of AWS services including S3, Lambda, Step Functions, Glue, and DynamoDB Experience deploying and managing ETL workflows in AWSFamiliarity with NoSQL databases, especially DynamoDBKnowledge of CI / CD practices and infrastructure-as-code tools (e.g., CloudFormation, Terraform) is a plusExcellent problem-solving and communication skillsAbility to work independently in a remote setupWhat We Offer
Fully remote work environmentOpportunity to work on cutting-edge data engineering projectsCollaborative and inclusive team cultureCompetitive compensation and benefits