Role Description
This role leads the design and scaling of a modern data platform on AWS. You will own the end-to-end data architecture across services like S3, Glue, Redshift, Athena, Lake Formation, SNS / SQS, Airflow, and Postgres, enabling analytics and AI / ML teams with reliable, well-modelled data.
Please note that the working hours for this job would be 5PM to 2AM IST .
- Own the architecture and roadmap of an AWS-based data platform.
- Design and build ETL / ELT pipelines using AWS Glue and related services.
- Manage and optimize a data lake on S3 and data warehouses in Redshift.
- Apply AWS Lake Formation for data security, governance, and fine-grained access control.
- Use Athena for ad-hoc and exploratory querying over S3-based data.
- Implement and maintain orchestrated data workflows using Apache Airflow.
- Design and support event-driven data flows using AWS SNS and SQS.
- Collaborate with analytics and ML teams to deliver clean, reliable, and well-modelled datasets.
- Mentor other data engineers and establish best practices, standards, and code review processes.
Qualifications
5+ years of experience as a Data Engineer, including 1–2 years in a lead or tech lead capacity.Strong experience with the AWS data stack, including : S3, AWS Glue, Redshift, Athena, Lake Formation.Hands-on experience with Apache Airflow (or a similar orchestration tool).Excellent SQL skills and a strong background in data warehousing and dimensional modelling.Solid experience with Postgres / PostgreSQL, including schema design and query optimization.Experience using SNS / SQS within data pipelines or event-driven architectures.Demonstrated ability to lead projects and mentor engineers in a remote setting.Nice to have :
Experience enabling ML / AI teams with curated data layers or feature stores.Familiarity with infrastructure-as-code for data infrastructure (CloudFormation, CDK, Terraform, etc.).