Design, develop, and maintain ETL / ELT data pipelines using Python and AWS native services (Glue, Lambda, EMR, Step Functions, etc.)
Build and manage data lakes and data warehouses using Amazon S3, Redshift, Athena, and Lake Formation
Implement data ingestion from diverse sources (RDBMS, APIs, streaming data, on-premise systems)
Optimize data workflows for performance, cost, and reliability using AWS tools like Glue Jobs, Athena, and Redshift Spectrum
Develop reusable, modular Python-based frameworks for data ingestion, transformation, and validation
Work with stakeholders to understand data requirements, model data structures, and ensure data consistency and governance
Deploy and manage data infrastructure using Infrastructure as Code (IaC) tools such as Terraform or AWS CloudFormation
Implement data quality, monitoring, and alerting using CloudWatch, Glue Data Catalog, or third-party tools
Support data security and compliance (IAM roles, encryption, data masking, GDPR policies, etc.)
Collaborate with DevOps and ML teams to integrate data pipelines into analytics and AI workflows
Required Skills & Qualifications
Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
Minimum 5 to 8 years of experience as a Data Engineer or similar role.
Strong programming experience in Python (pandas, boto3, PySpark, SQLAlchemy, etc.)
Deep hands-on experience with AWS services, including :
AWS Glue, Lambda, EMR, Redshift, Athena, S3, Step Functions
IAM, CloudWatch, Kinesis (for streaming), and ECS / EKS (for containerized workloads)
Experience with SQL and NoSQL databases (e.g., PostgreSQL, DynamoDB, MongoDB)
Strong knowledge of data modeling, schema design, and ETL orchestration.
Familiarity with version control (Git) and CI / CD pipelines for data projects.
Understanding of data governance, lineage, and cataloging principles.
Excellent problem-solving, debugging, and performance-tuning skills.
Senior Data Engineer • Hyderabad, Telangana, India