Summary :
We are looking for a skilled Lead AWS Data Engineer to join our team in Mumbai. The ideal candidate will have experience in designing, building, and optimizing data pipelines and workflows using AWS cloud technologies. You should be proficient in SQL, PySpark, and Python and have hands-on experience with AWS services like Redshift, EMR, Airflow, CloudWatch, and S3 .
Key Responsibilities :
- Design, develop, and maintain scalable ETL / ELT data pipelines using AWS cloud technologies.
- Work with PySpark and SQL to process large datasets efficiently.
- Manage AWS services such as Apache Airflow, Redshift, EMR, CloudWatch, and S3 for data processing and orchestration.
- Implement CI / CD pipelines using Azure DevOps for seamless deployment and automation.
- Monitor and optimize data workflows for performance, cost, and reliability.
- Utilize Jupyter Notebooks for data exploration and analysis.
- Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to ensure smooth data integration.
- Implement Unix commands and Git for version control and automation.
- Ensure best practices for data governance, security, and compliance in cloud environments.
- Should be a Lead Managing a team
Required Skills :
Programming & Scripting :
SQL, PySpark, Python AWS Tools & Services :Apache Airflow (Workflow Orchestration)Redshift (Data Warehousing)EMR (Big Data Processing)CloudWatch (Monitoring & Logging)S3 (Data Storage)Jupyter Notebooks (Data Exploration) DevOps & CI / CD :Azure DevOps (CI / CD Pipelines)Git (Version Control)Unix Commands (Shell Scripting) Preferred Qualifications :Experience in performance tuning data pipelines and SQL queries.Knowledge of data lake and data warehouse architecture .Strong problem-solving skills with the ability to troubleshoot data pipeline issues.Understanding data security, encryption, and access control in cloud environments.Skills Required
Aws Lambda, Aws Redshift, Python, Pyspark, Etl, Sql, Team Lead