We are looking for a AWS Data Engineer who will be able to design and build solutions for one of our Fortune 500 Client programs, which aims towards building a Enterprise Data Lake on AWS Cloud platform and build Data pipelines by developing several AWS Data Integration, Engineering & Analytics resources.
Key Responsibilities
- Perform technical assessment across Glue ETL and Lambda functions
- Build the Python programs across the Glue ETL and Lambda functions
- Build PySpark based data pipeline resources on AWS EMR Clusters & Glue ETL requiring in-depth knowledge on majority of Hadoop and NoSQL databases as well
- Optimize performance of Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDD’s
- Create documentation for user adoption, deployments, runbook, and support client users forenablement or for any issues encountered.
- Ability to monitor, troubleshoot and debug failures using AWS CloudWatch and Grafana
- Perform code reviews with the team and enable them to develop code for complex scenarios
- Liaise with all SDLC support teams to ensure seamless deployment to controlled environments and thus ensuring application robustness
Qualifications :
Bachelor’s Degree or equivalent in computer science or related and minimum 7+ years of experienceCertified on one of – Solution Architect OR Data Analytics Specialty by AWS2+ years of hands-on experience on AWS S3, Glue ETL & Catalog, EMR, Lamba Functions,Athena & KafkaRequire 2+ years of hand-on experience on PySpark programmingExperience in writing complex SQL queriesPreferred expertise on MongoDB, Oracle & Hadoop (Hive, Hdfs, Yarn)Require Technical Coordination skills to drive requirements and technical design with multipleteamsFinancial services and Healthcare industry experience preferredRequires aptitude to help build skillset within organizationLocation : Pune / Hyderabad (Even for Chicago / New York)