Description :
We are seeking a AWS-Data Engineer with strong experience across the entire Architecting and consulting around Cloud Data stack. The ideal candidate should have extensive experience in Architecting and consulting with data pipelines (ELT / ETL), data warehousing and dimensional modeling, and curation of data sets for Data Scientists and Business Intelligence users. Experience in Architecting Data Lake. This candidate should also have excellent problem-solving ability dealing with large volumes of data and various data sources while able to work with clients team and consult them ensuring a scalable system.
Key Responsibilities :
- Very good communication skills working with client and extending teams
- Should be able architect, suggest and work with various stakeholders to build the right Data Stack on AWS
- Design, Build and Operate the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, cloud migration tools, and big data technologies.
- Optimize various RDBMS engines in the cloud and solve customers' security, performance, and operation problems.
- Design, Build, and Operate large, complex data lakes that meet functional / non-functional business requirements.
- Optimize various data types of ingestion, storage, processing, and retrieval from near real-time events, and IoT, to unstructured data as images, audio, video, and documents, and in between.
- Work with customers' and internal stakeholders including the Executive, Product, Data, Software Development, and Design teams to assist with data-related technical issues and support their data infrastructure and business needs.
- Understand and Develop AI / ML models using AWS AI / ML services (e.g., Sage Maker, Rekognition, Comprehend) and collaborate with data scientists and engineers to implement machine learning :
- 6+ years of experience in a Data Engineer role in a cloud-native eco-system.
- Bachelor (Graduate preferred) degree in Computer Science, Mathematics, Informatics, Information Systems, or another quantitative field.
- Expertise working experience in AWS Glue ETL, Redshift, Step Function, Athena
- Experience in implementing data pipelines for both streaming and batch integrations using tools / frameworks like Glue ETL, Lambda, Spark, Spark Experience in Lake Formation
- Orchestration, Maintain and enhance models for SDLC using CI / CD pipelines in AWS.
- Experience in Relational and NoSQL databases, such as MySQL or Postgres and DynamoDB.
- Functional and scripting languages : Python, Advanced SQL.
- Experience building and optimizing big data pipelines, architectures, and data sets.
(ref : hirist.tech)