Background :
This position will be responsible for design, build and maintenance of data pipelines running on Airflow, Spark on the AWS Cloud platform at Bank.
Roles and Responsibility :
- Build and maintain all facets of Data Pipelines for Data Engineering team.
- Build the pipelines required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Spark, Python and Airflow.
- Work with internal and external stakeholders to assist with data-related technical issues and data quality issues.
- Engage in proof of concepts, technical demos and interaction with customers and other technical teams
- Participate in the agile ceremonies
- Ability to solve complex data-driven scenarios and triage towards defects and production issues
Technical Skills
Must Have Skills :Proficient with Python, PySpark and AirflowStrong understanding of Object-Oriented Programming and Functional Programming paradigmMust have experience working with Spark and its architectureKnowledge of Software Engineering best practicesAdvanced SQL knowledge (preferably Oracle)Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources.Good to Have Skills :Knowledge of ‘Data’ related AWS ServicesKnowledge of GitHub and JenkinsAutomated testing