Work with business and technical leadership to understand requirementsDesign to the requirements and document the designsAbility to write product-grade performant code for data extraction, transformations and loading using Spark, Py -SparkDo data modeling as needed for the requirementsWrite performant queries using Teradata SQL, Hive SQL and Spark SQL against Teradata and HiveImplementing dev-ops pipelines to deploy code artifacts on to the designated platform / servers like AWS or Hadoop Edge NodesImplement Hadoop job orchestration using Shell scripting, Apache Oozie, CA7 Enterprise Scheduler and AirflowTroubleshooting the issues, providing effective solutions and jobs monitoring in the production environmentParticipate in sprint planning sessions, refinement / story-grooming sessions, daily scrums, demos and retrospectivesExperience Desired :
- Experience in Jira and Confluence
- Health care domain knowledge is a plus
- Excellent work experience on Hadoop as data warehouse
- Experience in Agile and working knowledge on DevOps tools
Education and Training Required :
Primary Skills :
- Spark, Py -Spark, Shell scripting, Teradata, Hive and Hadoop
- SQLs (using Teradata SQL, Hive SQL and Spark SQL) and Stored Procedures
- Git, Jenkins, Artifactory
- Unix / Linux Shell scripting (KSH) and basic administration of Unix servers
- CA7 Enterprise Scheduler
- AWS (S3, EC2, SNS, SQS, Lambda, ECS, Glue, IAM, and CloudWatch)
- Databricks ( Delta lake , Notebooks, Pipelines, cluster management, Azure / AWS integration)
Additional Skills :
- Exercises considerable creativity, foresight, and judgment in conceiving, planning, and delivering initiatives.
Skills Required
Hive, Spark, Teradata, Databricks, Aws