Working experience and communicating with business stakeholders and architectsIndustry experience in developing relevant big data / ETL data warehouse experience building cloud native data pipelinesExperience in Python, Pyspark, Scala, Java and SQL Strong Object and Functional programming experience in PythonExperience worked with REST and SOAP based APIs to extract data for data pipelinesExtensive experience working with Hadoop and related processing frameworks such as Spark, Hive, Sqoop, etc.Experience working in a public cloud environment, particularly AWS is mandatoryAbility to implement solutions with AWS Virtual Private Cloud, EC2, AWS Data Pipeline, AWS Cloud Formation, Auto Scaling, AWS Simple Storage Service, EMR and other AWS products, HIVE, AthenaExperience in working with Real time data streams and Kafka Platform.Working knowledge with workflow orchestration tools like Apache Airflow design and deploy dags.Hands on experience with performance and scalability tuningProfessional experience in Agile / Scrum application development using JIRANice to Have
- Teradata and Snowflake experience.
- Professional experience with source control, merging strategies and coding standards, specifically Bitbucket / Git and deployment through Jenkins pipelines.
- Demonstrated experience developing in a continuous integration / continuous delivery (CI / CD) environment using tools like Jenkins, circleci Frameworks.
- Demonstrated ability to maintain the build and deployment process through the use of build integration tools
- Experience designing instrumentation into code and using and integrating with software and logging analysis tools like log4Python, New Relic,Signal FX and / or Splunk
Skills Required
Aws, Emr, S3, Data Pipeline, Python, Pyspark, Scala, Java