As a data engineer, you will be responsible for designing and developing data assets by collaborating with stakeholders.
Key Responsibilities :
- Work with stakeholders to design and develop complex ETL processes.
- Create data integration and documentation.
- Lead validation, UAT, and regression tests for new data asset creation.
- Develop data pipelines that automate data flow, ensuring quality and consistency.
Qualifications and Skills :
Strong knowledge of Python, Pyspark, and SQL.Ability to write scripts for developing data workflows.Expertise in Azure, Databricks, Hadoop, Hive, and Greenplum.Proficiency in writing queries for metadata and tables from various data management systems.Familiarity with big data technologies like Spark and distributed computing frameworks.Experience with Hue and running Hive SQL queries, scheduling Apache Oozie jobs to automate data workflows.Effective communication and collaboration skills with stakeholders and business teams.Strong problem-solving and analytical skills.Establish comprehensive data quality test cases and procedures.Bachelor's degree in Data Science, Statistics, Computer Science, or related fields.4-7 years of experience as a data engineer.Proficiency in programming languages commonly used in data engineering, such as Python, Pyspark, and SQL.