Job Summary :
We are seeking a highly skilled and motivated Azure Data Engineer to join our innovative data engineering team.
The ideal candidate will have deep expertise in designing, building, and optimizing end-to-end data pipelines and data platforms within the Microsoft Azure ecosystem.
This role involves ingesting, transforming, and modeling large volumes of data from diverse sources including legacy on-premise systems such as mainframesto enable accurate, timely, and actionable business insights.
Your contributions will be pivotal in building scalable, secure, and high-performance data solutions that support enterprise-wide analytics and Responsibilities :
- Design, develop, deploy, and maintain robust, scalable, and efficient data pipelines using Azure Data Factory and Azure Databricks (leveraging Apache Spark with PySpark or Scala) to process both batch and streaming data workloads.
- Build and optimize semantic data models, ensuring high performance, maintainability, and seamless integration with platforms such as Rahona or equivalent BI / reporting tools.
- Extract, transform, and load data from a wide range of on-premise sources including mainframe systems, SQL Server, Oracle databases, and file-based systems, ensuring data integrity and completeness throughout the pipeline.
- Utilize big data frameworks and tools such as Sqoop for data transfer between relational databases and Hadoop, and leverage Hadoop ecosystem components for large-scale data processing where appropriate.
- Manage comprehensive metadata and documentation for data ingestion workflows and business requirements, utilizing tools like Microsoft Excel and centralized documentation repositories to ensure transparency and traceability.
- Write clean, modular, and reusable code using Python, Scala, or Java to automate data transformations, pipeline orchestration, and error handling processes.
- Use advanced SQL techniques (T-SQL, PL / SQL) to perform complex querying, data manipulation, and aggregation to support downstream data needs and analytics.
- Manage secure file transfers and ingestion processes via dedicated mailboxes or secure FTP, ensuring compliance with data security policies and governance.
- Develop and maintain automated CI / CD pipelines using tools like Git for version control and Jenkins for build, test, and deployment automation, enabling faster and more reliable delivery of data engineering solutions.
- Schedule, monitor, and troubleshoot data jobs and workflows using orchestration and scheduling tools such as Autosys or Oozie, ensuring timely and successful execution of data pipelines.
- Work closely with data scientists, analysts, business stakeholders, and IT teams to gather requirements, design scalable data solutions, and deliver high-quality outputs that align with business goals.
- Continuously monitor data pipeline health and performance, identify bottlenecks, troubleshoot issues proactively, and implement improvements to ensure data quality, reliability, and Skills & Experience :
- Demonstrated expertise with Azure Data Factory, Azure Databricks, and Apache Spark (PySpark or Scala) for building scalable data pipelines.
- Strong background in data modeling, schema design, and query performance tuning within cloud data platforms.
- In-depth knowledge of various data ingestion patterns and techniques, particularly for extracting data from complex on-premise legacy systems into cloud environments.
- Proficiency in programming with Python, Scala, or Java, including writing production-quality code for data transformation and automation.
- Advanced skills in SQL (T-SQL, PL / SQL) for data extraction, transformation, and complex querying.
- Hands-on experience with Git for source control and Jenkins or similar tools for CI / CD pipeline creation and maintenance.
- Familiarity with job orchestration and scheduling tools such as Autosys or Oozie to manage and automate workflows efficiently.
- Strong competency in metadata management and documentation practices using Microsoft Excel or equivalent tools.
- Excellent analytical, problem-solving, and debugging skills with meticulous attention to detail.
- Ability to work effectively both independently and as part of a collaborative team in an Agile development environment
ref : hirist.tech)