Job Description : Data Engineer Azure Data Pipelines & Governance
Overview :
We are seeking a hands-on Data Engineer to develop, optimize, and maintain automated data pipelines supporting data governance and analytics initiatives.
This role will focus on building production-ready workflows for ingestion, transformation, quality checks, lineage capture, access auditing, cost usage analysis, retention tracking, and metadata integration, primarily using Azure Databricks, Azure Data Lake, and Microsoft Purview.
Location : Offshore.
Experience : 4+ years in data engineering, with strong Azure and Databricks experience.
Key Responsibilities :
- Pipeline Development Design, build, and deploy robust ETL / ELT pipelines in Databricks (PySpark, SQL, Delta Lake) to ingest, transform, and curate governance and operational metadata from multiple sources landed in Databricks.
- Granular Data Quality Capture Implement profiling logic to capture issue-level metadata (source table, column, timestamp, severity, rule type) to support drill-down from dashboards into specific records and enable targeted remediation.
- Governance Metrics Automation Develop data pipelines to generate metrics for dashboards covering data quality, lineage, job monitoring, access & permissions, query cost, usage & consumption, retention & lifecycle, policy enforcement, sensitive data mapping, and governance KPIs.
- Microsoft Purview Integration Automate asset onboarding, metadata enrichment, classification tagging, and lineage extraction for integration into governance reporting.
- Data Retention & Policy Enforcement Implement logic for retention tracking and policy compliance monitoring (masking, RLS, exceptions).
- Job & Query Monitoring Build pipelines to track job performance, SLA adherence, and query costs for cost and performance optimization.
- Metadata Storage & Optimization Maintain curated Delta tables for governance metrics, structured for efficient dashboard consumption.
- Testing & Troubleshooting Monitor pipeline execution, optimize performance, and resolve issues quickly.
- Collaboration Work closely with the lead engineer, QA, and reporting teams to validate metrics and resolve data quality issues.
- Security & Compliance Ensure all pipelines meet organizational governance, privacy, and security standards.
Required Qualifications :
Bachelors degree in Computer Science, Engineering, Information Systems, or related field.4+ years of hands-on data engineering experience, with Azure Databricks and Azure Data Lake.Proficiency in PySpark, SQL, and ETL / ELT pipeline design.Demonstrated experience building granular data quality checks and integrating governance logic into pipelines.Working knowledge of Microsoft Purview for metadata management, lineage capture, and classification.Experience with Azure Data Factory or equivalent orchestration tools.Understanding of data modeling, metadata structures, and data cataloging concepts.Strong debugging, performance tuning, and problem-solving skills.Ability to document pipeline logic and collaborate with cross-functional teams.Preferred Qualifications :
Microsoft certification in Azure Data Engineering.Experience in governance-heavy or regulated environments (e., finance, healthcare, hospitality).Exposure to Power BI or other BI tools as a data source consumer.Familiarity with DevOps / CI-CD for data pipelines in Azure.Experience integrating both cloud and on-premises data sources into Azure.(ref : hirist.tech)