Key responsibilities
1.Architecture and roadmap
Define reference architectures for lakehouse and medallion patterns using Delta Lake, OneLake, and Synapse / Fabric Lakehouse for scalable analytics and AI.
Create domain-driven data models, canonical schemas, and patterns for batch and streaming integration (bronze / silver / gold).
2.Platform design and build
Design ingestion frameworks for batch (ADF / Fabric Pipelines) and streaming (Event Hubs, Kafka, IoT Hub) into ADLS / OneLake with Delta and Change Data Capture.
Architect Databricks workloads (PySpark / Scala / SQL) for ETL / ELT, feature engineering, and ML data prep with robust job orchestration and scheduling.
3.Real-time streaming
Lead Structured Streaming architectures in Databricks with exactly-once semantics, watermarking, and stateful aggregations; design Kappa / Lambda where appropriate.
Implement low-latency serving layers and materialized views for near-real-time analytics and operational reporting.
4.Microsoft Fabric implementation
Establish Fabric workspaces, Lakehouse, Pipelines, Dataflows Gen2, Shortcuts to ADLS / OneLake, and semantic model standards for governed self-service BI.
Define data product patterns integrating Fabric with Databricks and Power BI for governed, reusable datasets.
5.Data governance and security
Implement RBAC / ABAC, Unity Catalog, Purview (lineage, glossary, classifications), encryption, network isolation, and data masking / tokenization.
Define data quality SLAs, expectations, and contracts; embed quality checks, observability, and lineage in pipelines.
6.DevOps and FinOps
Standardize CI / CD (Azure DevOps / GitHub), environment strategy, IaC (Bicep / Terraform), cluster policies, and workspace baselines.
Optimize cost via right-sized clusters, autoscaling, Photon, Delta optimization / Z-Order, and job scheduling.
7.Delivery leadership
Lead design reviews, threat modeling, performance testing, and production readiness; mentor engineers and partner with product / enterprise architects.
Translate business requirements into technical designs, estimates, and roadmaps; drive stakeholder communication and risk management.
Required skills and experience
8–12 years in data engineering / architecture with 4+ years on Azure data stack; strong leadership in complex enterprise programs.
Deep expertise
Databricks : PySpark / SQL, Delta Lake, Structured Streaming, Jobs / Workflows, Unity Catalog, cluster policies, performance tuning.
Azure : ADLS Gen2, Event Hubs / Kafka, Azure Functions / Logic Apps, Key Vault, ADF, Synapse; VNETs, Private Endpoints, Managed Identity.
Fabric : Lakehouse, OneLake, Pipelines, Dataflows Gen2, Shortcuts, semantic models, governance integration with Purview and Power BI.
Architecture patterns
Lakehouse, medallion, Data Mesh / data products, CDC with Debezium / Fivetran / ADF mapping data flows, SCD handling, schema evolution.
Batch and streaming design, watermarking, state store management, idempotency, backfills, and late / duplicate data handling.
Data management
Dimensional and semantic modeling, Data Vault / Kimball, query performance, partitioning, Z-Order, OPTIMIZE / VACUUM, file sizing.
DQ frameworks (Great Expectations / Deequ), monitoring / observability (Log Analytics, Databricks metrics), SLA / SLO design.
Security and compliance
Purview lineage and classification, Unity Catalog governance, PII / PHI handling, encryption, tokenization; audit, SOC2 / ISO, GDPR / DPDP familiarity.
DevOps / IaC and automation
Git-based development, branch strategies, CI / CD for notebooks / SQL / artifacts, IaC for data resources, automated testing.
Communication and leadership
Strong stakeholder engagement, technical writing, solution estimation, and mentoring.
Nice to have
Experience with data products and mesh operating models; product lifecycle and contracts between producer / consumer domains.
ML / feature store integration (Databricks Feature Store), MLOps awareness for data readiness.
Knowledge of dbt, Terraform, Airflow, Confluent, and enterprise SSO / SCIM / SCIM provisioning with Databricks / Fabric.
Qualifications
Bachelor’s / Master’s in Computer Science, Engineering, or related field.
Certifications : Azure Solutions Architect Expert, Azure Data Engineer Associate, Databricks Data Engineer Professional / Associate, Microsoft Fabric Data Engineer Associate.
Cloud Architect • Mount Abu, IN