This job offer is not available in your country.

Senior Data Engineer - Medallion Architecture

Talent SocioMumbai

10 days ago

Job description

Description :

We are building a next-generation Customer Data Platform (CDP) powered by the Databricks Lakehouse architecture and Lakehouse Engine framework.

We're looking for a skilled Data Engineer with 4-9 years of experience to help us build metadata-driven pipelines, enable real-time data processing, and support marketing campaign orchestration capabilities at scale.

The core responsibilities for the job include the following :

Lakehouse Engine Implementation :

Configure and extend the Lakehouse Engine framework for batch and streaming pipelines.
Implement the medallion architecture (Bronze -> Silver -> Gold) using Delta Lake.
Develop metadata-driven ingestion patterns from various customer data sources.
Build reusable transformers for PII handling, data standardization, and data quality enforcement.

Real-Time CDP Enablement :

Build Spark Structured Streaming pipelines for customer behavior and event tracking.

Set up Debezium + Kafka for Change Data Capture (CDC) from CRM systems.

Design and develop identity resolution logic across both streaming and batch datasets.

DataOps and Governance :

Use Unity Catalog for managing RBAC, data lineage, and auditability.

Integrate Great Expectations or similar tools for continuous data quality monitoring.

Set up CI / CD pipelines for deploying Databricks notebooks, jobs, and DLT pipelines.

Requirements :

4-9 years of hands-on experience in data engineering.

Expertise in Databricks Lakehouse platform, Delta Lake, and Unity Catalog.

Advanced PySpark skills, including Structured Streaming.

Experience implementing Kafka + Debezium CDC pipelines.

Strong in SQL transformations, data modeling, and analytical querying.

Familiarity with metadata-driven architecture and parameterized pipelines.

Understanding of data governance : PII masking, access controls, and lineage tracking.

Proficiency in working with AWS, MongoDB, and PostgreSQL.

Nice to Have :

Experience working on Customer 360 or Martech CDP platforms.

Familiarity with Martech tools like Segment, Braze, or other CDPs.

Exposure to ML pipelines for segmentation, scoring, or personalization.

Knowledge of CI / CD for data workflows using GitHub Actions, Terraform, or Databricks CLI.

(ref : hirist.tech)

Create a job alert for this search

Senior Data Engineer • Mumbai