Talent.com
This job offer is not available in your country.
MDM Engineer

MDM Engineer

ConfidentialHyderabad / Secunderabad, Telangana
30+ days ago
Job description

Roles & Responsibilities :

  • Develop distributed data pipelines using PySpark on Databricks for ingesting, transforming, and publishing master data
  • Write optimized SQL for large-scale data processing, including complex joins, window functions, and CTEs for MDM logic
  • Implement match / merge algorithms and survivorship rules using Informatica MDM or Reltio APIs
  • Build and maintain Delta Lake tables with schema evolution and versioning for master data domains
  • Use AWS services like S3, Glue, Lambda, and Step Functions for orchestrating MDM workflows
  • Automate data quality checks using IDQ or custom PySpark validators with rule-based profiling
  • Integrate external enrichment sources (e. g. , D&B, LexisNexis) via REST APIs and batch pipelines
  • Design and deploy CI / CD pipelines using GitHub Actions or Jenkins for Databricks notebooks and jobs
  • Monitor pipeline health using Databricks Jobs API, CloudWatch, and custom logging frameworks
  • Implement fine-grained access control using Unity Catalog and attribute-based policies for MDM datasets
  • Use MLflow for tracking model-based entity resolution experiments if ML-based matching is applied
  • Collaborate with data stewards to expose curated MDM views via REST endpoints or Delta Sharing

Basic Qualifications and Experience :

  • 8 to 13 years of experience in Business, Engineering, IT or related field
  • Functional Skills : Must-Have Skills :

  • Advanced proficiency in PySpark for distributed data processing and transformation
  • Strong SQL skills for complex data modeling, cleansing, and aggregation logic
  • Hands-on experience with Databricks including Delta Lake, notebooks, and job orchestration
  • Deep understanding of MDM concepts including match / merge, survivorship, and golden record creation
  • Experience with MDM platforms like Informatica MDM or Reltio, including REST API integration
  • Proficiency in AWS services such as S3, Glue, Lambda, Step Functions, and IAM
  • Familiarity with data quality frameworks and tools like Informatica IDQ or custom rule engines
  • Experience building CI / CD pipelines for data workflows using GitHub Actions, Jenkins, or similar
  • Knowledge of schema evolution, versioning, and metadata management in data lakes
  • Ability to implement lineage and observability using Unity Catalog or third-party tools
  • Comfort with Unix shell scripting or Python for orchestration and automation
  • Hands on experience on RESTful APIs for ingesting external data sources and enrichment feeds
  • Good-to-Have Skills :

  • Experience with Tableau or PowerBI for reporting MDM insights.
  • Exposure to Agile practices and tools (JIRA, Confluence).
  • Prior experience in Pharma / Life Sciences.
  • Understanding of compliance and regulatory considerations in master data.
  • Professional Certifications   :

  • Any MDM certification (e. g. Informatica, Reltio etc)
  • Any Data Analysis certification (SQL, Python, PySpark, Databricks)
  • Any cloud certification (AWS or AZURE)
  • Soft Skills :

  • Strong analytical abilities to assess and improve master data processes and solutions.
  • Excellent verbal and written communication skills, with the ability to convey complex data concepts clearly to technical and non-technical stakeholders.
  • Effective problem-solving skills to address data-related issues and implement scalable solutions.
  • Ability to work effectively with global, virtual teams
  • Skills Required

    Sql, Python, Pyspark, Databricks, Informatica

    Create a job alert for this search

    Mdm Engineer • Hyderabad / Secunderabad, Telangana