Job Description :
- Assess and understand current-state Informatica ETL jobs, mappings, workflows, and dependencies.
- Redesign and implement equivalent data pipelines using Databricks (PySpark / Scala / SQL).
- Optimize performance and scalability of the migrated workflows in the Databricks environment.
- Collaborate with data architects, DevOps, QA, and business stakeholders to validate functionality and ensure successful migration.
- Develop automated testing and validation scripts to verify data integrity post-migration.
- Establish CI / CD practices for deployment of Databricks notebooks / jobs.
- Document design patterns, migration strategy, and data lineage for audit and and Responsibilities :
Assess and Analyze Informatica ETL Workflows :
Evaluate existing Informatica PowerCenter jobs, mappings, workflows, and interdependencies to understand business logic and data flow.Redesign ETL Pipelines in DatabricksRe-engineer Informatica-based ETL processes into scalable, maintainable data pipelines using Databricks (PySpark, Scala, SQL), aligning with modern data architecture principles.Performance Tuning and Optimization :
Enhance performance, scalability, and resource utilization of Spark-based jobs through optimization techniques such as caching, partitioning, and efficient joins.Cross-functional Collaboration :
Work with data architects, DevOps, QA, and business stakeholders to validate functionality and ensure the successful migration of ETL workloads.Automated Testing and Data Validation :
Design and implement automated test scripts to validate data quality, transformation logic, and integrity post-migration.CI / CD Implementation :
Establish robust CI / CD pipelines for Databricks notebooks and jobs using tools like Azure DevOps, GitHub Actions, or Jenkins, enabling version control and seamless deployments.Documentation and ComplianceCreate comprehensive documentation covering design patterns, migration strategies, data lineage, and operational runbooks to support audit, compliance, and ongoing maintenance.(ref : hirist.tech)