Your team responsibilities :
Data Technology group in MSCI is responsible to build and maintain state-of-the-art data management platform that delivers Reference. Market & other critical datapoints to various products of the firm.
The platform, hosted on firms data centers and Azure & GCP public cloud, processes 100 TB+ data and is expected to run 24-7. With increased focus on automation around systems development and operations, Data Science based quality control and cloud migration, several tech stack modernization initiatives are currently in progress.
To accomplish these initiatives, we are seeking a highly motivated and innovative individual to join the Data Engineering team for the purpose of supporting our next generation of developer tools and infrastructure.
The team is the hub around which Engineering, and Operations team revolves for automation and is committed to provide self-serve tools to our internal key responsibilities :
- Implement & Maintain Data Catalogs : Deploy and manage data catalog tool Collibra to improve data discoverability and governance.
- Metadata & Lineage Management : Automate metadata collection, establish data lineage, and maintain consistent data definitions across systems.
- Enable Data Governance : Collaborate with governance teams to apply data policies, classifications, and ownership structures in the catalog.
- Support Self-Service & Adoption : Promote catalog usage across teams through training, documentation, and continuous support.
- Cross-Team Collaboration : Work closely with data engineers, analysts, and stewards to align catalog content with business needs.
- Tooling & Automation : Build scripts and workflows for metadata ingestion, tagging, and monitoring of catalog health.
- Leverage AI tools for automation of cataloging activities
- Reporting & Documentation : Maintain documentation and generate usage metrics, ensuring transparency and operational skills and experience that will help you excel :
- Self-motivated, collaborative individual with passion for excellence
- Computer Science or equivalent with 5+ years of total experience and at least 2 years of experience in working with Azure DevOps tools and technologies
- Good working knowledge of source control applications like git with prior experience of building deployment workflows using this tool
- Good working knowledge of Snowflake YAML, Python
- Tools : Experience with data catalog platforms (e.g., Collibra, Alation, DataHub).
- Metadata & Lineage : Understanding of metadata management and data lineage.
- Scripting : Proficient in SQL and Python for automation and integration.
- APIs & Integration : Ability to connect catalog tools with data sources using APIs.
- Cloud Knowledge : Familiar with cloud data services (Azure, GCP).
- Data Governance : Basic knowledge of data stewardship, classification, and compliance.
- Collaboration : Strong communication skills to work across data and business teams
(ref : hirist.tech)