Roles & Responsibilities :
- Work with Reference Data Product Owner, external resources and other engineers as part of the product team
- Develop and maintain semantically appropriate concepts
- Identify and address conceptual gaps in both content and taxonomy
- Maintain ontology source vocabularies for new or edited codes
- Support product teams to help them leverage taxonomic solutions
- Analyze the data from public / internal datasets.
- Develop a Data Model / schema for taxonomy.
- Create a taxonomy in Semaphore Ontology Editor.
- Perform Bulk-import data templates into Semaphore to add / update terms in taxonomies.
- Prepare SPARQL queries to generate adhoc reports.
- Perform Gap Analysis on current and updated data
- Maintain taxonomies in Semaphore through Change Management process.
- Develop and optimize automated data ingestion / pipelines through Python / PySpark when APIs are available
- Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs
- Identify and resolve complex data-related challenges
- Participate in sprint planning meetings and provide estimations on technical implementation.
Basic Qualifications and Experience :
Any degree with 5 - 9 years of experience in Business, Engineering, IT or related fieldFunctional Skills : Must-Have Skills :
Knowledge of controlled vocabularies, classification, ontology and taxonomyExperience in ontology development using Progress Semaphore , or a similar tool like Pool Party etcHands on experience writing SPARQL queries on graph dataExcellent problem-solving skills and the ability to work with large, complex datasetsStrong understanding of data modeling, data warehousing, and data integration conceptsGood-to-Have Skills :
Hands on experience writing SQL using any RDBMS (Redshift, Postgres, MySQL, Teradata, Oracle, etc. ).Experience using cloud services such as AWS or Azure or GCPExperience working in Product Teams environmentKnowledge of Python / R, Databricks, cloud data platformsKnowledge of NLP (Natural Language Processing) and AI (Artificial Intelligence) for extracting and standardizing controlled vocabularies.Strong understanding of data governance frameworks, tools, and best practicesProfessional Certifications :
Databricks Certificate preferred , Progress SemaphoreSAFe Practitioner Certificate preferredAny Data Analysis certification (SQL, Python)Any cloud certification (AWS or AZURE)Soft Skills :
Strong analytical abilities to assess and improve master data processes and solutions.Excellent verbal and written communication skills, with the ability to convey complex data concepts clearly to technical and non-technical stakeholders.Effective problem-solving skills to address data-related issues and implement scalable solutions.Ability to work effectively with global, virtual teamsSkills Required
Sql, Python, Data Modeling, Data Governance, Api Development