Job Purpose :
To lead the development and management of robust data infrastructure and pipelines across the CBG network, enabling the delivery of data as a product. This role ensures high-quality, accessible, and timely data is available for advanced analytics, reporting, and decision-making across all operational and business functions.
Key Responsibilities :
Data Infrastructure Development :
- Design, implement, and manage scalable data architecture for collecting, storing, and processing large volumes of data from CBG plant systems (SCADA, PLC, IoT devices, SAP, EMS, LIMS, etc.).
- Own the cloud / on-prem data lake, data warehouse, and structured databases supporting both real-time and batch processing.
Pipeline Engineering & Orchestration :
Develop and maintain robust, automated data pipelines using modern ETL / ELT toolsEnsure reliability, efficiency, and monitoring of all data flows from source to destination systems.Data Quality & Governance :
Implement processes to ensure data accuracy, consistency, completeness, and freshness.Work with Data Governance and Compliance teams to define standards, validation rules, and audit trails.Cross-Functional Collaboration :
Collaborate with data scientists, business analysts, application teams, and plant operations to understand and prioritize data requirements.Enable self-service data access through APIs, secure dashboards, and curated datasets.Metadata & Cataloguing :
Maintain a data catalogue and lineage tracking system to improve data discoverability and reusability across the organization.Provide documentation and training on data schema, usage, and access policies.Security & Compliance :
Ensure data is stored and accessed securely, following best practices in encryption, role-based access, and regulatory compliance.Key Skills :
B.E. / B.TechExpertise in SQL, Python, DevOps and distributed data technologies (e.g., Spark, Kafka).Experience with cloud platforms such as AWS, Azure, or GCP, and associated data services Strong understanding of CI / CD for data pipelines and MLOps integration.Familiarity with industrial data sources (OPC-UA, MQTT, SCADA systems) is highly desirable.Excellent leadership, documentation, and stakeholder communication skills.