Description :
Job Title : GCP AI Data Engineer
Location : Chennai - Onsite - Hybrid
Notice Period : Joining within 30 Days only
Job Summary :
We are seeking a skilled and motivated Software Engineer - AI / ML / GenAI with 3+ years of experience to join our team. You will be responsible for designing, building, and optimizing our data infrastructure on the Google Cloud Platform, with a special focus on enabling and operationalizing AI and machine learning workflows.
The ideal candidate will have a strong background in GCP data services, data pipeline development (ETL / ELT), and Infrastructure as Code (IaC). You will collaborate closely with data scientists and other engineers to build scalable, end-to-end data solutions.
Key Responsibilities :
Data Pipeline & Architecture :
- Design and build scalable data solutions, including enterprise data warehouses and data lakes on GCP.
- Develop and optimize robust ETL / ELT pipelines using DataFlow, Data Fusion, Dataproc, Python, and SQL.
- Architect and implement strategies for both historical and incremental data loads, refining the architecture as needed.
- Manage and optimize data storage and processing using services like BigQuery, Cloud Storage, and PostgreSQL.
- Orchestrate complex data workflows using Cloud Composer (Airflow).
AI / ML & MLOps :
Utilize ML services like Vertex AI to support the operationalization of machine learning models.Build and manage data infrastructure to support AI / ML / LLM workflows, including data labeling, classification, and document parsing for structured and unstructured data.Infrastructure & Automation (DevOps) :
Automate infrastructure provisioning and deployments using Terraform (IaC).Implement and manage CI / CD practices (e.g., using Tekton) to ensure reliable and scalable deployments.Monitor, troubleshoot, and optimize data pipelines for performance, reliability, and & GovernanceCollaborate with data scientists, data engineers, and business stakeholders to understand data needs and deliver solutions.Ensure solutions align with business objectives, security requirements, and data governance policies.Create and maintain clear documentation for data processes, pipeline designs, and system architecture.Required Qualifications & Skills :
Experience : A minimum of 3+ years of hands-on experience in a data engineering role.Education : A Bachelor's Degree in Computer Science, Engineering, or a related field.GCP Data Stack :
Warehousing & Lakes :
Strong proficiency with BigQuery and Cloud Storage.Processing :
Hands-on experience with DataFlow and Dataproc.Integration :
Experience with Data Fusion.Orchestration :
Proven experience with Cloud Composer (Airflow).AI / ML : Familiarity with GCP's AI services, particularly Vertex AI.DevOps & Automation :IaC :
Proficiency with Terraform.CI / CD :
Experience with CI / CD principles and tools (e.g., Tekton, Jenkins, GitLab CI).Databases : Experience with relational databases, specifically PostgreSQL.Programming : Strong programming skills in Python and SQL.Preferred Qualifications :
Google Cloud Certified Professional Data Engineer.Experience with real-time streaming services (e.g., Pub / Sub).Deeper experience in building and supporting LLM or Generative AI workflows.(ref : hirist.tech)