Responsibilities :
- Build and manage end-to-end ML pipelines using Vertex AI Pipelines , Airflow , or similar orchestrators.
- Implement ML models using TensorFlow , PyTorch , or other frameworks; monitor model performance in production.
- Manage Python environments using venv , pip , poetry , and containerized deployments with Docker .
- Design and implement distributed data pipelines with Apache Spark , Beam , or Flink .
- Develop data ingestion workflows from databases and API-based sources using GCP tools .
- Optimize pipelines and clusters for performance, cost, and scalability.
- Perform feature engineering, data engineering, and ML model evaluation.
- Develop and maintain CI / CD pipelines , deployment strategies, and infrastructure-as-code using Terraform .
- Design data architecture considering storage and compute options for efficient BI and analytics use cases.
- Optional / Good to have : Kubernetes, Vector Databases (e.g., Qdrant), and LLM experience (embeddings, RAG, agents).
Skills Required
Python, Java, Machine Learning, Tensorflow, Pytorch, Docker