About the job
Job Title : ML Engineer (LLM Deployment & GPU)
Location : Remote - Pune prefer
Experience : 4–5 Years
Overview
We are looking for an ML Engineer with hands-on experience in deploying large language models (LLMs) on GPU infrastructure. This role combines ML engineering with DevOps, focusing on scalable deployments, API integration, and optimization of LLM performance.
Key Responsibilities
Deploy and optimize LLMs on GPU-based infrastructure.
Build and manage APIs for model serving (Python-based).
Implement CI / CD, monitoring, and scaling for ML models.
Collaborate on prompt engineering and model optimization.
Manage containerized workloads (Docker / Kubernetes).
Requirements
4–5 years of ML / DevOps engineering experience.
Strong in Python, APIs, and LLM architecture.
Experience with GPU deployments and cloud platforms (AWS / GCP / Azure).
Familiarity with prompt engineering and inference optimization.
Machine Learning Engineer • Ghaziabad, IN