TCS is Hiring!!!
Job Title : Machine Learning Engineer
Experience : 6-8 Years
Location : Ahmedabad, Chennai, Gurgaon, Hyderabad, Kolkata
Job Description :
- Develops, deploys, and maintains scalable machine learning (ML) models, including specialized focus on Large Language Models (LLMs) and other foundation models.
- Designs and implements robust APIs and microservices for real-time model serving and inference, particularly focusing on optimizing latency and throughput for large generative models.
- Implements MLOps practices (CI / CD) to automate the testing, deployment, and monitoring of ML systems, including specific pipelines for fine-tuning and customizing LLMs.
- Optimizes model performance, latency, and resource consumption through techniques like quantization, pruning, and low-rank adaptation (LoRA) for efficient GenAI deployment.
- Establishes comprehensive monitoring systems for tracking model health, detecting data drift, concept drift, and monitoring prompt quality and token usage in production GenAI environments.
- Designs and implements ethical AI guardrails, content filters, and safety mechanisms to mitigate risks (e.g., bias, toxicity, and hallucination) associated with deployed GenAI applications.
- Manages the underlying infrastructure (e.g., specialized compute clusters) necessary to support the intensive training and serving requirements for large-scale GenAI workloads.
- Contributes to the engineering wiki, defining model versioning strategies, prompt management standards, and documenting deployment architectures and runbooks for GenAI systems.
- Expert proficiency in Python and specialized ML frameworks (e.g., TensorFlow, PyTorch, Scikit-learn), with specific expertise in transformer-based architecture.
- Deep understanding of various GenAI applications (e.g., text generation, summarization, code synthesis, vector storage and retrieval) and the trade-offs between different foundation models.Proficiency in GenAI orchestration frameworks such as LangChain or LlamaIndex to build complex, multi-step generative applications.
- Expertise in managing and integrating vector databases (e.g., Pinecone, Weaviate, Milvus) and embedding generation pipelines for semantic search and RAG systems.
- Ability to manage and process large, unstructured datasets, often leveraging tools like Spark for data preparation pipelines, including cleaning data specifically for LLM fine-tuning.
- Strong command of prompt engineering methodologies and capacity to design reusable, efficient prompt templates and management systems.
- Familiarity with cloud platforms (AWS, Azure, or GCP) and specialized GenAI model hubs and services (e.g., Hugging Face, OpenAI APIs, Anthropic, or proprietary cloud offerings).
- Skill in diagnosing and resolving complex technical issues related to inference latency, memory management, and cost optimization for large foundation models.