Job Title : Generative AI Engineer Python | LLM | LangChain | HuggingFace | AWS / GCP / Azure
Experience : 3 to 5 Years (with minimum 2 years in Generative AI / LLM development)
Location : Gurgaon working from Office
Employment Type : Us :
We are a fast-growing AI-first company building Generative AI products that solve real-world problems across industries. Our work blends cutting-edge AI research with production-grade software engineering to create scalable, high-performance AI Overview :
We are seeking a Generative AI Engineer with expertise in Large Language Models (LLMs), transformer architectures, and AI product development. This role demands hands-on skills in fine-tuning models, optimizing inference, and deploying AI solutions on cloud platforms. You will work with OpenAI, Anthropic, Gemini, LLaMA, Mistral and integrate them into production-grade applications.
Key Responsibilities :
- Build & Deploy : Design, develop, and deploy production-grade Generative AI applications (not just proof-of-concepts).
- LLM Integration : Integrate commercial APIs (OpenAI, Anthropic, Gemini) and open-source models (LLaMA, Mistral, Vicuna) using LangChain, LlamaIndex, HuggingFace Transformers.
- Fine-tuning : Train and fine-tune LLMs / SLMs using LoRA, QLoRA, SFT and optimize inference using NVIDIA Triton, CUDA, TensorRT.
- Vector Search & RAG : Implement semantic search and Retrieval-Augmented Generation (RAG) pipelines using Chroma, Pinecone, FAISS, Weaviate, Qdrant.
- Observability & Evaluation : Use tools like LangFuse, PromptLayer, WandB to monitor and evaluate AI system performance.
- API Development : Build and integrate robust APIs with FastAPI, Flask and orchestrate agentic workflows.
- Cloud & DevOps : Deploy on AWS, GCP, Azure using Docker, Kubernetes, Terraform, GitHub Skills & Qualifications :
Experience : 3 to 5 years in software engineering / machine learning with 2+ years in Generative AI.
Project Delivery : Must have delivered at least one end-to-end working GenAI product (will be demonstrated during the interview).
Technical Skills :
Python (Advanced), LangChain, HuggingFace, LlamaIndexTransformer architecture, embeddings, tokenization, attention mechanismsLLM fine-tuning (LoRA, QLoRA, SFT)Vector databases (Chroma, Pinecone, FAISS, Weaviate)NVIDIA stack for inference optimizationCloud & Deployment : AWS SageMaker, GCP Vertex AI, Azure OpenAI, Kubernetes, DockerMindset : Research-driven, innovation-oriented, and strong problem-solving ability
Preferred / Good to Have
RLHF, DPO, multi-agent AI systemsMulti-modal AI (CLIP, LLaVA, BLIP)Model quantization (GGUF, GPTQ, AWQ)Contributions to open-source AI Stack ExposureLanguages & Frameworks : Python, LangChain, HuggingFace Transformers, FastAPI, FlaskLLM APIs : OpenAI, Anthropic, Gemini, HuggingFace, LLaMA, MistralFine-tuning Tools : LoRA, QLoRA, SFT, DPOVector Databases : Chroma, Pinecone, FAISS, Weaviate, QdrantCloud & Deployment : AWS, GCP, Azure, Docker, Kubernetes, TerraformObservability : LangFuse, PromptLayer, WandBWhy Join Us ?
Opportunity to work on state-of-the-art Generative AI solutions from research to deploymentCollaborate with top AI researchers and engineersWork on high-impact projects used globallyFlexible work environment and growth-oriented culture(ref : hirist.tech)