Job Title : Gen AI Architect
Experience : 8 – 12 Years
Work Mode : Work from Office (WFO)
Locations : Chennai | Bangalore | Hyderabad
Position Overview We are looking for an experienced GenAI Architect to lead the design and implementation of cutting-edge Generative AI solutions that drive innovation, scalability, and enterprise transformation.
This role demands deep expertise in LLM-based architectures , RAG (Retrieval-Augmented Generation) systems, agentic workflow orchestration , and MLOps deployment . The ideal candidate will combine strong hands-on technical skills with architectural vision, ensuring secure, compliant, and high-performance GenAI ecosystems.
Key Responsibilities
Generative AI (GenAI) Architecture
- Architect and design enterprise-grade LLM-based systems optimized for performance, scalability, and reliability.
- Demonstrate strong understanding of vector embeddings , similarity search (cosine / IP / L2), chunking strategies , and reranking .
- Build and optimize RAG pipelines — including indexing, metadata, hybrid search, and evaluator frameworks.
- Apply prompt engineering for tool use, function calling, and multi-agent planning.
- Develop and orchestrate agentic workflows using frameworks such as LangGraph or similar; integrate tools / services via MCP-compatible patterns .
- Implement NL2SQL approaches, ensuring SQL safety through schema constraints, sandboxes, and secure API integrations.
- Evaluate and balance trade-offs between generic LLMs and fine-tuned models , optimizing for accuracy, latency, and operational cost.
- Embed security, compliance, and data governance principles within all AI workflows, including RBAC / ABAC, auditability, and privacy controls.
MLOps & Deployment
Manage end-to-end deployment using frameworks such as MLflow , Kubeflow , SageMaker , or Vertex AI .Implement CI / CD pipelines for AI models, ensuring consistent delivery and robust version control.Utilize Docker and Kubernetes for model containerization, orchestration, and scalability.Establish monitoring, tracing, and observability for model performance, latency, and cost metrics.Preferred Experience
Hands-on experience with cloud platforms such as AWS , Azure , GCP , or OCI .Strong understanding of large-scale distributed systems and big-data frameworks (Spark, Hadoop).Expertise in retrieval optimization (hybrid lexical + vector search, metadata filtering, rerankers).Knowledge of model fine-tuning and adapter techniques ( LoRA, SFT, DPO ) with evaluation best practices.Experience in Document AI , layout parsing , and schema design for unstructured data.Familiarity with observability stacks for LLM applications — tracing, evaluation dashboards, and performance SLOs.Understanding of caching, batching , and KV-cache strategies for improved throughput and cost optimization.Exposure to secure tool-use patterns , constrained decoding, and policy-based control mechanisms.Assessment Framework
Portfolio Review : Evaluation of production-ready RAG or agentic systems, including design, architecture, and measurable outcomes.Technical Exercise : Design an intent router, justify model selection (base vs. fine-tuned), propose chunking / metadata strategies, and define evaluation metrics.Scenario Discussion : Identify and mitigate failure modes (hallucinations, tool errors, SQL risk).Governance Evaluation : Demonstrate understanding of access controls, PII handling, audit logging, and red-teaming practices.Qualifications
Bachelor’s or Master’s degree in Computer Science , Artificial Intelligence , Data Engineering , or a related discipline.8–12 years of total experience, including at least 3+ years in AI / ML architecture or Gen AI system design .Proficiency in Python , LLM / RAG frameworks , and modern containerization and CI / CD workflows .Strong communication, leadership, and problem-solving abilities with a focus on delivering business-impactful AI solutions.