Lead the design and deployment of enterprise-grade generative AI systems, driving innovation in LLM orchestration, multimodal architectures, and scalable AI / ML pipelines. Own the full lifecycle from research to production, ensuring alignment with business objectives and ethical AI standards. This will be a hands-on individual contributor role as well as providing technical guidance to junior developers.
Key Responsibilities
- Technical Leadership
- Architect multi-LLM systems (e.g., Mixture-of-Experts, LLM routing) for cost-performance optimization.
- Design GPU / TPU-optimized training pipelines (FSDP, DeepSpeed) for billion-parameter models.
- Cloud-Native AI Development
- Build multi-cloud GenAI platforms (Azure OpenAI + GCP Vertex AI + AWS Bedrock) with unified MLOps.
- Implement enterprise security : VPC peering, private model endpoints, and data residency compliance.
- Innovation & Strategy
- Pioneer GenAI use cases : Agentic workflows, AI-driven synthetic data generation, real-time fine-tuning.
- Establish AI governance frameworks : Model cards, drift monitoring, and red-teaming protocols.
- Cross-Functional Impact
- Partner with leadership to define AI roadmaps and ROI metrics (e.g., $ saved via AI-driven automation).
- Mentor junior engineers and evangelize GenAI best practices across the organization.
Qualifications
Education : Bachelors / Masters in CS / AI or equivalent industry experience (5+ years in ML, 2+ in GenAI).Technical Mastery :Languages : Python.Frameworks : Expert-level PyTorch, TensorFlow Extended (TFX), ONNX Runtime.Cloud : Certified in Azure AI Engineer Expert and / or GCP Professional ML Engineer.GenAI Expertise :Shipped production GenAI systems (e.g., 10k+ QPS chatbots, code autocomplete at GitHub Copilot scale).Advanced prompt / response engineering : Self-critique chains, LLM cascades, guardrail-driven generation.Must-Have Experience
Cloud AI experience :Azure : Designed solutions with Azure OpenAI , MLOps Pipelines , and Cognitive Search .GCP : Scaled Vertex AI LLM Evaluation , Gemini Multimodal , and TPU v5 Pods .High-Impact Projects :Automation projects to reduce significant $$ costs.Built RAGsystems with hybrid search (vector + lexical) and dynamic data hydration.Led AI compliance for regulated industries (healthcare, finance).Preferred Qualifications Additions
Certifications :Azure : Microsoft Certified : Azure AI Engineer Associate.GCP : Google Cloud Professional Machine Learning Engineer.Experience with hybrid / multi-cloud GenAI deployments (e.g., training on GCP TPUs, serving via Azure endpoints).Skills Required
Pytorch, MLops, Python