Talent.com
No longer accepting applications
Inference Optimization Engineer(LLM and Runtime)

Inference Optimization Engineer(LLM and Runtime)

Sustainability Economics.aiNagpur, IN
11 hours ago
Job description

Location :   Bengaluru, Karnataka

About the Company :

Sustainability Economics.ai is a global organization, pioneering the convergence of clean energy and AI, enabling profitable energy transitions while powering end-to-end AI infrastructure. By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems that drive efficiency, innovation, and long-term impact across industries. Guided by exceptional leaders and visionaries with decades of expertise in finance, policy, technology, and innovation, we are committed to making long-term efforts to fulfil this vision through our technical innovation, client services, expertise, and capability expansion.

Role Summary :

We are seeking a highly skilled and innovative  Inference Optimization (LLM and Runtime)  to design, develop, and optimize cutting-edge AI systems that power intelligent, scalable, and agent-driven workflows. This role blends the frontier of generative AI research with robust engineering, requiring expertise in machine learning, deep learning, and large language models (LLMs) and latest trends going on in the industry. The ideal candidate will collaborate with cross-functional teams to build production-ready AI solutions that address real-world business challenges while keeping our platforms at the forefront of AI innovation.

Key Tasks and Accountability :

  • Optimization and customization  of large-scale generative models (LLMs) for efficient inference and serving.
  • Apply and evaluate advanced  model optimization techniques  such as quantization, pruning, distillation, tensor parallelism, caching strategies, etc., to enhance model efficiency, throughput, and inference performance.
  • Implement  custom fine-tuning pipelines  using parameter-efficient methods (LoRA, QLoRA, adapters etc.) to achieve task-specific goals while minimizing compute overhead.
  • Optimize  runtime performance  of inference stacks using frameworks like vLLM, TensorRT-LLM, DeepSpeed-Inference, and Hugging Face Accelerate.
  • Design and implement  scalable model-serving architectures  on GPU clusters and cloud infrastructure (AWS, GCP, or Azure).
  • Work closely with platform and infrastructure teams to reduce  latency, memory footprint, and cost-per-token  during production inference.
  • Evaluate  hardware–software co-optimization strategies  across GPUs (NVIDIA A100 / H100), TPUs, or custom accelerators.
  • Monitor and profile performance using tools such as  Nsight, PyTorch Profiler, and Triton Metrics  to drive continuous improvement.

Key Requirements :

Education & Experience

  • Ph.D. in  Computer Science  or a related field, with a specialization in  Deep Learning, Generative AI, or Artificial Intelligence and Machine Learning (AI / ML) .
  • 2–3 years of hands-on experience in large language model (LLM) or deep learning optimization, gained through academic or industry work.
  • Skills

  • Strong analytical and mathematical reasoning ability with a focus on measurable performance gains.
  • Collaborative mindset, with ability to work across research, engineering, and product teams.
  • Pragmatic problem-solver who values  efficiency, reproducibility, and maintainable code  over theoretical exploration.
  • Curiosity-driven attitude — keeps up with  emerging model compression and inference technologies .
  • What You’ll Do

  • Take ownership of  end-to-end optimization lifecycle  — from profiling bottlenecks to delivering production-optimized LLMs.
  • Develop  custom inference pipelines  capable of high throughput and low latency under real-world traffic.
  • Build and maintain  internal libraries, wrappers, and benchmarking suites  for continuous performance evaluation.
  • What you will bring

  • Hands-on experience in building, optimizing machine learning or Agentic Systems   at scale.
  • A builder’s mindset — bias toward action, comfort with experimentation, and enthusiasm for solving complex, open-ended challenges.
  • Startup DNA  → bias to action, comfort with ambiguity, love for fast iteration, and flexible and growth mindset.
  • Why Join Us

  • Shape a  first-of-its-kind AI + clean energy platform .
  • Work with a small, mission-driven team obsessed with impact.
  • An aggressive growth path.
  • A chance to leave your mark at the intersection of  AI and sustainability .
  • Create a job alert for this search

    Optimization • Nagpur, IN

    Related jobs
    • Promoted
    Applied AI and LLM Innovation Engineer

    Applied AI and LLM Innovation Engineer

    Trupti Solutions - IndiaPune, Republic Of India, IN
    Generative & Agentic AI Developer with 7+ years of IT experience,.Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) and agent-based AI systems. This role focuses on developing and d...Show moreLast updated: 2 days ago
    • Promoted
    LLM Application Engineer

    LLM Application Engineer

    Vaikhari AIRepublic Of India, IN
    Our mission is to ensure AI systems behave reliably across languages, cultures, and real-world conditions.AI systems with a special focus on underrepresented and low-resource languages.Our founder...Show moreLast updated: 6 days ago
    • Promoted
    Machine Learning Engineer, LLM Focus

    Machine Learning Engineer, LLM Focus

    DeepLLMDataRepublic Of India, IN
    DeepLLMData focuses on coding SFT, RLHF, STEM, PhD, Maths, Image and Video data annotators and provisioning.This is a Part-time remote role for LLM Model training. The task needs LLM Trainers to cre...Show moreLast updated: 4 days ago
    • Promoted
    LLM Operations Engineer

    LLM Operations Engineer

    Strategic Talent PartnerRepublic Of India, IN
    Design and deploy advanced multi-agent pipelines for credit analysis.Optimize inference and prompt chains using frameworks like DSPy, GEPA, and LangChain. Implement reasoning techniques (CoT, ToT, G...Show moreLast updated: 6 days ago
    • Promoted
    Applied Ml - Engineer

    Applied Ml - Engineer

    TIH | IIT BombayRepublic Of India, IN
    Development, adaptation, and implementation of AI / ML algorithms and frameworks, Prediction algorithms.Developing deep learning and machine learning algorithms (CNN, object detection, segmentation, ...Show moreLast updated: 25 days ago
    • Promoted
    Genai Llm Engineer

    Genai Llm Engineer

    The Judge GroupChennai, Republic Of India, IN
    Position Title : Gen AI LLM Engineer.Position Overview We are seeking a solution-oriented engineer who can identify where AI / LLM capabilities can transform business operations and architect practica...Show moreLast updated: 6 days ago
    • Promoted
    • New!
    Cotiviti - Artificial Intelligence / MLOps Engineer

    Cotiviti - Artificial Intelligence / MLOps Engineer

    Cotiviti India Private LimitedIndia
    Description : We are seeking an experienced AI / ML Ops Engineer to bridge the gap between data science and DevOps.The ideal candidate will operationalize machine lear...Show moreLast updated: 18 hours ago
    • Promoted
    ML Ops

    ML Ops

    EXLIndia, India
    Deploy, monitor, and scale ML models on.GCP (Vertex AI, GKE, Cloud Functions).GitHub Actions / Jenkins / cloud-native tools. Containerize and orchestrate workloads with.MLflow, Feast, Prometheus / Gra...Show moreLast updated: 30+ days ago
    • Promoted
    AI / ML Engineer - LLM / Generative AI

    AI / ML Engineer - LLM / Generative AI

    WorksconsultancyIndia
    Responsibilities : - Research, develop, and fine-tune AI language models for text generation.Implement and optimize embe...Show moreLast updated: 30+ days ago
    • Promoted
    AI Inference Kernel Optimization Specialist

    AI Inference Kernel Optimization Specialist

    PhinityRepublic Of India, IN
    We look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new c...Show moreLast updated: 15 days ago
    • Promoted
    Ai Inference Kernel Engineer

    Ai Inference Kernel Engineer

    PhinityRepublic Of India, IN
    We look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new c...Show moreLast updated: 14 days ago
    • Promoted
    • New!
    Inference Optimization Engineer(LLM and Runtime)

    Inference Optimization Engineer(LLM and Runtime)

    Sustainability Economics.ainagpur, maharashtra, in
    AI, enabling profitable energy transitions while powering end-to-end AI infrastructure.By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems t...Show moreLast updated: 7 hours ago
    • Promoted
    Engineer : Senior LLM Optimization (LLMO) / GEO Expert – Google Vertex

    Engineer : Senior LLM Optimization (LLMO) / GEO Expert – Google Vertex

    Proso.ainagpur, maharashtra, in
    Generative Engine Optimization (GEO).You’ll help make sa products and services.AI-driven assistants and LLMs like ChatGPT, Copilot, and Gemini. Optimize LLM pipelines on Google Vertex AI : .Design RAG...Show moreLast updated: 4 days ago
    • Promoted
    Mlops Engineer

    Mlops Engineer

    Yotta Data Services Private LimitedRepublic Of India, IN
    We’re looking for a strategic Senior MLOps Engineer to lead the end-to-end design, implementation, and scaling of our AI infrastructure. You’ll partner with researchers, product teams, and DevOps to...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    LLM Engineer

    LLM Engineer

    Smart Moves ConsultantsPune, Republic Of India, IN
    We are looking for a skilled Python + Generative AI Engineer who is passionate about building intelligent systems using LLMs and modern AI frameworks. The ideal candidate will have strong Python dev...Show moreLast updated: 1 hour ago
    • Promoted
    LLM Solutions Engineer

    LLM Solutions Engineer

    Tata Consultancy ServicesChennai, Republic Of India, IN
    GPT,Gemini, LLaMA, Mistral) and.Solid understanding of data structures, algorithms, and software engineering principles.Bachelor’s or master’s degree in computer science, AI, Data Science, or relat...Show moreLast updated: 6 days ago
    • Promoted
    • New!
    GenAI / LLM Engineer – Domain-Specific AI Solutions (Telecom)

    GenAI / LLM Engineer – Domain-Specific AI Solutions (Telecom)

    Mobileumnagpur, maharashtra, in
    Mobileum is a leading provider of Telecom analytics solutions for roaming, core network, security, risk management, domestic and international connectivity testing, and customer intelligence.More t...Show moreLast updated: 7 hours ago
    • Promoted
    LLM Application Engineer

    LLM Application Engineer

    SakonPune, Republic Of India, IN
    Role : AI Engineer – Agentic Systems & LLM Applications.We’re looking for a well-rounded, forward-thinking AI Engineer who can design, build, and deploy intelligent systems powered by LLMs, retrieva...Show moreLast updated: 6 days ago