Talent.com
No longer accepting applications
Inference Optimization Engineer(LLM and Runtime)

Inference Optimization Engineer(LLM and Runtime)

Sustainability Economics.aiVisakhapatnam, IN
12 hours ago
Job description

Location :   Bengaluru, Karnataka

About the Company :

Sustainability Economics.ai is a global organization, pioneering the convergence of clean energy and AI, enabling profitable energy transitions while powering end-to-end AI infrastructure. By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems that drive efficiency, innovation, and long-term impact across industries. Guided by exceptional leaders and visionaries with decades of expertise in finance, policy, technology, and innovation, we are committed to making long-term efforts to fulfil this vision through our technical innovation, client services, expertise, and capability expansion.

Role Summary :

We are seeking a highly skilled and innovative  Inference Optimization (LLM and Runtime)  to design, develop, and optimize cutting-edge AI systems that power intelligent, scalable, and agent-driven workflows. This role blends the frontier of generative AI research with robust engineering, requiring expertise in machine learning, deep learning, and large language models (LLMs) and latest trends going on in the industry. The ideal candidate will collaborate with cross-functional teams to build production-ready AI solutions that address real-world business challenges while keeping our platforms at the forefront of AI innovation.

Key Tasks and Accountability :

  • Optimization and customization  of large-scale generative models (LLMs) for efficient inference and serving.
  • Apply and evaluate advanced  model optimization techniques  such as quantization, pruning, distillation, tensor parallelism, caching strategies, etc., to enhance model efficiency, throughput, and inference performance.
  • Implement  custom fine-tuning pipelines  using parameter-efficient methods (LoRA, QLoRA, adapters etc.) to achieve task-specific goals while minimizing compute overhead.
  • Optimize  runtime performance  of inference stacks using frameworks like vLLM, TensorRT-LLM, DeepSpeed-Inference, and Hugging Face Accelerate.
  • Design and implement  scalable model-serving architectures  on GPU clusters and cloud infrastructure (AWS, GCP, or Azure).
  • Work closely with platform and infrastructure teams to reduce  latency, memory footprint, and cost-per-token  during production inference.
  • Evaluate  hardware–software co-optimization strategies  across GPUs (NVIDIA A100 / H100), TPUs, or custom accelerators.
  • Monitor and profile performance using tools such as  Nsight, PyTorch Profiler, and Triton Metrics  to drive continuous improvement.

Key Requirements :

Education & Experience

  • Ph.D. in  Computer Science  or a related field, with a specialization in  Deep Learning, Generative AI, or Artificial Intelligence and Machine Learning (AI / ML) .
  • 2–3 years of hands-on experience in large language model (LLM) or deep learning optimization, gained through academic or industry work.
  • Skills

  • Strong analytical and mathematical reasoning ability with a focus on measurable performance gains.
  • Collaborative mindset, with ability to work across research, engineering, and product teams.
  • Pragmatic problem-solver who values  efficiency, reproducibility, and maintainable code  over theoretical exploration.
  • Curiosity-driven attitude — keeps up with  emerging model compression and inference technologies .
  • What You’ll Do

  • Take ownership of  end-to-end optimization lifecycle  — from profiling bottlenecks to delivering production-optimized LLMs.
  • Develop  custom inference pipelines  capable of high throughput and low latency under real-world traffic.
  • Build and maintain  internal libraries, wrappers, and benchmarking suites  for continuous performance evaluation.
  • What you will bring

  • Hands-on experience in building, optimizing machine learning or Agentic Systems   at scale.
  • A builder’s mindset — bias toward action, comfort with experimentation, and enthusiasm for solving complex, open-ended challenges.
  • Startup DNA  → bias to action, comfort with ambiguity, love for fast iteration, and flexible and growth mindset.
  • Why Join Us

  • Shape a  first-of-its-kind AI + clean energy platform .
  • Work with a small, mission-driven team obsessed with impact.
  • An aggressive growth path.
  • A chance to leave your mark at the intersection of  AI and sustainability .
  • Create a job alert for this search

    Optimization • Visakhapatnam, IN

    Related jobs
    • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Innodata Inc.Visakhapatnam, IN
    Our AI-driven platforms and expert teams empower clients in healthcare, life insurance, and other industries to identify risks, improve efficiency, and make smarter decisions.By combining proprieta...Show moreLast updated: 5 days ago
    • Promoted
    • New!
    ML Ops Engineer

    ML Ops Engineer

    People Prime WorldwideVisakhapatnam, IN
    Our client is a trusted global innovator of IT and business services.They help clients transform through consulting, industry solutions, business process services, digital & IT modernization and ma...Show moreLast updated: 12 hours ago
    • Promoted
    AI Lead - LLM Security and DLP - Distinguished CyberSecurity Startup

    AI Lead - LLM Security and DLP - Distinguished CyberSecurity Startup

    CareerXperts ConsultingVisakhapatnam, IN
    Notice Period : Immediate to 1 Month.AI, with a strong focus on NLP technologies.Strong proficiency in machine learning frameworks such as TensorFlow, PyTorch, or Hugging Face.Strong proficiency in ...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Senior Machine Learning Engineer

    Senior Machine Learning Engineer

    Diligente TechnologiesVisakhapatnam, IN
    Title : Senior Machine Learning Engineer.Location : Vaishnavi Signature, Bellandur, Bengaluru( hybrid2 days onsite a week). What You Will Achieve and Key Responsibilities.Research, Design, Develop and...Show moreLast updated: 5 hours ago
    • Promoted
    Polarion ALM Expert – Process Implementation & Support

    Polarion ALM Expert – Process Implementation & Support

    Hexad Infosoft INVisakhapatnam, IN
    Polarion ALM Expert – Process Implementation & Support.R&D process digitalization project.The role involves implementing, configuring, and optimizing. The expert will collaborate with global stakeho...Show moreLast updated: 3 days ago
    • Promoted
    Artificial Intelligence Engineer

    Artificial Intelligence Engineer

    Cloud 9 Solutions, LLCVisakhapatnam, IN
    Should have experience in building.Having knowledge in Microsoft Copilot studio is an advantage.Type : Fulltime (40 hours per week). AI / ML architecture, with at least 3 years focused on enterprise-le...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Machine Learning Engineer - NLP

    Senior Machine Learning Engineer - NLP

    Observe.AIvizianagaram, andhra pradesh, in
    AI is the leading AI agent platform for customer experience.It enables enterprises to deploy AI agents that automate customer interactions, delivering natural conversations for customers with predi...Show moreLast updated: 6 days ago
    • Promoted
    AMS Verification Engineer / Lead

    AMS Verification Engineer / Lead

    eInfochips (An Arrow Company)Visakhapatnam, IN
    Minimum 6 years relevant experience is required.Bangalore, Hyderabad, Noida, Chennai, Ahmedabad, Pune.Min 6 Years of overall experience in ASIC Verification. Should have worked on AMS Verification f...Show moreLast updated: 30+ days ago
    • Promoted
    Remote GenAI Engineer

    Remote GenAI Engineer

    EazyMLVisakhapatnam, IN
    Remote
    Founded by Bell Labs research veterans, and associated with breakthrough startups like Amelia, EazyML, specializes in Transparent Machine Learning. Early on EazyML founders saw the need for Transpa...Show moreLast updated: 21 days ago
    • Promoted
    Engineer : Senior LLM Optimization (LLMO) / GEO Expert – Google Vertex

    Engineer : Senior LLM Optimization (LLMO) / GEO Expert – Google Vertex

    Proso.aivisakhapatnam, andhra pradesh, in
    Generative Engine Optimization (GEO).You’ll help make sa products and services.AI-driven assistants and LLMs like ChatGPT, Copilot, and Gemini. Optimize LLM pipelines on Google Vertex AI : .Design RAG...Show moreLast updated: 4 days ago
    • Promoted
    Computer Vision & Multimodal LLM Intern (Engineering Drawing Analysis Agent)

    Computer Vision & Multimodal LLM Intern (Engineering Drawing Analysis Agent)

    doAZVisakhapatnam, IN
    Computer Vision & Multimodal LLM.Drawing Change Analysis Agent)About Doaz.Doaz turns fragmented industrial knowledge into instant, actionable insight. We build LLM- and Vision-AI solutions for const...Show moreLast updated: 6 days ago
    • Promoted
    ML Ops

    ML Ops

    EXLVisakhapatnam, IN
    Deploy, monitor, and scale ML models on.GCP (Vertex AI, GKE, Cloud Functions).GitHub Actions / Jenkins / cloud-native tools. Containerize and orchestrate workloads with.MLflow, Feast, Prometheus / Gra...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Inference Optimization Engineer(LLM and Runtime)

    Inference Optimization Engineer(LLM and Runtime)

    Sustainability Economics.aiVizag, Andhra Pradesh, India
    Location : Bengaluru, Karnataka About the Company : Sustainability Economics.AI, enabling profitable energy transitions while powering end-to-end AI infrastructure. By integrating AI-driven cloud s...Show moreLast updated: 4 hours ago
    • Promoted
    • New!
    Founding MLOps Engineer

    Founding MLOps Engineer

    Vectorial AIVisakhapatnam, IN
    Vectorial is a simulation engine platform powered by millions of synthetic users—state-of-the-art models that capture real human behavior—to deliver instant, nuanced validation across the entire pr...Show moreLast updated: 11 hours ago
    • Promoted
    Machine Learning Observability Platform Engineer

    Machine Learning Observability Platform Engineer

    Mewar Infotech LimitedVisakhapatnam, IN
    Machine Learning Observability Platform Engineer.You’ll help design and enhance our.AI capabilities that power critical insights across enterprise environments. Observability Platform built on.SREs,...Show moreLast updated: 5 days ago
    • Promoted
    • New!
    Senior AI ML Engineer (MLOps)

    Senior AI ML Engineer (MLOps)

    Balancehero IndiaVisakhapatnam, IN
    BHI), the wholly-owned subsidiary of Balancehero Co.Korea which runs and operates the mobile app “True Balance”- a one-stop destination for financial services. Founded by Charlie Lee in Korea in 201...Show moreLast updated: 12 hours ago
    • Promoted
    AI Inference Kernel Engineer (CUDA)

    AI Inference Kernel Engineer (CUDA)

    Phinityvizianagaram, andhra pradesh, in
    We look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new c...Show moreLast updated: 14 days ago
    • Promoted
    MLOps Engineer

    MLOps Engineer

    Capgeminivisakhapatnam, andhra pradesh, in
    Experience in developing MLOps framework cutting ML lifecycle : model development, training, evaluation, deployment, monitoring including Model Governance. Expert in Azure Databricks, Azure ML, Unity...Show moreLast updated: 4 days ago