Talent.com
No longer accepting applications
LLM Performance Engineer

LLM Performance Engineer

Sustainability Economics.aiBengaluru, Republic Of India, IN
10 days ago
Job description

Location : Bengaluru, Karnataka

About the Company :

Sustainability Economics.Ai is a global organization, pioneering the convergence of clean energy and AI, enabling profitable energy transitions while powering end-to-end AI infrastructure. By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems that drive efficiency, innovation, and long-term impact across industries. Guided by exceptional leaders and visionaries with decades of expertise in finance, policy, technology, and innovation, we are committed to making long-term efforts to fulfil this vision through our technical innovation, client services, expertise, and capability expansion.

Role Summary :

We are seeking a highly skilled and innovative Inference Optimization (LLM and Runtime) to design, develop, and optimize cutting-edge AI systems that power intelligent, scalable, and agent-driven workflows. This role blends the frontier of generative AI research with robust engineering, requiring expertise in machine learning, deep learning, and large language models (LLMs) and latest trends going on in the industry. The ideal candidate will collaborate with cross-functional teams to build production-ready AI solutions that address real-world business challenges while keeping our platforms at the forefront of AI innovation.

Key Tasks and Accountability :

  • Optimization and customization of large-scale generative models (LLMs) for efficient inference and serving.
  • Apply and evaluate advanced model optimization techniques such as quantization, pruning, distillation, tensor parallelism, caching strategies, etc., to enhance model efficiency, throughput, and inference performance.
  • Implement custom fine-tuning pipelines using parameter-efficient methods (LoRA, QLoRA, adapters etc.) to achieve task-specific goals while minimizing compute overhead.
  • Optimize runtime performance of inference stacks using frameworks like vLLM, TensorRT-LLM, DeepSpeed-Inference, and Hugging Face Accelerate.
  • Design and implement scalable model-serving architectures on GPU clusters and cloud infrastructure (AWS, GCP, or Azure).
  • Work closely with platform and infrastructure teams to reduce latency, memory footprint, and cost-per-token during production inference.
  • Evaluate hardware–software co-optimization strategies across GPUs (NVIDIA A100 / H100), TPUs, or custom accelerators.
  • Monitor and profile performance using tools such as Nsight, PyTorch Profiler, and Triton Metrics to drive continuous improvement.

Key Requirements :

Education & Experience

  • Ph.D. in Computer Science or a related field, with a specialization in Deep Learning, Generative AI, or Artificial Intelligence and Machine Learning (AI / ML) .
  • 2–3 years of hands-on experience in large language model (LLM) or deep learning optimization, gained through academic or industry work.
  • Skills

  • Strong analytical and mathematical reasoning ability with a focus on measurable performance gains.
  • Collaborative mindset, with ability to work across research, engineering, and product teams.
  • Pragmatic problem-solver who values efficiency, reproducibility, and maintainable code over theoretical exploration.
  • Curiosity-driven attitude — keeps up with emerging model compression and inference technologies .
  • What You’ll Do

  • Take ownership of end-to-end optimization lifecycle — from profiling bottlenecks to delivering production-optimized LLMs.
  • Develop custom inference pipelines capable of high throughput and low latency under real-world traffic.
  • Build and maintain internal libraries, wrappers, and benchmarking suites for continuous performance evaluation.
  • What you will bring

  • Hands-on experience in building, optimizing machine learning or Agentic Systems at scale.
  • A builder’s mindset — bias toward action, comfort with experimentation, and enthusiasm for solving complex, open-ended challenges.
  • Startup DNA → bias to action, comfort with ambiguity, love for fast iteration, and flexible and growth mindset.
  • Why Join Us

  • Shape a first-of-its-kind AI + clean energy platform .
  • Work with a small, mission-driven team obsessed with impact.
  • An aggressive growth path.
  • A chance to leave your mark at the intersection of AI and sustainability .
  • Create a job alert for this search

    Performance Engineer • Bengaluru, Republic Of India, IN

    Related jobs
    • Promoted
    Optimization and ML Modelling Engineer

    Optimization and ML Modelling Engineer

    Sustainability Economics.aiBengaluru, India
    AI, enabling profitable energy transitions while powering end-to-end AI infrastructure.By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems t...Show moreLast updated: 7 days ago
    • Promoted
    Publicis Sapient - Senior MLOps Engineer - LLM Models

    Publicis Sapient - Senior MLOps Engineer - LLM Models

    Publicis SapientBangalore
    About the Project : Client is a leading global financial markets infrastructure and data provider.The Content Amplify project is at the forefront of transforming unstructured f...Show moreLast updated: 1 day ago
    • Promoted
    Gas Turbine Performance Engineer

    Gas Turbine Performance Engineer

    InfosysBengaluru, Karnataka, India
    This role is central to the design, analysis, and validation of.Collaborate with cross-functional teams including.Deliver technical packages aligned with project milestones, cost, and quality requi...Show moreLast updated: 26 days ago
    • Promoted
    ML Engineer (LLM)

    ML Engineer (LLM)

    ConfidentialBengaluru / Bangalore
    Full Time (As per company timings).AI and machine learning, with hands-on expertise in LLMs and.Strong experience with large language models (GPT-3 / 4, Claude, etc, and understanding of transformer ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Mlops Engineer - Bangalore

    Senior Mlops Engineer - Bangalore

    ConcentrixBengaluru, Republic Of India, IN
    This role is responsible for customizing and validating the various components.Configures and customizes core Kubeflow components for the specific needs of the projects. Validation installation with...Show moreLast updated: 16 days ago
    • Promoted
    Prompt Engineer

    Prompt Engineer

    Innodata Inc.hosur, tamil nadu, in
    Demonstrated experience programmatically using LLMs to automate data labeling, classification, localization and annotation tasks. Strong expertise in Python for NLU, for data processing & transforma...Show moreLast updated: 17 days ago
    • Promoted
    Senior ML Engineer

    Senior ML Engineer

    Piramal Financehosur, tamil nadu, in
    Build and operate end-to-end ML / AI pipelines (data → training → deployment → monitoring).Automate CI / CD for ML / AI with Jenkins, integrate MLflow for tracking and registry.Deploy scalable batch and ...Show moreLast updated: 16 days ago
    • Promoted
    Mlops Engineer

    Mlops Engineer

    CapgeminiBengaluru, Republic Of India, IN
    Experience in developing MLOps framework cutting ML lifecycle : model development, training, evaluation, deployment, monitoring including Model Governance. Expert in Azure Databricks, Azure ML, Unity...Show moreLast updated: 14 days ago
    • Promoted
    symplr - Performance Engineer - JMeter / LoadRunner

    symplr - Performance Engineer - JMeter / LoadRunner

    Jedi Software Engineering LLPBangalore
    Key Responsibilities : - Utilize industry-standard performance testing tools (e.JMeter, LoadRunner, Gatling) to simulate real-world scenarios and measure system perf...Show moreLast updated: 30+ days ago
    • Promoted
    Senior MLOps Engineer 5+YRS

    Senior MLOps Engineer 5+YRS

    ConfidentialBengaluru / Bangalore, India
    Job Profile : Senior MLOps Engineer.LSEG is a leading global financial markets infrastructure and data provider.ML and LLM models to learn from subject matter expert feedback in a scalable, iterativ...Show moreLast updated: 16 days ago
    LLM Engineer

    LLM Engineer

    ScaleneWorksBengaluru, Karnataka, India
    Quick Apply
    Support the fine-tuning, testing, and deployment of LLMs (e.GPT, BERT) for specific use cases.Provide expertise on cloud-based AI platforms, including. Offer guidance on prompt engineering and optim...Show moreLast updated: 30+ days ago
    • Promoted
    LLM & ML Ops Engineer

    LLM & ML Ops Engineer

    ConfidentialBengaluru / Bangalore
    Gainwell is seeking LLM Ops Engineers and ML Ops Engineers to join our growing AI / ML team.This role is responsible for developing, deploying, and maintaining scalable infrastructure and pipelines f...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior ML Engineer

    Senior ML Engineer

    Torinithosur, tamil nadu, in
    Canadian-based digital consulting company.At Torinit, we don't just serve our clients; we work with them to create transformative digital journeys by leveraging the latest technologies and world-cl...Show moreLast updated: less than 1 hour ago
    • Promoted
    Full Stack LLM Engineer

    Full Stack LLM Engineer

    CerebrasBengaluru, Karnataka, India
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show moreLast updated: 15 days ago
    • Promoted
    MLOps Engineer

    MLOps Engineer

    CapgeminiBengaluru, Karnataka, India
    Experience in developing MLOps framework cutting ML lifecycle : model development, training, evaluation, deployment, monitoring including Model Governance. Expert in Azure Databricks, Azure ML, Unity...Show moreLast updated: 15 days ago
    • Promoted
    Senior MLOps Engineer - Bangalore

    Senior MLOps Engineer - Bangalore

    ConcentrixBengaluru, Karnataka, India
    This role is responsible for customizing and validating the various components.Configures and customizes core Kubeflow components for the specific needs of the projects. Validation installation with...Show moreLast updated: 17 days ago
    • Promoted
    LLM Engineer

    LLM Engineer

    ConfidentialBengaluru / Bangalore
    Design and implement LLM-based features across Qure's products, including clinician-facing assistants and automated diagnostic flows. Build internal tools, APIs, and developer-facing utilities to en...Show moreLast updated: 30+ days ago
    • Promoted
    Full Stack LLM Engineer

    Full Stack LLM Engineer

    ConfidentialBengaluru / Bangalore, India
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show moreLast updated: 21 days ago