No longer accepting applications

Inference Optimization Engineer(LLM and Runtime)

Sustainability Economics.aiNadiad, IN

7 hours ago

Job description

Location : Bengaluru, Karnataka

About the Company :

Sustainability Economics.ai is a global organization, pioneering the convergence of clean energy and AI, enabling profitable energy transitions while powering end-to-end AI infrastructure. By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems that drive efficiency, innovation, and long-term impact across industries. Guided by exceptional leaders and visionaries with decades of expertise in finance, policy, technology, and innovation, we are committed to making long-term efforts to fulfil this vision through our technical innovation, client services, expertise, and capability expansion.

Role Summary :

We are seeking a highly skilled and innovative Inference Optimization (LLM and Runtime) to design, develop, and optimize cutting-edge AI systems that power intelligent, scalable, and agent-driven workflows. This role blends the frontier of generative AI research with robust engineering, requiring expertise in machine learning, deep learning, and large language models (LLMs) and latest trends going on in the industry. The ideal candidate will collaborate with cross-functional teams to build production-ready AI solutions that address real-world business challenges while keeping our platforms at the forefront of AI innovation.

Key Tasks and Accountability :

Optimization and customization of large-scale generative models (LLMs) for efficient inference and serving.
Apply and evaluate advanced model optimization techniques such as quantization, pruning, distillation, tensor parallelism, caching strategies, etc., to enhance model efficiency, throughput, and inference performance.
Implement custom fine-tuning pipelines using parameter-efficient methods (LoRA, QLoRA, adapters etc.) to achieve task-specific goals while minimizing compute overhead.
Optimize runtime performance of inference stacks using frameworks like vLLM, TensorRT-LLM, DeepSpeed-Inference, and Hugging Face Accelerate.
Design and implement scalable model-serving architectures on GPU clusters and cloud infrastructure (AWS, GCP, or Azure).
Work closely with platform and infrastructure teams to reduce latency, memory footprint, and cost-per-token during production inference.
Evaluate hardware–software co-optimization strategies across GPUs (NVIDIA A100 / H100), TPUs, or custom accelerators.
Monitor and profile performance using tools such as Nsight, PyTorch Profiler, and Triton Metrics to drive continuous improvement.

Key Requirements :

Education & Experience

Ph.D. in Computer Science or a related field, with a specialization in Deep Learning, Generative AI, or Artificial Intelligence and Machine Learning (AI / ML) .

2–3 years of hands-on experience in large language model (LLM) or deep learning optimization, gained through academic or industry work.

Skills

Strong analytical and mathematical reasoning ability with a focus on measurable performance gains.

Collaborative mindset, with ability to work across research, engineering, and product teams.

Pragmatic problem-solver who values efficiency, reproducibility, and maintainable code over theoretical exploration.

Curiosity-driven attitude — keeps up with emerging model compression and inference technologies .

What You’ll Do

Take ownership of end-to-end optimization lifecycle — from profiling bottlenecks to delivering production-optimized LLMs.

Develop custom inference pipelines capable of high throughput and low latency under real-world traffic.

Build and maintain internal libraries, wrappers, and benchmarking suites for continuous performance evaluation.

What you will bring

Hands-on experience in building, optimizing machine learning or Agentic Systems at scale.

A builder’s mindset — bias toward action, comfort with experimentation, and enthusiasm for solving complex, open-ended challenges.

Startup DNA → bias to action, comfort with ambiguity, love for fast iteration, and flexible and growth mindset.

Why Join Us

Shape a first-of-its-kind AI + clean energy platform .

Work with a small, mission-driven team obsessed with impact.

An aggressive growth path.

A chance to leave your mark at the intersection of AI and sustainability .

Create a job alert for this search

Optimization • Nadiad, IN

Related jobs

Promoted
New!

Inference Optimization Engineer(LLM and Runtime)

Sustainability Economics.ainadiad, gujarat, in

AI, enabling profitable energy transitions while powering end-to-end AI infrastructure.By integrating AI-driven cloud solutions with sustainable energy, we create scalable, intelligent ecosystems t...Show moreLast updated: 4 hours ago

Promoted
New!

ML Ops Engineer

People Prime WorldwideAhmedabad, IN

Our client is a trusted global innovator of IT and business services.They help clients transform through consulting, industry solutions, business process services, digital & IT modernization and ma...Show moreLast updated: 7 hours ago

Promoted
New!

GenAI / LLM Engineer – Domain-Specific AI Solutions (Telecom)

Mobileumanand, gujarat, in

Mobileum is a leading provider of Telecom analytics solutions for roaming, core network, security, risk management, domestic and international connectivity testing, and customer intelligence.More t...Show moreLast updated: 3 hours ago

Promoted
New!

Senior LLM Engineer

ConfidentialIndia, Ahmedabad

LLMs , fine-tune open-source models, integrate multi-agent systems, and deploy scalable solutions in production environments. LLM-based models and AI agents.RAG), and memory architectures.Integrate ...Show moreLast updated: 6 hours ago

Promoted

Artificial Intelligence / Machine Learning Engineer - LLM

Techify Solutions Pvt LtdAhmedabad

Description : - Design and build machine learning models, with a focus on large language models (LLMs) like GPT, BERT, ...Show moreLast updated: 19 days ago

Promoted

AMS Verification Engineer / Lead

eInfochips (An Arrow Company)Vadodara, IN

Minimum 6 years relevant experience is required.Bangalore, Hyderabad, Noida, Chennai, Ahmedabad, Pune.Min 6 Years of overall experience in ASIC Verification. Should have worked on AMS Verification f...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Observability Platform Engineer

Mewar Infotech LimitedNadiad, IN

Machine Learning Observability Platform Engineer.You’ll help design and enhance our.AI capabilities that power critical insights across enterprise environments. Observability Platform built on.SREs,...Show moreLast updated: 5 days ago

Promoted

Senior AI / ML Engineer

MSBC GroupAhmedabad, Gujarat, India

AI systems across multiple subfields, not just Generative AI.You’ll lead end-to-end solutions spanning Computer Vision, classical ML, unsupervised / self-supervised learning, time-series / forecasting,...Show moreLast updated: 30+ days ago

Promoted

Polarion ALM Expert – Process Implementation & Support

Hexad Infosoft INAhmedabad, IN

Polarion ALM Expert – Process Implementation & Support.R&D process digitalization project.The role involves implementing, configuring, and optimizing. The expert will collaborate with global stakeho...Show moreLast updated: 3 days ago

Promoted

Remote GenAI Engineer

EazyMLAhmedabad, IN

Remote

Founded by Bell Labs research veterans, and associated with breakthrough startups like Amelia, EazyML, specializes in Transparent Machine Learning. Early on EazyML founders saw the need for Transpa...Show moreLast updated: 20 days ago

Promoted

Ml Ops

EXLAhmedabad, Republic Of India, IN

Deploy, monitor, and scale ML models on.GCP (Vertex AI, GKE, Cloud Functions).GitHub Actions / Jenkins / cloud-native tools. Containerize and orchestrate workloads with.MLflow, Feast, Prometheus / Gra...Show moreLast updated: 6 days ago

Promoted

AI Lead - LLM Security and DLP - Distinguished CyberSecurity Startup

CareerXperts ConsultingAhmedabad, IN

Notice Period : Immediate to 1 Month.AI, with a strong focus on NLP technologies.Strong proficiency in machine learning frameworks such as TensorFlow, PyTorch, or Hugging Face.Strong proficiency in ...Show moreLast updated: 1 day ago

Promoted

Machine Learning Engineer

RecroVadodara, IN

We are looking for an experienced.Azure and AWS cloud ecosystems.The ideal candidate should bring a strong background in. GenAI tooling, automation, and CI / CD pipelines.Design, implement, and manage...Show moreLast updated: 30+ days ago

Promoted

ML Ops

EXLVadodara, IN

Promoted

Engineer : Senior LLM Optimization (LLMO) / GEO Expert – Google Vertex

Proso.aiahmedabad, gujarat, in

Generative Engine Optimization (GEO).You’ll help make sa products and services.AI-driven assistants and LLMs like ChatGPT, Copilot, and Gemini. Optimize LLM pipelines on Google Vertex AI : .Design RAG...Show moreLast updated: 4 days ago

Promoted

Mlops Engineer

CapgeminiAnand, Republic Of India, IN

Experience in developing MLOps framework cutting ML lifecycle : model development, training, evaluation, deployment, monitoring including Model Governance. Expert in Azure Databricks, Azure ML, Unity...Show moreLast updated: 3 days ago

Promoted

Sr Threat Detection Engineer

Insight GlobalNadiad, IN

Exact compensation may vary based on several factors, including skills, experience, and education.We are seeking a highly experienced Senior Detection Engineer to lead the development and optimizat...Show moreLast updated: 19 days ago

Promoted

AI / ML Engineer

Innodata Inc.Vadodara, IN

Our AI-driven platforms and expert teams empower clients in healthcare, life insurance, and other industries to identify risks, improve efficiency, and make smarter decisions.By combining proprieta...Show moreLast updated: 5 days ago