About the Role :
We are looking for a highly skilled AI Backend Developer to design, build, and scale backend systems that power cutting-edge Generative AI applications. Youll play a pivotal role in integrating Large Language Models (LLMs) into intelligent, agentic workflows and building robust, production-grade infrastructure for AI-powered services.
This is an exciting opportunity to work with state-of-the-art AI models (like LLaMA, Falcon, Mistral, GPT, Claude), experiment with LLM orchestration frameworks, and contribute to real-world AI product development. Youll be part of a cross-functional team that includes ML engineers, researchers, DevOps, and frontend developers.
Key Responsibilities :
- Design and develop scalable, high-performance backend services and APIs for AI-driven applications.
- Implement microservices and event-driven architectures to support agentic workflows and LLM interactions.
- Ensure low-latency and high-availability systems capable of handling real-time AI inference and orchestration.
- Integrate LLMs into backend systems using APIs (e.g., OpenAI, HuggingFace) or self-hosted models (e.g., LLaMA, Mistral, Falcon).
- Optimize prompt workflows, manage context windows, and handle token limitations.
- Work on caching strategies, input / output preprocessing, and inference acceleration for efficient LLM use.
- Build intelligent agents using frameworks like LangChain, Haystack, AutoGen, LlamaIndex, or custom-built orchestration.
- Implement multi-agent coordination, tool usage, memory systems, and agent autonomy.
- Connect agents to external tools, APIs, databases, and user interfaces to enable decision-making and task automation.
- Serve models using tools such as FastAPI, Triton Inference Server, Ray Serve, or TorchServe.
- Work with vector databases (Pinecone, FAISS, Weaviate, Qdrant) to implement RAG (retrieval augmented generation) pipelines.
- Collaborate with ML engineers to productionize fine-tuned or custom models.
- Write unit, integration, and load tests to validate the robustness of AI services.
- Monitor system health, latency, and usage metrics; set up alerts and dashboards.
- Ensure code quality, versioning, and CI / CD pipelines are maintained for backend components.
- Work closely with product managers, ML engineers, and design teams to scope and deliver features.
- Contribute to architectural decisions and backend standards for the AI engineering team.
- Maintain detailed documentation for APIs, internal tooling, and service architecture.
Required Experience & Skills :
4 to 8 years of backend development experience, preferably with Python, but open to Node.js, Go, or Java-based stacks.Hands-on experience with integrating and deploying LLMs in production (e.g., GPT-4, Claude, LLaMA, Mistral, etc.).Proficiency with agentic AI frameworks (e.g., LangChain, Haystack, AutoGen, LlamaIndex).Experience with fine-tuning open-source LLMs, including dataset preparation, training, and evaluation.Strong understanding of :Model serving and API designVector databases and RAG patternsPrompt engineering fundamentalsExperience building scalable microservices and RESTful or gRPC APIs.Nice-to-Have Skills :
Experience with cloud platforms (AWS, GCP, Azure), including services like S3, Lambda, ECS, or Vertex AI.Familiarity with infrastructure-as-code (Terraform, Pulumi) or container orchestration (Docker, Kubernetes).Previous experience contributing to AI / LLM-powered product development.Exposure to MLOps tools and practices : model versioning, logging, monitoring, and retraining triggers.(ref : hirist.tech)