Role Overview :
We are seeking a highly skilled LLM Engineer with strong expertise in building, deploying, and optimizing Large Language Model (LLM)-driven applications.
The ideal candidate will have a solid software engineering background, proven experience in AI / ML systems, and hands-on exposure to modern cloud-native, containerized, and vector database technologies.
This role demands a professional who can integrate LLM capabilities into scalable backend systems while also collaborating on frontend Responsibilities :
LLM Development & Integration :
- Design, fine-tune, and deploy LLM-based solutions using frameworks such as AWS Bedrock, Gemini, Hugging Face, or OpenAI APIs.
- Build custom pipelines for prompt engineering, retrieval-augmented generation (RAG), and model orchestration.
- Optimize model inference performance and reduce latency in real-world Engineering :
- Develop and maintain backend services using Python (FastAPI, Flask, Django) with strong focus on
scalability, modularity, and performance.
Implement APIs and microservices to integrate AI-driven features with enterprise-grade applications.Work with EKS, Docker, and Kubernetes for containerized deployments and Database & Retrieval Systems :Build and optimize semantic search systems using vector databases such as Pinecone, FAISS, or Weaviate.Design and implement embeddings pipelines to support RAG (Retrieval-Augmented Generation) use cases.Ensure efficient storage, indexing, and retrieval of unstructured data for LLM & Infrastructure :Leverage AWS (Bedrock, Lambda, S3, SageMaker) and GCP AI / ML services for scalable model deployment.Implement cloud-native best practices for security, cost optimization, and monitoring.Automate deployment pipelines (CI / CD) for ML-powered Integration :Collaborate with frontend engineers to integrate AI-powered features into applications using React, Vue.js, or Lovable.dev.Ensure seamless user experiences for AI-driven interfaces, including conversational UIs and intelligent Collaboration :Partner with data scientists, ML engineers, and product managers to design end-to-end LLM-powered solutions.Contribute to system architecture, scalability discussions, and performance reviews.Stay updated with the latest advancements in LLMs, vector search, and AI Skills & Qualifications :Software Engineering Experience : 6+ years in backend or full-stack development.LLM Expertise : Proven hands-on work with AWS Bedrock, Gemini, Hugging Face, or similar LLM ecosystems.Programming Skills : Strong proficiency in Python for backend development, API design, and AI integration.Frameworks & Tools : Experience with FastAPI, Flask, and containerized deployments (EKS, Docker, Kubernetes).Vector Databases : Practical experience with Pinecone, FAISS, Weaviate, or similar technologies.Cloud Proficiency : Hands-on exposure to AWS and GCP services for deploying and scaling LLM applications.Frontend Knowledge : Working experience with React, Vue.js, or Lovable.dev for AI feature integration.System Design : Ability to design scalable, distributed, and fault-tolerant AI-driven architectures.Problem-Solving : Strong debugging, optimization, and performance tuning skills(ref : hirist.tech)