Key Responsibilities
Develop, fine-tune, and integrate LLMs (GPT, LLaMA, Claude, etc.) into enterprise applications.
Implement prompt engineering techniques, embeddings, and RAG (Retrieval Augmented Generation) pipelines.
Understanding of building Agentic AI applications using Langgraph, etc.
Build and maintain APIs, microservices, and front-end / back-end integrations for GenAI applications.
Work with vector databases (Pinecone, FAISS, Weaviate, Milvus) to enable semantic search.
Deploy solutions on cloud AI platforms (Azure OpenAI, AWS Bedrock, GCP Vertex AI).
Optimize model performance, latency, and scalability for real-world usage.
Collaborate with data scientists and solution architects to deliver PoCs and production-ready applications.
Ensure AI solutions follow responsible AI, data privacy, and security guidelines.
Required Skills
Strong programming skills in Python (preferred), with experience in APIs, microservices, Flask / FastAPI / Django.
Hands-on experience with Azure Databricks and Apache Spark.
Knowledge of vector embeddings and vector databases.
Experience with LangChain, LlamaIndex, or similar orchestration frameworks.
Familiarity with Docker, Kubernetes, and MLOps practices for deploying AI models.
Experience working with cloud-based AI services (Azure, AWS, or GCP).
Strong problem-solving and debugging skills.
Good to have experience with PyTorch, TensorFlow, Hugging Face Transformers.
Generative Ai Engineer • Kottayam, Kerala, India