This job offer is not available in your country.

AI / ML LLM developer

Codem Inc.ananthapur, andhra pradesh, in

9 hours ago

Job description

About Codem

Codem is a technology services company specializing in eCommerce, SAP, custom applications, cloud infrastructure, DevOps, and systems integration. We work with global enterprises to build and modernize scalable platforms. We are now seeking a skilled SAP SD Consultant.

ONLY APPLY IF YOU ARE AN IMMEDIATE JOINER. Also you should have passed out of college by 2023 or earlier.

About the role

We’re looking for an AI Developer to build and ship LLM-powered features (chat / search / agents, RAG pipelines, automations). You’ll work closely with product and data teams to turn messy real-world data into reliable, low-latency experiences.

Responsibilities

RAG pipelines : Ingest, chunk, embed, and index documents; design retrieval strategies (hybrid / BM25+embedding, metadata filtering, reranking).
App logic : Build APIs / serving layers around LLMs (prompt templates, tool / function calling, agents, streaming).
Vector DB ops : Create / maintain indexes, upserts, namespace / tenant design, TTLs, migrations, and recall / latency tuning.
Evaluation & quality : Set up offline / online evals (accuracy, grounding, toxicity, hallucination rate), A / B tests, and feedback loops.
Safety & reliability : Implement guardrails, prompt-injection defenses, PII redaction, rate limiting, retries, and fallbacks.
Cost & perf : Token budgeting, caching (prompt / result / embedding), batching, and observability for latency & spend.
Data pipelines : Build ETL for PDFs / HTML / docs, enrichment, and scheduled syncs from SaaS / data lakes.
DevOps / MLOps : CI / CD, environment config, secrets management, dataset / version control, and monitoring.

Must-have qualifications

3+ years software experience (ideally Python) delivering production code.

Hands-on with LLM APIs (OpenAI / Azure OpenAI, Anthropic, or local LLMs like Llama) including prompting, tools / function calling, and streaming .

Practical RAG experience using vector databases (e.g., Pinecone, Weaviate, FAISS, pgvector) and embedding models .

Experience with LangChain or LlamaIndex (or equivalent in-house orchestration).

Strong with web APIs (FastAPI / Flask / Node), Git, testing, and debugging.

Solid understanding of security & privacy basics (PII handling, secrets, auth).

Nice to have

Reranking (Cohere / TEI), hybrid search (BM25 + embeddings), or Elasticsearch / OpenSearch .

Eval frameworks (Ragas, TruLens) and telemetry (Langfuse, OpenTelemetry).

Workflow / orchestration (Celery / Temporal / Airflow) and message queues (SQS / Kafka).

Cloud : AWS (Bedrock, Lambda), GCP (Vertex AI), Azure (AOAI) , Docker; basic Terraform.

Frontend collaboration (React) for chat UIs, streaming tokens, and citations.

Fine-tuning / LoRA, prompt caching, distillation, or model hosting experience.

Tools you might use here

Python (FastAPI), TypeScript / Node (optional), LangChain / LlamaIndex

Vector DBs : Pinecone, Weaviate, pgvector / FAISS

LLMs / Embeddings : GPT-4 / 4o / mini, Claude, Llama, instructor / sentence-transformers

Infra : AWS / GCP / Azure, Docker, GitHub Actions, Terraform (basic)

Obs & Eval : Langfuse, Ragas / TruLens, Prometheus / Grafana

Success in 3–6 months

Ship a production RAG feature with measurable uplift in answer quality.

Reduce latency / cost via caching / batching and better retrieval configs.

Establish evaluation + feedback loop with clear QA dashboards and guardrails.

Create a job alert for this search

Aiml Developer • ananthapur, andhra pradesh, in