About the Role
You will help scale our Voice AI Agent (telephony + STT + LLM + TTS) into an enterprise-grade product. This includes building real-time speech / LLM pipelines, barge-in and human takeover flows, data loops for model improvement, and natural interruption handling.
What You’ll Do
- Implement and tune real-time STT / ASR and TTS pipelines.
- Orchestrate LLM-driven conversations : prompting, function calling, dialog state, and context handling.
- Build barge-in logic and reliable human handoff signals.
- Create datasets for training : recordings, redaction, labeling, metadata;
enable search and retrieval.
Optimize latency, accuracy, turn-taking, and cost.Add guardrails, fallbacks, and basic A / B experimentation.Required Skills
Programming : Python or Node.Js;modular code, async handling.
APIs & Events : REST, webhooks, retries, idempotency.Speech / LLM : Using STT / TTS SDKs or streaming APIs;prompt and function-calling basics.
Data : Basic SQL;schema design for transcripts;understanding embeddings / vector search concepts.
Cloud (AWS preferred) : S3, Lambda or equivalent services, CloudWatch.Git & CI : Branching, PRs, basic pipelines.Nice to Have
Telephony / WebRTC / Twilio / Kixie experience.Vector DBs (pgvector / FAISS) or Redis for caching / queues.Observability tools (OpenTelemetry / Grafana).Security fundamentals (encryption, IAM best practices).Tech Stack
OpenAI Realtime, Deepgram, Twilio / Kixie, Node.Js / Python, Postgres, Redis, Docker, AWS (S3, Lambda, API Gateway, CloudWatch), WebSockets / SSE.
Compensation & Benefits
Competitive salary and performance bonusRemote / hybrid options within IndiaLearning budget and mentorshipEducation
B.E. / B.Tech / M.E. / M.Tech / MCA or equivalent practical experience.
Notice Period
Immediate to 30 days preferred.
Job Type : Contractual / Temporary
Contract length : 6 months
Work Location : Remote
Timings : 3pm to 12am IST