About the Role
You will help scale our Voice AI Agent (telephony + STT + LLM + TTS) into an enterprise-grade product. This includes building real-time speech / LLM pipelines, barge-in and human takeover flows, data loops for model improvement, and natural interruption handling.
What You’ll Do
Implement and tune real-time STT / ASR and TTS pipelines.
Orchestrate LLM-driven conversations : prompting, function calling, dialog state, and context handling.
Build barge-in logic and reliable human handoff signals.
Create datasets for training : recordings, redaction, labeling, metadata; enable search and retrieval.
Optimize latency, accuracy, turn-taking, and cost.
Add guardrails, fallbacks, and basic A / B experimentation.
Required Skills
Programming : Python or Node.js; modular code, async handling.
APIs & Events : REST, webhooks, retries, idempotency.
Speech / LLM : Using STT / TTS SDKs or streaming APIs; prompt and function-calling basics.
Data : Basic SQL; schema design for transcripts; understanding embeddings / vector search concepts.
Cloud (AWS preferred) : S3, Lambda or equivalent services, CloudWatch.
Git & CI : Branching, PRs, basic pipelines.
Nice to Have
Telephony / WebRTC / Twilio / Kixie experience.
Vector DBs (pgvector / FAISS) or Redis for caching / queues.
Observability tools (OpenTelemetry / Grafana).
Security fundamentals (encryption, IAM best practices).
Tech Stack
OpenAI Realtime, Deepgram, Twilio / Kixie, Node.js / Python, Postgres, Redis, Docker, AWS (S3, Lambda, API Gateway, CloudWatch), WebSockets / SSE.
Compensation & Benefits
Competitive salary and performance bonus
Remote / hybrid options within India
Learning budget and mentorship
Education
B.E. / B.Tech / M.E. / M.Tech / MCA or equivalent practical experience.
Notice Period
Immediate to 30 days preferred.
Job Type : Contractual / Temporary
Contract length : 6 months
Work Location : Remote
Timings : 3pm to 12am IST
Ai Engineer • Ahmedabad, Gujarat, India