We are building next-generation
AI Voice Agents for recruiters . You’ll design and implement real-time, low-latency conversational systems that combine
LLMs, STT, TTS, RAG , and streaming pipelines. The backend runs in
Python , with a
React / Node.js
frontend. If you have built real-time voice systems that can handle human interruptions and dynamic context, this role fits you.
Key Responsibilities
Build and optimize
real-time AI Voice Agents
in Python.
Integrate
TTS (e.g., ElevenLabs)
and
STT (e.g., Deepgram)
systems for natural, fast voice interaction.
Implement
LLM integration
with focus on
latency reduction
and
interruption handling .
Develop caching layers for
prompts and audio streams
to reduce cost and delay.
Design and maintain
SIP, WebRTC, and WebSocket
pipelines for call handling.
Handle
automatic answering machine detection (AMD)
and dynamic call routing.
Work with React / Node.js frontend teams for end-to-end real-time communication.
Required Skills
Python : Strong backend development experience.
Real-Time AI Voice Systems : Proven working experience is
mandatory .
LLM Integration : Deep understanding of latency, interruption handling, and RAG pipelines.
Networking Protocols : Hands-on experience with
SIP, WebRTC, and WebSockets .
Frontend Collaboration : Working knowledge of
React
and
Node.js .
Preferred Skills
Experience with
ElevenLabs ,
Deepgram , or other leading TTS / STT APIs.
Knowledge of
audio stream caching , buffering, and GPU optimization.
Familiarity with
FreeSWITCH ,
Asterisk , or similar VoIP platforms.
Background in
automatic machine detection
and call classification.
Why Join Us
You’ll work on cutting-edge
voice-first AI infrastructure
used by staffing companies worldwide. Early-stage team, fast decisions, deep tech stack, and clear impact.
Salary is in INR
Ai Agent Engineer • India