We are building next-generation AI Voice Agents for recruiters . You’ll design and implement real-time, low-latency conversational systems that combine LLMs, STT, TTS, RAG , and streaming pipelines. The backend runs in Python , with a React / Node.js frontend. If you have built real-time voice systems that can handle human interruptions and dynamic context, this role fits you.
Key Responsibilities
- Build and optimize real-time AI Voice Agents in Python.
- Integrate TTS (e.g., ElevenLabs) and STT (e.g., Deepgram) systems for natural, fast voice interaction.
- Implement LLM integration with focus on latency reduction and interruption handling .
- Develop caching layers for prompts and audio streams to reduce cost and delay.
- Design and maintain SIP, WebRTC, and WebSocket pipelines for call handling.
- Handle automatic answering machine detection (AMD) and dynamic call routing.
- Work with React / Node.js frontend teams for end-to-end real-time communication.
Required Skills
Python : Strong backend development experience.Real-Time AI Voice Systems : Proven working experience is mandatory .LLM Integration : Deep understanding of latency, interruption handling, and RAG pipelines.Networking Protocols : Hands-on experience with SIP, WebRTC, and WebSockets .Frontend Collaboration : Working knowledge of React and Node.js .Preferred Skills
Experience with ElevenLabs , Deepgram , or other leading TTS / STT APIs.Knowledge of audio stream caching , buffering, and GPU optimization.Familiarity with FreeSWITCH , Asterisk , or similar VoIP platforms.Background in automatic machine detection and call classification.Why Join Us
You’ll work on cutting-edge voice-first AI infrastructure used by staffing companies worldwide. Early-stage team, fast decisions, deep tech stack, and clear impact.
Salary is in INR