Talent.com
AI Engineer & Researcher

AI Engineer & Researcher

BaseThesis LabsBengaluru, India
8 hours ago
Job description

about we believe the next decade of computing isn't about better agents or faster models.

it's about machines that remember you, respond naturally, and feel like they understand context across time and modality.

right now, every ai assistant forgets you exist when you close the session. every voice agent sounds scripted and breaks on interrupts. every "breakthrough" is just better prompts on the same broken and painfully complex architecture.

we're building the infrastructure layer that makes ai interactions actually feel natural - adaptive memory systems, real-time voice orchestration, self-evolving agents that evolve from experiences by rewriting their own code, and multimodal context that doesn't fall apart.

think : the default stack that every multimodal agent will need, but doesn't exist yet.

our team is :

a 25-year-old ai-first builder (that’s me) from IIT-M and BITS Pilani who’s been building self evolving, multi-modal agents for enterprises

Ashok, who cofounded DriveU and scaled it to ₹125 crore ARR and profitability

Raveen, cofounder of Myntra, Baby Oye and Multiply Ventures

we also run a private community of 140+ curated ai builders / researchers from labs like anthropic, open ai, cartesia, ultravox, microsoft research, as well as veterans from aws & am

our research thesis we're betting on three-layer adaptive memory with reinforcement learning :

layer 1 (working memory) : recent context, lasts minutes, conversation-specific

layer 2 (episodic memory) : important past interactions, lasts weeks, user-specific

layer 3 (semantic memory) : learned patterns / preferences, lasts forever, continuously refined

the rl component : the agent gets feedback signals (explicit corrections, conversation success, user retention) and uses policy gradients to learn :

what's worth moving from working to episodic memory?

what patterns in episodic memory should become semantic?

when to surface which memory layer during conversations?

this isn't novel ml research - we're applying existing rl techniques (you know them) to a new problem : continuous memory improvement for conversational agents.

we're also building real-time voice orchestration with :

predictive vad (anticipate interrupts before they happen)

multi-agent coordination (handle 3+ speakers / agents smoothly)

latency-adaptive quality (degrade gracefully under network constraints)

what you'll actually do (with me) audit ultravox, sesame, elevenlabs, openai realtime api with me

design prototype adaptive memory system (three layers + rl)

ship first voice agent demo with working memory

build multi-agent orchestration (handling 2+ speakers)

implement predictive vad (not just silence detection)

create test harnesses for latency, context retention, ux flow

package everything as usable sdk

write documentation that doesn't suck or confuse

iterate based on what breaks in production

either scale the infra or spin out a product

mentor teams at our hackathons (we're running quarterly builder events)

collaborate with fellow lab researchers to turn research papers, frameworks, and technical ideas into tangible demos or proof-of-concepts.

contribute to reusable internal libraries, api’s, and infrastructure that power all basethesis projects.

not using off-the-shelf solutions (again). evaluating sesame, ultravox, elevenLabs, openai realtime api, and building what they lack.

work across voice, vision, and text-based interfaces- integrating models into cohesive user experiences.

document learnings as “lab notes” or public prototypes. clarity and storytelling are part of the craft.

who this is for required qualifications

you've shipped something that had real users (product, side project, open source tool)

you understand system design at an architectural level, not just api integration

you can write production-quality code fast (we're judging on speed + quality, not just one)

you think from first principles : "why does rag fail for real-time agents?" not "everyone uses rag so we should too"

you're comfortable with navigating ambiguity initially

you can explain complex technical decisions simply (this matters for docs + future hiring later)

preferred qualifications (not required, just helpful)

experience with websockets, webrtc, or real-time audio / video pipelines

familiarity with reinforcement learning (grpo, ppo, dqn, policy gradients)

you've read papers on memory systems, hci, or voice ai (or you will after reading this)

you've built ml models beyond just fine-tuning llms

you've debugged production systems under load and lived to tell the stories

you've contributed to open source or have a github people actually look at

what you'll get you're not maintaining legacy code or optimizing conversion rates by 0.3%.

you're solving genuinely hard problems that don't have stack overflow or chatgpt answers.

space to work on cutting edge research, no pressure to serve legacy customers

you define the entire memory + multimodal stack with me from scratch

market rate salary and founding engineer equity in something that will matter

workspace buzzing with the most hardcore builders & the best research minds in blr

when people reference "how basethesis handles context" in 2027, that's your architecture

direct access to ai builders / researchers at openai, deepmind, anthropic, xAI, amd

a community of hundreds hardcore builders and top researchers in frontier labs

the downside : first couple months you're researching, tinkering and coding 10-12 hours a day

working on cutting edge research means we rapidly learn and iterate every day

we might and will fail multiple times before we finally land on mars

why you might want to do this you're probably 2-5 years out of college. worked at a startup or built side projects with users.

you're frustrated by abstraction layer or wrapper culture.

you want to build infrastructure that's technically hard and which requires deep understanding of the entire stack.

you're okay with :

figuring it out together (we don't have all the answers)

intense first 6 months proving this matters

rapidly learning and iterating with an obsession towards the mision

you're excited by :

defining a new technical approach to ai memory

building in public, open sourcing some parts

being in the room when we decide together on what to build next

ownership of hard technical problems

if you're reading this thinking "f

  • it, i want to try" - that's the sign to apply.

you're probably more ready than you think. capability over credentials.

let's build

Create a job alert for this search

Ai Engineer • Bengaluru, India