About Zumlo Zumlo is an always-on well-being companion—one place for immediate help, gentle structure, and progress you can see. We unify mind, body, emotions, and relationships through timely support, a caring community, and personalized guidance that fits real life. What we stand for : Human first - Simple >
- complex - Privacy & trust - Evidence over hype - Inclusive by default The role (why it’s rare) AI isn’t a bolt-on here—it’s the nervous system of the product. We’re hiring a senior who has already shipped LLM + retrieval to production and wants end-to-end ownership : problem framing, modeling / orchestration, evaluation, privacy / safety, and the Python services that make it real. What you will own Product AI (end-to-end) Build AI across product surfaces : conversational help, guided steps, tailored activities, “what to do next,” summaries / explanations, and safety checks—grounded with retrieval and citations. Turn fuzzy needs into robust flows : prompt design, tool / function calling, JSON-schema outputs, fallbacks, streaming, controllable tone and safety boundaries. Retrieval & knowledge Do RAG right : chunking / segmenters, embeddings, vector DBs (FAISS / qdrant / Pinecone / Milvus), hybrid semantic + re-rankers (BGE / ColBERT), dedupe, freshness policies, provenance. Rapid test & evaluation loop Make eval routine : golden sets, adversarial suites, shadow evals, canaries, online metrics tied to user outcomes. Capture in-app feedback and close the loop weekly. Safety, privacy, governance PHI / PII redaction, prompt-injection defenses, output guardrails, rate limits, audit trails, safe logging. Clear data-handling notes for the team. Backend & reliability Own Python services : FastAPI, Postgres (schemas / migrations), Redis, task queues, retries / idempotency, auth / RBAC, feature flags. Observability first : logs / metrics / traces, alerting that matters, simple SLOs—systems that are predictable and calm. Data & experimentation Trustworthy event tracking, simple SQL cohorting, per-feature cost / latency dashboards, A / B hooks so product / growth can run honest tests. Exploration & de-risking Evaluate models / embeddings, inference servers (vLLM / TensorRT-LLM), compression / quantization, token-efficiency. Prove value with small, cheap spikes before big changes. Collaboration & leadership Partner with product, mobile (React Native / TS), and platform. Review PRs, write concise docs / runbooks, mentor juniors, and help hire the next 1–2 great engineers. Must-have experience 7+ years software engineering with 4+ years Python shipping production APIs / services. Production LLM + RAG you can discuss end-to-end : retrieval, orchestration, evaluation, and user impact. RAG depth : embeddings, vector DBs (FAISS / qdrant / Pinecone / Milvus), hybrid search, re-rankers, citation strategies. Backend foundation : FastAPI / Django / Flask, Postgres / SQL, Redis, queues (Celery / RQ), testing (pytest), CI / CD, containers. Eval mindset : offline metrics + online behavior;
- sample-size sense;
- Knows when to ship, iterate, or kill. Security / privacy : least-privilege, secrets, encryption, safe logging;
- comfortable with sensitive data. Strong plus Well-being / health context;
- HIPAA-aware practices.Azure & Azure DevOps pipelines;
- GPU inference;
- streaming responses. Telemetry for AI : prompt / version tracking, per-feature cost / latency, drift monitors. Worked across US + India user bases / time zones. How you work (what we value) Builder energy : ship → measure → iterate. Creative + logical : playful with ideas, strict with evidence. Product-curious : start with the user problem and “definition of good.” Kind, direct, low-ego : crisp commits / PRs, generous feedback. Owner’s mindset : reliable, documented, observable. Work setup Remote-first in India, collaborating closely with a small core team. Preference for Ahmedabad for an eventual in-person cadence;
open across India for the right fit.