We are CirrusLabs . Our vision is to become the world's most sought-after niche digital transformation company that helps customers realize value through innovation. Our mission is to co-create success with our customers, partners and community. Our goal is to enable employees to dream, grow and make things happen. We are committed to excellence. We are a dependable partner organization that delivers on commitments. We strive to maintain integrity with our employees and customers. Every action we take is driven by value. The core of who we are is through our well-knit teams and employees. You are the core of a values driven organization.
You have an entrepreneurial spirit. You enjoy working as a part of well-knit teams. You value the team over the individual. You welcome diversity at work and within the greater community. You aren't afraid to take risks. You appreciate a growth path with your leadership team that journeys how you can grow inside and outside of the organization. You thrive upon continuing education programs that your company sponsors to strengthen your skills and for you to become a thought leader ahead of the industry curve.
You are excited about creating change because your skills can help the greater good of every customer, industry and community. We are hiring a talented Data Scientist & Agentic AI Developer to join our team. If you're excited to be part of a winning team, CirrusLabs ( http : / / www.cirruslabs.io ) is a great place to grow your career.
Experience - 5-8 years
Location - Bengaluru
Work Timings - 2pm - 11pm IST
Tech Stack : Python, LangChain, LangSmith, Phoenix, TensorFlow, PyTorch, OpenAI API, Anthropic Claude, Azure OpenAI, AWS / Azure / GCP. We value innovation, collaboration, and a commitment to ethical AI development.
Position Overview
We are seeking an experienced Data Scientist and Agentic AI Developer to design, develop, and evaluate intelligent AI agents that deliver high-quality, reliable, and compliant solutions. This role requires a blend of data science expertise, AI / ML engineering capabilities, and hands-on experience building agentic systems.
Experience Required
5-6 years of professional experience in data science, machine learning, and AI development
Key Responsibilities
AI Agent Development & Deployment
- Design and develop agentic AI systems powered by Large Language Models (LLMs) with tool-calling capabilities
- Build and optimize multi-step reasoning workflows and agent orchestration frameworks
- Implement retrieval-augmented generation (RAG) pipelines for knowledge-intensive applications
- Integrate external tools, APIs, and databases into agent workflows
- Deploy and monitor production-grade AI agents at scale
Evaluation & Quality Assurance
Develop comprehensive evaluation frameworks using LLM-as-a-judge methodologiesImplement automated scoring systems for output quality metrics (correctness, helpfulness, coherence, relevance)Design and execute robustness testing including adversarial attack scenariosMonitor and reduce hallucination rates and ensure factual accuracyTrack performance metrics including latency, throughput, and cost-per-interactionData Science & Analytics
Analyze agent performance data to identify improvement opportunitiesBuild custom evaluation pipelines and scoring rubricsConduct A / B testing and statistical analysis for model optimizationCreate dashboards and visualization tools for stakeholder reportingImplement RAGAs (Retrieval Augmented Generation Assessment) frameworksSafety, Ethics & Compliance
Ensure AI systems meet ethical standards including bias detection and fairnessImplement safety guardrails to prevent harmful content generationDevelop compliance monitoring systems for regulatory frameworks (EU AI Act, GDPR, HIPAA, DPDP)Document transparency and explainability measuresEstablish human oversight protocolsRequired Skills & Qualifications
Technical Expertise Programming : Strong proficiency in Python; experience with AI / ML frameworks (Lang Chain, Lang Smith, Phoenix, or similar)LLM Expertise : Hands-on experience with GPT, Claude, or other frontier models; prompt engineering and fine-tuningMachine Learning : Deep understanding of NLP, deep learning architectures, and model evaluationTools & Platforms : Experience with MLOps tools, vector databases, and observability platformsData Engineering : Proficiency in SQL, data pipelines, and ETL processesDomain Knowledge
Understanding of agentic AI architectures and autonomous systemsKnowledge of RAG systems and information retrieval techniquesFamiliarity with LLM evaluation methodologies and benchmarksExperience with conversational AI and dialogue systemsUnderstanding of AI safety, alignment, and interpretabilityEvaluation & Metrics
Experience designing evaluation rubrics and scoring systemsProficiency with automated evaluation frameworks (RAGAs, custom evaluators)Understanding of quality metrics : coherence, fluency, factual accuracy, hallucination detectionKnowledge of performance metrics : latency optimization, token usage, throughput analysisExperience with user experience metrics (CSAT, NPS, turn count analysis)Soft Skills Strong analytical and problem-solving abilitiesExcellent communication skills for cross-functional collaborationAbility to balance innovation with practical constraintsDetail-oriented with a focus on quality and reliabilitySelf-driven with ability to work in fast-paced environmentsPreferred Qualifications
Experience with compliance frameworks and regulatory requirements
Background in conversational AI or chatbot developmentKnowledge of reinforcement learning from human feedback (RLHF)Experience with multi-modal AI systemsTools & Technologies Frameworks : Lang Chain, Lang Smith, Phoenix, TensorFlow, PyTorchLLM Platforms : OpenAI API, Anthropic Claude, Azure OpenAIDatabases : Vector databases (Pinecone, Weaviate, Chroma), PostgreSQL, MongoDBMonitoring : Lang Smith, Phoenix, custom observability toolsCloud : AWS / Azure / GCP experience preferredVersion Control : Git, CI / CD pipelinesWhat You'll Deliver Production-ready agentic AI systems with measurable quality improvementsComprehensive evaluation frameworks with automated scoringPerformance dashboards and reporting systemsDocumentation for technical specifications and compliance standardsContinuous improvement strategies based on data-driven insights