This job offer is not available in your country.

Machine Learning Intern

WadhwaniAI LEHSDelhi, DL, in

19 days ago

Job type

Quick Apply

Job description

Job Description

This is a remote position.

Conduct experiments and reports results reliably, with guidance
Experiments includes (but not limited to) :
Benchmark open-source LLMs (gpt-oss-120 / 20b, Llama models, etc.) against the proprietary LLMs (like OpenAI's GPT-4o / -mini, Gemini, etc.)
Experiment with LLMs to create the scalable and search-efficient KB, synthetic QA generation from the digital documents, and Prompting optimizations to build the scalable chatbots
Evaluate language translation models / services for Indian languages (e.g., Bhashini, Sarvam, Google Translate, etc.)
Assess Speech models for Indian languages for ASR (STT) and TTS tasks (e.g., Amazon Polly, AI4Bharat's Conformers, Sarvam, etc.)
Improve the existing RAG-QA pipeline by finding performance gaps, benchmark different Retrieval (Embedding) and chunking techniques
Gather, clean, analyze, and process the text and speech data for building the knowledge base (KB) for conversational chatbots.
Learns to derive insights from experiments and next steps
Watches incoming data regularly and performs quality checks
Collaborates with cross-functional teams to complete tasks on time
Proactively seeks help and required information from peers
Communicate the research findings in a clean and compact manner
Supports development of good and clean codebases with documentation of code and work consistently with high standards
Communicates and presents results effectively with peers
Stay updated with recent advancements in GenAI-LLMs, ASR (STT and TTS), RAG-QA, Evaluation of LLMs, etc. (that can be applied in our product)
Develops expertise with typical ML tooling such as Pandas, ML frameworks (Pytorch, Scikit-Learn), Excel (Pivot tables), Visualization libraries, Experiment monitoring (Weights & Biases), GitHub
Learns to work efficiently with tooling : Unix, VSCode, Google office suite, Calendar, Slack
Ability to work in a fast-paced startup environment
Eagerness to learn, and apply the latest research / work happening in the domain in the solution

Requirements

Good at AI / ML fundamentals

Good at LLM fundamentals like RAG-QA, Prompt Engineering, Evaluations, Vector Stores, Retrievals and Chunking, Fine-tuing, Synthetic data generation, model deployment, etc.

Experience with GenAI tools (but not limited to) like LangChain, LlamaIndex, LlamaParse, Langfuse, FAISS, Chroma, vLLM, OpenAI toolkits and SDK, etc.

Familiarity with OpenAI models and tools, Open-source LLM models like gpt-oss, Llama, Gemma, Mistral, Wave2Vec2, Bhashini's or AI4Bharat's language translation model, etc.

Familiarity with Docker, AWS, GCP is a plus

Strong Python coding and debugging skills, hands-on experience with some of the Data Science toolkits like Pandas, Numpy, Matplotlib / Seaborn, etc. and preferably at least one Deep Learning Framework among Pytorch (preferably), Keras, TensorFlow, etc.

Should have completed coursework in Probability, Linear Algebra, Calculus and preferably has some exposure to AI / Machine Learning.

Highly preferred to have demonstrated experience of working in the field via an internship or project. Do provide links to some of your open-source projects.

Prior exposure to Linux / Unix is expected before joining for the internship.

Requirements

M.Tech. / M.E. / M.S. / M.Sc. or equivalent in Computer Science, Electrical Engineering, Statistics, Applied Mathematics, Physics, Economics, or a relevant quantitative field

Create a job alert for this search

Machine Learning Intern • Delhi, DL, in