Description :
We are seeking a high-calibre Applied Scientist to drive innovation in our core information retrieval capabilities.
This critical role demands deep expertise in building high-performance Machine Learning systems that enhance search relevance, retrieval efficiency, and user experience.
You will be responsible for pioneering the implementation of Vector Search, Hybrid Search, and LLM-powered RAG systems at scale.
Key Responsibilities & Strategic Deliverables :
1. Search & Ranking System Development :
- ML Pipeline Ownership : Design, build, and deploy robust, production-ready ML pipelines specifically for large-scale search and ranking applications.
- Advanced Retrieval : Lead the implementation and optimization of vector search, hybrid search, and advanced Learning-to-Rank (LTR) systems to maximize relevance and precision.
- Embedding Management : Drive the entire embedding lifecycle, including embedding generation using state-of-the-art models (e.g., BERT, Sentence Transformers) and managing high-scale embedding indexes using efficient libraries like FAISS, ScaNN, or Annoy.
Generative AI & Cloud Deployment :
LLM Integration (RAG) : Design and implement systems leveraging Large Language Models (LLMs) for advanced Retrieval-Augmented Generation (RAG), enabling more nuanced and conversational search results.Cloud Deployment : Deploy and manage scalable, low-latency solutions in a production environment using modern cloud services such as Vertex AI, Google Cloud Run, or Cloud Functions.Evaluation, Testing & Optimization :
Metrics & Evaluation : Define and rigorously evaluate models using industry-standard search relevance metrics : Precision@K, Recall, nDCG, Mean Average Precision (MAP), and related A / B testing frameworks.MLOps : Practice robust MLOps principles, including the management of CI / CD pipelines, model versioning, and continuous LLM optimization for cost and performance.Required Skills & Technical Expertise :
Programming & Big Data : Strong proficiency in Python, SQL, BigQuery, and PySpark for data manipulation and pipeline construction.Cloud & MLOps : Hands-on experience with Google Cloud Platform services, including Vertex AI, Matching Engine, and Dataproc.Search Infrastructure : Deep practical experience with enterprise search engines like ElasticSearch / OpenSearch and managing vector databases / stores.Core Fundamentals (Mandatory) :
Strong understanding of Vector Databases, Approximate Nearest Neighbor (ANN) algorithms, and core Search Relevance Metrics.Practical knowledge of transformer-based models (BERT, Sentence Transformers) and fine-tuning techniques(ref : hirist.tech)