JOB DESCRIPTION
Must have : Strong on programming languages like Python, Java One cloud hands-on experience (GCP preferred) Experience working with Dockers Environments managing (e.g venv, pip, poetry, etc.) Experience with orchestrators like Vertex AI pipelines, Airflow, etc Understanding of full ML Cycle end-to-end Data engineering, Feature Engineering techniques Experience with ML modelling and evaluation metrics Experience with Tensorflow, Pytorch or another framework Experience with Models monitoring Advance SQL knowledge Aware of Streaming concepts like Windowing , Late arrival , Triggers etc
Good to have : Hyperparameter tuning experience. Proficient in either Apache Spark or Apache Beam or Apache Flink Should have hands-on experience on Distributed computing Should have working experience on Data Architecture design Should be aware of storage and compute options and when to choose what Should have good understanding on Cluster Optimisation / Pipeline Optimisation strategies Should have exposure on GCP tools to develop end to end data pipeline for various scenarios (including ingesting data from traditional data bases as well as integration of API based data sources). Should have Business mindset to understand data and how it will be used for BI and Analytics purposes. Should have working experience on CI / CD pipelines, Deployment methodologies, Infrastructure as a code (eg. Terraform) Hands-on experience on Kubernetes Vector based Database like Qdrant LLM experience (embeddings generation, embeddings indexing, RAG, Agents, etc.).
Machine Learning Engineer • India