Machine Learning Engineer
About Us :
We are a fast-growing, innovative company at the forefront of leveraging cutting-edge AI and Machine Learning to revolutionize our industry.
We're building intelligent systems that drive significant impact for our customers.
We are looking for a highly skilled and experienced Machine Learning Engineer to join our dynamic team and play a pivotal role in bringing our advanced ML and LLM services to life in production environments.
The Role :
As a Machine Learning Engineer, you will be instrumental in designing, developing, and deploying robust and scalable ML and LLM-powered solutions.
You'll work across the entire ML lifecycle, from ideation and prototyping to production deployment, monitoring, and optimization.
This role requires a deep understanding of modern deep learning techniques, LLM toolchains, and MLOps best practices, with a strong emphasis on productionizing AI services on cloud platforms.
Responsibilities :
- Own the ML / LLM Lifecycle : Lead the end-to-end ownership of Machine Learning and Large Language Model services, from initial design and development through to their successful deployment and operation in production environments.
- Production Deployment : Deploy and manage ML / LLM services on cloud platforms, particularly Azure (AKS, Azure OpenAI, Azure ML), ensuring high availability, performance, and scalability.
- Deep Learning Development : Apply strong Python programming skills and hands-on experience with modern deep-learning stacks such as PyTorch, TensorFlow, or Hugging Face Transformers to build and refine models.
- LLM Toolchain Implementation : Develop features leveraging advanced LLM toolchains, including prompt engineering, function calling / tools, and integration with vector stores (e.g., FAISS, Pinecone).
- Agentic AI Patterns : Implement and refine agentic AI patterns using frameworks like LangChain or LangGraph, including setting up evaluation harnesses and robust guardrails to ensure reliable and controlled LLM behavior.
- Taming Non-determinism : Develop and apply strategies to mitigate LLM non-determinism, ensuring more predictable and consistent outputs from models.
- MLOps & DevOps : Implement and maintain robust MLOps practices, including containerization (Docker), orchestration (Kubernetes), and CI / CD pipelines using Git-based workflows.
- Monitoring & Troubleshooting : Monitor the performance and health of live ML / LLM services, proactively identifying and resolving issues, scaling resources as needed, and troubleshooting complex production problems.
- Collaboration : Work closely with data scientists, product managers, and other engineering teams to translate research and prototypes into production-ready solutions.
Required Qualifications :
3+ years of experience owning and successfully deploying Machine Learning and Large Language Model services in production environments on Azure (specifically AKS, Azure OpenAI, or Azure ML) or another major cloud provider (AWS, GCP).Strong proficiency in Python with a proven track record of hands-on work with a modern deep-learning stack, including PyTorch, TensorFlow, or Hugging Face Transformers.Demonstrated experience building features with LLM toolchains, including expertise in prompt engineering, function calling / tools, and utilizing vector stores such as FAISS, Pinecone, or similar.Familiarity with agentic AI patterns, including experience with frameworks like LangChain or LangGraph, as well as understanding of evaluation harnesses and guardrails for LLMs, and strategies to manage LLM non-determinism.Comfortable with containerization (Docker) and CI / CD practices using Git-based workflows.Proficiency in deploying and managing applications on Kubernetes and experience with monitoring, scaling, and troubleshooting live services.Nice-to-Haves :
Experience within the domains of billing, collections, fintech, or professional-services SaaS.Knowledge of email deliverability, templating engines, or CRM systems.Exposure to compliance frameworks such as SOC 2, ISO 27001, or experience with the secure handling of financial data(ref : hirist.tech)