We are seeking a highly skilled AI / ML Engineer with strong expertise in Python programming, API development, and real-time deployment of ML models. The ideal candidate should have experience in designing, building, and optimizing machine learning pipelines and integrating models into production environments.
Key Responsibilities :
- Design, develop, and optimize machine learning models using Python and popular ML frameworks (TensorFlow, PyTorch, Scikit-learn, etc.).
- Implement end-to-end ML pipelines including data preprocessing, feature engineering, training, evaluation, and deployment.
- Build and manage RESTful APIs / gRPC services to expose ML models for real-time inference.
- Deploy and scale models in production environments (Docker, Kubernetes, cloud platforms such as AWS, GCP, or Azure).
- Ensure high availability, low-latency, and fault-tolerant real-time ML systems.
- Collaborate with data engineers and software teams to integrate ML solutions into existing applications.
- Conduct performance monitoring, optimization, and retraining of models as needed.
- Apply MLOps best practices for CI / CD pipelines, model versioning, and automated deployment workflows.
- Write clean, efficient, and production-grade Python code following software engineering best practices.
Required Skills & Experience :
3-8 years of hands-on experience in Python programming (advanced knowledge of data structures, OOP, multiprocessing, async programming).Strong expertise in machine learning algorithms, model training, and evaluation techniques.Experience with API development (FastAPI, Flask, Django, or similar).Proven experience in real-time model deployment and serving (TensorFlow Serving, TorchServe, MLflow, or custom solutions).Solid understanding of cloud-native deployments (AWS Sagemaker, GCP Vertex AI, Azure ML) and containerization (Docker, Kubernetes).Knowledge of streaming data frameworks (Kafka, Spark Streaming, Flink) is a plus.Familiarity with CI / CD pipelines for ML (GitHub Actions, Jenkins, or similar).Strong grasp of data engineering concepts (data ingestion, transformation, and storage).Experience in monitoring & logging (Prometheus, Grafana, ELK, or equivalent).Nice to Have :
Experience with Generative AI (LLMs, diffusion models, transformers).Exposure to GPU optimization and distributed training.Familiarity with feature stores and advanced MLOps frameworks (Kubeflow, TFX, MLflow).(ref : hirist.tech)