Talent.com
This job offer is not available in your country.
SRE DevOps Engineer

SRE DevOps Engineer

BrillioIndia
1 day ago
Job description

SRE DevOps(ML Ops role)

Required Skills :

  • Demonstrated ability in designing, building, refactoring and releasing software written in Python.
  • Hands-on experience with ML frameworks such as PyTorch, TensorFlow, Triton.
  • Ability to handle framework-related issues, version upgrades, and compatibility with data processing / model training environments.
  • Experience with AI / ML model training and inferencing platforms is a big plus.
  • Experience with the LLM fine tuning system is a big plus.
  • Debugging and triaging skills.
  • Cloud technologies like Kubernetes, Docker and Linux fundamentals.
  • Familiar with DevOps practices and continuous testing.
  • DevOps pipeline and automations : app deployment / configuration & performance monitoring.
  • Test automations, Jenkins CI / CD.
  • Excellent communication, presentation, and leadership skills to be able to work and collaborate with partners, customers and engineering teams.
  • Well organized and able to manage multiple projects in a fast paced and demanding environment.
  • Good oral / reading / writing English ability

SRE DevOps (Big Data Role)

Required Skills :

  • Demonstrated ability in designing, building, refactoring and releasing software written in Python.
  • Hands-on experience with ML frameworks such as PyTorch, TensorFlow, Triton
  • Ability to handle framework-related issues, version upgrades, and compatibility with data processing / model training environments.
  • Experience with AI / ML model training and inferencing platforms is a big plus.
  • Experience with the LLM fine tuning system is a big plus.
  • Debugging and triaging skills.
  • Cloud technologies like Kubernetes, Docker and Linux fundamentals.
  • Familiar with DevOps practices and continuous testing.
  • DevOps pipeline and automations : app deployment / configuration & performance monitoring.
  • Test automations, Jenkins CI / CD.
  • Excellent communication, presentation, and leadership skills to be able to work and collaborate with partners, customers and engineering teams.
  • Well organized and able to manage multiple projects in a fast paced and demanding environment.
  • Good oral / reading / writing English ability
  • SRE DevOps(ML Flow)

    Required Skills :

  • Demonstrated ability in designing, building, refactoring and releasing software written in Python, C++.
  • Hands-on experience with Ray.io, including workload management, cluster deployment, distributed task scheduling, and troubleshooting.
  • Ability to use Ray Dashboard and CLI tools for monitoring, resource tracking, debugging distributed jobs, and resolving production issues.
  • Having knowledge of Ray ecosystem libraries such as Ray Train, Ray Tune, Ray Serve, and Ray Data is a big plus.
  • Experience integrating Ray with tools such as Airflow, MLflow, Dask, DeepSpeed is a big plus.
  • Debugging and triaging skills.
  • Cloud technologies like Kubernetes, Docker and Linux fundamentals.
  • Familiar with DevOps practices and continuous testing.
  • DevOps pipeline and automations : app deployment / configuration & performance monitoring.
  • Test automations, Jenkins CI / CD.
  • Excellent communication, presentation, and leadership skills to be able to work and collaborate with partners, customers and engineering teams.
  • Well organized and able to manage multiple projects in a fast paced and demanding environment.
  • Good oral / reading / writing English ability
  • Create a job alert for this search

    Sre Engineer • India