✅ Must-Have Requirements
Experience : 5–8 years in SRE and / or DevOps roles
Programming Skills : Proficiency in at least one coding language — preferably Python or C++
Platform Support : Experience supporting and enhancing AI Platform services
Automation : Proven ability to implement automation solutions
Troubleshooting : Strong skills in diagnosing and resolving issues in distributed systems
Agile Methodologies : Comfortable working in Scrum or Kanban environments
Soft Skills :
Team-oriented mindset
Strong interpersonal and communication skills
High ownership and accountability
Work Ethic :
Independent and proactive
Comfortable in fast-paced, ambiguous environments
Energetic, self-directed, and self-motivated
Ray.io Expertise : Hands-on experience with Ray.io for :
Workload management
Cluster deployment
Distributed task scheduling
Technical Stack Familiarity :
Kubernetes
Docker
Linux fundamentals
CI / CD pipelines
Good-to-Have Skills
Exposure to AI / ML infrastructure or platform engineering
Experience in cross-functional collaboration across service lines
Familiarity with cloud-native tools and observability frameworks
Senior Site Reliability Engineer • Bengaluru, Karnataka, India