Talent.com
Site Reliability Engineer
Site Reliability EngineerWhiteLotus Talent Partners • bhopal, India
No longer accepting applications
Site Reliability Engineer

Site Reliability Engineer

WhiteLotus Talent Partners • bhopal, India
16 hours ago
Job description

We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high system availability, reliability, and performance. You will be responsible for identifying and addressing simple issues, as well as escalating more complex problems to senior SREs when needed.

The ideal candidate should have a basic understanding of cloud infrastructure (especially OpenStack and Kubernetes ), containerized environments , and system monitoring. This position offers an excellent opportunity for someone looking to grow into a more advanced SRE or DevOps role.

Key Responsibilities :

For L0 Support (Level 0) :

  • Incident Monitoring & Triage :
  • Respond to system alerts, monitor infrastructure health using tools like Prometheus , Grafana , and Observability for both OpenStack and Kubernetes.
  • Identify low-level issues and follow runbooks or predefined scripts to perform first-level triage.
  • Document and escalate unresolved incidents to L1 or L2 based on established escalation protocols.
  • System Health Checks :
  • Perform daily health checks for Kubernetes pods, nodes, and OpenStack instances.
  • Verify basic functionality of VMs , containers , and network services within the environment.
  • Basic Troubleshooting :
  • Resolve simple issues such as VM reboots, pod failures, and network connectivity issues within OpenStack or Kubernetes environments.
  • Follow the predefined steps for basic troubleshooting tasks like restarting services or clearing logs.
  • Ticket Management :
  • Log incidents and issues into a ticketing system (e.g., JIRA , ServiceNow ) for tracking and escalation.
  • Update incident tickets and provide relevant information for ongoing resolution efforts.

=========================================================================================================

For L1 Support (Level 1) :

  • Incident Resolution :
  • Investigate and resolve more complex issues compared to L0, such as Kubernetes pod crashes, network misconfigurations in OpenStack, and minor service disruptions.
  • Work with tools like kubectl to troubleshoot Kubernetes pods and nodes, and OpenStack CLI to diagnose problems with VMs, storage, and networks.
  • Automation & Scripting :
  • Automate routine tasks, such as VM provisioning, pod deployments, or status checks, using basic scripting languages ( Python , Bash ).
  • Improve automation workflows based on feedback and frequently encountered issues.
  • Log Aggregation & Monitoring :
  • Review logs and metrics collected from ELK Stack , Prometheus , Grafana , or other logging tools to detect trends and potential issues.
  • Analyze logs and metrics from OpenStack and Kubernetes clusters to pinpoint underlying problems (e.g., high CPU usage, memory leaks).
  • Basic Network & Storage Management :
  • Investigate networking issues related to Neutron (for OpenStack) and CNI configurations (for Kubernetes).
  • Manage storage resources within OpenStack and Kubernetes (e.g., creating persistent volumes, debugging storage access issues).
  • Collaboration & Escalation :
  • Work closely with L2 and L3 engineers for complex troubleshooting or advanced system issues that require in-depth knowledge.
  • Share knowledge with the team and assist in creating new documentation or updating existing troubleshooting guides.
  • User and Permissions Management :
  • Perform basic user management tasks within OpenStack (e.g., creating and managing tenants, security groups).
  • Review and modify Kubernetes RBAC (Role-Based Access Control) settings based on user access needs.
  • Skills & Qualifications :

    Required Skills :

  • Basic Cloud & Kubernetes Knowledge :
  • Familiarity with OpenStack architecture (e.g., Nova , Neutron , Cinder ).
  • Basic understanding of Kubernetes components, including pods , services , deployments , and namespaces .
  • Systems & Networking :
  • Knowledge of Linux / Unix-based operating systems (e.g., Ubuntu , CentOS , Red Hat ).
  • Understanding of networking concepts like DNS , IP routing , and VLANs in cloud environments.
  • Monitoring & Alerting Tools :
  • Familiarity with monitoring tools like Prometheus , Grafana , Zabbix , or CloudWatch for alert management and system health monitoring.
  • Troubleshooting & Incident Response :
  • Experience in using log aggregation tools ( ELK stack , Splunk ) and interpreting logs for incident detection.
  • Ability to perform basic troubleshooting steps (e.g., restarting services, running basic shell commands) to resolve issues.
  • Communication Skills :
  • Strong communication skills to collaborate effectively with senior SREs, developers, and other teams.
  • Ability to document incidents, solutions, and troubleshooting steps clearly.
  • Preferred Skills :

  • Basic Scripting & Automation :
  • Exposure to scripting languages such as Bash , Python , or Go to automate basic administrative tasks.
  • Cloud Platform Experience :
  • Familiarity with other cloud technologies such as AWS , Azure , or Google Cloud Platform .
  • Certifications :
  • Basic certifications such as CompTIA Linux+ , AWS Certified Solutions Architect , Kubernetes Fundamentals (CKA), or OpenStack COA are a plus.
  • Create a job alert for this search

    Site Reliability Engineer • bhopal, India

    Related jobs
    Full Stack Engineer

    Full Stack Engineer

    Allianze Infosoft • bhopal, madhya pradesh, in
    We’re looking for a highly skilled.You’ll independently design, develop, and deploy fully functional, high-performance websites — from concept to completion. Build responsive and optimized websites ...Show more
    Last updated: 20 days ago • Promoted
    Databricks Engineer

    Databricks Engineer

    DataE2E Technologies • bhopal, madhya pradesh, in
    We are looking for an experienced.The ideal candidate has hands-on experience in Databricks production environments, along with. You will collaborate with cross-functional teams to create robust dat...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer (C# / Python)

    Senior Site Reliability Engineer (C# / Python)

    Entech • bhopal, madhya pradesh, in
    Senior Software Site Reliability Engineer (C# / Python).You’ll ensure enterprise systems are reliable, scalable, and performant - driving improvements, leading SRE initiatives, and mentoring teams on...Show more
    Last updated: 5 days ago • Promoted
    MLOps Engineer

    MLOps Engineer

    Capgemini • bhopal, madhya pradesh, in
    Experience in developing MLOps framework cutting ML lifecycle : model development, training, evaluation, deployment, monitoring including Model Governance. Expert in Azure Databricks, Azure ML, Unity...Show more
    Last updated: 18 days ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    Soopra.ai • bhopal, madhya pradesh, in
    Join us as we create a world where anyone can build their own AI twin that learns, earns, and lives forever.Build scalable web applications end-to-end across backend, APIs, and front-end interfaces...Show more
    Last updated: 30+ days ago • Promoted
    DevOps Engineer

    DevOps Engineer

    CES • bhopal, madhya pradesh, in
    We are seeking a highly skilled.Site Reliability Engineer (SRE) / DevOps Engineer.In this role, you will partner across engineering teams to enhance platform reliability, accelerate delivery, and e...Show more
    Last updated: 22 hours ago • Promoted • New!
    HYPERVISOR TEST ENGINEER (Foundation Level)

    HYPERVISOR TEST ENGINEER (Foundation Level)

    Piepeople Consulting Inc. • bhopal, madhya pradesh, in
    Solid understanding of hypervisors, virtual machines (VMs), and core concepts like CPU, memory, and I / O allocation.Basic operating systems (especially Linux), hardware basics, and fundamental progr...Show more
    Last updated: 1 day ago • Promoted
    Freelance Site Reliability Engineer (SRE) / DevOps Engineer

    Freelance Site Reliability Engineer (SRE) / DevOps Engineer

    ThreatXIntel • bhopal, madhya pradesh, in
    ThreatXIntel is a startup cyber security company focused on delivering customized, affordable solutions to protect businesses and organizations from cyber threats. Our experienced team specializes i...Show more
    Last updated: 22 hours ago • Promoted • New!
    Site Reliability Engineer

    Site Reliability Engineer

    Infosys Finacle • bhopal, India
    Role : DevSecOps Developer – Secure Coding & Automation.Strong scripting skills in Python, Shell, or similar languages for automation and tooling. Should be able to design, develop, test, and deploy...Show more
    Last updated: 2 days ago • Promoted
    GenAI Engineer Role - Remote - Sonata Software Services

    GenAI Engineer Role - Remote - Sonata Software Services

    Sonata Software • bhopal, madhya pradesh, in
    Remote
    Currently we have an urgent Position for GenAI Engineer Role with one of our projects, location based out of Remote.Kindly find below the Details for your Perusal. Notice Period : 0-20 Days Maximum....Show more
    Last updated: 12 days ago • Promoted
    Billing Engineer / Field Engineer

    Billing Engineer / Field Engineer

    Self-employed • Bhopal, Madhya Pradesh, India
    Company Description We suggest you enter details here.Role Description This is a full-time hybrid role for a Billing Engineer / Field Engineer located in Bina. The Billing Engineer / Field Engineer wi...Show more
    Last updated: 30+ days ago • Promoted
    Software Test Engineer (Remote)

    Software Test Engineer (Remote)

    Taskify AI • bhopal, madhya pradesh, in
    Remote
    We're Hiring "Software Test Engineer (Freelance / Remote)" | Earn up to $2500 per month.Contribute to training and refining cutting-edge AI systems. Adopt a “user mindset” to produce natural and reali...Show more
    Last updated: 5 days ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    HRhelpdesk • Bhopal, Madhya Pradesh, India
    About the company : Company is a rapidly growing, private equity backed SaaS product company and provides cloud-based solutions. Job Summary : As a Site Reliability Engineer (SRE), you will be re...Show more
    Last updated: 16 hours ago • Promoted • New!
    Full Stack Engineer

    Full Stack Engineer

    DevRob • bhopal, madhya pradesh, in
    DevRob is an AI and robotics startup developing cutting-edge software for planning and optimizing production lines for large manufacturing companies. We combine advanced robotics simulation, motion ...Show more
    Last updated: 3 days ago • Promoted
    Mid QA / Test Engineer

    Mid QA / Test Engineer

    Imfuna • bhopal, madhya pradesh, in
    Imfuna delivers world-class digital inspection apps and a SaaS-based web report publishing solution for the property and construction markets. Our innovative tools transform inefficient industry pro...Show more
    Last updated: 22 hours ago • Promoted • New!
    Full Stack Engineer

    Full Stack Engineer

    Beast Insights • bhopal, madhya pradesh, in
    We’re building the Payment Command Center for high-risk merchants — a platform that helps businesses recover failed payments, prevent chargebacks, and boost approval rates using data and intelligen...Show more
    Last updated: 22 hours ago • Promoted • New!
    Principal RTL Design Engineer / Co-founder - AI / ML Accelerator

    Principal RTL Design Engineer / Co-founder - AI / ML Accelerator

    Faststream Technologies • bhopal, madhya pradesh, in
    Lead / Own a world class NPU for Edge AI Inference.Develop ultra-low-power machine learning chips for intelligent sensing and autonomous navigation. Architect / Work independently and collaborativel...Show more
    Last updated: 2 days ago • Promoted
    Software Engineer

    Software Engineer

    Quik Hire • bhopal, madhya pradesh, in
    Outsmart Artificial Intelligence with your human brilliance.In this role, you’ll leverage your coding expertise to evaluate, refine, and guide AI systems by applying real-world engineering standard...Show more
    Last updated: 5 days ago • Promoted