Talent.com
Site Reliability Engineer - Docker / Kubernetes

Site Reliability Engineer - Docker / Kubernetes

VakconsultingHyderabad
17 days ago
Job description

Description :

We have an urgent need of strong Python + SRE engineer at Offshore. Kindly share profiles with hands-on experience in AWS, Kubernetes, Python, Splunk, Prometheus & Grafana. Please do share only immediate joiners and mention the candidate availability for this opportunity while sharing the profiles.

Key Responsibilities :

  • Design, implement, and manage scalable and highly available cloud infrastructure on AWS or GCP.
  • Containerize applications using Docker, and manage orchestration with Kubernetes.
  • Collaborate with developers and QA teams to integrate CI / CD pipelines and automate deployment processes.
  • Ensure system reliability, uptime, and performance by leveraging industry-leading monitoring tools such as Grafana, Dynatrace, etc.
  • Troubleshoot system failures, conduct root cause analysis, and provide long-term solutions to prevent recurrence.
  • Script and automate operational tasks using Python or Java to improve system efficiency.
  • Maintain documentation of system architecture, procedures, and configurations.
  • Participate in incident response and on-call support rotation if Skills & Qualifications :
  • Minimum 5 years of hands-on experience in a DevOps / SRE role.
  • Strong expertise in AWS or Google Cloud Platform (GCP).
  • Deep understanding and practical experience with Docker and Kubernetes in production environments.
  • Proficient in Java or Python for scripting, automation, and integrations.
  • Experience with monitoring tools such as Grafana, Dynatrace, Prometheus, etc.
  • Strong problem-solving skills and ability to work in a fast-paced environment.
  • Excellent communication and documentation skills.

Must Have Skills :

  • AWS, DevOps, Prometheus, Grafana, Splunk, Python Scripting.
  • Need experience in dashboards configuration / Setup for monitoring using Splunk, Grafana etc
  • Preferred Attributes :

  • Prior experience in large-scale enterprise systems.
  • Ability to work independently and take ownership of DevOps processes.
  • Exposure to Agile / Scrum methodologies.
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Hyderabad