Job Description
We are seeking a talented Site Reliability Engineer (SRE) to join our team in Pune, India. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our large-scale distributed systems. You will work closely with development teams to implement and maintain robust infrastructure solutions that support our growing business needs.
Role Overview
The Site Reliability Engineer (SRE) is responsible for designing, implementing, and maintaining scalable, reliable, and secure infrastructure and applications. This role blends software engineering with systems engineering to ensure high availability, performance, and observability across cloud-native environments.
- Key Responsibilities
- Architect for Resilience : Design systems with redundancy, fault tolerance, and graceful degradation.
- Observability & Monitoring : Implement full-stack observability including monitoring, logging, tracing, and alerting.
- Automation First : Build workflows to automate deployments, incident response, and routine tasks.
- Incident Management : Enable blameless postmortems and continuous improvement.
- Release Planning : Collaborate with DevOps and engineering teams to manage lifecycle work items and release cycles.
- Global Collaboration : Work in a shared responsibility model with 50–60% overlap with onshore teams for effective communication.
Required Skills & Experience
Cloud Platforms : Azure (preferred), AWS (acceptable with upskilling plan)Infrastructure as Code : Terraform, Helm, GitHub ActionsContainerization & Orchestration : Docker, Kubernetes, Argo CD, FluxDevOps Tools : CI / CD pipelines, GitOps, REST APIsProgramming : Bash, Python (moderate proficiency)Data Ecosystems : Azure Data Factory, Databricks, Fabric (optional but preferred)Team Integration & ExpectationsWork closely with technical leads on support tasks and playbook development.Participate in onboarding and training programs outlined in internal documentation.Contribute to offshore delivery excellence and maintain high standards of reliability and performance.️ Required Technical Skills
Strong Azure Infrastructure and Networking skills.Strong Terraform IaC experience and skills.BICEP knowledge a plus.Strong previous experience in troubleshooting complex issues on an unfamiliar tech stack.Strong Github Actions / Azure DevOps Pipelines.Moderate knowledge of the operations and Infrastructure patterns for working in Data Ecosystems.Data Factory, Databricks, Fabric knowledge a plus.Moderate knowledge of SRE, Observability, and other maintenance style knowledge.Moderate Bash skills.Moderate Python skills.Moderate experience in CI / CD and Git operations for software releases.Moderate AKS / Helm / Kustomize Skills.Flux / Argo / GitOps experience a plus.Moderate Docker operations knowledge.Moderate REST API Knowledge.Qualifications
Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience)3+ years of experience in SRE, DevOps, or similar rolesStrong programming skills in languages such as Python, Go, or JavaExtensive experience with cloud platforms (e.g., AWS, GCP, Azure) and Infrastructure as Code tools (e.g., Terraform, Ansible)Proficiency in containerization technologies, particularly Docker and KubernetesExperience with monitoring and logging tools such as Prometheus, Grafana, and ELK stackStrong knowledge of Linux / Unix systems administration and networking protocolsFamiliarity with CI / CD pipelines and version control systems (e.g., Git)Experience with large-scale distributed systems and microservices architecturesStrong understanding of system reliability, scalability, and performance optimization techniquesExcellent problem-solving skills and ability to troubleshoot complex issuesStrong communication skills and ability to work effectively in a collaborative team environmentExperience with incident management and on-call rotationsKnowledge of SRE principles and best practicesAdditional Information
Beware of scams
Our recruiting team may communicate with candidates via our @hitachisolutions.com domain email address and / or via our SmartRecruiters (Applicant Tracking System) domain email address regarding your application and interview requests.
All offers will originate from our @hitachisolutions.com domain email address. If you receive an offer or information from someone purporting to be an employee of Hitachi Solutions from any other domain, it may not be legitimate.