Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

JRD SystemsDelhi, India
22 hours ago
Job description

Position :

Site Reliability Engineer (SRE)

Role Overview :

We are seeking an experienced Site Reliability Engineer (SRE) with a strong background in

Windows infrastructure

to manage and optimize our cloud and on-premises environments. The ideal candidate will partner with development teams to improve service reliability, implement automation, and ensure high-performance systems across VMC, AWS, and Azure platforms.

Key Responsibilities :

Manage day-to-day operations of

VMC, AWS, and Azure infrastructure .

Gather and analyze metrics from operating systems and applications to support performance tuning and fault finding.

Collaborate with development teams to improve services through rigorous testing and release procedures.

Participate in system design consulting, platform management, and capacity planning.

Create sustainable systems and services through

automation and Infrastructure as Code (IaC) .

Balance feature development speed and reliability by defining and maintaining

service-level objectives (SLOs) .

Build and document automation processes for Infrastructure as a Service (IaaS).

Oversee backup, patch management, and system maintenance.

Configure, deploy, maintain, troubleshoot, and monitor

container orchestration

on AWS.

Communicate complex technical ideas effectively to both technical and non-technical stakeholders.

Qualifications : Bachelor’s degree

(or equivalent) in Computer Science or related discipline.

7+ years of experience

in IT operations, Windows infrastructure, or SRE roles.

Hands-on experience with

Windows Server, Active Directory (AD), LDAP, DNS, network storage, and Azure compute services .

Strong scripting and programming skills using

PowerShell, Python, Ansible, Terraform, and Bash .

Solid understanding of

container orchestration

(Kubernetes, Docker, etc.).

Familiarity with

ITSM processes

(Incident, Problem, Change Management) using tools such as ServiceNow (preferable).

Proactive approach to identifying performance bottlenecks and areas for improvement.

Strong

analytical, problem-solving, and communication skills .

Ability to work both independently and collaboratively with a sense of ownership.

Skills & Attributes :

Strong interpersonal and teamwork skills.

Self-driven with a

proactive mindset .

Ability to balance reliability and speed while maintaining high-quality services.

Create a job alert for this search

Site Reliability Engineer • Delhi, India