This job offer is not available in your country.

Sr. Site Reliability Engineer- Azure

ConfidentialMohali

13 days ago

Job description

Gathering Project Requirements from Stakeholders along with Business Analysts and Project Managers

Break down complex problems and projects into manageable goals

Handle High severity incident and situation.

Designing high level Schematics of the infrastructure, tools and process needed

Performing and in depth analysis of the possible risk and countermeasures for them

Create a bridge between development and operations by applying software engineering mindset to system administration topics

Configuration management platform understanding and experience (Chef / Puppet / Ansible)

Release engineering, which involves defining best practices to ensure software releases are consistent and repeatable.

Alerting, being on-call, and troubleshooting, along with emergency and incident response and postmortems.

Know how best to monitor systems and react when things go wrong, constantly writing and rewriting response playbooks to reduce the time to fix any breakdown which may occur

Involves documenting an incident, understanding all contributing root causes, and implementing future preventive actions.

Highly developed skills in managing 24x7 production support comprising of Incident, Problem, Change management

Troubleshooting Support Escalation

On-Call Process Optimization

Documenting Knowledge

Optimizing SDLC (Software Development Life Cycle)

Technical Requirement -

Strong understanding of cloud-based architecture and cloud operations. Hands-on experience with Azure
Experience in administration / build / management of Linux systems
Foundational understanding of Infrastructure and Platform Technology stacks
Strong understanding of Networking concepts and theories, such as different protocols (TCP / IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing
Working knowledge of Infrastructure and Application monitoring platforms
Understanding of the core DevOps practices (CI / CD pipeline, release management etc)
Ability to write code using any one modern programming language (Python, JavaScript, Ruby etc). Additional scripting skills are preferred
Prior experience in Cloud management automation tools (Terraform / CloudFormation etc) is preferred
Experience with source code management software and API automation is preferred.
Deep Understanding of architecture and operations of Container Orchestration tools eg Kubernetes
Deep understanding of Know Applications ie JAVA, Nodejs, Golang
Deep understanding of Databases and SQL
Strong understanding of BigData Infrastructure.
Understanding of Incident management and Event Register Management
Knowledge of SDLC methodologies and best practices including Waterfall Process, Agile methodologies, deployment automation, code reviews, and test-driven development

Professional Attributes -

Excellent communication skills
Attention to detail
Analytical mind and Problem Solving Aptitude
Strong Organizational skills
Visual Thinking

Skills Required

Javascript, Linux, Networking, Agile, Production Support, Automation, Troubleshooting

Create a job alert for this search

Site Reliability Engineer • Mohali