Talent.com
This job offer is not available in your country.
Sr. Site Reliability Engineer- Azure

Sr. Site Reliability Engineer- Azure

ConfidentialMohali
13 days ago
Job description
  • Gathering Project Requirements from Stakeholders along with Business Analysts and Project Managers
  • Break down complex problems and projects into manageable goals
  • Handle High severity incident and situation.
  • Designing high level Schematics of the infrastructure, tools and process needed
  • Performing and in depth analysis of the possible risk and countermeasures for them
  • Create a bridge between development and operations by applying software engineering mindset to system administration topics
  • Configuration management platform understanding and experience (Chef / Puppet / Ansible)
  • Release engineering, which involves defining best practices to ensure software releases are consistent and repeatable.
  • Alerting, being on-call, and troubleshooting, along with emergency and incident response and postmortems.
  • Know how best to monitor systems and react when things go wrong, constantly writing and rewriting response playbooks to reduce the time to fix any breakdown which may occur
  • Involves documenting an incident, understanding all contributing root causes, and implementing future preventive actions.
  • Highly developed skills in managing 24x7 production support comprising of Incident, Problem, Change management
  • Troubleshooting Support Escalation
  • On-Call Process Optimization
  • Documenting Knowledge
  • Optimizing SDLC (Software Development Life Cycle)
  • Technical Requirement -

    • Strong understanding of cloud-based architecture and cloud operations. Hands-on experience with Azure
    • Experience in administration / build / management of Linux systems
    • Foundational understanding of Infrastructure and Platform Technology stacks
    • Strong understanding of Networking concepts and theories, such as different protocols (TCP / IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing
    • Working knowledge of Infrastructure and Application monitoring platforms
    • Understanding of the core DevOps practices (CI / CD pipeline, release management etc)
    • Ability to write code using any one modern programming language (Python, JavaScript, Ruby etc). Additional scripting skills are preferred
    • Prior experience in Cloud management automation tools (Terraform / CloudFormation etc) is preferred
    • Experience with source code management software and API automation is preferred.
    • Deep Understanding of architecture and operations of Container Orchestration tools eg Kubernetes
    • Deep understanding of Know Applications ie JAVA, Nodejs, Golang
    • Deep understanding of Databases and SQL
    • Strong understanding of BigData Infrastructure.
    • Understanding of Incident management and Event Register Management
    • Knowledge of SDLC methodologies and best practices including Waterfall Process, Agile methodologies, deployment automation, code reviews, and test-driven development
    • Professional Attributes -

    • Excellent communication skills
    • Attention to detail
    • Analytical mind and Problem Solving Aptitude
    • Strong Organizational skills
    • Visual Thinking
    • Skills Required

      Javascript, Linux, Networking, Agile, Production Support, Automation, Troubleshooting

    Create a job alert for this search

    Site Reliability Engineer • Mohali