Gathering Project Requirements from Stakeholders along with Business Analysts and Project ManagersBreak down complex problems and projects into manageable goalsHandle High severity incident and situation.Designing high level Schematics of the infrastructure, tools and process neededPerforming and in depth analysis of the possible risk and countermeasures for themCreate a bridge between development and operations by applying software engineering mindset to system administration topicsConfiguration management platform understanding and experience (Chef / Puppet / Ansible)Release engineering, which involves defining best practices to ensure software releases are consistent and repeatable.Alerting, being on-call, and troubleshooting, along with emergency and incident response and postmortems.Know how best to monitor systems and react when things go wrong, constantly writing and rewriting response playbooks to reduce the time to fix any breakdown which may occurInvolves documenting an incident, understanding all contributing root causes, and implementing future preventive actions.Highly developed skills in managing 24x7 production support comprising of Incident, Problem, Change managementTroubleshooting Support EscalationOn-Call Process OptimizationDocumenting KnowledgeOptimizing SDLC (Software Development Life Cycle)Technical Requirement -
- Strong understanding of cloud-based architecture and cloud operations. Hands-on experience with Azure
- Experience in administration / build / management of Linux systems
- Foundational understanding of Infrastructure and Platform Technology stacks
- Strong understanding of Networking concepts and theories, such as different protocols (TCP / IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing
- Working knowledge of Infrastructure and Application monitoring platforms
- Understanding of the core DevOps practices (CI / CD pipeline, release management etc)
- Ability to write code using any one modern programming language (Python, JavaScript, Ruby etc). Additional scripting skills are preferred
- Prior experience in Cloud management automation tools (Terraform / CloudFormation etc) is preferred
- Experience with source code management software and API automation is preferred.
- Deep Understanding of architecture and operations of Container Orchestration tools eg Kubernetes
- Deep understanding of Know Applications ie JAVA, Nodejs, Golang
- Deep understanding of Databases and SQL
- Strong understanding of BigData Infrastructure.
- Understanding of Incident management and Event Register Management
- Knowledge of SDLC methodologies and best practices including Waterfall Process, Agile methodologies, deployment automation, code reviews, and test-driven development
Professional Attributes -
- Excellent communication skills
- Attention to detail
- Analytical mind and Problem Solving Aptitude
- Strong Organizational skills
- Visual Thinking
Skills Required
Javascript, Linux, Networking, Agile, Production Support, Automation, Troubleshooting