1. Manage and maintain day-to-day BAU operations, including monitoring system
performance, troubleshooting issues, and ensuring high availability.
2. Build infrastructure as code (IAC) patterns that meet security and engineering
standards.
3. Build CI / CD pipelines using Octopus, GitLab-CI and cloud-native toolchains like
ArgoCD.
4. Build and maintain automation scripts and tools to streamline operational processes.
5. Ensure observability around the system uptime is available and take necessary
actions to triage issues with respective service teams and stakeholders.
6. Manage observability setup, including metrics and logging, and enhance capability
with proficiency in PromQL queries.
7. Build runbooks that are comprehensive and detailed to manage, detect, remediate,
and restore services.
8. Collaborate with engineering teams to provide quicker solutions during the
firefighting and help improve the overall process.
9. Support the operations team in managing BAU by monitoring and analyzing system
logs and performance metrics to identify areas for improvement and take proactive
measures.
10. Stay up to date with industry trends and best practices in SRE, observability,
alerting, and infrastructure automation.
11. Actively participate in rotational shift / on-call duties to ensure continuous
operational support.
12. Communicate effectively with technical peers and team members in both written
and verbal formats.
What are we looking in new hire
1. Fresher with strong knowledge of cloud computing platforms, preferably Azure.
2. Cross-functional knowledge in Linux systems, storage, networking, security, and
databases.
3. Knows orchestration tools like Kubernetes.
4. Proficiency in languages such as Python, Go, etc.
5. Have the capability to develop and maintain software written in any programming
language.
6.Knows continuous integration and continuous delivery tooling and practices (e.g.,
GitLab, ArgoCD, Octopus).
7. Aware of monitoring infrastructure and application uptime and availability to ensure
functional and performance objectives.
8. Excellent communication and collaboration skills.
Sre • Delhi, India