Talent.com
This job offer is not available in your country.
Site Reliability Engineer-II

Site Reliability Engineer-II

ConfidentialNoida, Delhi NCR
7 days ago
Job description
  • Build CICD stack collaborating across Dev and QA / Automation team and drive organization to new level of (daily / hourly) continuous delivery and deployment.
  • Security is paramount to everything we do, you will work closely with CISO, Dev team(s) and make security as first class citizens. Develop S-CICD (Secure CICD), enable various security tool chains and vulnerability reports to developers via automation.
  • Observability is very critical for the scale of our systems and ability to find insights / behavior, detect problem / failures. Looking for leads to drive this charter spanning across logs, metrics, mesh, tracing etc
  • Collaborate closely with Dev and QA team to bring given initiative to a closer, increase adoption of DevOps practices and tool chain.
  • Apply strong analytical skills to understand production system metrics, drive change, optimize system utilization and drive cost efficiency.
  • Auto scale / down the platform during peak season scenarios.
  • Ensure that the Platform is secured as per guidelines established by CISO. eg, Secure against DDoS attacks by implementing WAF, Vulnerability and Patch management, install required security agents etc
  • Lead least privilege based RBAC for various production services and tool chains.
  • Build and execute Disaster Recovery plan.
  • Key stakeholder to participate incase of IR (Incident Response).
  • What You Need

    • 3+ years experience as a DevOps / SRE Engineer.
    • Solid experience with at least one of the clouds with automation focus - AWS, Azure, GCP . Certification has advantages.
    • Hands-on experience with Kubernetes along with Linux.
    • Programming experience with scripting languages eg Python.
    • Build and deployment experience building scalable CICD architectures and solutions is preferred.
    • Building observability stack from logs, metrics, traces, service mesh, data observability is preferred.
    • Good at documenting and structuring documents for consumption by various dev teams.
    • Cloud Security is a major advantage and highly preferred skill.
    • Hands-on experience with a few of these - Kafka, Postgre, Snowflake etc is preferred.
    • Preferred Skills :

    • Multi Cloud :  AWS, Azure, GCP
    • Distributed Compute : Kubernetes (EKS / AKS), Containerization
    • Persistence stores Postgres, MongoDB
    • Data Warehousing Snowflake, Data Bricks
    • Messaging Kafka
    • CICD Jenkins, ArgoCD, GitOps
    • Observability Elasticsearch, Prometheus, Jaeger, NewRelic etc
    • Skills Required

      Cicd, Site Reliability Engineering, Kafka, Kubernetes, Aws

    Create a job alert for this search

    Site Reliability • Noida, Delhi NCR