Talent.com
Senior Site Reliability Engineer

Senior Site Reliability Engineer

iVoyantLucknow, IN
21 hours ago
Job description

One of our clients is looking for an experienced Senior Site Reliability Engineer (SRE) - Mission-Critical SaaS Cloud Products to join their team.

Key Responsibilities :

Reliability and Performance Management :

  • Design, implement, and maintain highly available, scalable, and resilient cloud-native architectures for mission-critical SaaS products.
  • Develop and implement SLOs, SLIs, and SLAs to measure and improve service reliability.
  • Continuously optimize system performance and resource utilization across multiple cloud platforms.
  • Finetune / Optimize Application performance by analyzing the code, traces and database queries.

Incident Management and Troubleshooting :

  • Lead incident response efforts, effectively troubleshooting complex issues to minimize downtime and impact.
  • Reduce Mean Time to Recover (MTTR) through proactive monitoring, automated alerting, and efficient problem-solving techniques.
  • Conduct thorough Root Cause Analysis (RCA) for all major incidents and implement preventive measures.
  • Observability and Monitoring :

  • Design and implement end-to-end observability solutions across our distributed systems.
  • Develop and maintain comprehensive monitoring strategies using tools like ELK Stack, Prometheus, Grafana.
  • Create and optimize product status dashboards to provide real-time visibility into system health and performance.
  • Automation and Infrastructure as Code (IaC) :

  • Implement Infrastructure as Code practices using tools like Terraform.
  • Develop and maintain automated deployment pipelines and CI / CD workflows.
  • Create self-healing systems and automate routine operational tasks to reduce manual intervention.
  • Cloud-Agnostic Architecture :

  • Design and implement cloud-agnostic solutions that can operate efficiently across multiple cloud providers.
  • Develop expertise in event-driven architecture and related technologies (e.g., Apache Kafka / EventHub, Redis, Mongo Atlas, IoTHub).
  • Implement and manage containerized applications using Kubernetes across different cloud environments.
  • Continuous Improvement :

  • Regularly review and refine operational practices to enhance efficiency and reliability.
  • Stay updated with the latest industry trends and technologies in SRE, cloud computing, and DevOps.
  • Contribute to the development of internal tools and frameworks to support SRE practices.
  • Requirements :

  • Strong knowledge of cloud platforms - Azure and their associated services.
  • Expert in Observability tools (ELK Stack, Dynatrace, Prometheus)
  • Expertise in containerization technologies such as Docker and Kubernetes
  • Understanding of Event-driven architecture and database technologies (Mongo Atlas, Azure SQL, Postgres DB)
  • Proficient in IaaC tools such as - Terraform and GitHub Actions.
  • Proficiency in one or more programming languages - Python / .Net / Java
  • Strong understanding of networking concepts, load balancing, and security practices.
  • Create a job alert for this search

    Senior Site Reliability Engineer • Lucknow, IN

    Related jobs
    • Promoted
    Senior CV / LLM Engineer

    Senior CV / LLM Engineer

    doAZLucknow, IN
    Doaz is a hyper-growth startup on a mission to turn fragmented industrial knowledge into instant, actionable insight.We build LLM- and Vision-AI solutions for construction, heavy industry, and fina...Show moreLast updated: 10 days ago
    • Promoted
    Lead Sustenance Engineer - Storage

    Lead Sustenance Engineer - Storage

    DDNLucknow, IN
    This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a globa...Show moreLast updated: 30+ days ago
    • Promoted
    DevOps / Platform Engineer

    DevOps / Platform Engineer

    iVedha Inc.Lucknow, IN
    Hiring a seasoned DevOps / Platform Engineer to drive automation, platform reliability, and robust.Design, deploy, and manage CI / CD pipelines and infrastructure automation, leveraging AI for.Implemen...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Full Stack Engineer

    Senior Full Stack Engineer

    CotivitiLucknow, IN
    At Edifecs / Cotiviti, you’ll join the Engineering team responsible for the design and implementation of a multi-tenant SaaS (Software-as-a-Service) platform that is transforming the healthcare space...Show moreLast updated: 3 days ago
    • Promoted
    Senior MLOps Engineer

    Senior MLOps Engineer

    Mitchell Martin Inc.Lucknow, IN
    Include, but are not limited to, the following : .Own productionizing models—from tracked experiments to governed releases—ensuring resilient services with clear SLOs, runbooks, and fast, safe rollba...Show moreLast updated: 30+ days ago
    • Promoted
    Deployment Engineer

    Deployment Engineer

    AvocaLucknow, IN
    Build, launch & optimize AI agents that power the next generation of home-service customer experiences.Avoca is the all-in-one AI lead-conversion platform. Our technology boosts booking rates, slash...Show moreLast updated: 30+ days ago
    • Promoted
    Rotating Equipment Reliability Consultant / Trainer

    Rotating Equipment Reliability Consultant / Trainer

    EC-Energy EventsLucknow, IN
    EC-Energy Events is looking for an experienced Rotating Equipment Reliability Consultant / Trainer to join our growing pool of experts supporting technical conferences, training programs, and worksho...Show moreLast updated: 21 days ago
    • Promoted
    LLM Systems Performance Engineer (CUDA)

    LLM Systems Performance Engineer (CUDA)

    PhinityLucknow, IN
    We look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new c...Show moreLast updated: 1 day ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.Lucknow, IN
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    o9 Solutions, Inc.lucknow, uttar pradesh, in
    Be part of something revolutionary.At o9 Solutions, our mission is clear : be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate globa...Show moreLast updated: 11 days ago
    • Promoted
    Resident Engineer – Kubernetes & Portworx

    Resident Engineer – Kubernetes & Portworx

    CMK Resources, Inc.Lucknow, IN
    CMK Resources Resident Engineer – Kubernetes & Portworx.Remote - based in India working U.EST standard time business hours. compensation expectation of up to 30 lakhs per annum depending on experie...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer / Lead

    Site Reliability Engineer / Lead

    Coforgeuttar pradesh, India
    Skills : Docker, Prometheus, grafana, ELK, DataDog.We at Coforge are hiring a highly skilled and experienced.You will lead a team of SREs, collaborate with development and operations teams, and impl...Show moreLast updated: 4 days ago
    • Promoted
    MLOps Engineer

    MLOps Engineer

    X4 TechnologyLucknow, IN
    MLOps Engineer - Role & Responsibilities.Design, deploy and manage scalable & secure cloud infrastructure.Apply least privilege across cloud platforms (Azure, RBAC, AWS IAM).Enable audit logging co...Show moreLast updated: 11 days ago
    • Promoted
    CX Solutions Engineer

    CX Solutions Engineer

    Prudentica Consulting LLPLucknow, IN
    We are a leading Customer Experience (CX) solutions provider, specializing in delivering world-class cloud contact center implementations and managed services. Our team builds intelligent, scalable,...Show moreLast updated: 3 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeLucknow, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 2 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CodeKarmalucknow, uttar pradesh, in
    Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 10 days ago
    • Promoted
    Senior Kubernetes Platform Engineer

    Senior Kubernetes Platform Engineer

    People Prime WorldwideLucknow, IN
    Our Client Corporation provides digital engineering and technology services to Forbes Global 2000 companies worldwide.Our Engineering First approach ensures we can execute all ideas and creatively ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior DevOps Enginner

    Senior DevOps Enginner

    GlowingbudLucknow, IN
    Glowingbud is a rapidly growing eSIM services platform that simplifies connectivity with powerful APIs, robust B2B and B2C interfaces, and seamless integrations with Telna.Our platform enables glob...Show moreLast updated: 30+ days ago