Talent.com
This job offer is not available in your country.
Lead Site Reliability Engineer - Java

Lead Site Reliability Engineer - Java

Landmark GroupBengaluru, Karnataka, India
29 days ago
Job description

COMPANY- LANDMARK GROUP

Job Title : SRE Lead (Engineering & Reliability)

Experience : 8-12 years

Job Summary :

We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to

oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead,

you will play a pivotal role in establishing and implementing SRE practices, leading a team

of engineers, and driving automation, monitoring, and incident response strategies. This

position combines software engineering and systems engineering expertise to build and

maintain high-performing, reliable systems.

Key Responsibilities :

Reliability & Performance :

  • Lead efforts to maintain high availability and reliability of critical services.
  • Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met.
  • Proactively identify and resolve performance bottlenecks and system inefficiencies.

Incident Management & Response :

  • Establish and improve incident management processes and on-call rotations.
  • Lead incident response and root cause analysis for high-priority outages.
  • Drive post-incident reviews and ensure actionable insights are implemented.
  • Automation & Tooling :

  • Develop and implement automated solutions to reduce manual operational tasks.
  • Enhance system observability through metrics, logging, and distributed tracing tools
  • (e.g., Prometheus, Grafana, Elastic APM).

  • Optimize CI / CD pipelines for seamless deployments.
  • Collaboration :

  • Partner with software engineering teams to improve the reliability of applications and
  • infrastructure.

  • Work closely with product / engineering teams to design scalable and robust systems.
  • Ensure seamless integration of monitoring and alerting systems across teams.
  • Leadership & Team Building :

  • Manage, mentor, and grow a team of SREs.
  • Promote SRE best practices and foster a culture of reliability and performance across
  • the organization.

  • Drive performance reviews, skills development, and career progression for team
  • members.

    Capacity Planning & Cost Optimization :

  • Perform capacity planning and implement autoscaling solutions to handle traffic
  • spikes.

  • Optimize infrastructure and cloud costs while maintaining reliability and
  • performance.

    Skills & Qualifications :

  • Technical Expertise :
  • o Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes.

    o Hands-on knowledge of infrastructure-as-code tools like Terraform / Helm / Ansible.

    o Proficiency in Java

    o Expertise in distributed systems, databases, and load balancing.

  • Monitoring & Observability :
  • o Proficient with tools like Prometheus, Grafana,, Elastic APM, or New relic.

    o Understanding of metrics-driven approaches for system monitoring and alerting.

  • Automation & CI / CD :
  • o Hands-on experience with CI / CD pipelines (e.g., Jenkins, Azure Pipelines etc).

    o Skilled in automation frameworks and tools for infrastructure and application deployments.

  • Incident Management :
  • o Proven track record in handling incidents, post-mortems, and implementing

    solutions to prevent recurrence.

    Leadership & Communication Skills :

  • Strong people management and leadership skills with the ability to inspire and motivate teams.
  • Excellent problem-solving and decision-making skills.
  • Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders.
  • Preferred Qualifications :

  • Experience with database optimization, Kafka, or other messaging systems.
  • Knowledge of autoscaling techniques
  • Previous experience in an SRE, DevOps, or infrastructure engineering leadership role.
  • Understanding of compliance and security best practices in distributed systems.
  • Why Join Us?

  • Be a key driver in building and scaling reliable systems in a fast-paced environment.
  • Work with cutting-edge technologies and influence the evolution of the infrastructure.
  • Lead a high-impact team and foster a culture of reliability and innovation.
  • Create a job alert for this search

    Site Reliability Engineer • Bengaluru, Karnataka, India

    Related jobs
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    ViewSonicBengaluru, Karnataka, India
    At ViewSonic Technologies, we’re passionate about building software that solves problems.We count on our site reliability engineers (SREs) to empower users with a rich feature set, high availabilit...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Amicon Hub ServicesBengaluru, Karnataka, India
    Manage and scale production systems hosted on.Automate operational tasks using.Improve system reliability and reduce manual interventions through automation. Collaborate with development teams to en...Show moreLast updated: 9 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    HDFC LimitedBengaluru, Karnataka, India
    Hiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore Location Experience - 8 - 14 Years Job Purpose - Analysing, troubleshooting, and designing vital services, platforms, and in...Show moreLast updated: 11 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    BayOne Solutionshosur, tamil nadu, in
    Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    XebiaBangalore, IN
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ViewSonicBengaluru, Karnataka, India
    Bachelor's degree in Computer Science, Engineering, or a related field.Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions in...Show moreLast updated: 21 days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.hosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Core Minds Tech SOlutionsHosur
    Job Description : - Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions&l...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Site Reliability Engineer - Java

    Lead Site Reliability Engineer - Java

    Landmark GroupBengaluru, Karnataka, India
    Job Title : SRE Lead (Engineering & Reliability).We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to. Lead efforts to maintain high availability and reliability of cr...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TavantBengaluru, Karnataka, India
    With 25+ years of experience building innovative digital products and solutions, Tavant provides impactful results to its customers. It has been the frontrunner in driving digital innovation and tec...Show moreLast updated: 29 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    WSO2Bengaluru, Karnataka, India
    Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) to thousands of enterprises in over 90 c...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Central Business Solutions Inc.Bangalore Urban, Karnataka, India
    Linux SRE [Linux SRE L3 with Infra + Operation Support].The Server Operations team is part of the Enterprise Computing organization within Client. The wider team has presence in cities globally and ...Show moreLast updated: 8 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ExasoftBangalore, IN
    Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    WhiteLotus Talent PartnersBengaluru, Karnataka, India
    L0 and L1 Site Reliability Engineer (SRE) Support.Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by. In this role, you will focu...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ACL DigitalBengaluru, Karnataka, India
    Service Management : Maintain application uptime / performance, manage system enhancements and defects, oversee daily operational activities, and ensure continuous improvement and adherence to ITIL be...Show moreLast updated: 4 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    EmbarkGCCBengaluru, Karnataka, India
    Senior Site Reliability Engineer (SRE) – Job Description.Implement and tune SLOs / SLIs, build reliability dashboards, and respond to incidents using Grafana IRM, JSM, and escalation workflows.Monito...Show moreLast updated: 29 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Uplershosur, tamil nadu, in
    Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 28 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    TrantorBengaluru, Karnataka, India
    Job Title - Site Reliability Engineer Role- Contract (9 Months- Extendable) Exp- 5+ years Loc- Bangalore ( Hybrid) Notice- Immediate joiner only Duties : Responsible for maintaining and scaling pro...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer - Chaos Management

    Site Reliability Engineer - Chaos Management

    Xebiahosur, tamil nadu, in
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 11 days ago
    • Promoted
    Lead Engineer

    Lead Engineer

    HCLTechhosur, tamil nadu, in
    Architect efficient and reusable front-end systems to support complex interactions within Meta HW infrastructure.Develop full-stack web applications for internal infrastructure tooling using techno...Show moreLast updated: 3 days ago