Talent.com
This job offer is not available in your country.
Site Reliability Engineer - Cloud Platforms

Site Reliability Engineer - Cloud Platforms

LanceSoft, IncPune
19 days ago
Job description

Role and Responsibilities :

Reporting to Engineering, the Site Reliability Engineer will play a critical role in driving innovation and growth for the Banking Solutions, Payments, and Capital Markets business. In this role, the candidate will have the opportunity to make a lasting impact on the company's transformation journey, drive customer-centric innovation and automation, and position the organization as a leader in the competitive banking, payments, and investment landscape.

Specifically, the Site Reliability Engineer will be responsible for the following :

  • Design and maintain monitoring solutions and alerting mechanisms for infrastructure, application performance, and user experience metrics, enabling proactive issue detection and mitigation.
  • Implement automation tools and processes to automate routine tasks, scale infrastructure, and ensure seamless deployments, updates, and rollbacks with minimal user impact.
  • Ensure the reliability, availability, and performance of applications and services, focusing on minimizing downtime, optimizing response times, and maintaining high availability for users.
  • Lead incident response efforts for incidents, including identification, triage, resolution, and post-incident analysis to prevent recurrence and improve system resilience.
  • Conduct capacity planning, performance tuning, and resource optimization for environments, collaborating with development and operations teams to meet scalability and performance goals.
  • Collaborate with security teams to implement security best practices, perform vulnerability assessments, and ensure compliance with security standards and regulatory requirements for applications.
  • Manage deployment pipelines, release processes, and configuration management for app deployments, ensuring consistency, reliability, and version control across environments.
  • Identify areas for improvement in reliability, performance, and efficiency through data analysis, root cause analysis, and trend analysis, and drive initiatives to enhance system reliability and operational efficiency.
  • Create and maintain documentation, runbooks, and knowledge base articles for operational procedures, troubleshooting guides, and best practices, and promote knowledge sharing within the team.
  • Develop and test disaster recovery plans, backup strategies, and failover mechanisms for app services, ensuring business continuity and data integrity in case of failures or disasters.
  • Collaborate with development, QA, DevOps, and product teams to ensure alignment on reliability goals, performance metrics, release schedules, and incident response processes.
  • Participate in on-call rotations and provide 24 / 7 support for critical incidents, troubleshoot issues, and coordinate with teams for resolution, escalation, and follow-up actions as per defined SLAs.

Professional Qualifications :

  • Proficient in development technologies, architectures, and platforms (web, api) to understand system complexities and performance considerations.
  • Experience in cloud platforms (e.g., AWS, Azure, Google Cloud) and infrastructure as code (IaC) tools for managing app infrastructure and deployments.
  • Knowledge of monitoring tools (e.g., Prometheus, Grafana, DataDog, New Relic) and logging frameworks (e.g., Splunk, SumoLogic, ELK Stack) for real-time visibility into system health, performance metrics, and user experience.
  • Experience in incident management, including incident response, triage, root cause analysis (RCA), and post-mortem reviews to prevent recurring issues.
  • Strong troubleshooting skills to diagnose complex technical issues in app environments, infrastructure, networking, and performance bottlenecks.
  • Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Terraform, Ansible) for automating routine tasks, deployments, and infrastructure management.
  • Experience in implementing continuous integration / continuous deployment (CI / CD) pipelines for apps using tools like Jenkins, GitLab CI / CD, or Azure DevOps.
  • Expertise in setting up monitoring solutions, configuring alerts, and creating dashboards to monitor system performance, application metrics, and user experience.
  • Familiarity with APM (Application Performance Monitoring) tools to analyze app performance, identify bottlenecks, and optimize resource utilization.
  • Familiarity with RUM (Real User Monitoring) for tracking and analyzing user interaction and system performance.
  • Commitment to continuous learning, staying updated with industry trends, new technologies, and best practices in app reliability, performance, and operations.
  • Adaptability to evolving requirements, technologies, and business needs, with a focus on driving continuous improvement and operational excellence.
  • Personal Characteristics :

  • Demonstrates judgment and flexibility; thinks about issues and develops solutions that thoughtfully take the broader context into account - positively deals with a shifting demand for time, priorities, and the rapid change of environments.
  • Takes an ownership approach to engineering and product outcomes.
  • Action-oriented self-starter who can set strategy and drive execution with a roll up the sleeves approach.
  • Excellent interpersonal communication, negotiation and influencing skills to work effectively with all stakeholders (internal & external), making information-based decisions.
  • Penchant for excellence, both personally and professionally, demonstrated by intellectual curiosity, record of accomplishment, and reputation; shows strong attention to detail and implementation of best practices with an inclination for continuous improvement.
  • Ability to quickly establish strong credibility with employees, business partners and external resources.
  • Embodies and delivers the firm's values and culture towards colleagues, clients, and communities :
  • o Win as one team

    o Lead with integrity

    o Be the change

    (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Pune

    Related jobs
    • Promoted
    NICE - Cloud Site Reliability Engineer

    NICE - Cloud Site Reliability Engineer

    Nice interactive solutions India pvt ltdPune
    At NiCE, we dont limit our challenges.We set the highest standards and execute beyond them.And if youre like us, we can offer you the ultimate career opportunity that will light a fire within you.S...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    AllianzPune
    Site Reliability Engineer (SRE) - One Identity Access Management The primary objective of the Site Reliability Engineer (SRE) specializing in One Identity Access Mana...Show moreLast updated: 24 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Birlasoftpune, maharashtra, in
    Be primarily responsible for providing production, operations support and application administration to business and web applications, 3rd party applications and related ecosystems.The application ...Show moreLast updated: 24 days ago
    • Promoted
    Dynamisch - DevOps / Site Reliability Engineer

    Dynamisch - DevOps / Site Reliability Engineer

    Dynamisch IT Pvt ltd.Pune
    Job Title : DevOps & Site Reliability Engineer Experience : 4+ Yrs Qualification : B.SC IT / MCAShow moreLast updated: 30+ days ago
    • Promoted
    Spotnana - Site Reliability Engineer - Cloud Infrastructure

    Spotnana - Site Reliability Engineer - Cloud Infrastructure

    SpotnanaPune
    Lets build whats next, together.Were on a mission to modernize the infrastructure of the $1.Our Travel-as-a-Service platform is designed to make every trip better, whether youre booking for work, b...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    ExasoftPune, IN
    Responsibilities and Requirements : .Experience must be at least 10+ years in SRE.Multi Cloud, Hybrid Cloud – on Data center sites. Experience with multiple operating systems (.Operating Systems, Kern...Show moreLast updated: 13 hours ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    TechVeritopune, maharashtra, in
    As a SRE Engineer, you will have a strong background in cloud infrastructure management, migration and deployment, with expertise in Google Cloud Platform (GCP), DevOps tools, and Kubernetes ecosys...Show moreLast updated: 9 hours ago
    • Promoted
    Site Reliability Engineer - Chaos Management

    Site Reliability Engineer - Chaos Management

    Xebiapune, maharashtra, in
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 8 days ago
    • Promoted
    Site Reliability Engineer (AWS)

    Site Reliability Engineer (AWS)

    Idox plcPune, Maharashtra, India
    Site Reliability Engineer (AWS).We are seeking a driven and detail-oriented Site Reliability Engineer (SRE) with a strong passion for building resilient, scalable cloud infrastructure.This role off...Show moreLast updated: 26 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    XebiaPune, IN
    AWS Engineer with strong Python development and Chaos Engineering expertise.The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault toler...Show moreLast updated: 26 days ago
    • Promoted
    Rosemallow Technologies - Site Reliability Engineer

    Rosemallow Technologies - Site Reliability Engineer

    ROSEMALLOW TECHNOLOGIES PRIVATE LIMITEDPune
    Job Title : Site Reliability Engineer (SRE).Department : Technology / Infrastructure / DevOps.Employment Type : Full-time.Job Summary : Show moreLast updated: 26 days ago
    • Promoted
    Reveille Technologies - Site Reliability Engineer - DevOps

    Reveille Technologies - Site Reliability Engineer - DevOps

    Reveille TechnologiesPune
    Job Summary : We are seeking a skilled and proactive Site Reliability Engineer (SRE) with a strong DevOps mindset and hands-on experience in applicat...Show moreLast updated: 30+ days ago
    • Promoted
    Qualys - Senior Site Reliability Engineer - DevOps

    Qualys - Senior Site Reliability Engineer - DevOps

    QUALYS SECURITY TECHSERVICES PRIVATE LIMITEDPune
    About the job : Come work at a place where innovation and teamwork come together to support the most exciting missions in the world! <...Show moreLast updated: 22 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ConcordPune, IN
    Engineers (Individual Contributors).Strong SRE (Site Reliability Engineering).CI / CD, monitoring, automation, infrastructure as code, etc.Show moreLast updated: 18 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Luxoft Indiapune, maharashtra, in
    We are looking for an experienced technical developer to work for one of our client from the banking industry.Project goal is to maintain and develop solutions. Design, develop, and improve the digi...Show moreLast updated: 17 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Onit IndiaPune, Maharashtra, India
    Site Reliability Engineer L2 to join our Core Infrastructure team.This role will help to ensure the reliability of a diverse set of applications across our AWS infrastructure.To be successful in th...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Uplerspune, maharashtra, in
    Uplers is hiring for one of the clients.SRE (Oracle Cloud Infrastructure).Remote | Mon–Fri | 10 : 30 AM – 7 : 30 PM IST.Use of personal device required. OCI cloud infrastructure using Terraform and GitL...Show moreLast updated: 24 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    BayOne Solutionspune, maharashtra, in
    Role : Site Reliability Engineer.The CXE Site Reliability Engineering (SRE) team manages the CI / CD pipelines and cloud infrastructure, ensuring seamless deployment, monitoring, and maintenance.Howev...Show moreLast updated: 9 hours ago