Talent.com
Site Reliability Engineering Manager

Site Reliability Engineering Manager

EpsilonBengaluru, Karnataka, India
30+ days ago
Job description

About Business Unit :

SaaSOps leads post-production support and the overall experience of Epsilon PeopleCloud products for our global clients. This function is responsible for product support, incident management, managed operations and the automation of processes. The team has successfully incubated and mainstreamed Site Reliability Engineering (SRE) as a practice, to ensure reliable product operations on a global scale. Plus, the team is actively leading the adoption of AI in operations (AIOps) and recently launched AI-driven self-service capabilities to enhance operational efficiency and improve client experiences.

Click here to view how Epsilon transforms marketing with 1 View, 1 Vision and 1 Voice.

About the Role

  • Will be a senior IC role responsible for driving strong operations engineering practices in SaaS product operations.
  • Role will drive the incident triage practices, implement effective monitoring and observability tools and help build SRE competence in the team.
  • Role will be closely working with product operations team to deep dive and identify root cause of production issues and work with concerned teams to come up with a permanent fix to recurring issues
  • Role will identify automation opportunities to streamline repeat tasks.
  • Will contribute to evolution of AIOps strategy - identify use cases and come up with AI / Agentic autonomous solutions

What you’ll need

  • 15+ Years hands on experience in SRE
  • The candidate will be hands-on technology leader with a proven experience working as a SRE leader in a SAAS product set up.
  • The candidate should have a deep understanding of monitoring tools (New Relic, Prometheus) and observability practices.
  • Prior experience working with ServiceNow, JIRA, Bitbucket and Confluence required.
  • The candidate should be proficient at designing effective Ops dashboards, especially for peak traffic events in a SaaS environment.
  • The candidate should have prior experience handling communications with leadership across an organization for peak traffic events.
  • The ideal candidate should have a strong full stack engineering background with Cloud Engineering, L1-L3 Operations & AI / Gen AI experience
  • Must have strong development skills - at least two of Python, Java, C#; strong DB skills (RDBMS, NoSql, Cloud DBs), Container / orchestration, Cloud Infrastructure
  • Super proficient in atleast one hyperscaler cloud (AWS, GCP, Azure)
  • Demonstrated real world experience in traditional ML & Gen AI use case deployments in production
  • Candidate should have had experience in working closely with Engineering & Operations team - must have a strong DevOps, Incident Management, Release management, change management experience
  • Prior experience with at least one AIOps solution preferred.
  • Must have proven skills in collaboration and getting things done
  • ITIL certification and experience working in an ITIL environment will be a plus.
  • Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we’ve provided marketers from the world’s leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice. 1 View of their universe of potential buyers. 1 Vision for engaging each individual. And 1 Voice to harmonize engagement across paid, owned and earned channels.

    Epsilon’s comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions every single day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world.

    Epsilon has a core set of 5 values that define our culture and guide us to bring value for our clients, our people and consumers. We are seeking candidates that align with our values, demonstrate them and make them meaningful in their day-to-day work :

    Additional Information

  • Act with integrity. We are transparent and have the courage to do the right thing.
  • Work together to win together. We believe collaboration is the catalyst that unlocks our full potential.
  • Innovate with purpose. We shape the market with big ideas that drive big outcomes.
  • Respect all voices. We embrace differences and foster a culture of connection and belonging.
  • Empower with accountability. We trust each other to own and deliver on common goals.
  • Because You Matter

    YOUniverse. A work-world with you at the heart of it!

    At Epsilon, we believe people make the place. And everything we do is designed with you in mind. That’s why our work-world, aptly named ‘YOUniverse’ is passionate about crafting a nurturing environment that elevates your growth, wellbeing and work-life harmony. So, come be part of a people-centric workspace where care for you is at the core of all we do.

    Take a trip to YOUniverse and explore our outstanding benefits, here

    Epsilon is an Equal Opportunity Employer.

    Epsilon is committed to promoting diversity, inclusion, and equal employment opportunities by using reasonable efforts to attract, recruit, engage and retain qualified individuals of all ethnicities and backgrounds, including, but not limited to, women, people of color, LGBTQ individuals, people with disabilities and any other underrepresented groups, traits or characteristics.

    Create a job alert for this search

    Engineering Manager • Bengaluru, Karnataka, India

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    JRD SystemsBengaluru, Karnataka, India
    Site Reliability Engineer (Windows / Cloud / Automation) Job Summary : We are seeking an experienced Site Reliability Engineer with a strong background in managing Windows infrastructure and cloud e...Show moreLast updated: 20 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CodeKarmahosur, tamil nadu, in
    Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 22 days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    Tata Consultancy Servicesbangalore, karnataka, in
    Role • • : Manager, Site Reliability Engineering.Required Technical Skill Set : Manager, Site Reliability Engineering.Desired Experience Range : 12 - 18 yrs. Notice Period : Immediate to 90Days only.We ar...Show moreLast updated: 23 days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    SynechronBengaluru, Karnataka, India
    We have immediate opportunity for Senior Site Reliability Engineer.Senior Site Reliability Engineer.At Synechron, we believe in the power of digital to transform businesses for the better.Our globa...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    CitNOW GroupBengaluru, IN
    Founded in 2008, CitNOW is an innovative, enterprise-level software product suite that allows automotive dealerships globally to sell more vehicles and parts more profitably.CitNOW’s app-based plat...Show moreLast updated: 18 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeBengaluru, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 14 days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.hosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Site Reliability Engineer - Azure

    Site Reliability Engineer - Azure

    PhonePehosur, tamil nadu, in
    We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production serv...Show moreLast updated: 14 hours ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Nebula Tech Solutionshosur, tamil nadu, in
    SRE team supporting mission-critical applications for our.We’re now looking for engineers who can go beyond operations — those who can. Enhance application reliability through code.Add or modify cod...Show moreLast updated: 1 day ago
    • Promoted
    Senior Site Reliability Engineer (SRE) – Datadog Observability

    Senior Site Reliability Engineer (SRE) – Datadog Observability

    Jade Globalhosur, tamil nadu, in
    Senior Site Reliability Engineer (SRE) – Datadog Observability.SRE and Infrastructure Operations with minimum 3.Hyderabad preferable but open for Pune and remote. Site Reliability Engineer (SRE).SRE...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    iSoftStonehosur, tamil nadu, in
    Greetings from ISoftStone Inc!.This is Rajlaxmi from the HR department of ISoftStone Inc.We are looking for a SRE / Devops. Location- Bangalore / Hybrid (2-3 days WFO).Bachelors degree in computer scie...Show moreLast updated: 14 hours ago
    • Promoted
    Manager- Site Reliability Engineering

    Manager- Site Reliability Engineering

    JPMorganChaseBengaluru, Republic Of India, IN
    JOB DESCRIPTION Guide and shape the future of technology at a globally recognized firm, driven by pride in ownership.As a Senior Manager of Site Reliability Engineering at JPMorgan Chase within the...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Capgeminibangalore district, karnataka, in
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 11 days ago
    • Promoted
    Manager- Site Reliability Engineering

    Manager- Site Reliability Engineering

    ConfidentialBengaluru / Bangalore
    Pearson is looking for a dynamic and experienced.Manager - Site Reliability Engineering (SRE).This individual will play a critical role in ensuring the stability, performance, and scalability of ou...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    super.moneyBengaluru, Karnataka, India
    Site Reliability Engineer (SRE) Level 3.A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Sr Engineer, Site Reliability [T500-21295]

    Sr Engineer, Site Reliability [T500-21295]

    TMUS Global Solutionshosur, tamil nadu, in
    NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show moreLast updated: 14 hours ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    ConfidentialBengaluru / Bangalore, India
    Notice Period - Immediate Joiner.Drive high levels of stability and availability of services driving Site Reliability Engineering as a practice across IPE. Grow partnership with Product Engineering ...Show moreLast updated: 5 days ago
    • Promoted
    o9 Solutions - Site Reliability Engineering Manager

    o9 Solutions - Site Reliability Engineering Manager

    o9 SolutionsBangalore
    Job Summary : We are seeking an experienced Manager to lead complex, cross-functional initiatives across our DevOps in collaboration with platform engineering.This ro...Show moreLast updated: 30+ days ago