Talent.com
Infrastructure Reliability Engineer

Infrastructure Reliability Engineer

CitNOW GroupRepublic Of India, IN
1 day ago
Job description

About us

Founded in 2008, CitNOW is an innovative, enterprise-level software product suite that allows automotive dealerships globally to sell more vehicles and parts more profitably. CitNOW’s app-based platform provides a secure, brand-compliant solution – for dealers to build trust, transparency and long-lasting relationships.

CitNOW Group was formed in 2021 to unite a portfolio of 12 global software companies leveraging innovation to aid retailers and manufacturers in delivering an outstanding customer experience. We have over 300 employees worldwide who all contribute to our vision to provide market-leading automotive solutions to drive efficiencies, seamlessly transforming every customer moment.

The CitNOW Group is no ordinary technology company, we live a series of One Team values and this guiding principle forms the foundation of CitNOW Group’s award winning, collaborative and inclusive culture. Recognised recently within the Top 25 Best Mid Sized Companies to work for within the UK, we pride ourselves on being a great place to work.

About the role

We are looking for a proactive and experienced Site Reliability Engineer (SRE) to join our Engineering team remotely in India. The ideal candidate will have deep expertise in cloud operations, automation, monitoring, and reliability engineering, with hands-on experience managing a wide range of SaaS and infrastructure tools. The role focuses on ensuring system uptime, performance, and scalability across our global platform.

Key responsibilities :

Reliability & Infrastructure Management

  • Design, implement, and manage scalable cloud infrastructure on Google Cloud (GCP) and AWS
  • Manage integrations and operations across third-party platforms including Mongo Atlas, Cloudflare, Stripe, Cledara, Datadog, Atlassian Status page, Semaphore, Postmark, SendGrid, Lokalise, Zendesk (Smooch & Smooch EU), Twilio, Mailgun, Facebook, Google Workspace, Asana, GitHub, Ngrok, npm, Readme, Loom, Deepgram, and OpenAI
  • Implement Infrastructure as Code (IaC) using tools like Terraform or Ansible to automate provisioning and scaling
  • Ensure systems adhere to security, compliance, and reliability best practices

Monitoring, Alerting & Incident Management

  • Build and maintain observability solutions using Datadog, GCP Logging, and related tools for monitoring system health, latency, and performance
  • Define and manage SLOs, SLIs, and SLAs to measure and maintain reliability
  • Implement proactive alerting, diagnostics, and runbooks for efficient incident response
  • Participate in on-call rotations and lead root cause analyses (RCA) for post-incident reviews
  • Automation & CI / CD

  • Design and optimize CI / CD pipelines using Semaphore CI / CD, GitHub Actions, or similar tools
  • Develop automation scripts and utilities in Python, Bash, or equivalent scripting languages to streamline operations and reduce manual interventions
  • Integrate and automate workflows between systems such as Asana, Github, and Google Workspace for operational efficiency
  • Security & Governance

  • Manage identity and access controls across cloud services and third-party SaaS platforms
  • Implement best practices for secrets management, data protection, and compliance with privacy standards
  • Collaboration & Continuous Improvement

  • Partner closely with developers to design resilient, high-performing services.
  • Promote an SRE culture focused on continuous learning, blameless postmortems, and process improvement.
  • Maintain up-to-date operational documentation, playbooks, and architectural diagrams.
  • We are looking for :

  • Bachelor’s degree in computer science, Engineering, or related field
  • 4+ years of experience in Site Reliability Engineering, DevOps, or Cloud Operations
  • Strong experience with Google Cloud Platform (GCP), Amazone Web Services (AWS) and Mongo Atlas
  • Proven ability to manage and integrate multiple SaaS and developer tools (Datadog, Cloudflare, Atlassian Status page, Semaphore, SendGrid, etc.)
  • Hands-on experience with CI / CD pipelines, Terraform, GitHub Actions, and containerized environments (Docker, GCP Cloud Run, or Kubernetes)
  • Expertise in monitoring, incident response, and system optimization
  • Excellent troubleshooting, documentation, and communication skills
  • Strong collaboration mindset aligned with cross-functional development and operations teams
  • In addition to a competitive salary, our benefits package is second to none. Employee wellbeing is at the heart of our people strategy, with a number of innovative wellness initiatives such as flexi-time, where employees can vary their start and finish times within our core business hours and / or extend their lunch break by up to 2 hours per day. Employees also benefit from an additional two half days paid leave per year to focus on their personal wellbeing.

    We recognise the development of our people is vital to the ongoing success of the business and proudly promote a culture of continuous learning and improvement, along with opportunities to develop and progress a successful career with us.

    The CitNOW Group is an equal opportunities employer that celebrates diversity across our international teams. We are passionate about creating an inclusive workplace where everyone’s individuality is valued.

    View our candidate privacy policy here - CitNOW-Group-Candidate-Privacy-Policy.Pdf (citnowgroup.Com)

    Create a job alert for this search

    Reliability Engineer • Republic Of India, IN

    Related jobs
    • Promoted
    Infrastructure Engineer - Tier3

    Infrastructure Engineer - Tier3

    NEXPLAY SECUREnagpur, maharashtra, in
    The Infrastructure Engineer (Tier III, remote) serves as the senior technical authority within Nexplay Secure's Managed Services division. This role leads the deployment and ongoing support of criti...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapgeminiNagpur, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 13 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CitNOW GroupNagpur, IN
    Founded in 2008, CitNOW is an innovative, enterprise-level software product suite that allows automotive dealerships globally to sell more vehicles and parts more profitably.CitNOW’s app-based plat...Show moreLast updated: 2 days ago
    • Promoted
    Lead - Cloud Reliability Engineer

    Lead - Cloud Reliability Engineer

    Searce Incnagpur, maharashtra, in
    The ‘process-first’ AI-native modern tech consultancy that's rewriting the rules.As an engineering-led consultancy, we are dedicated to relentlessly improving the real business outcomes.Our solvers...Show moreLast updated: 30+ days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    PhonePePune, Republic Of India, IN
    Troubleshoot issues across the entire stack - hardware, software, application, and network.Work to improve the reliability and performance of the next generation of distributed systems.Work to impr...Show moreLast updated: 3 days ago
    • Promoted
    Cloud Infrastructure Reliability Engineer

    Cloud Infrastructure Reliability Engineer

    HTC Global ServicesChennai, Republic Of India, IN
    Troy, Michigan, is a leading global Information Technology solution and BPO provider.HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data ...Show moreLast updated: 3 days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    SynechronRepublic Of India, IN
    We have immediate opportunity for.Site Reliability Engineer Devop 5 to 9 years.SRE (Senior Site Reliability Engineer) Devop. We began life in 2001 as a small, self-funded team of technology speciali...Show moreLast updated: 30+ days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    PoshmarkChennai, Republic Of India, IN
    We’re looking for an experienced.You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying ...Show moreLast updated: 16 days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    APTO SOLUTIONS - EXECUTIVE SEARCH & CONSULTANTSRepublic Of India, IN
    Hiring Alert – Site Reliability Engineer L2 (SRE) 🌟.Location : Mumbai - contractual.Hands-on in Data Center Operations (DCOps) – Linux installation, configuration & troubleshooting.Strong experienc...Show moreLast updated: 3 days ago
    • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaServiceNagpur, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show moreLast updated: 1 day ago
    • Promoted
    Infrastructure reliability engineer

    Infrastructure reliability engineer

    SFS Group India Pvt. Ltd.Pune, Republic Of India, IN
    Act as the Site Reliability Engineer for global operations, ensuring system stability, scalability, and efficiency through advanced automation, observability, and proactive infrastructure managemen...Show moreLast updated: 20 days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    iSoftStoneRepublic Of India, IN
    Greetings from ISoftStone Inc!.This is Rajlaxmi from the HR department of ISoftStone Inc.We are looking for a SRE / Devops. Location- Bangalore / Hybrid (2-3 days WFO).Bachelors degree in computer scie...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CodeKarmanagpur, maharashtra, in
    Site Reliability Engineer (Multi-Cloud Deployments).CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s w...Show moreLast updated: 24 days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    Tata Consultancy ServicesChennai, Republic Of India, IN
    Role : Site Reliability Engineer.Locations : Chennai / Pune / Kolkata.Show moreLast updated: 3 days ago
    • Promoted
    Cloud Systems Reliability Engineer

    Cloud Systems Reliability Engineer

    DeloitteChennai, Republic Of India, IN
    We’re hiring Cloud & Linux Operations Engineers (SMEs)!.Looking for experienced professionals to manage and support enterprise-scale Linux systems, cloud platforms (AWS, Azure, Kubernetes), and dat...Show moreLast updated: 3 days ago
    • Promoted
    Cloud Infrastructure Reliability Lead

    Cloud Infrastructure Reliability Lead

    Searce IncPune, Republic Of India, IN
    The ‘process-first’ AI-native modern tech consultancy that's rewriting the rules.As an engineering-led consultancy, we are dedicated to relentlessly improving the real business outcomes.Our solvers...Show moreLast updated: 16 days ago
    • Promoted
    Regional Cloud Infrastructure Engineer

    Regional Cloud Infrastructure Engineer

    Argyll ScottIndia, India
    This position offers an opportunity to lead and support a diverse hybrid IT landscape across the APAC region.The Regional IT and Cloud Specialist will be responsible for managing, optimizing, and s...Show moreLast updated: 3 days ago
    • Promoted
    Infrastructure Reliability Engineer

    Infrastructure Reliability Engineer

    Grid DynamicsRepublic Of India, IN
    Location-Bangalore / Chennai / Hyderabad.Core Skills (Some combination of : ).These might include (Tomcat, Apache, Springboot, SQS, JBoss, IBM MQ, IBM DataPower, Hazelcast, Flink, Connect Direct, SSL).Un...Show moreLast updated: 1 day ago