Talent.com
Principal Engineer, Site Reliability
Principal Engineer, Site ReliabilityTMUS Global Solutions • Hyderabad, India
Principal Engineer, Site Reliability

Principal Engineer, Site Reliability

TMUS Global Solutions • Hyderabad, India
11 hours ago
Job description

About the Role

The Principal Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms. This role is focused on leading the operational health of these platforms, ensuring the delivery of highly reliable financial applications and data services that meet the demanding requirements of accuracy, compliance, and availability to support business operations.

As a Principal SRE, you will build automation, implement monitoring, improve incident response, and champion DevOps practices that enable Finance and Accounting systems to operate with consistency and trustworthiness, while also coaching and mentoring junior SREs to ensure overall operational excellence.

What Youll Do

Operational Oversight : Own day-to-day operations for Accounting and Finance applications and data platforms, ensuring they run smoothly and meet business expectations.

Reliability & Availability : Ensure Accounting and Finance platforms meet defined SLAs, SLOs, and SLIs for performance, reliability, and uptime.

Automation & Efficiency : Build automation for deployments, monitoring, scaling, and self-healing capabilities to reduce manual effort and operational risk.

Observability & Monitoring : Implement and maintain comprehensive monitoring, alerting, and logging for accounting applications and data pipelines (e.g., Snowflake, dbt workflows, ERP integrations).

Incident Response : Lead and participate in on-call rotations, perform root cause analysis, and drive improvements to prevent recurrence of production issues.

Operational Excellence : Establish and enforce best practices for capacity planning, performance tuning, disaster recovery, and compliance controls in financial systems.

Collaboration with Engineering & Finance : Partner with software engineers, data engineers, and Finance / Accounting teams to ensure operational needs are met from development through production.

Team Coordination : Manage workload, priorities, and escalations for operations staff and partner teams, ensuring alignment with SLAs and compliance requirements.

Security & Compliance : Ensure financial applications and data pipelines meet audit, compliance, and security requirements.

Continuous Improvement : Drive post-incident reviews, implement lessons learned, and proactively identify opportunities to improve system resilience.

Audit & Compliance Support : Ensure operational practices meet internal controls, audit requirements, and financial compliance standards.

What Youll Bring

Bachelors in Computer Science, Engineering, Information Technology, or related field (or equivalent experience).

7-12 years of experience in Site Reliability Engineering, DevOps, or Production Engineering, ideally supporting financial or mission-critical applications.

Strong experience with monitoring / observability tools (Datadog, Prometheus, Grafana, Splunk, or equivalent).

Hands-on expertise with CI / CD pipelines, automation frameworks, and IaC tools (Terraform, Ansible, GitHub Actions, Azure DevOps, etc.).

Familiarity with Snowflake, dbt, and financial system integrations from an operational support perspective.

Strong scripting / programming experience (Python, Bash, Go, or similar) for automation and tooling.

Proven ability to manage incident response and conduct blameless postmortems.

Experience ensuring compliance, security, and audit-readiness in enterprise applications.

Must Have Skills

SRE

SQL

Snowflake OR Databricks

DevOps OR CICD OR Github Actions

monitoring / observability tools (Datadog, Prometheus, Grafana, Splunk, or equivalent)

Automation

Nice To Have

Experience supporting financial applications (ERP, revenue recognition systems, accounting platforms).

Exposure to FinOps practices for optimizing cloud spend in finance-related platforms.

Familiarity with containers and orchestration (Docker, Kubernetes).

Experience building resilience into data pipelines and ensuring auditability for accounting data.

Strong communication skills to articulate operational issues and risks to both technical and non-technical stakeholders.

Create a job alert for this search

Site Reliability Engineer • Hyderabad, India

Related jobs
Site Reliability Engineer

Site Reliability Engineer

Tata Consultancy Services • Hyderabad, Telangana, India
GKE(Preferable); Kubernetes (Any cloud) + PostgresSQL, SQL(Must) Linux (Optional), Java (Optional) , Kubernetes (CLI), Prior Production support experience, Release Management, Prior Deployment exp...Show more
Last updated: 30+ days ago • Promoted
Sr Engineer, Site Reliability [T500-20425]

Sr Engineer, Site Reliability [T500-20425]

TMUS Global Solutions • Hyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show more
Last updated: 30+ days ago • Promoted
Sr Engineer, Site Reliability

Sr Engineer, Site Reliability

TMUS Global Solutions • Hyderabad, India
The Senior Engineer, Site Reliability (SRE) will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms.This role is focused on ...Show more
Last updated: 11 hours ago • Promoted • New!
Lead Site Reliability Engineer

Lead Site Reliability Engineer

AutoRABIT • Hyderabad, Republic Of India, IN
AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, ...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

inTune Systems Inc • Hyderabad, Telangana, India
We are looking for a Senior Site Reliability Engineer (SRE) to join our growing Engineering team.As an SRE, you will play a key role in ensuring the reliability, scalability, and performance of our...Show more
Last updated: 18 hours ago • Promoted • New!
Senior Site Reliability Engineer

Senior Site Reliability Engineer

AutoRABIT • Hyderabad, Telangana, India
AutoRABIT is the leader in DevSecOps for SaaS platforms such as Salesforce.Its unique metadata-aware capability makes Release Management, Version Control, and Backup & Recovery complete, reliable, ...Show more
Last updated: 30+ days ago • Promoted
Engineer - Site Relibility - FPT

Engineer - Site Relibility - FPT

Talent500 INC • Hyderabad, India
Engineer - Site Reliability - FPT.As a Site Reliability Engineer, youll play a crucial role in keeping our digital backbone running seamlessly for millions of customers. Your mission : reduce inciden...Show more
Last updated: 11 hours ago • Promoted • New!
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Elios Talent • Hyderabad, Telangana, India
Senior Site Reliability Engineer.Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms. Drive automation-first engineering across AWS, Terraform, CI / CD,...Show more
Last updated: 18 hours ago • Promoted • New!
Site Reliability Engineer [T500-21132]

Site Reliability Engineer [T500-21132]

Inspire • Hyderabad, Telangana, India
Inspire Brands is disrupting the restaurant industry through digital transformation and operational efficiencies.The company’s technology hub, Inspire Brands Hyderabad Support Center, India, will l...Show more
Last updated: 15 days ago • Promoted
Lead Site Reliability Engineer

Lead Site Reliability Engineer

GSPANN Technologies, Inc • Hyderabad, Telangana, India
Headquartered in California, U.GSPANN provides consulting and IT services to global clients.We help clients transform how they deliver business value by helping them optimize their IT capabilities,...Show more
Last updated: 1 day ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Foodsmart • Hyderabad, Republic Of India, IN
Foodsmart is the leading telenutrition and foodcare solution, backed by a robust network of Registered Dietitians.Our platform is designed to foster healthier food choices, drive lasting behavior c...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

Elios Talent • Hyderabad, Telangana, India
Build, automate, and support cloud-native infrastructure powering high-availability platforms.Contribute to automation-first engineering across AWS, Terraform, CI / CD, and observability tooling.Impr...Show more
Last updated: 18 hours ago • Promoted • New!
Engineer, Site Reliability [T500-20266]

Engineer, Site Reliability [T500-20266]

TMUS Global Solutions • Hyderabad, Telangana, India
NASDAQ : TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mo...Show more
Last updated: 30+ days ago • Promoted
Engineer, Site Reliability

Engineer, Site Reliability

TMUS Global Solutions • Hyderabad, India
The Systems Reliability Engineer (SRE) ensures the stability, performance, and reliability of IT services and infrastructure. This role combines software engineering and operations expertise to buil...Show more
Last updated: 11 hours ago • Promoted • New!
Site Reliability Engineer

Site Reliability Engineer

VXI Global Solutions • Hyderabad, Telangana, India
We are looking for a Site Reliability Engineer with 3+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications.The id...Show more
Last updated: 30+ days ago • Promoted
Site Reliability Engineer

Site Reliability Engineer

NationsBenefits India • Hyderabad, Telangana, India
Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog |.SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.Triage and resolve product...Show more
Last updated: 30+ days ago • Promoted
Principal Site Reliability Specialist

Principal Site Reliability Specialist

Elios Talent • Hyderabad, Republic Of India, IN
Senior Site Reliability Engineer.Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms. Drive automation-first engineering across AWS, Terraform, CI / CD,...Show more
Last updated: 16 hours ago • Promoted • New!
Site Reliability Engineer

Site Reliability Engineer

Inspire Brands Hyderabad Support Center • Hyderabad, India
Inspire Brands is disrupting the restaurant industry through digital transformation and operational efficiencies.The companys technology hub, Inspire Brands Hyderabad Support Center, India, will le...Show more
Last updated: 11 hours ago • Promoted • New!