About GlobalFoundries
GlobalFoundriesis a leading full-service semiconductor foundry providing a unique combination of design development and fabrication services to some of the worlds most inspired technology companies. With a global manufacturing footprint spanning three continents GlobalFoundries makes possible the technologies and systems that transform industries and give customers the power to shape their markets. For more information visit .
Overview
We are seeking a forward-thinking Cloud Ops Observability Lead to drive the strategy implementation and continuous improvement of observability across our cloud environments (AWS and Azure). This role will lead efforts to ensure our systems are measurable reliable and transparent enabling proactive operations and rapid incident response.
Key Responsibilities
Lead the design and implementation of observability frameworks including logging metrics tracing and event correlation.
Own the strategy and tooling for cloud-native monitoring across AWS and Azure integrating with operational workflows.
Collaborate with Cloud Engineering Platform and Security teams to ensure observability is embedded in infrastructure and applications.
Establish and maintain SLOs SLIs and error budgets to drive reliability and performance improvements.
Drive incident response readiness including alerting strategies runbooks and post-incident analysis.
Champion a culture of proactive operations using data to identify trends prevent outages and optimize performance.
Required Qualifications
5 years of experience in cloud operations site reliability engineering or observability roles.
Strong expertise in monitoring and observability tools (e.g. Datadog Prometheus Grafana CloudWatch Azure Monitor).
Deep understanding of AWS and Azure architectures including networking compute and managed services.
Experience with SRE principles incident management and operational analytics.
Proficiency in scripting and automation (e.g. Python PowerShell Bash).
Strong communication and stakeholder engagement skills.
Preferred Qualifications
Experience implementing OpenTelemetry distributed tracing and log aggregation pipelines.
Familiarity with AIOps anomaly detection and predictive analytics.
Exposure to FinOps and cost-aware observability practices.
Experience with chaos engineering and resilience testing.
Why Join Us
Lead a critical function that directly impacts system reliability and customer experience.
Work with cutting-edge cloud technologies in a collaborative high-impact environment.
Influence enterprise-wide observability strategy and tooling.
GlobalFoundries is an equal opportunity employer cultivating a diverse and inclusive workforce. We believe having a multicultural workplace enhances productivity efficiency and innovation whilst our employees feel truly respected valued and heard. As an affirmative employer all qualified applicants are considered for employment regardless of age ethnicity marital status citizenship race religion political affiliation gender sexual orientation and medical and / or physical abilities. All offers of employment with GlobalFoundries are conditioned upon the successful completion of background checks medical screenings as applicable and subject to the respective local laws and regulations.
Information about our benefits you can find here :
Key Skills
Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting
Employment Type : Full-Time
Experience : years
Vacancy : 1
Site Reliability Engineer • Bengaluru, Karnataka, India