Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions
Operate, monitor, and triage all aspects of our production and non-production environments
Collaborate with other engineers on code, infrastructure, design reviews, and process enhancements.
Evaluate and integrate new technologies to improve system reliability, security, and performance
Develop and implement automation to provision, configure, deploy, and monitor Apple services
Participate in an on-call rotation providing hands-on technical expertise during service-impacting events
Design, build, and maintain highly available and scalable infrastructure
Implement and improve monitoring, alerting, and incident response systems
Automate operations tasks and develop efficient workflows
Conduct system performance analysis and optimization
Collaborate with development teams to ensure smooth deployment and release processes
Implement and maintain security best practices and compliance standards
Troubleshoot and resolve system and application issues
Participate in capacity planning and scaling efforts
Stay up-to-date with the latest trends, technologies, and advancements in SRE practices
Contribute to capacity planning, scale testing, and disaster recovery exercises.
Approach operational problems with a software engineering mindset
BS degree in computer science or equivalent field with 5+ years of experience
5+ years in an Infrastructure Ops, Site Reliability Engineering, or DevOps-focused role.
Knowledge of Linux operating system principles, networking fundamentals, and systems management.
Demonstrable fluency in at least one of the following languages : Java, Python, or Go
Experience managing and scaling distributed systems in a public, private, or hybrid cloud environment
Develop and implement automation tools and apply best practices for system reliability.
You will be responsible for the availability & scalability of our services and manage the disaster recovery and other operational tasks.
Collaborate with the development team to improve application codebase for logging, metrics and traces for observability.
Collaborate with data science teams and other business units to design, build and maintain the infrastructure that runs machine learning and generative AI workloads.
Influence architectural decisions with focus on security, scalability and performance.
Find and fix problems in production, and work to avoid them from happening again
Preferred Qualifications :
Familiarity with micro-services architecture and container orchestration with Kubernetes.
Awareness of key security principles including encryption, keys (types and exchange protocols).
Understanding SRE principles includes monitoring, alerting, error budgets, fault analysis, and automation.
Strong sense of ownership, with a desire to communicate and collaborate with other engineers and teams.
Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.
ref : hirist.tech)
Create a job alert for this search
Site Reliability Engineer • Ahmedabad
Related jobs
Site Reliability Engineer
trellixINDIA
Trellix, the trusted CISO ally, is redefining the future of cybersecurity and soulful work.Our comprehensive, GenAI-powered platform helps organizations confronted by todays most advanced threats g...Show moreLast updated: 30+ days ago
Site Reliability Engineer, AEP
Adobe Systems LtdINDIA
Changing the world through digital experiences is what Adobes all about.We give everyonefrom emerging artists to global brandseverything they need to design and deliver exceptional digital experien...Show moreLast updated: 30+ days ago
Site Reliability Engineer II
forcepointINDIA
Forcepoint simplifies security for global businesses and governments.Forcepoints all-in-one, truly cloud-native platform makes it easy to adopt Zero Trust and prevent the theft or loss of sensitive...Show moreLast updated: 30+ days ago
Lead Site Reliability Engineer
ZenotiINDIA
Zenoti provides an all-in-one, cloud-based software solution for the beauty and wellness industry.Our solution allows users to seamlessly manage every aspect of the business in a comprehensive mobi...Show moreLast updated: 30+ days ago
Lead Site Reliability Engineer
cvent india pvt ltdINDIA
Cvent is a global meeting, event, travel, and hospitality technology leader, with more than 4000 employees worldwide.As a leading cloud-based technology company, we have over 28,000 customers, in...Show moreLast updated: 30+ days ago
Site Reliability Engineer
NatWest GroupINDIA
Join us as a Site Reliability Engineer.In this key role, youll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change manage...Show moreLast updated: 30+ days ago
Site Reliability Engineer
PhonepeINDIA
PhonePe is Indias leading digital payments company with 50 crore (500 Million) registered users and 3.Million) merchants covering over 99 PERCENT of the postal codes across India.On the back of it...Show moreLast updated: 30+ days ago
Sr. Engineer, Site Reliability
IntelIndia
Do you want to innovate an industry leading developer cloud? Join SATG as a Sr.The cloud development division within Software and Advanced Technology Group (SATG) is developing and shaping the way ...Show moreLast updated: 30+ days ago
New!
Lead Site Reliability Engineer
InfraveoAhmedabad, Gujarat, India
We are looking for Lead Site Reliability Engineer to join our team.Provide tech leadership in SRE execution and planning.
Lead complex infra projects for both internal and external stakeholders.Orch...Show moreLast updated: 8 hours ago
Senior Site Reliability Engineer
everbridgeINDIA
Everbridge (NASDAQ : EVBG) empowers enterprises and government organizations to anticipate, mitigate, respond to, and recover stronger from critical events.
In todays unpredictable world, resilient o...Show moreLast updated: 30+ days ago
Site Reliability Engineer-II
BloomreachIndia
Improve and manage infrastructure to drive efficiency and scalability.Write and review code, develop documentation, capacity plans, and optimize service costs.
Set up Service Level Indicators (SLIs)...Show moreLast updated: 30+ days ago
Promoted
Site Reliability Engineer
noonAhmedabad, IN
Job Title : Site Reliability Engineer.In doing this we hope to accelerate the digital economy of the Middle East, empowering regional talent and businesses to meet the full range of consumers' onlin...Show moreLast updated: 16 days ago
Site Reliability Engineer
tcg digital solutions pvt ltdINDIA
Bachelors or masters degree in Computer Science, Engineering, or related field.Essential Skills (Two top skills).AWS Ecosystem EKS, EC2, DynamoDB, Lambda, etc.
The SRE team should include some memb...Show moreLast updated: 30+ days ago
Promoted
Sr. Site Reliability Engineer, L2(Network)
Crest DataAhmedabad, Gujarat, India
Site Reliability Engineer (Network) L2 | CCNA Certified.Certifications Required - CCNA / CCNP Certified.Can manage and optimize complex network environments, including large-scale deployments and h...Show moreLast updated: 9 days ago
Senior Site Reliability Engineer
autodesk india pvt ltdINDIA
Do you want the opportunity to be part of a startup environment working on a new product seeking to become a world-leading integration platform? Are you looking to be at the forefront of innovative...Show moreLast updated: 30+ days ago
Site Reliability Engineer
Qure.aiINDIA
AI is one of the fastest-growing startups in India, which develops Artificial intelligence-enabled products and platforms for healthcare diagnostics.
We create cutting-edge solutions that positively...Show moreLast updated: 30+ days ago
Promoted
Site Reliability Engineer
ACL DigitalAhmedabad, Gujarat, India
Continuous monitoring of system performance and identify potential issues before they impact users.Experience working with Industry leading monitoring tools.
Respond to incidents related to monitori...Show moreLast updated: 9 days ago
Site Reliability Engineer
VistexAhmedabad, Gujarat, IND
The Vistex Site Reliability Engineer will be primarily responsible for service availability, performance, monitoring, incident response, and capacity planning.
This is a highly technical, hands-on r...Show moreLast updated: 13 days ago