Talent.com
Site Reliability Engineer (SRE DevOps) Engineering Productivity
Site Reliability Engineer (SRE DevOps) Engineering ProductivityArista Networks • Bengaluru, Karnataka, India
No longer accepting applications
Site Reliability Engineer (SRE DevOps) Engineering Productivity

Site Reliability Engineer (SRE DevOps) Engineering Productivity

Arista Networks • Bengaluru, Karnataka, India
30+ days ago
Job description

Who Youll Work With

Arista Networks is looking for a skilled professional for our Engineering Productivity (EngProd) team to help maintain and support our rapidly expanding infrastructure and internal user base. The ideal candidate is someone who can wear many hats is versatile and is enthusiastic about learning new technologies. As a part of the software engineering team you will work with other team members to design build and administer secure scalable and fault-tolerant tools and infrastructure in a hybrid cloud environment.

Working in the EngProd group you will collaborate and work with other engineers to design build scale and operate the systems used by Aristas product development teams. Thes systems are based on industry-standards including Ansible Artifactory Gerrit Jenkins Kubernetes Grafana Spinnaker MySQL ElasticSearch Google Cloud Varnish Perforce Gerrit etc 3rd party storage appliances as well as internal systems developed from the ground-up to automate CI/CD testing analysis and visualization.

What Youll Do

  • Build deploy safely and incrementally and operate critical production systems with focus on scalability reliability observability performance and security.
  • Monitor support and enhance developer experience across services.
  • Build automation to remove toil and efficiently operate production systems.
  • Proactively monitor respond to and enhance alerts and set up automated alert handling
  • Create and maintain the incident response runbooks.
  • Build and deploy new systems with scalability reliability and observability as primary requirements
  • Triage platform/infrastructural issues and help Arista software engineers in their triages. Engage with 3rd party vendor support.
  • Deploy new systems in a staged manner
  • Write postmortem documents and build solutions to avoid incidents from repeating.
  • Plan and communicate maintenance windows on production systems.
  • Work with Aristas product development teams to identify infrastructural issues that are causing bottlenecks and limitations in their workflows. Design and implement solutions to resolve them.
  • Survey and adopt best practices around infrastructure/platform to maintain secure scalable and fault-tolerant systems.
  • Implement solutions to scale the systems
  • Implement fault-tolerance and performance to improve availability of the systems
  • Study the design and sufficient implementation details of OSS systems for better triage and fix resolution.


    Qualifications :

    Essential to have all of the following skills

    • At least BSc Computer Science or Engineering 5 years experience MS Computer Science or Engineering 5 years experience or equivalent work experience.
    • Knowledge of one or more of Go Python shell scripting to be able to implement medium complexity automation workflows.
    • Knowledge of Linux (or UNIX) from administration and debugging perspective
    • Hands-on experience in operating software systems (infrastructure complex applications etc) at scale
    • Experience in server provisioning (esp from storage and networking perspective).
    • Strong problem solving and software troubleshooting skills
    • Experience with infrastructure-as-code

    Desirable to have one/more of the following skills

    • Experience managing databases - mariadb postgres mongodb etc
    • Experience with docker and virtualization technologies - kvm qemu kata-containers etc
    • Experience managing monitoring stack - Prometheus Loki Tempo InfluxDB Grafana Thanos etc
    • Experience managing ElasticSearch clusters
    • Experience managing Artifactory docker registry etc
    • Experience managing CI/CD systems like ArgoCD Spinnaker etc
    • Experience managing version control systems like Perforce Gerrit etc
    • Experience with infrastructure-as-code frameworks like Ansible
    • Experience managing large Java applications
    • Experience in storage infrastructure management eg: NAS SAN Ceph etc


    Additional Information :

    Arista stands out as an engineering-centric company. Our leadership including founders and engineering managers are all engineers who understand sound software engineering principles and the importance of doing things right.

    We hire globally into our diverse team. At Arista engineers have complete ownership of their projects. Our management structure is flat and streamlined and software engineering is led by those who understand it best. We prioritize the development and utilization of test automation tools.

    Our engineers have access to every part of the company providing opportunities to work across various domains. Arista is headquartered in Santa Clara California with development offices in Australia Canada India Ireland and the US. We consider all our R&D centers equal in stature.

    Join us to shape the future of networking and be part of a culture that values invention quality respect and fun.


    Remote Work :

    Yes


    Employment Type :

    Full-time


    Key Skills
    Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting
    Experience: years
    Vacancy: 1
    Create a job alert for this search

    Site Reliability Engineer (SRE DevOps) Engineering Productivity • Bengaluru, Karnataka, India

    Similar jobs

    ELK Platform Site Reliability Engineer

    PeopleLogicBengaluru, Republic Of India, IN

    We are seeking an exceptionally skilled and dedicated ELK Platform Site.Reliability Engineer to join our dynamic infrastructure team, ensuring the robust health,.ELK (Elasticsearch, Logstash, Kiban...Show more

     • Promoted

    Site Reliability Engineer

    Ascendionbangalore, India

    Job Title :: Site Reliability Engineer.Location :: Bengaluru (Hybrid, 2-3 days onsite in a week).Minimum relevant years of experience :: 10+ Years.We are recruiting multiple SRE Engineers to embed ...Show more

     • Promoted

    Senior Site Reliability Engineer

    Shell Recharge Solutionsbangalore, India

    Shell Recharge Solutions is seeking a.Senior Site Reliability Engineer!.We are excited to find a highly engaged engineer who is obsessed with technology that wants to be a part of a “world class” p...Show more

     • Promoted

    Site reliability engineer

    LuxoftBengaluru, Karnataka, India

    Luxoft partner with next-generation digital bank, built from the ground up to deliver seamless, secure, and scalable financial services.Our platform is cloud-native, API-first, and focused on relia...Show more

     • Promoted

    Site Reliability Engineer

    Live ConnectionsBengaluru, Karnataka, India

    SRE Manager (Support | Non-Production Environments) Experience: 8–12 Years Location: [Add Location] We are looking for an experienced SRE Manager to lead reliability engineering for non-production ...Show more

     • Promoted

    Site Reliability Engineer

    Luxoftbangalore, India

    Luxoft partner with next-generation digital bank, built from the ground up to deliver seamless, secure, and scalable financial services.Our platform is cloud-native, API-first, and focused on relia...Show more

     • Promoted

    Senior Site Reliability Engineer

    Pocket FMbangalore, India

    Senior Site Reliability Engineer (SRE).We are looking for an experienced.Senior Site Reliability Engineer (SRE).Kubernetes-first, cloud-native architecture.In this role, you will own platform stabi...Show more

     • Promoted

    Site Reliability Engineer

    Impetusbangalore district, India

    SRE will ensure the appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services.This role will be engaged in production triage efforts and work on i...Show more

     • Promoted

    Site Reliability Engineer, Observability

    YASH TechnologiesBengaluru, Republic Of India, IN

    We are seeking a Full-stack Infrastructure Observability Specialist to join the Infra and Operations Team.Thisrole will focus on building and enabling.A key responsibility is to design, implement, ...Show more

     • Promoted

    Site Reliability Engineer

    Aqilea (formerly Soltia)Bangalore, Karnataka, India
    Quick Apply

    Aqilea is an IT and engineering consulting partner that helps companies get more out of their technology and operations.With teams in Stockholm and Bangalore, we work closely with our clients to bu...Show more

    Site Reliability Engineer I

    Aqilea (formerly Soltia)Bangalore, Karnataka, India
    Quick Apply

    Aqilea is an IT and engineering consulting partner that helps companies get more out of their technology and operations.With teams in Stockholm and Bangalore, we work closely with our clients to bu...Show more

    Senior Site Reliability Engineer (AgenticAI)

    Securonixbangalore, India

    At Securonix, we’re on a mission to secure the world by staying ahead of cyber threats, reinforcing all layers of our platform with AI capabilities.Securonix Unified Defense SIEM.AI-Reinforced solu...Show more

     • Promoted

    Lead Site Reliability Engineer

    Shell Recharge SolutionsBengaluru, Republic Of India, IN

    Shell Recharge Solutions is seeking a.Senior Site Reliability Engineer!.We are excited to find a highly engaged engineer who is obsessed with technology that wants to be a part of a “world class” p...Show more

     • Promoted

    Site reliability engineer

    Live ConnectionsBengaluru, Karnataka, India

    SRE Manager (Support | Non-Production Environments).We are looking for an experienced SRE Manager to lead reliability engineering for non-production environments (Dev, QA, UAT, Staging) , driving a...Show more

     • Promoted

    Site Reliability Engineer (SRE)

    ScaleneWorksBengaluru, Karnataka, India
    Quick Apply

    Excellent written and verbal communication skills.Must be extremely comfortable using and navigating within a Linux environment.Ability to do low level debugging and problem analysis by examining l...Show more

    Site Reliability Engineer

    greytHRbangalore, India

    We are looking for a passionate and detail-oriented.Site Reliability Engineer (SRE).As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our infrast...Show more

     • Promoted

    Site Reliability Engineer

    BayOne Solutionsbangalore, India

    We are looking for a Site Reliability Engineer (SRE) with a strong focus on infrastructure reliability, scalability, and automation.This role emphasizes building resilient cloud platforms, improvin...Show more

     • Promoted

    Site Reliability Engineer

    PwC IndiaBengaluru, Karnataka, India

    Experience : 5 - 12 yrs Role Overview As a Lead SRE, you will be the bridge between software engineering and systems operations.You will apply an engineering mindset to system administration, focus...Show more