We seek sharp thinkers who design scalable systems while keeping a startup mindset. Our culture values fast, data-driven innovation. The Observability team needs experienced engineers skilled in cloud-native design, legacy maintenance, and SRE best practices plus ideas for improvement! We collaborate across Tech to ensure platforms and services are production-ready, contributing to both platform and software codebases.
Responsibilities :
- As a DevOps Engineer, you will join our team to help grow our systems into best-in-class for efficiency, stability, observability, velocity, and scale in the e-commerce space, engage with the product and engineering team from Day 1 to design, build, and maintain the system / software proactively.
- Influence the design and architecture of the Wayfair system as part of the Cloud Enablement journey, whilst maintaining our critical pieces of Legacy Tools; collaborate with development teams to design scalable and reliable systems, considering aspects such as fault tolerance, availability, and performance.
- Work with both software engineers and platform Engineers to optimize and develop repeatable systems for the two sides to leverage each other.
- There's a wide range of opportunities to both guide the broad conversation and dive into the nuance of our code & architecture.
- Help service owners build realistic SLOs, set SLAs and error budgets, and ensure production services have reliability built into their design.
- Even after self-healing and automation done by you, provide production support and creatively solve challenging engineering problems across our stack.
- Participate in a shared on-call schedule managed across Engineering.
- Automate repetitive tasks to increase efficiency and reduce human error.
- Mentor new hires and other engineers by example, tech talks, paired programming, and other avenues to increase technical efficiency across the organization.
Responsibilities :
6+ years of experience working in a DevOps or SRE role, or software development, with an understanding of Cloud Infrastructure.Experience with cloud platforms GCP, AWS, Azure, and containerization technologies (e. g. Docker, Kubernetes).Experience with server-side software engineering (Python, Go, Java, BASH, etc).Design experience with distributed systems, microservices architecture, and related technologies.Strong understanding of monitoring and alerting, with a focus on performance monitoring and tracing instrumentation, and SLI / SLO / SLAs.Experience with decoupling monolith services is a plus.Knowledge of CI / CD pipelines and version control systems (e. g., Git).Excellent communication skills across engineers, product managers, and business stakeholders alike.Knowledge of configuration management tools (e. g. Puppet, Ansible, Chef, Terraform).Nice To Have :
Passion for leading a large, cross-cutting technical initiative to delivery, cross-functional consensus building, and influencing design decisions.Ample experience gathering and balancing requirements from technical and business stakeholders, and reaching consensus on prioritization.Experience mentoring engineers and leading code reviews.