Job descriptionWe’re looking for a Senior Devops Engineer to join our Site Reliability Engineer (SRE) Team in Noida.Working at Taazaa involves engaging with cutting-edge technology and innovative software solutions in a collaborative environment. We emphasize on continuous professional growth, offering workshops and training. Our employees often interact with clients to tailor solutions to business needs, working on diverse projects across industries. We promote work-life balance with flexible hours and remote options, fostering a supportive and inclusive culture. Competitive salaries, health benefits, and various perks further enhance the work experience.Looking ahead, we aim to expand our technological capabilities and market reach, investing in advanced technologies and expanding our service offerings. We plan to deepen our expertise in AI and machine learning, enhance our cloud services, and continue fostering a culture of innovation and excellence. Taazaa is committed to staying at the forefront of technology trends, ensuring it delivers impactful and transformative solutions for its clients.As a Site Reliability Engineer (SRE) you will play a pivotal role in the design, implementation, and maintenance of the infrastructure that supports our software development lifecycle. You will work closely with software engineers, QA, and IT teams to ensure the availability, reliability, and performance of our systems. Your primary focus will be on streamlining our deployment processes, improving system scalability, and ensuring a robust, secure, and cost-efficient infrastructure.What you’ll do:Partner with product engineering squads to design, build, and operate highly reliable servicesOwn and improve production reliability end-to-end:Define and measure SLOs/SLIs, error budgets, and reliability goalsLead incident response, postmortems, and follow-up action itemsParticipate in on-call rotation and drive rapid, effective resolution of production issuesBuild and maintain world-class observability:Create comprehensive dashboards, alerts, metrics, structured logging, and distributed tracingEnable squads to understand system behavior and debug effectivelyDevelop automation, tooling, and infrastructure as code to reduce toil and increase developer velocityCollaborate closely with Staff Engineers / Team Leads to:Embed reliability best practices into the development lifecycleReview architectural decisions with a production lensMentor engineers on operational excellence, observability, and on-call mindsetChampion modern engineering and DevOps practices:CI/CD pipelinesProgressive delivery (feature flags, canaries, blue-green)Infrastructure as code (Terraform, Pulumi, CDK)Effective use of AI-assisted tools to accelerate scripting, debugging, and documentationProactively identify and eliminate classes of failure through chaos engineering, capacity planning, and performance tuningHelp evolve our technical strategy for reliability, scalability, and cost-efficiencyYour Qualifications Technical:5+ years of professional experience in SRE, DevOps, or software engineering with a strong focus on production systemsDeep hands-on experience operating distributed cloud systems (AWS / GCP / Azure — at least one in depth, preferably AWS)Proficiency in at least one modern programming language used for tooling & automation (Go, Python, TypeScript/JavaScript, Rust)Strong observability expertise:Building dashboards and alerts (Grafana, Groundcover, Datadog, New Relic, Prometheus, etc.)Distributed tracing (OpenTelemetry, Jaeger, Zipkin)Structured logging and metrics at scaleProven track record of incident management, postmortems, and driving reliability improvementsExperience defining and working with SLOs, SLIs, and error budgetsComfort with infrastructure as code and modern DevOps practices (CI/CD, GitOps, containers/Kubernetes)Excellent collaboration skills — you enjoy partnering with product engineers and teaching reliability conceptsBias toward automation and reducing manual toilNice-to-HavesPrevious on-call leadership or incident commander experienceBackground in performance engineering or capacity planning at scaleFamiliarity with service meshes, API gateways, or zero-trust networkingContributions to open-source reliability/observability toolsExperience mentoring or embedding within product squads within product squadsBehavioural:Here are four essential behavioral skills Assistant Manager should possess:Effective Communication: Clearly and concisely convey ideas, requirements, and feedback to team members, stakeholders, and clients, fostering an environment of open dialogue and mutual understanding.Problem-Solving Attitude: Approach challenges with a proactive mindset, quickly identifying issues and developing innovative solutions to overcome obstacles.Collaboration and Teamwork: Work well within a team, encouraging collaboration and valuing diverse perspectives to achieve common goals and deliver high-quality results.Adaptability and Flexibility: Stay adaptable in a fast-paced, dynamic environment, effectively managing changing priorities and requirements while maintaining focus on project objectives.What you’ll get in return:Joining Taazaa Tech means thriving in a dynamic, innovative environment with competitive compensation and performance-based incentives. You'll have ample opportunities for professional growth through workshops and certifications, while enjoying a flexible work-life balance with remote options. Our collaborative culture fosters creativity and exposes you to diverse projects across various industries. We offer clear career advancement pathways, comprehensive health benefits, and perks like team-building activities.Who we are:Taazaa Tech is a kaleidoscope of innovation, where every idea is a brushstroke on the canvas of tomorrow. It's a symphony of talent, where creativity dances with technology to orchestrate solutions beyond imagination. In this vibrant ecosystem, challenges are sparks igniting the flames of innovation, propelling us towards new horizons.Welcome to Taazaa, where we sculpt the future with passion, purpose, and boundless creativity.