We are seeking a highly skilled Site Reliability Engineer (SRE) / DevOps Engineer with a strong background in cloud infrastructure, automation, and large-scale system operations. In this role, you will partner across engineering teams to enhance platform reliability, accelerate delivery, and ensure a world-class customer experience.Key ResponsibilitiesDrive initiatives that enhance operational efficiency, scalability, and overall platform reliability.Lead standardization efforts across services and disciplines in collaboration with embedded SRE teams.Identify and implement automation opportunities for deployment, infrastructure management, and observability.Apply modern security practices to ensure secure cloud-based infrastructure and software systems.Perform full-stack diagnostics to determine and resolve root cause issues.Analyze system performance and drive improvements to key operational metrics and KPIs.Proactively assess infrastructure and applications for enhancements rather than waiting for direction.Safeguard application data from unauthorized access, modification, or disclosure.Build and maintain high-availability, redundant systems and disaster recovery procedures.Develop integrated workflows to support internal support teams and cross-functional partners.Own the customer experience—ensure seamless digital interactions and promote user satisfaction.Respond to incidents promptly and support troubleshooting efforts across the stack.Required Skills & CompetenciesCloud & Infrastructure1+ years working with Infrastructure as Code (IaC) and DSC tools : Terraform, CDK, Chef.1+ years deploying and managing containerized workloads with Docker and Kubernetes .1+ years managing AWS infrastructure at scale : EC2, S3, ELB, Lambda, Route 53, ECS, SQS, CloudWatch.Prior experience in a DevOps or SRE environment .Automation & ScriptingStrong automation background with scripting languages including PowerShell, Ruby, Go, Python, Bash .Monitoring & TroubleshootingExperience with large-scale monitoring and APM tools : ELK Stack, Dynatrace, New Relic, Nagios .Skilled in IIS management, troubleshooting, and performance monitoring .Experience supporting web farms in high-traffic SaaS environments.Strong analytical, diagnostic, and problem-solving skills with focus on proactive improvement.Application & CI / CDExtensive experience with .NET application architecture (caching, CDNs, load balancing, HA).Clear understanding of SDLC processes and hands-on experience with CI / CD tools such asTeamCity, Octopus Deploy, GitHub, Jenkins, Codefresh .Additional Technologies (Preferred)Active Directory, SSL, FTP, Big-IP F5T-SQL, MongoDB, MySQL, SQL ServerGit, Chef, SaltKafkaLinux / Windows Server AdministrationApache, Bash
Engineer • Cannanore, Kerala, India