Talent.com
Lead Site Reliability Engineer

Lead Site Reliability Engineer

ConfidentialChennai, India
6 days ago
Job description

Join our software, system, and test engineering group as a Lead Site Reliability Engineer focusing on designing and managing AWS infrastructure, automating CI / CD pipelines, and ensuring scalable, reliable deployments.

You will leverage your extensive experience to enhance automation, optimize resource usage, and collaborate with teams to support client needs. Apply now to contribute your expertise in site reliability engineering and cloud infrastructure management.

Responsibilities

  • Design and manage AWS infrastructure using CloudFormation templates, stacks, and stack sets
  • Build and maintain CI / CD pipelines with Jenkins integrated with Git or Bitbucket
  • Automate infrastructure provisioning and configuration using Python, Terraform, and Ansible
  • Implement and manage monitoring and logging solutions with Amazon CloudWatch and CloudTrail
  • Manage environment-specific IAM roles, policies, and permissions ensuring compliant access controls
  • Optimize AWS resource usage and cost through automation and tagging strategies
  • Troubleshoot and resolve issues related to AWS services, CI / CD pipelines, and automation scripts
  • Collaborate with development and operations teams to ensure reliable deployments and infrastructure scalability
  • Maintain documentation for infrastructure, automation workflows, and operational procedures

Requirements

  • Over 10 years of experience in DevOps, cloud engineering, or site reliability engineering roles
  • Strong hands-on experience with AWS services such as EC2, S3, IAM, VPC, RDS, CloudWatch, CloudTrail, ECS, and Fargate
  • Proficiency in infrastructure as code using CloudFormation, Terraform, and Ansible
  • Solid understanding of Git workflows, branching strategies, and code merge practices
  • Experience with Python scripting for automation and integration tasks
  • Expertise in Jenkins for pipeline creation, orchestration, and deployment automation
  • Familiarity with REST APIs and JSON / YAML for configuration and integration
  • Strong troubleshooting skills across AWS infrastructure and DevOps toolchains
  • Fluent English communication and documentation skills
  • Skills Required

    Yaml, Cloudformation, Amazon CloudWatch, Json, Jenkins, Git, Bitbucket, Terraform, Ansible, Rest Apis, Python, Aws

    Create a job alert for this search

    Site Reliability Engineer • Chennai, India