Talent.com
Systems Reliability & Performance Engineer

Systems Reliability & Performance Engineer

Sails Software IncVisakhapatnam, Republic Of India, IN
1 day ago
Job description

SRE- AWS

Job Summary

We are looking for an experienced and driven Senior Site Reliability Engineer (SRE) to architect, implement, and maintain robust cloud infrastructure. This role demands a deep understanding of AWS, Kubernetes, ECS, and the ability to build scalable, secure, and highly available infrastructure from scratch. The ideal candidate will be a strong advocate for DevOps principles, automation, and reliability, and will possess the skills to support and optimize complex microservices-based architectures.

Key Responsibilities

  • Infrastructure Design & Implementation
  • Design and build highly scalable, fault-tolerant, and secure cloud infrastructure using AWS, Kubernetes, and ECS.
  • Lead efforts in infrastructure as code (IaC) using tools like Terraform or CloudFormation.
  • Develop and enforce best practices for infrastructure provisioning, security, and cost optimization.

System Reliability & Performance

  • Ensure availability, performance, scalability, and security of production systems.
  • Implement observability strategies including monitoring, logging, and alerting using tools such as Prometheus, Grafana, ELK, or Datadog.
  • Analyse system performance metrics and proactively identify potential issues and bottlenecks.
  • DevOps & Automation

  • Build and maintain CI / CD pipelines to streamline code deployments across environments.
  • Drive automation in infrastructure provisioning, configuration management, and operational tasks.
  • Ensure repeatable and reliable deployments using containers and orchestration tools like Kubernetes and ECS.
  • Service Management

  • Own the SRE lifecycle, including incident management, postmortems, root cause analysis, and runbook creation.
  • Collaborate closely with development and QA teams to ensure seamless microservices integration, deployment, and lifecycle management.
  • Maintain service-level objectives (SLOs), service-level agreements (SLAs), and error budgets.
  • Security & Compliance

  • Implement and enforce cloud security best practices for networking, identity and access management, and data protection.
  • Support audits, compliance assessments, and vulnerability remediation.
  • Monitor for security anomalies and work with security teams to respond to threats.
  • Technical Skills

  • 6+ years of hands-on experience in Site Reliability Engineering, DevOps, or Cloud Engineering.
  • Expertise in AWS services such as EC2, S3, RDS, IAM, VPC, Lambda, CloudWatch, etc.
  • Strong knowledge of Kubernetes and container orchestration best practices.
  • Experience managing services on Amazon ECS (Fargate or EC2).
  • Proficient in infrastructure-as-code tools like Terraform, CloudFormation, or Pulumi.
  • Skilled in scripting languages such as Python, Bash, or Go.
  • Solid grasp of networking, load balancing, DNS, and firewall rules in cloud environments.
  • Deep understanding of microservices architectures, API gateways, and service meshes.
  • Soft Skills

  • Proven leadership and cross-functional collaboration skills.
  • Strong problem-solving and incident-resolution mindset.
  • Clear communication, documentation, and stakeholder reporting abilities.
  • Passion for continuous improvement and automation.
  • Preferred Qualifications

  • AWS certifications such as AWS Certified DevOps Engineer, Solutions Architect – Professional, or equivalent.
  • Familiarity with service meshes like Istio or Linkerd.
  • Experience with serverless architectures and event-driven systems.
  • Knowledge of regulatory compliance (SOC2, ISO 27001, GDPR) in cloud environments.
  • Skills – AWS Cloud, CICD, EC2, Kubernete, Grafana, Datadog, Python

    Key Responsibilities :

    Cloud Platform : GCP

  • Infrastructure Automation : Design, implement, and manage infrastructure as code using Terraform to provision and manage GCP resources.
  • Container Orchestration : Deploy and manage Kubernetes clusters, ensuring efficient operation of containerized applications.
  • Continuous Integration / Continuous Deployment (CI / CD) : Develop and maintain CI / CD pipelines using Jenkins to automate application build, test, and deployment processes.
  • Containerization : Collaborate with development teams to containerize applications using Docker and manage deployments with Helm Charts.
  • Code Quality Assurance : Integrate and manage SonarQube to ensure code quality and security standards are met.
  • Monitoring and Logging : Implement and manage monitoring solutions using Datadog to ensure system health, performance, and security.
  • Collaboration : Work closely with cross-functional teams, including developers, QA, and operations, to streamline processes and improve productivity.
  • Requirements :

  • Experience : 5+ years in DevOps or cloud engineering roles, with at least 3 years of relevant experience in the specified technologies.
  • Technical Proficiency :
  • o Hands-on experience with GCP services and architecture.

    o Proficiency in Terraform for infrastructure as code implementations.

    o Strong understanding and experience with Kubernetes and Docker.

    o Experience in setting up and managing CI / CD pipelines using Jenkins.

    o Familiarity with Helm Charts for application deployment.

    o Experience with SonarQube for code quality analysis.

    o Proficiency in monitoring and logging tools, particularly Datadog.

  • Scripting Skills : Proficiency in scripting languages such as Bash or Python is an added advantage.
  • o Strong problem-solving abilities and analytical thinking.

    o Excellent communication skills, both verbal and written.

    o Ability to work collaboratively in a team environment.

    o Strong organizational and time management skills.

    Skills – Terraform, Kubernetes, Cluster, Docker, GCP, Sonar

    Technical Skills

  • 6+ years of hands-on experience in Site Reliability Engineering, DevOps, or Cloud Engineering.
  • Expertise in AWS services such as EC2, S3, RDS, IAM, VPC, Lambda, CloudWatch, etc.
  • Strong knowledge of Kubernetes and container orchestration best practices.
  • Experience managing services on Amazon ECS (Fargate or EC2).
  • Proficient in infrastructure-as-code tools like Terraform, CloudFormation, or Pulumi.
  • Skilled in scripting languages such as Python, Bash, or Go.
  • Solid grasp of networking, load balancing, DNS, and firewall rules in cloud environments.
  • Deep understanding of microservices architectures, API gateways, and service meshes.
  • Soft Skills

  • Proven leadership and cross-functional collaboration skills.
  • Strong problem-solving and incident-resolution mindset.
  • Clear communication, documentation, and stakeholder reporting abilities.
  • Passion for continuous improvement and automation.
  • Preferred Qualifications

  • AWS certifications such as AWS Certified DevOps Engineer, Solutions Architect – Professional, or equivalent.
  • Familiarity with service meshes like Istio or Linkerd.
  • Experience with serverless architectures and event-driven systems.
  • Knowledge of regulatory compliance (SOC2, ISO 27001, GDPR) in cloud environments.
  • Skills – AWS Cloud, CICD, EC2, Kubernete, Grafana, Datadog, Python

    Create a job alert for this search

    Performance Engineer • Visakhapatnam, Republic Of India, IN

    Related jobs
    • Promoted
    Systems Engineer – Enterprise Vault Specialist

    Systems Engineer – Enterprise Vault Specialist

    CMK Resources, Inc.Visakhapatnam, IN
    CMK Resources is looking for an Enterprise Vault Specialist for our partner in India.This is a 12-month contract supporting archiving and data migration activities within the Enterprise Vault envir...Show moreLast updated: 6 days ago
    • Promoted
    Consulting Systems Engineer

    Consulting Systems Engineer

    World Wide TechnologyVisakhapatnam, IN
    Pune or Bangalore (50% travel).As a Presales Consultant / Consulting Systems Engineer (CSE), you will be partnering with Client Executives / Client and Account Managers to provide the technical Pre-S...Show moreLast updated: 6 days ago
    • Promoted
    CX Solutions Engineer

    CX Solutions Engineer

    Prudentica Consulting LLPVisakhapatnam, IN
    We are a leading Customer Experience (CX) solutions provider, specializing in delivering world-class cloud contact center implementations and managed services. Our team builds intelligent, scalable,...Show moreLast updated: 19 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CitNOW GroupVisakhapatnam, IN
    Founded in 2008, CitNOW is an innovative, enterprise-level software product suite that allows automotive dealerships globally to sell more vehicles and parts more profitably.CitNOW’s app-based plat...Show moreLast updated: 5 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IntraEdgeVisakhapatnam, IN
    Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Strategic thinking with a focus on long-term operational excellence.Champion operation...Show moreLast updated: 19 days ago
    • Promoted
    Lead Engineer

    Lead Engineer

    HyqooVisakhapatnam, IN
    Design, deploy, and manage AWS cloud infrastructure, including EC2 instances, S3 buckets, VPCs, RDS databases, and Lambda functions. Assist in the design, implementation, and maintenance of backup, ...Show moreLast updated: 1 day ago
    • Promoted
    DevOps / Platform Engineer

    DevOps / Platform Engineer

    iVedha Inc.Visakhapatnam, IN
    Hiring a seasoned DevOps / Platform Engineer to drive automation, platform reliability, and robust.Design, deploy, and manage CI / CD pipelines and infrastructure automation, leveraging AI for.Implemen...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CapgeminiVisakhapatnam, IN
    Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues...Show moreLast updated: 16 days ago
    • Promoted
    Senior Site Reliability Engineer- ELK Expert

    Senior Site Reliability Engineer- ELK Expert

    iVedha Inc.Visakhapatnam, IN
    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice.Must be available to work in the EST (US / Canada) Time Zone. Are you a Senior Site Reliability Engineer (SRE) with ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior System Engineer

    Senior System Engineer

    C-DOT Systems Pvt Ltd, PuneVisakhapatnam, IN
    Job Title : L2 Support Engineer (Field Support).Work Mode : Onsite / Field Support.The L2 Support Engineer will be responsible for field support, implementation, and troubleshooting of IT infrastruct...Show moreLast updated: 2 hours ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Sails Software IncVisakhapatnam, Andhra Pradesh, India
    We are looking for an experienced and driven Senior Site Reliability Engineer (SRE) to architect, implement, and maintain robust cloud infrastructure. This role demands a deep understanding of AWS, ...Show moreLast updated: 30+ days ago
    • Promoted
    Sr Systems Engineer Linux – AI Infrastructure

    Sr Systems Engineer Linux – AI Infrastructure

    DC Tech ConsultingVisakhapatnam, IN
    Position : Senior Linux Administrator – AI / ML Infrastructure.We are seeking a highly skilled Senior Linux Administrator to join our team, focusing on the implementation and management of on-premises...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Specialist Systems Engineer - MSSQL-Onsite / Bengaluru

    Specialist Systems Engineer - MSSQL-Onsite / Bengaluru

    IHVisakhapatnam, IN
    Must have good team-working skills balanced with the ability to work in shifts (24x7 rotational) or independently, with a positive attitude. Strong knowledge of Production Database Operations and un...Show moreLast updated: 2 hours ago
    • Promoted
    Linux Engineer

    Linux Engineer

    RecroVisakhapatnam, IN
    As a Senior Software Engineer at Nasuni, you will play a key role in enhancing our cloud-scale NAS platform.Your responsibilities will include : . Collaborating on requirements analysis, architecture ...Show moreLast updated: 27 days ago
    • Promoted
    Compliance Engineer - Sustainability Compliance (Remote)

    Compliance Engineer - Sustainability Compliance (Remote)

    CertivoVisakhapatnam, IN
    Remote
    Certivo turns regulatory evidence into market access.Our AI, CORA, automates supplier outreach, data extraction, and rule checks, then assembles market-ready packets mapped to every product × site ...Show moreLast updated: 17 days ago
    • Promoted
    • New!
    Site Reliability Engineer

    Site Reliability Engineer

    SynamediaVisakhapatnam, IN
    At Synamedia, the world’s most talented innovators and trailblazers are shaping the way the world is entertained and informed. We are backed by the Permira funds and Sky.This is the age of infinite ...Show moreLast updated: 2 hours ago
    • Promoted
    Site Reliability Engineer (SRE) – Infrastructure & Automation

    Site Reliability Engineer (SRE) – Infrastructure & Automation

    InstaServiceVisakhapatnam, IN
    InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding...Show moreLast updated: 4 days ago
    • Promoted
    • New!
    Specialist Systems Engineer-DataSatge-Onsite / Bengaluru & Chennai

    Specialist Systems Engineer-DataSatge-Onsite / Bengaluru & Chennai

    IHVisakhapatnam, IN
    Please find the Informatica Administrator JD (6+ years of relevant experience).Excellent administration skills with tools like IBM Datastage Informatica,. SQL Developer, Putty & other client support...Show moreLast updated: 2 hours ago