Company Description
ThreatXIntel is a startup cyber security company focused on delivering customized, affordable solutions to protect businesses and organizations from cyber threats. Our experienced team specializes in cloud security, web and mobile security testing, cloud security assessment, DevSecOps, and other critical areas. By taking a proactive approach, we monitor and test clients' digital environments to identify vulnerabilities before they can be exploited. Dedicated to making high-quality cybersecurity solutions accessible to startups and small businesses, we enable clients to safeguard their digital assets while they grow their operations.
Role Description
We are seeking a highly skilled Site Reliability Engineer / DevOps Engineer with 6+ years of overall experience and 4+ years of hands-on expertise in cloud infrastructure, automation, and large-scale distributed systems. This is a demanding, hands-on role suited for engineers who thrive in fast-paced environments and are passionate about building reliable, scalable, and automated systems.
You will be responsible for infrastructure design, automation, CI / CD optimization, Kubernetes management, and ensuring high system availability across mission-critical environments.
Key Responsibilities
Design, deploy, and maintain scalable cloud infrastructure on Google Cloud Platform (GCP) .
Manage and optimize containerized applications using Kubernetes ( GKE preferred ).
Automate provisioning, deployments, and configurations using Terraform, Python, and Shell scripting .
Monitor reliability, performance, and system uptime; respond to incidents, outages, and production issues .
Build and maintain CI / CD pipelines to support fast and reliable software delivery.
Ensure strong practices for backup, disaster recovery, and cloud security .
Collaborate closely with development teams to optimize performance, reduce downtime, and improve overall system stability.
Must-Have Skills
4+ years of hands-on experience in SRE, DevOps, or Infrastructure Engineering roles.
Expert-level experience with GCP (IAM, Compute Engine, GKE, Cloud Storage, VPC, Load Balancing, etc.).
Deep understanding of Kubernetes (deployments, scaling, troubleshooting, Helm charts, operators).
Strong scripting skills : Shell & Python .
Proficiency with CI / CD tools , monitoring, and logging systems such as :
GitLab CI / GitHub Actions
Prometheus
Grafana
Stackdriver / Cloud Monitoring
Site Reliability Engineer • Nellore, Andhra Pradesh, India