Company Description
ThreatXIntel is a startup cyber security company focused on delivering customized, affordable solutions to protect businesses and organizations from cyber threats. Our experienced team specializes in cloud security, web and mobile security testing, cloud security assessment, DevSecOps, and other critical areas. By taking a proactive approach, we monitor and test clients' digital environments to identify vulnerabilities before they can be exploited. Dedicated to making high-quality cybersecurity solutions accessible to startups and small businesses, we enable clients to safeguard their digital assets while they grow their operations.
Role Description
We are seeking a highly skilled
Site Reliability Engineer / DevOps Engineer
with
6+ years of overall experience
and
4+ years of hands-on expertise
in cloud infrastructure, automation, and large-scale distributed systems. This is a demanding, hands-on role suited for engineers who thrive in fast-paced environments and are passionate about building reliable, scalable, and automated systems.
You will be responsible for infrastructure design, automation, CI / CD optimization, Kubernetes management, and ensuring high system availability across mission-critical environments.
Key Responsibilities
Design, deploy, and maintain scalable cloud infrastructure on
Google Cloud Platform (GCP) .
Manage and optimize
containerized applications
using Kubernetes ( GKE preferred ).
Automate provisioning, deployments, and configurations using
Terraform, Python, and Shell scripting .
Monitor reliability, performance, and system uptime; respond to
incidents, outages, and production issues .
Build and maintain
CI / CD pipelines
to support fast and reliable software delivery.
Ensure strong practices for
backup, disaster recovery, and cloud security .
Collaborate closely with development teams to optimize performance, reduce downtime, and improve overall system stability.
Must-Have Skills
4+ years
of hands-on experience in
SRE, DevOps, or Infrastructure Engineering
roles.
Expert-level experience with
GCP
(IAM, Compute Engine, GKE, Cloud Storage, VPC, Load Balancing, etc.).
Deep understanding of
Kubernetes
(deployments, scaling, troubleshooting, Helm charts, operators).
Strong scripting skills :
Shell & Python .
Proficiency with
CI / CD tools , monitoring, and logging systems such as :
GitLab CI / GitHub Actions
Prometheus
Grafana
Stackdriver / Cloud Monitoring
Site Reliability Engineer • Delhi, India