At Nebula Tech Solutions, we’re expanding our global reliability engineering team to support mission-critical systems for our US-based enterprise clients during night shifts only.
We’re looking for experienced DevOps / SRE professionals (5+ years) who bring hands-on depth in Kubernetes, monitoring / metrics, and coding — not just infrastructure management.
This is a role for engineers who thrive on troubleshooting, automation, and continuous improvement in high-availability environments.
What You’ll Do
✅ Build, optimize, and maintain Kubernetes clusters (EKS / GKE / AKS) for scalability and resilience
✅ Design and improve CI / CD pipelines (Jenkins, ArgoCD, FluxCD, Harness, GitHub Actions)
✅ Implement and extend observability using Prometheus, Grafana, OpenTelemetry, and custom metrics
✅ Develop and maintain internal tools and automations using Python, Go, or similar programming languages
✅ Drive incident response, RCA, and reliability improvements across services
✅ Collaborate with global teams to ensure continuous uptime and performance
What We’re Looking For
5+ years of DevOps / SRE / Platform Engineering experience
Deep, hands-on knowledge of Kubernetes architecture, deployments, debugging, and scaling
Strong programming or scripting skills in Python, Go, Java, or Node.js (beyond shell scripting)
Proven experience with monitoring and telemetry systems (Prometheus, Grafana, ELK, OpenTelemetry)
Understanding of web services, REST APIs, and distributed systems troubleshooting
Familiarity with Terraform, Helm, and GitOps workflows (FluxCD / ArgoCD)
Bonus Points
Location : Remote (India)
Shift : US Night Shift (Continuous)
Client : US-based Enterprise (Global Scale)
If you love solving complex reliability challenges, enjoy scripting and building automation, and want to work with globally distributed systems — we’d love to hear from you.
Apply Now Engineer • Mumbai, Maharashtra, India