At Nebula Tech Solutions , we’re expanding our global reliability engineering team to support mission-critical systems for our US-based enterprise clients during night shifts only .
We’re looking for experienced DevOps / SRE professionals (5+ years) who bring hands-on depth in Kubernetes, monitoring / metrics, and coding — not just infrastructure management.
This is a role for engineers who thrive on troubleshooting, automation, and continuous improvement in high-availability environments. 🌎🌙
🔧 What You’ll Do
✅ Build, optimize, and maintain Kubernetes clusters (EKS / GKE / AKS) for scalability and resilience
✅ Design and improve CI / CD pipelines (Jenkins, ArgoCD, FluxCD, Harness, GitHub Actions)
✅ Implement and extend observability using Prometheus, Grafana, OpenTelemetry, and custom metrics
✅ Develop and maintain internal tools and automations using Python, Go, or similar programming languages
✅ Drive incident response, RCA, and reliability improvements across services
✅ Collaborate with global teams to ensure continuous uptime and performance
🧩 What We’re Looking For
🔹 5+ years of DevOps / SRE / Platform Engineering experience
🔹 Deep, hands-on knowledge of Kubernetes architecture, deployments, debugging, and scaling
🔹 Strong programming or scripting skills in Python, Go, Java, or Node.js (beyond shell scripting)
🔹 Proven experience with monitoring and telemetry systems (Prometheus, Grafana, ELK, OpenTelemetry)
🔹 Understanding of web services, REST APIs, and distributed systems troubleshooting
🔹 Familiarity with Terraform, Helm, and GitOps workflows (FluxCD / ArgoCD)
🌟 Bonus Points
📍 Location : Remote (India)
🕐 Shift : US Night Shift (Continuous)
🌍 Client : US-based Enterprise (Global Scale)
If you love solving complex reliability challenges, enjoy scripting and building automation, and want to work with globally distributed systems — we’d love to hear from you. 🚀
Sr Engineer • thiruvananthapuram, kerala, in