Job Description : System Administrator (with Strong Kubernetes Expertise)
Position Overview
We are seeking an experienced System Administrator with deep expertise in Kubernetes and modern infrastructure management. The ideal candidate will be responsible for ensuring the stability, scalability, and security of our infrastructure across on-premises and cloud environments. This role requires strong skills in Linux systems, container orchestration, and automation, along with the ability to troubleshoot complex issues in distributed systems.
Key Responsibilities : Kubernetes Operations
- Deploy, configure, upgrade, and maintain Kubernetes clusters (on-prem and / or cloud).
- Manage and optimize workloads, namespaces, RBAC, storage classes, ingress controllers, and monitoring within Kubernetes.
- Implement and maintain GitOps or CI / CD workflows for Kubernetes resource deployment.
Systems Administration
Administer Linux servers, including package management, patching, and performance tuning.Manage storage (e.g., Ceph, NFS, cloud block / file stores) and networking for production systems.Troubleshoot and resolve system, networking, and application issues at both the OS and container orchestration layers.Automation & Tooling
Build and maintain automation scripts and tooling (e.g., Ansible, Terraform, Helm, ArgoCD).Monitor systems with Prometheus, Grafana, Zabbix, or similar tools; set up alerts and dashboards.Automate backup, disaster recovery, and capacity planning tasks.Security & Compliance
Implement Kubernetes best practices for cluster security, secrets management, and network policies.Ensure compliance with company policies, audits, and industry security standards.Collaboration
Work closely with DevOps, developers, and network / security teams to support reliable application delivery.Participate in on-call rotations and incident response as needed.QualificationsRequired
3+ years of hands-on Linux systems administration experience.2+ years managing Kubernetes clusters in production (K8s internals, kube-proxy, CNI plugins, etc.).Proficiency with container runtimes (Docker, containerd) and Helm charts.Experience with infrastructure automation (Terraform, Ansible, or similar).Strong troubleshooting and debugging skills across systems, networks, and applications.Preferred
Experience with cloud providers (AWS, GCP, Azure) and hybrid deployments.Familiarity with Ceph, RadosGW, or other distributed storage.Knowledge of service mesh (Istio / Linkerd), MetalLB, or Kubernetes GPU workloads.Experience in monitoring / alerting with Prometheus, Grafana, or Zabbix.Scripting / programming knowledge (Python, Go, or Bash).Soft Skills
Strong analytical and problem-solving abilities.Excellent written and verbal communication.Ability to prioritize tasks and manage time effectively in a fast-moving environment.Collaborative mindset with a focus on reliability and scalability.What We Offer
Competitive salary and benefits package.Opportunity to work with cutting-edge infrastructure technologies (Kubernetes, Ceph, GPU compute, etc.).Collaborative, growth-oriented engineering culture.Exposure to both enterprise-grade and cloud-native environments.