Talent.com
Kubernetes Expert

Kubernetes Expert

MicrocorewareChennai
3 days ago
Job description

Description : Role Overview :

We are looking for an experienced Kubernetes with strong expertise in Kubernetes clusters, cloud-native technologies, storage integration, and performance optimisation. The ideal candidate should have hands-on experience in designing, deploying, and managing large-scale Kubernetes environments across on-prem and cloud platforms, along with troubleshooting complex containerised workloads.

Key Responsibilities :

Cluster Management & Deployment :

  • Provision and manage Kubernetes clusters using kubeadm, RKE2, and Cluster API across cloud platforms (AWS, Azure, GCP, OpenStack).
  • Deploy, scale, and upgrade applications using Kubernetes best practices (rolling updates, probes, HPA, VPA).
  • Configure node scheduling strategies using taints, tolerations, and affinity rules.

Application Deployment & Troubleshooting :

  • Debug CrashLoopBackOff and pod failures using kubectl logs, events, and resource monitoring.
  • Troubleshoot networking, persistent volumes, and service exposure issues (ClusterIP, NodePort, LoadBalancer, Ingress).
  • Debug application routing using APISIX, NGINX ingress, and multi-path routing.
  • Handle application scaling and high-traffic scenarios using autoscalers.
  • Storage & Data Management :

  • Integrate Ceph storage with Kubernetes via CSI drivers for block and filesystem provisioning.
  • Troubleshoot PersistentVolume (PV) and PersistentVolumeClaim (PVC) issues.
  • Observability & Performance :

  • Deploy and configure monitoring solutions such as Prometheus and Metrics Server.
  • Benchmark cluster and workload performance (CPU, memory, networking).
  • Enable log collection and analysis for multi-container pods.
  • Security & Networking :

  • Manage authentication and RBAC policies within Kubernetes.
  • Configure isolation for virtual Kubernetes clusters (vcluster).
  • Handle registry authentication (AWS ECR, private registries) using image pull secrets.
  • Specialized Workloads :

  • Deploy and manage GPU workloads using NVIDIA GPU Operator.
  • Enable GPU scheduling and resource allocation for AI / ML workloads.
  • Operations & Maintenance :

  • Troubleshoot faulty nodes (on-prem / cloud) including CPU, memory, disk, and kubelet health.
  • Work on service routing, ingress configurations, and debugging cloud load balancer / firewall issues.
  • Perform rolling upgrades and ensure zero-downtime deployments.
  • Required Skills :

  • Strong expertise in Kubernetes administration and cloud-native deployments.
  • Hands-on experience with kubeadm, RKE2, Cluster API, and Terraform for cluster provisioning.
  • Knowledge of storage integration with Ceph and CSI drivers.
  • Experience with monitoring and observability tools (Prometheus, Grafana, Metrics Server).
  • Strong debugging skills for pod crashes, networking issues, and persistent storage problems.
  • Knowledge of NGINX ingress, APISIX, and traffic routing.
  • Understanding of RBAC, security groups, and IAM policies in Kubernetes & cloud.
  • Experience with GPU workloads in Kubernetes.
  • Familiarity with CI / CD pipelines for Kubernetes deployments is a plus.
  • Preferred Qualifications :

  • 4+ years of hands-on experience in Kubernetes roles.
  • Experience in both managed (EKS, AKS, GKE) and on-prem Kubernetes clusters.
  • Strong scripting skills (Bash, Python, Go preferred).
  • Prior experience with infrastructure-as-code tools like Terraform, Helm, and Ansible.
  • Exposure to multi-cluster and multi-tenant environments.
  • (ref : hirist.tech)

    Create a job alert for this search

    Expert • Chennai