Key Responsibilities :
- OpenShift Cluster Management : Administer and manage multiple OpenShift clusters across different environments (on-premise and cloud), ensuring high availability, stability, and scalability.
- Troubleshooting & Incident Resolution : Provide L2 support for OpenShift-related incidents, resolve issues escalated from L1 support, and investigate root causes of cluster or container failures.
- Cluster Provisioning : Deploy and configure new OpenShift environments , ensuring they are aligned with organizational standards and optimized for performance and security.
- Container Management : Assist developers in deploying applications using Docker containers on OpenShift , ensuring proper environment configuration and resource allocation.
- Performance Tuning & Optimization : Monitor cluster performance, optimize resources, and ensure efficient utilization of computing, storage, and networking resources.
- Security & Compliance : Enforce security best practices, including configuring role-based access control (RBAC) , security policies , and ensuring compliance with organizational and industry security standards.
- Log Management : Manage and analyze logs and metrics using OpenShift tools (e.g., OC logs , Prometheus , Grafana ) to identify performance bottlenecks and troubleshoot issues.
- Patch Management : Regularly update and patch OpenShift clusters , ensuring they are up to date with the latest features, patches, and security updates.
- Backup & Disaster Recovery : Ensure robust backup and disaster recovery strategies for OpenShift deployments to minimize downtime and data loss.
- Automation & Scripting : Automate routine tasks using Ansible , Bash , or Python scripts to improve efficiency and reduce manual intervention.
- CI / CD Pipeline Support : Support and integrate CI / CD pipelines for continuous delivery and deployment, working closely with developers to ensure smooth code deployment using tools like Jenkins or GitLab .
- Documentation & Reporting : Maintain comprehensive documentation on cluster configurations, processes, troubleshooting procedures, and best practices. Provide regular reports on system performance, incidents, and upgrades.
- Collaboration with L3 Team : Work with the L3 support team on complex issues, providing feedback, contributing to root cause analysis, and suggesting improvements to the infrastructure.
- Training & Knowledge Sharing : Provide training to L1 and junior engineers on OpenShift administration, container orchestration, and best practices.
Required Qualifications & Skills :
3-5 years of hands-on experience with OpenShift administration and management.Strong knowledge of OpenShift 3.x / 4.x architecture and components (e.g., Pods , Nodes , Deployments , Services , Ingress Controllers ).Experience with Kubernetes (since OpenShift is built on Kubernetes) and a good understanding of Kubernetes clusters , containers , and container orchestration .Expertise in Linux-based systems (RHEL, CentOS, Ubuntu) as OpenShift primarily runs on Linux.Familiarity with containerization tools like Docker and container registries.Solid understanding of CI / CD tools (e.g., Jenkins , GitLab CI , OpenShift Pipelines ) and deployment automation in a containerized environment.Experience with networking concepts like DNS , Load Balancing , Network Policies , and Ingress controllers.Familiarity with Ansible , Helm , or similar automation tools for managing OpenShift clusters.Experience with Prometheus for monitoring and Grafana for visualization in the OpenShift environment.Hands-on experience with role-based access control (RBAC) and security policies in OpenShift.Familiarity with cloud platforms like AWS , Azure , or GCP , and the deployment of OpenShift on these platforms.Strong problem-solving, troubleshooting, and debugging skills.Ability to work in a fast-paced environment, manage multiple tasks, and prioritize effectively.Strong written and verbal communication skills for interacting with developers, L1 support, and cross-functional teams.Skills Required
Azure, Aws, Jenkins