We are seeking an experienced Linux Administrator (L3) to join our infrastructure operations team. The ideal candidate will provide Level 3 support for mission-critical Linux systems, ensuring performance, reliability, scalability, and security across environments. You will act as the final escalation point for complex system issues, lead automation initiatives, optimize infrastructure performance, and collaborate with cross-functional teams to maintain robust enterprise systems.
Key Responsibilities :
- Provide L3 (Level 3) support for Linux environments, resolving complex technical issues escalated from L1 and L2 teams.
- Diagnose and troubleshoot advanced system-level issues related to servers, networking, storage, and kernel performance.
- Perform in-depth root cause analysis (RCA) for recurring issues and implement permanent fixes.
- Manage and maintain production and development Linux systems across multiple environments (physical, virtual, and cloud).
- Oversee system health, performance, and capacity using monitoring tools such as Nagios, Zabbix, or Prometheus.
- Perform routine system administration tasks including patching, configuration management, and backup validation.
- Proactively identify and resolve performance bottlenecks in CPU, memory, disk I / O, and network utilization.
- Manage kernel upgrades, patching, and system updates with minimal downtime.
- Design, deploy, and maintain high-availability (HA), disaster recovery (DR), and scalable Linux-based systems.
- Contribute to the architecture and implementation of resilient infrastructure solutions aligned with business goals.
- Work closely with network, security, and application teams to ensure seamless integration and uptime.
- Automate administrative and operational tasks using Bash, Python, or Perl scripting.
- Implement and manage configuration management and orchestration tools such as Ansible, Puppet, or Chef.
- Develop scripts and playbooks to improve deployment speed, consistency, and system efficiency.
- Implement Linux hardening procedures and ensure adherence to internal and external compliance standards.
- Manage access control, user permissions, and security patch deployment. Monitor and remediate vulnerabilities using industry best practices and tools.
- Participate in periodic security audits and ensure timely closure of findings.
- Maintain detailed documentation of system configurations, policies, and standard operating procedures.
- Create knowledge base articles and troubleshooting guides for L1 / L2 teams.
- Ensure incident response and escalation procedures are followed efficiently to minimize downtime.
Required Skills & Qualification :
Operating Systems : Expert in Linux distributions such as Red Hat Enterprise Linux (RHEL), CentOS, Ubuntu, or SUSE.Scripting & Automation : Strong scripting skills in Bash, Python, or Perl for automation and task orchestration.Configuration Management : Hands-on experience with Ansible, Puppet, or Chef.Monitoring & Performance : Experience using monitoring tools such as Nagios, Zabbix, Prometheus, or Grafana.Networking : Good understanding of TCP / IP, DNS, DHCP, VPN, firewalls, and load balancing concepts.Storage & Backup : Familiarity with SAN / NAS, RAID configurations, LVM, and enterprise backup solutions.Cloud & Virtualization : Experience with VMware, KVM, or cloud platforms like AWS, Azure, or GCP.Red Hat Certified Engineer (RHCE) or Red Hat Certified System Administrator (RHCSA).Experience with containerization technologies (Docker, Kubernetes).Familiarity with CI / CD pipelines, Git, and infrastructure as code (IaC) practices.Exposure to disaster recovery, capacity planning, and load testing methodologies.(ref : hirist.tech)