We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production :
- Systems internals / security, Linux, Network, and Monitoring
- Work to improve the reliability and performance of the next generation of distributed systems and containerized deployments
- Diagnose and troubleshoot complex distributed systems handling millions of queries per second
- Knowledge of Linux cloud services using kvm / qemu / lvm.
- Knowledge of containerization technologies like docker and deployment and troubleshooting of containers
- Understanding of cloud platform Azure, ability to set up, configure, monitor and troubleshoot various PaaS components like Firewalls, VPN gateways, Load Balancers, Storage accounts, Networks and others
- In-depth knowledge in Perl / GoLang / Python to automate tasks with minimal intervention.
- Day-to-day work is heavily command-line driven, which requires a strong understanding of Linux.
- Troubleshoot issues across the entire stack - hardware, software, application, and network
- Knowledge in Database technologies, specifically in MySQL / NoSQL is good to have.
- Participate in 24x7 on-call rotations.
- Design, build and maintain core infrastructure that enables Phonepe scaling to support hundreds of thousands of concurrent users.
- Actively take part in the Analysis and System improvement plan.
- Drive performance testing, capacity planning and high availability practices.
- Own implementations of new technologies while ensuring proper testing and documentation.
- Proactively monitor / identify / solve issues which could have a potential impact to our Infrastructure.
- Natural team player and also have a resourceful attitude.
- Buddy new team members, and get them production ready.
(ref : hirist.tech)