About Saarthee :
Saarthee is a Global Strategy, Analytics, Technology and AI consulting company, where our passion for helping others fuels our approach and our products and solutions. Our diverse and global team work with one objective in mind : Our Customers Success. At Saarthee, we are passionate about guiding organizations to wards insights fueled success. Thats why we call ourselves Saartheeinspired by the Sanskrit word Saarthi, which means charioteer, trusted guide, or companion. Cofounded in 2015 by Mrinal Prasad and Shikha Miglani, Saarthee already encompasses all the components of Data Analytics consulting. Saarthee is based out of Philadelphia, USA with office in UK and India.
Position Summary :
We are seeking a skilled Site Reliability Engineer (SRE) with expertise in Python, Shell Scripting, and cloud infrastructure. This role involves ensuring the reliability, scalability, and performance of critical applications and infrastructure while collaborating with globally distributed engineering, product, and operations teams. The ideal candidate will be passionate about automation, operational excellence, and continuous process improvement.
Your Role Responsibilities and Duties :
- Ensure reliability, scalability, and efficiency of systems under management.
- Develop and maintain automation scripts using Python and Shell.
- Support and optimize AWS-based infrastructure and topologies.
- Implement and manage monitoring and alerting systems (Prometheus, Grafana, Datadog, NewRelic, Splunk).
- Collaborate on incident response, performance optimization, and root cause analysis.
- Work closely with development and QA teams to improve resiliency and maintainability of production systems.
- Engage in system and solution architecture discussions for complex technical problems.
- Drive CI / CD pipelines and version control using Git and Jenkins.
- Deploy and manage containerized environments using Docker and Kubernetes.
- Work in an Agile SCRUM environment with strong focus on end-user scenarios.
Required Skills and Qualifications :
Mandatory :
Bachelors or Masters degree in Computer Science, IT, or related field.6-8 years of experience in Site Reliability Engineering.Strong skills in Python and Shell Scripting.Hands-on experience with AWS infrastructure and services.Strong knowledge of networking concepts and protocols (TCP / IP, DNS, load balancing).Proficiency in monitoring and alerting tools (Prometheus, Grafana, Datadog, Splunk, NewRelic).Experience with CI / CD tools (Jenkins) and version control (GitHub).Familiarity with containerization (Docker) and orchestration (Kubernetes).Strong analytical and troubleshooting skills.Good understanding of DevOps principles and Agile methodology.Technologies We Use :
Python, Shell Scripting, AWS, Prometheus, Grafana, Datadog, Splunk, NewRelic, GitHub, Jenkins, Docker, Kubernetes, Agile SCRUMWhat we Offer :
Bootstrapped and financially stable with high pre-money evaluation.Above industry renumerations.Additional compensation tied to Renewal and Pilot Project Execution.Additional lucrative business development compensation.Firm building opportunities that offer stage for holistic professional development, growth, and branding.Empathetic, excellence and result driven organization. Believes in mentoring and growing a team with constant emphasis on learning.(ref : hirist.tech)