Pearson is looking for a motivated and eager Associate Site Reliability Engineer (SRE) to join our team. This is a foundational role where you'll acquire and hone essential SRE skills under the close mentorship of experienced engineers. You'll gain hands-on experience across various facets of site reliability and cloud engineering, contributing to our mission of ensuring robust and reliable systems.
Key Responsibilities
- Cloud Fundamentals : Build a foundational understanding of cloud design, hosting, and delivery in AWS, GCP, and / or Azure. Contribute to CI / CD pipelines and developing Infrastructure as Code (IaC) for our products and services. Gain an understanding of the vast array of service offerings from our cloud provider partners.
- Tooling & Workflow : Build proficiency in the team tech stack tooling to automate provisioning and manage infrastructure components efficiently using IaC. Use GIT to apply best practices for version control, branching, and collaborative development. Utilize Jira effectively for issue tracking and streamlining workflows.
- Automation : Acquire scripting skills to automate routine tasks, data collection, and deployments, streamlining operations and enhancing efficiency.
- Peer Review : Participate in the code review process, scrutinizing contributions from peers and receiving valuable feedback to continually improve coding and troubleshooting skills.
- Security Protocols : Under guidance from experienced SREs, gain familiarity with security measures and assist in their implementation to safeguard systems.
- Monitoring & Alerting : Contribute to setting up, configuring, and maintaining monitoring and alerting systems, focusing on understanding and improving key performance indicators (KPIs) for system reliability.
- Incident Response : Collaborate with other engineers to diagnose and resolve incidents, involving data gathering, issue tracking, and problem-solving.
- Post-Incident Reviews : Actively engage in post-incident discussions to understand root causes and learn from insights shared by senior team members.
- Collaboration : Foster collaboration with team members across various roles, including developers, operations, and other SREs, sharing knowledge and working towards team objectives.
- Basic Troubleshooting : Develop skills in identifying and resolving straightforward issues using monitoring tools and logs.
- Cost Optimization : Assist in collecting, analyzing, and interpreting cloud cost data to identify trends, anomalies, and cost-saving opportunities.
- Agile & Scrum Practices : Learn and develop Agile methodologies and Scrum frameworks, actively participating in sprint planning, daily stand-ups, and sprint reviews.
- Documentation : Contribute to the creation and updating of procedural guides, processes, and troubleshooting documentation.
- On-Call Support : Participate in the on-call rotation and learn how to effectively respond to incidents and troubleshoot issues under high-pressure scenarios.
Skills & Qualifications
Ability to work effectively under pressure.Basic understanding of system monitoring tools.Eagerness to learn and adapt.Strong communication skills.Receptiveness to constructive feedback.Foundational knowledge of cloud services and essential networking principles.Good time management skills.Skills Required
cloud platform , Automation, Security Protocols, Troubleshooting, Agile Methodology