Talent.com
This job offer is not available in your country.
Reveille Technologies - Site Reliability Engineer - DevOps

Reveille Technologies - Site Reliability Engineer - DevOps

Reveille TechnologiesPune
30+ days ago
Job description

Job Summary :

We are seeking a skilled and proactive Site Reliability Engineer (SRE) with a strong DevOps mindset and hands-on experience in application troubleshooting. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our applications and infrastructure. This role requires a blend of software engineering, system administration, and operational expertise, with a focus on automating processes and proactively resolving issues.

Key Responsibilities :

Site Reliability & Automation :

  • Design and implement tools to automate infrastructure provisioning, application deployment, and operational tasks.
  • Build and manage CI / CD pipelines using Jenkins to ensure seamless and efficient software delivery.
  • Utilize a strong understanding of Linux to maintain and troubleshoot server environments, including certificate renewals.

Monitoring & Troubleshooting :

  • Implement and manage monitoring solutions using tools like Splunk or Dynatrace to create dashboards, set up alerting, and execute log queries for proactive issue detection.
  • Perform application troubleshooting, debugging, and root cause analysis to resolve complex incidents promptly.
  • Leverage SQL (DML & SELECT queries) to analyze application data for performance and troubleshooting insights.
  • Process & Collaboration :

  • Apply ITIL / ITSM principles for effective incident, problem, and change management.
  • Collaborate closely with development, quality assurance, and product teams to improve system reliability.
  • Manage and track code changes using Git or Bitbucket.
  • Required Skills :

    Core Technical Skills :

  • 5-8 years of experience in an SRE, DevOps, or similar role.
  • Strong proficiency in at least one scripting language : Shell, Groovy, or YAML.
  • Expertise in monitoring tools like Splunk or Dynatrace for alerting, dashboarding, and log analysis.
  • Hands-on experience with CI / CD tools, specifically Jenkins.
  • System & Infrastructure :

  • Strong understanding of Linux system administration.
  • Basic exposure to cloud environments, with AWS being preferred.
  • Process & Data :

  • Basic knowledge of ITIL / ITSM concepts (Incident, Problem, Change Management).
  • Proficiency in SQL (DML and SELECT queries).
  • Preferred Skills :

  • Experience with configuration management tools like Ansible or Chef.
  • Hands-on experience with Docker and Kubernetes for container orchestration.
  • Knowledge of other monitoring tools such as Prometheus or Grafana.
  • Relevant certifications in Linux or cloud platforms.
  • Strong problem-solving and analytical skills, with a proactive attitude toward identifying and resolving issues.
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Pune