Talent.com
Site Reliability Engineer

Site Reliability Engineer

Search Synergy Pvt LtdMumbai
30+ days ago
Job description

Note - Location - Dadar / Kurla (Mumbai)

Skill, Knowledge &Trainings :

  • Own and manage the CI / CD pipelines for automated build, test, and deployment.
  • Design and implement robust deployment strategies for microservices and web applications.
  • Set up and maintain monitoring, alerting, and logging frameworks (e.g., Prometheus, Grafana, ELK)
  • Build automations which will help optimize software delivery.
  • Improve reliability, quality, and time-to-market of our suite of software solutions.
  • Will be responsible for availability, latency, performance efficiency, change management, monitoring, emergency response and capacity planning.
  • Will create services that will do automatic provisioning of test environments, automation of release management process, setting up pre-emptive monitoring of logs and creating dashboards for metrics visualisations
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Run our infrastructure with Gitlab CI / CD, Kubernetes, Kafka, NGINX and ELK stack.
  • Co-ordinate with infra teams and developers to improvise the incident management process.
  • Responsible for L1 support as well.
  • Good Communication and Presentation skills

Core Competencies(Must Have) :

  • Elastic, Logstash, Kibana or AppDynamics
  • CI / CD - Gitlab / Jenkins
  • Other KeySkills :

  • SSO technologies
  • Ansible
  • Python
  • Linux Administration
  • Additional Competencies (Nice to have) :

  • Kubernetes
  • Kafka, MQ
  • NGINX or APIGEE
  • Redis
  • Experience in working with outsourced vendor teams for application development
  • Appreciation of Enterprise Functional Architecture in Capital
  • Job Purpose :

    We are looking for a skilled and proactive Site Reliability Engineer (SRE) with strong expertise in deployment automation, monitoring, and infrastructure reliability. The ideal candidate will be responsible for managing the end-to-end deployment lifecycle, ensuring the availability, scalability, and performance of our production and non-production environments.

    Area of Operations Key Responsibility :

  • Deployment & Release Management Own and manage the CI / CD pipelines for automated build, test, and deployment.
  • Design and implement robust deployment strategies for microservices and web applications. Monitor and troubleshoot deployment issues and rollbacks, ensuring zero-downtime deployment where possible
  • System Reliability & Performance Set up and maintain monitoring, alerting, and logging frameworks (e.g., Prometheus, Grafana, ELK)
  • Any Other Requirement :

  • Should be a good team player.
  • Would be required to work with multiple projects / teams concurrently
  • (ref : hirist.tech)

    Create a job alert for this search

    Site Reliability Engineer • Mumbai