Experience : 14+ Yrs
Job Location : Bangalore / Hyderabad (Currently remote)
Notice Period : Immediate to 15 Days
SRE :
Observability, Reliability, Monitoring, Scalability, Change Management
Cloud Providers : AWS, GCP or Azure
Tools :
Datadog + Any one of Dynatrace, Splunk, Prometheus, Grafana
Automation : Terraform, Ansible
Programming Language :
Python, Golang
Docker / Kubernetes experience is a plus
Key Responsibilities :
CI / CD Pipeline Setup :
Configure and maintain CI / CD pipelines using Jenkins and Google Cloud Build. Integrate these pipelines with GitHub for source code management and ServiceNow for change management orchestration
Core Logging & Monitoring :
Implement centralized logging and metrics collection using GCP Cloud Logging, Open Telemetry, Prometheus, and Grafana. Ensure deep, real-time visibility into the health of the data platform
Pipeline Development :
Develop and maintain data pipelines using Apache Flink and the Bootstrap ETL Library. Implement data lineage and audit logging to ensure compliance with SOX requirements 3
CI / CD Automated Testing :
Establish and maintain automated testing frameworks for data pipelines. Integrate these tests into the CI / CD pipeline to ensure high code quality and pipeline correctness
Performance Monitoring :
Configure and maintain monitoring dashboards and alerting mechanisms using Prometheus and Grafana. Conduct baseline performance testing and ensure the platform is operationally ready for production use.
Required Skills : DevOps Tools :
Jenkins, Google Cloud Build, Docker, Kubernetes, GCP, Open Telemetry, Prometheus, Grafana
Programming Languages :
Python, with experience in developing ETL pipelines and using testing frameworks like pytest.
Monitoring and Observability :
Experience in setting up and maintaining monitoring and observability solutions using Prometheus, Grafana, and Open Telemetry
Architect • Delhi, India