Cloud Associate - CloudOps
Experience : 2 to 3 years
B.Sc Computers, BE, B.Tech or with equivalent experience
Location : Mangalore (Onsite)
Skills Required
- Communication skills : Should be able to communicate effectively with both technical and non-technical audiences.
- Collaboration skills : work closely with other teams, such as development, operations, and security, so they need to be able to collaborate effectively.
- Strong hands-on experience with GCP services :
BigQuery, Dataflow, Pub / Sub, Cloud Storage, Cloud Composer, Dataproc, Cloud Functions.
Programming languages : Python, SQL (must-have).Data orchestration : Apache Airflow / Cloud Composer.Streaming & batch data processing : Pub / Sub, Dataflow, Dataproc.Experience with version control (Git), CI / CD pipelines, and Terraform (IaC).Data modeling and performance optimization for large datasets.Strong problem-solving and debugging skills.Roles and Responsibilities
Monitoring Infrastructure : Set up and maintain the monitoring infrastructure, which includes selecting and configuring monitoring tools, agents, and sensors. Ensure that monitoring systems are properly deployed and integrated with the infrastructure being monitored.System and Application Monitoring : Monitor the health, performance, and availability of systems, networks, applications, and services. Configure monitoring tools to collect relevant metrics, logs, and events. Continuously analyze monitoring data to identify trends, anomalies, and potential issues.Incident Detection and Response : Detect and analyze incidents based on monitoring data and alerts. Investigate and diagnose issues to determine the root cause. Respond promptly to incidents, follow incident management processes, and coordinate with relevant teams for timely resolution.Alerting and Escalation : Configure alerting rules and thresholds based on predefined criteria and service level objectives (SLOs). Ensure that alerts are actionable, timely, and routed to the appropriate teams or individuals. Escalate critical incidents as per the escalation procedures.Performance Analysis and Optimization : Analyze performance metrics to identify bottlenecks, resource utilization patterns, and areas for optimization. Collaborate with other teams, such as development or operations, to implement performance improvements and ensure optimal system performance.Capacity Planning : Monitor and analyze resource usage trends to forecast future capacity requirements. Collaborate with capacity planning teams to ensure that infrastructure and resources are scaled appropriately to meet growing demands.7Documentation and Reporting : Maintain accurate and up-to-date documentation related to monitoring configuration, processes, and incident response. Generate regular reports on system performance, uptime, and incident metrics to provide insights to stakeholders and management.Continuous Improvement : Stay up-to-date with the latest monitoring technologies, trends, and best practices. Continuously evaluate and improve the monitoring infrastructure, tools, and processes to enhance the effectiveness and efficiency of monitoring activities.Collaboration and Communication : Collaborate with other teams, such as development, operations, and support, to understand monitoring requirements, share insights, and coordinate incident response. Communicate monitoring findings, trends, and recommendations to relevant stakeholders.Additional Requirements
Excellent Communication skillsKnowledge of Cloud Computing preferably GCPWillingness to work in 24 / 7 environment (Mandatory)Skills Required
BigQuery, Performance Optimization, Data Modeling, Dataproc, Sql, Apache Airflow, Pub Sub, Cloud Storage, Git, Terraform, DataFlow, Python