Talent.com
This job offer is not available in your country.
AI Operations Engineers

AI Operations Engineers

GSPANNhyderabad, India
4 hours ago
Job description

Description GSPANN is hiring AI Operations Engineers to build monitoring solutions, automate workflows, and apply AIOps practices for reliable, high-performing systems. Expertise in AppDynamics, Sumo Logic, automation, and Site Reliability Engineering (SRE) is crucial.

Role and Responsibilities

  • Architect and deploy monitoring solutions across applications, infrastructure, and networks using tools such as AppDynamics, Sumo Logic, Grafana, and LogicMonitor.
  • Define and enforce observability best practices, including infrastructure monitoring, Application Performance Monitoring (APM), log analytics, and synthetic monitoring.
  • Build and maintain dashboards, alerts, and Key Performance Indicator (KPI) reports that provide actionable insights for technical and business stakeholders.
  • Write and optimize log queries (e.g., Sumo Logic queries, Dynatrace DQL, Grafana Loki, Splunk SPL) to extract meaningful data and support root cause analysis.
  • Monitor and improve performance in cloud environments such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), as well as containerized platforms like Kubernetes and Docker.
  • Develop automation scripts and workflows using Python, PowerShell, or Shell scripting to enable self-healing systems and streamline monitoring operations.
  • Integrate monitoring tools with Information Technology Service Management (ITSM) platforms, including ServiceNow, Ivanti, and FreshService, to automate incident detection, ticketing, and resolution workflows.
  • Perform deep-dive troubleshooting and analysis by correlating data across multiple monitoring sources to uncover performance issues and anomalies.
  • Collaborate with DevOps, infrastructure, and application teams to ensure full monitoring coverage and continuous improvement of observability practices.

Skills and Experience

  • 5–8 years of experience in monitoring, observability, or AIOps engineering roles.
  • Design, implement, and configure monitoring solutions across applications, infrastructure, and networks using AppDynamics, Sumo Logic, and Grafana.
  • Strong understanding of monitoring methodologies, including infrastructure monitoring, APM, log analytics, and synthetic monitoring.
  • Hands-on experience in LogicMonitor, particularly for infrastructure monitoring.
  • Expertise in dashboard creation, alerting, and KPI reporting for business and technical audiences.
  • Proficiency in log query languages such as Sumo Logic queries, Dynatrace DQL, Grafana Loki, or Splunk SPL.
  • Familiarity with AWS, Azure, GCP, and containerized workloads on Kubernetes and Docker.
  • Working knowledge of scripting and automation with Python, PowerShell, or Shell scripting for integrations and self-healing automation.
  • Strong analytical and troubleshooting abilities to correlate data across monitoring sources.
  • Experience integrating monitoring tools with ITSM platforms such as ServiceNow, Ivanti, or FreshService to support automated workflows.
  • Create a job alert for this search

    Ai Engineer • hyderabad, India