Talent.com
This job offer is not available in your country.
Lead AI Ops & Automation Engineer

Lead AI Ops & Automation Engineer

TEAM GEEK SOLUTIONS PRIVATE LIMITEDPune
30+ days ago
Job description

We are seeking a dynamic and experienced Lead AIOps and Automation Engineer to drive the design, implementation, and optimization of cutting-edge automation and AIOps solutions.

This role will involve leveraging Amelia AIOps, cloud technologies, and integration frameworks to enhance IT operations, scalability, and resilience. As a key member of our team, you will lead initiatives that modernize IT, streamline processes, and enable proactive decision-making through intelligent automation.

  • Design and implement a new, modern DR Orchestration & Automation platform using the Amelia

AIOps SaaS product to transform IT Operations

  • Configure and set up integrations with various infrastructure services.
  • Develop APIs and connectors for enabling automated orchestration across diverse platforms.
  • Migrate existing DR workflows from the legacy platform to the new Amelia-based platform.
  • Develop and implement new DR Automation workflows to meet emerging business demands.
  • Develop and integrate automation solutions using Amelia AIOps platform to optimize IT and operational processes.
  • Build and maintain integrations between IT systems, cloud platforms, and third-party services to
  • enable seamless data exchange and workflows.

  • Implement GitOps practices to streamline and automate deployment processes, ensuring consistency and reliability.
  • Ensure compliance with security policies and best practices across all automation and cloud initiatives.
  • Design, develop, and implement automation solutions to support DR processes for both infrastructure and business-critical applications.
  • Create modular, scalable, and reusable automation scripts and workflows using Amelia AIOPs and integrations like Ansible, PowerShell, Python, Terraform, or Azure Automation.
  • Ensure automation scripts can handle failover, failback, environment validation, and recovery steps end-to-end.
  • Automate provisioning, backup, snapshotting, and recovery of virtual machines, containers, storage, and network configurations.
  • Integrate with existing CI / CD pipelines, monitoring systems (e.g., Dynatrace), and configuration management tools.
  • Automate health checks, post-recovery validations, and real-time status reporting of DR operations.
  • Build dashboards and alert mechanisms to monitor the success / failure of automated DR jobs.
  • Proven experience with Amelia AIOps SaaS product, specializing in engineering DR Orchestration and Automation in large scale environments
  • 7-8 years of experience working in the AIOps Automation field
  • Strong background in setting up DR Orchestration platforms for diverse organizations using Amelia
  • and infra automation tools

  • Expertise in configuring real time integrations with multiple infrastructure services on-prem and cloud.
  • Experience in migrating existing DR workflows to new platforms.
  • Experience in automating disaster recovery processes for :

  • Infrastructure (VMs, storage, network, etc.)
  • Business applications (databases, services, containers, etc.)
  • Familiarity with DR orchestration tools (e.g., Azure Site Recovery, Commvault , EMC Data Domain , VMware SRM).
  • Proficiency in working across cloud environments (Azure, AWS, ) and hybrid infrastructure.
  • Knowledge of backup strategies, snapshot automation, and replication mechanisms.
  • Strong understanding of high availability, RPO / RTO, and BC / DR principles.
  • Understanding of network protocols, security, and cloud networking.
  • Experience with Disaster Recovery (DR) planning and experience with :
  • Backup & restore automation using Commvault, Data Domain , RMAN..etc
  • DR orchestration tools such as Azure Site Recovery, VMware SRM, Cloud Endure
  • Simulating failover / failback and automating recovery playbooks
  • Automating DR for databases (SQL, NoSQL), middleware, and containerized applications.
  • Proficiency with Infra as code and automation tools (e.g., Ansible, Puppet, Terraform, Chef etc).
  • Expertise in cloud platforms (AWS, Azure, GCP), with hands-on experience in automation and orchestration.
  • Solid understanding of APIs, web services, and integration technologies (e.g., REST, GraphQL, Kafka). Micro services architecture
  • Proficiency in scripting / programming languages (Python, Java,Bash, etc.).
  • Familiarity with observability tools (e.g., Splunk, Dynatrace, New Relic) and ITSM tools (e.g., ServiceNow) Knowledge of monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or similar.
  • Experience working with DevOps including but not limited to container technologies like Docker & Kubernetes, as well as Cloud Native technology stack such as Argo, Helm, etcd, and Envoy
  • Extensive experience with infrastructure as code (IaC) tools such as Ansible, Terraform, or CloudFormation.
  • Strong communication skills to collaborate with cross-functional teams and problem solving
  • ref : hirist.tech)

    Create a job alert for this search

    Lead Ai Engineer • Pune