We are seeking a dynamic and experienced Lead AIOps and Automation Engineer to drive the design, implementation, and optimization of cutting-edge automation and AIOps solutions.
This role will involve leveraging Amelia AIOps, cloud technologies, and integration frameworks to enhance IT operations, scalability, and resilience. As a key member of our team, you will lead initiatives that modernize IT, streamline processes, and enable proactive decision-making through intelligent automation.
- Design and implement a new, modern DR Orchestration & Automation platform using the Amelia
AIOps SaaS product to transform IT Operations
Configure and set up integrations with various infrastructure services.Develop APIs and connectors for enabling automated orchestration across diverse platforms.Migrate existing DR workflows from the legacy platform to the new Amelia-based platform.Develop and implement new DR Automation workflows to meet emerging business demands.Develop and integrate automation solutions using Amelia AIOps platform to optimize IT and operational processes.Build and maintain integrations between IT systems, cloud platforms, and third-party services toenable seamless data exchange and workflows.
Implement GitOps practices to streamline and automate deployment processes, ensuring consistency and reliability.Ensure compliance with security policies and best practices across all automation and cloud initiatives.Design, develop, and implement automation solutions to support DR processes for both infrastructure and business-critical applications.Create modular, scalable, and reusable automation scripts and workflows using Amelia AIOPs and integrations like Ansible, PowerShell, Python, Terraform, or Azure Automation.Ensure automation scripts can handle failover, failback, environment validation, and recovery steps end-to-end.Automate provisioning, backup, snapshotting, and recovery of virtual machines, containers, storage, and network configurations.Integrate with existing CI / CD pipelines, monitoring systems (e.g., Dynatrace), and configuration management tools.Automate health checks, post-recovery validations, and real-time status reporting of DR operations.Build dashboards and alert mechanisms to monitor the success / failure of automated DR jobs.Proven experience with Amelia AIOps SaaS product, specializing in engineering DR Orchestration and Automation in large scale environments7-8 years of experience working in the AIOps Automation fieldStrong background in setting up DR Orchestration platforms for diverse organizations using Ameliaand infra automation tools
Expertise in configuring real time integrations with multiple infrastructure services on-prem and cloud.Experience in migrating existing DR workflows to new platforms.Experience in automating disaster recovery processes for :
Infrastructure (VMs, storage, network, etc.)Business applications (databases, services, containers, etc.)Familiarity with DR orchestration tools (e.g., Azure Site Recovery, Commvault , EMC Data Domain , VMware SRM).Proficiency in working across cloud environments (Azure, AWS, ) and hybrid infrastructure.Knowledge of backup strategies, snapshot automation, and replication mechanisms.Strong understanding of high availability, RPO / RTO, and BC / DR principles.Understanding of network protocols, security, and cloud networking.Experience with Disaster Recovery (DR) planning and experience with :Backup & restore automation using Commvault, Data Domain , RMAN..etcDR orchestration tools such as Azure Site Recovery, VMware SRM, Cloud EndureSimulating failover / failback and automating recovery playbooksAutomating DR for databases (SQL, NoSQL), middleware, and containerized applications.Proficiency with Infra as code and automation tools (e.g., Ansible, Puppet, Terraform, Chef etc).Expertise in cloud platforms (AWS, Azure, GCP), with hands-on experience in automation and orchestration.Solid understanding of APIs, web services, and integration technologies (e.g., REST, GraphQL, Kafka). Micro services architectureProficiency in scripting / programming languages (Python, Java,Bash, etc.).Familiarity with observability tools (e.g., Splunk, Dynatrace, New Relic) and ITSM tools (e.g., ServiceNow) Knowledge of monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or similar.Experience working with DevOps including but not limited to container technologies like Docker & Kubernetes, as well as Cloud Native technology stack such as Argo, Helm, etcd, and EnvoyExtensive experience with infrastructure as code (IaC) tools such as Ansible, Terraform, or CloudFormation.Strong communication skills to collaborate with cross-functional teams and problem solvingref : hirist.tech)