Agentic & AI Tech Ops Engineer
Location : AI Center of Excellence
Role Overview :
We seek a proactive Agentic & AI Tech Ops Engineer to ensure reliability, scalability, and efficiency of AI and Agentic AI systems in production. You will manage deployments, monitor performance, troubleshoot issues, and implement best practices for Tech Ops / MLOps / LLMOps .
Key Responsibilities
Deployment & Infrastructure :
Deploy and manage AI models, agentic systems, and infrastructure across cloud (GCP, AWS, Azure) and on-prem.
Implement CI / CD pipelines for AI / ML and agentic applications.
Optimize cloud resources for cost and scalability.
Monitoring & Incident Management :
Build monitoring, logging, and alerting solutions for AI systems.
Handle incident response, root cause analysis, and maintain runbooks.
Automation & Operational Excellence :
Automate deployments and maintenance using Python / Bash and IaC tools (Terraform, Ansible).
Enforce security, compliance, and operational best practices.
Collaboration & Documentation :
Work with AI developers, data scientists, and architects for smooth production transitions.
Maintain clear documentation and provide feedback on system performance.
Required Qualifications
Bachelor’s in CS, IT, Engineering, or related field.
2–4+ years in Tech Ops, DevOps, SRE, or MLOps roles.
Hands-on with cloud platforms (GCP—Vertex AI, AWS, Azure).
Proficient in Python / Bash scripting, CI / CD tools (Jenkins, GitLab CI, GitHub Actions).
Experience with Docker, Kubernetes, monitoring tools (Prometheus, Grafana, ELK, Datadog).
Knowledge of networking, security, and IaC principles.
Strong troubleshooting and communication skills.
Preferred Qualifications
Master’s degree and cloud certifications.
Experience in MLOps / LLMOps and AI / ML frameworks (TensorFlow, PyTorch).
Familiarity with agentic AI concepts, vector databases, and data pipeline tools (Airflow, Kubeflow).
Agile environment experience.
Agentic Ai Engineer • Panipat, Haryana, India