Talent.com
This job offer is not available in your country.
DevOps Manager

DevOps Manager

Scry AIamritsar, punjab, in
13 hours ago
Job description

Position : DevOps Manager

Location : India (Remote)

Employment Type : Full-Time

Schedule : Monday to Friday, Day Shift

Company Description

Scry AI is a research-led enterprise AI company that builds intelligent platforms to drive efficiency, insight, and compliance. Our platforms Collatio®, Auriga®, and Concentio® streamline complex workflows by automating data extraction, validation, reconciliation and delivering real-time intelligence.

We are seeking a DevOps Manager to lead our infrastructure, CI / CD, and reliability practices across cloud and on-prem deployments. You will own uptime, performance, security, and cost efficiency for AI / ML workloads powering Collatio®, Auriga®, and Concentio®.

Role Overview

As DevOps Manager, you will lead a small team of DevOps / SRE engineers to design, automate, and operate secure, compliant, and highly available platforms across AWS / Azure / GCP and customer on-prem environments. You will standardize IaC, improve CI / CD velocity, build robust observability, and enable GPU-accelerated AI inference at scale for enterprise clients.

Key Responsibilities

Platform Reliability & Operations

  • Own SLOs / SLIs, availability, latency, and capacity planning across services.
  • Lead incident response, root-cause analysis, postmortems, and on-call processes.
  • Implement backup, disaster recovery, and business continuity for multi-region and on-prem.

Cloud, On-Prem & Edge Deployments

  • Architect Kubernetes platforms (managed and self-hosted), including RBAC, network policies, and secrets management.
  • Standardize infrastructure with Terraform, Helm, and GitOps (Argo CD) for repeatable customer deployments.
  • Support Concentio® edge / IoT rollouts with secure remote updates and telemetry pipelines.
  • AI / ML & Data Infrastructure

  • Enable GPU scheduling and drivers (CUDA, NVIDIA), inference runtimes (Triton), and model packaging.
  • Build MLOps foundations (MLflow, feature stores) and artifact / version governance.
  • Operate data services (Kafka, PostgreSQL, Redis, MinIO / S3, Elasticsearch / Opensearch) for high-throughput pipelines.
  • CI / CD & Developer Experience

  • Own CI / CD with GitHub Actions / GitLab CI / Jenkins; establish trunk-based development, automated testing, and canary / blue-green releases.
  • Maintain internal developer platforms, templates, and golden paths to improve delivery speed and quality.
  • Security, Compliance & Observability

  • Implement least-privilege access, SSO (Okta / AAD), Vault-based secrets, image scanning (Trivy), and policy as code.
  • Ensure SOC 2, ISO 27001, HIPAA / GDPR alignment with audit trails and immutable logs.
  • Build end-to-end observability using Prometheus, Grafana, Loki / EFK, and OpenTelemetry.
  • FinOps & Stakeholder Management

  • Track cloud spend, rightsize resources, and negotiate quotas for GPU / compute.
  • Partner with Product, Data Science, and Customer Success to plan capacity for new features and enterprise go-lives.
  • Required Qualifications & Skills

  • Strong Kubernetes expertise (production operations, networking, security, Helm, GitOps).
  • Proven IaC experience with Terraform and configuration management (Ansible).
  • CI / CD at scale with GitHub Actions / GitLab CI / Jenkins; artifact registries and SBOMs.
  • Observability : Prometheus, Grafana, ELK / EFK or Loki, alerting and runbooks.
  • Cloud proficiency in at least one major provider (AWS / Azure / GCP) and Linux fundamentals.
  • Security fundamentals : network segmentation, TLS, secrets management, container hardening.
  • Experience running data / streaming systems (Kafka, Redis, PostgreSQL) in production.
  • Excellent communication, incident leadership, and stakeholder management.
  • Nice-to-Have

  • GPU orchestration, Triton Inference Server, Hugging Face model serving.
  • Service mesh (Istio / Linkerd), API gateways, and zero-trust patterns.
  • MLOps tooling (MLflow, Feast), Airflow, dbt.
  • Compliance implementations for regulated industries (BFSI, healthcare).
  • Certifications : CKA / CKAD, AWS / Azure / GCP Architect, Security+.
  • Our Ideal Candidate

  • Drives reliability with automation, not toil.
  • Balances speed and safety with measurable delivery improvements.
  • Thrives in customer-facing, hybrid cloud, and on-prem environments.
  • Coaches teams with clear standards, runbooks, and continuous improvement.
  • Tip for Candidates

    If you want to build secure, high-performance platforms for real-world AI at enterprise scale, follow our page for more such relevant job openings.

    Create a job alert for this search

    Manager • amritsar, punjab, in