Talent.com
Infrastructure Architect - AWS

Infrastructure Architect - AWS

CareStack - Dental Practice ManagementPushkar, IN
20 hours ago
Job description

We are looking for a Principal Infrastructure who can command, design, and operate a globally distributed production environment across the US, UK, and AU regions. This role is for an architect-level engineer who still loves to get their hands dirty — the kind of person who can both lead a SRE / DevOps team and SSH into a production node to trace issues.

About the Role

This role involves end-to-end control of our multi-region infrastructure (AWS + on-prem hybrid) spanning compute, VoIP, databases, monitoring, and automation.

Responsibilities

  • Reliability and uptime for mission-critical communication systems handling millions of SIP sessions and HTTP requests daily.
  • Management and mentorship of DevOps / SRE and Support Engineers, ensuring process maturity and on-call readiness.
  • Responsibility for performance, scalability, and security across all environments.
  • Decision authority on infrastructure architecture, cost optimization, and modernization initiatives.

Qualifications

  • 10+ years in Infrastructure / DevOps / SRE roles
  • Proven expertise running production-grade, multi-region environments in AWS.
  • Deep understanding of networking, SIP signaling, NAT traversal, RTP, and media relay.
  • Proficiency in Debian / Linux internals, kernel tuning, and packet tracing.
  • Hands-on experience with WAF, ModSecurity, fail2ban, iptables, and system hardening.
  • Solid background in Ansible, AWX, Python (2.7 / 3.x), and Bash scripting.
  • Experience managing Grafana Loki, Prometheus, and alerting frameworks.
  • Familiarity with Docker / Docker Compose / Terraform for repeatable infra.
  • Prior leadership experience — leading cross-functional infrastructure teams in 24x7 environments.
  • Required Skills

  • Architect, operate, and continuously improve AWS environments (EC2, EKS, RDS, Route 53, VPCs, IAM, S3, LAMDA, Cloudfront).
  • Maintain multi-layer high-availability VoIP servers.
  • Design and enforce multi-region DR and failover policies with automated recovery.
  • Manage clusters for MariaDB, MongoDB, Redis, and RabbitMQ.
  • Own and optimize Kamailio, Asterisk, RTPProxy, and RTPEngine stacks — ensuring consistent SIP routing, NAT traversal, and call resilience across regions.
  • Build SIP-level observability : tracing dialogs, RTP flows, and registrations through Grafana, Loki, and sngrep pipelines.
  • Lead continuous performance and load testing with SIPp and Playwright-based automation frameworks.
  • Enforce multi-layer security using ModSecurity, fail2ban, IPTables, and real-time log-based intrusion prevention.
  • Build proactive defenses against SIP scanning, brute-force attacks, and zero-day threats.
  • Maintain end-to-end TLS, cert rotation, and infrastructure audit trails.
  • Drive compliance with regional (US / UK / AU) data protection requirements.
  • Manage a large-scale Grafana + Loki + Prometheus monitoring.
  • Correlate VoIP, network, and application metrics for unified visibility.
  • Build actionable alerts and dashboards for call traffic, system health, and anomaly detection.
  • Implement distributed tracing and incident replay for debugging call failures.
  • Maintain full configuration automation via Ansible + AWX with disaster recovery playbooks.
  • Build self-healing routines for proxy or service recovery via scripts and cron jobs.
  • Design CI / CD pipelines for infrastructure code, ensuring rapid, controlled deployments.
  • Implement chaos testing and resilience benchmarking to proactively harden infrastructure.
  • Lead a DevOps and Support Engineering team, setting standards for incident response, documentation, and reliability.
  • Conduct performance reviews and mentor team members on debugging, SIP analysis, and cloud automation.
  • Establish a culture of observability and ownership — no “restart and pray” mentality.
  • Work directly with development teams (Laravel, Node.js, React) to ensure application-level readiness for scale.
  • Preferred Skills

  • Experience with cloud-native technologies and microservices architecture.
  • Familiarity with Agile methodologies and DevOps practices.
  • Create a job alert for this search

    Aws Architect • Pushkar, IN