Overview As a senior DevOps Engineer, you will own the AWS infrastructure and DevOps toolchain for a high-scale ad serving system composed of asynchronous Java microservices (Akka framework) .
Targets include
Responsibilities Design & stand up AWS environments end-to-end (landing zone, VPCs, networking, security, automation).
Build immutable infrastructure and CI / CD for Java microservices (Maven / Gradle) including blue / green & canary releases and automated rollbacks.
Implement observability : metrics, logs, traces, SLOs / SLIs, alerting, on-call runbooks.
Engineer reliability & performance : autoscaling, caching layers, multi-AZ / region DR, capacity planning to support 5M+ concurrent users and p95 / p99 latency goals.
Establish security-by-design : IAM least privilege, KMS / Secrets Manager, WAF / Shield, image / signing policies, CIS benchmarks.
Partner with EY developers & Performance Test Engineer to tune JVM / Akka, thread pools, GC, and infra limits based on load-testing feedback.
Champion cost governance and tagging; produce dashboards and weekly reports.
Tech you’ll use (you don’t need every single one, but you know most) AWS : EKS / ECS, EC2, ALB / NLB, API Gateway / Lambda, S3 / CloudFront, DynamoDB / ElastiCache (Redis), Aurora / RDS, MSK / Kinesis, OpenSearch, Route 53, VPC, NAT / GW, WAF / Shield, CloudWatch / X-Ray, IAM, KMS, Secrets Manager.
IaC & CI / CD : Terraform / CloudFormation, Helm, Argo CD or Flux, GitHub Actions / Jenkins / GitLab CI, Docker.
Observability : CloudWatch, OpenTelemetry, Prometheus / Grafana, log pipelines.
Languages / Build : Bash / Python for automation; familiarity with Java build / release workflows.
What makes you a great fit 3–5+ years total experience; Senior / Manager-level depth in AWS platform engineering for high-throughput, low-latency services.
Proven ownership of production systems at 10k–1M+ concurrent users (or comparable high RPS) with 99.9x SLOs.
Hands-on with Akka / Java microservice delivery pipelines (nice if you’ve tuned JVM, GC, Akka dispatchers).
Strong grounding in scaling patterns (event-driven, async IO, caching, backpressure, rate limiting) and resilience (circuit breakers, retries, chaos).
Excellent collaboration, documentation, and stakeholder communication.
Logistics Location : Remote (prefer India candidates) Schedule : Must join US morning calls (Eastern Time) as needed.
Start : 1–3 weeks from offer.
Term : Through end of January (likely extension).
Powered by JazzHR
Engineer Aws • IN