Your mission : build and maintain a secure, automated, and observable AWS foundation so engineers can ship faster, safer, and cheaper. You’ll be the owner of deployment velocity, system uptime, and cloud cost sanity across our ECS-based microservices.
What You’ll Own
1. Platform Reliability
Design and maintain ECS clusters (Fargate / EC2) for multi-service workloads.
Implement autoscaling, health checks, and blue / green rollouts for zero-downtime deployments.
Build observability into everything — logs, metrics, traces — to shorten MTTR.
2. Delivery Automation
Architect and maintain CI / CD pipelines using GitHub Actions + CodePipeline / CodeBuild .
Enforce testing, security scanning, and deployment gates as part of every release.
Move from semi-manual deploys to fully automated pipelines across environments.
3. Network & Security
Manage VPC architectures (subnets, routing, gateways, VPN, endpoints).
Handle Route 53 for internal / external DNS, SSL / TLS, health checks, and routing policies.
Maintain multi-account setup with IAM least privilege, KMS encryption, and security baselines.
4. Infrastructure as Code
Define all infra in Terraform / CDK; no console drift.
Use IaC reviews and environments for repeatable, compliant infrastructure.
5. Data Layer Operations
Operate and optimize ClickHouse and PostgreSQL clusters — backups, replication, partitioning, and tuning.
Ensure RTO / RPO objectives are met and documented.
6. Monitoring & Debugging
Aggregate logs (CloudWatch, FireLens, OpenTelemetry).
Build dashboards and alerts that highlight anomalies, not noise.
Lead root-cause investigations across network, container, and app layers.
Core Tech Stack
AWS : ECS (Fargate / EC2), EC2, S3, VPC, Route 53, CloudWatch, CodePipeline, CodeBuild
CI / CD : GitHub Actions, Docker, Terraform / CDK
Databases : ClickHouse, PostgreSQL
Languages (plus) : FastAPI (Python), Node.js
Networking : DNS, VPN, load balancers, private link, peering, NAT, IGW
Security : Multi-account strategy, IAM roles / policies, KMS, AWS Config, GuardDuty
Requirements
5+ years running production workloads on AWS.
Deep knowledge of ECS, CodePipeline, EC2 / VPC, S3 , and Docker .
Proven track record of shipping secure automated deployments .
Strong understanding of networking and DNS fundamentals.
Experience managing databases in production.
Strong debugging and observability mindset.
Clear written communication and operational discipline.
Nice to Have
Familiarity with FastAPI or Node.js applications to optimize deployment flows.
Hands-on with cost-optimization and cross-account automation (Organizations, Control Tower).
Experience setting up VPNs , Bastion, or SSO integration.
What Success Looks Like
✅ All ECS services deployed via automated pipelines.
✅ CloudWatch dashboards and alerts in place for core systems.
✅ Verified ClickHouse and PostgreSQL backups / restores.
✅ Documented multi-account / VPC network topology.
✅ No manual deploys, no console changes.
Why This Role Matters
This role defines the foundation for everything we build. The more you automate, the faster teams deliver.
You’ll directly impact uptime, developer productivity, and cloud spend — three metrics that define operational excellence.
Senior Engineer Aws • Kurnool, Andhra Pradesh, India