Role Overview
We are looking for a highly skilled AWS Solution Architect with deep experience in designing and optimizing scalable, compute-intensive, event-driven architectures .
Key Responsibilities
1. Architecture & System Modernization
- Own the redesign of the Truscan backend architecture :
- Multi-stage compute pipeline (Stage 1–4)
- High-performance Python workers (Blender, Open3D, CV, ML pipelines)
- RabbitMQ or AWS-native event orchestration
- Build an event-driven, scalable system architecture leveraging :
- AWS ECS Fargate / EKS for isolated compute
- AWS Lambda where applicable
- AWS Step Functions for orchestration
- AWS SQS / SNS as queueing alternatives to RabbitMQ
- Introduce horizontal scaling patterns for heavy Blender / Python workloads.
2. Performance Engineering
Analyze slow-running compute code and provide architectural solutions :Worker isolationParallel multi-processingConcurrency throttlingMemory and CPU right-sizingRecommend improvements in Python, Blender, Open3D runtimes to reduce Stage 3 times (goal : 50% reduction ).3. DevOps & CI / CD Engineering
Establish full CI / CD pipelines using :AWS CodePipeline / GitHub ActionsDockerized environments for Python computeAutomated deployments for Java Spring Boot + Python servicesImplement AWS-native observability :CloudWatch LogsApplication InsightsAWS X-Ray for traceabilityPrometheus / Grafana for performance dashboards (optional)4. Security, Governance & Best Practices
Harden the compute platform with AWS standards :VPC design (private subnets, NAT, security groups)Secrets Manager / Parameter StoreIAM roles with least-privilege accessMigrate static configs to AWS-managed secure storage.5. Reliability, Fault Tolerance & Scaling
Design fault-tolerant workers for handling :Failed scansRetriesIdempotent processingLarge file handling (USDZ, PNG, GLB)Implement S3 tiered storage & lifecycle policies for compute data.6. Collaboration & Leadership
Work directly with Java, Python, ML, and DevOps teams.Translate business performance goals (SLA improvement, throughput increase) into concrete technical plans.Provide sprint planning guidance and architectural documentation.Required Skills & Experience
1. Core AWS Expertise
Must-have production experience with :
ECS (Fargate), EKS, EC2 AutoscalingS3, Lambda, API GatewayStep FunctionsSQS / SNSCloudWatch, X-Ray, CloudTrailVPC, Subnets, NAT, SGsAWS Secrets Manager / Parameter StoreIAM design2. High-Performance Compute Experience
Expert in running compute-heavy workloads (3D processing, ML workflows, or large image pipelines).Experience with containerized Blender / Open3D , GPU / CPU optimization, or similar pipelines.3. Strong Python & Java Background
Enough understanding to review and guide improvements in :Python multi-processingAsynchronous compute workersJava Spring Boot schedulers, producers, queue consumersFile-locking, message-driven architecture4. DevOps & Automation
Docker, GitHub Actions, AWS CodeBuild, CodePipelineInfrastructure-as-Code : CloudFormation or TerraformMonitoring & alerting design5. Architecture Skills
Event-driven design patternsMicroservices architectureRace-condition and concurrency safetyDistributed locks, queues, idempotent consumersCircuit breaker, retry strategies