Job Title : AI Infrastructure & Performance Engineer
Experience : 4 to 6 Years
Location : Remote
Job Type : Full-time / Contract
Joiners : Immediate or short notice preferred
AI Infrastructure & Performance Engineer (1 Position) Primary Responsibilities
- Setup and manage model deployment infrastructure
- Optimize inference speed and resource utilization
- Monitor and scale AI services
- Implement security for model deployments
- Manage costs and optimize compute resource usage
Detailed Skillset
Model Serving : Ray Serve, vLLM, optimized inference enginesInfrastructure : Kubernetes, Docker, cloud deployments (AWS / Azure / GCP)Performance Optimization : Quantization, caching, batch processingMonitoring : Prometheus, GrafanaSecurity : Secure model serving, access control, input validationDevOps : CI / CD pipelines, automated scalingCost Management : Resource tracking and optimizationTools & Technologies
Kubernetes, Docker, HelmRay Serve, vLLMPrometheus, GrafanaAWS / Azure / GCP AI servicesSecurity scanning and access management tools