Role Purpose :
You will work with a leading technology organization focused on building highly scalable, high-performance ML model-serving infrastructure. As a Software Development Engineer in the MLOps team, you will design and develop robust systems that enable efficient, reliable, and optimized model deployment at scale.
Role Value :
As a Software Engineer (SDE III), you will drive continuous improvements across the infrastructure and applications that support the lifecycle of ML assets (data and models). You will enhance developer experience, strengthen governance capabilities, and architect a strong foundation for model deployment, serving, and optimization. You will also build efficient CI / CD pipelines for ML assets and leverage advanced compilers or hardware optimization to maximize inference performance and cost efficiency. You are encouraged to challenge existing processes and contribute to better engineering tools and practices.
Example Responsibilities :
- Upgrade ML asset management systems (models, data) to improve developer experience and governance.
- Build and optimize model-serving infrastructure with emphasis on low-latency and cost-efficient inference.
- Architect high-performance inference pipelines balancing latency, throughput, and cost across accelerator options.
- Design and implement cost-optimized, enterprise-scale ML solutions.
- Collaborate with cross-functional teams in a distributed environment for continuous system enhancement.
- Work closely with MLEs, QA Engineers, DevOps Engineers, and other stakeholders.
- Evaluate and integrate new tools, technologies, and methodologies.
- Contribute to architectural decisions for distributed ML systems.
Experience and Qualifications :
5+ years of experience in software engineering with Python.Experience with model lifecycle management tools (MLflow, Weights & Biases, or similar).Experience with data management ecosystems (quality, transformation, cataloging).Hands-on experience with ML frameworks, especially PyTorch.Experience optimizing ML models using hardware acceleration (AWS Neuron, ONNX, TensorRT).Strong experience building and operating AWS serverless architectures.Deep knowledge of event-driven architectures, SQS / SNS, and serverless caching solutions.Experience with containerization (Docker) and orchestration tools.Strong understanding of RESTful API design and implementation.Proficiency in writing clean, secure, high-quality code;familiarity with staticanalysis tools.
Strong analytical, problem-solving, and communication skills (written and verbal).Solid understanding of Computer Science fundamentals—algorithms, problem-solving, complexity analysis.Great to Have :
Experience with model compilation, quantization, performance profiling, and benchmarking ML inference systems.Experience working in industries requiring strict compliance for cloud-native solutions.