About TrueFoundry
- TrueFoundry is an enterprise-grade AI / ML platform that accelerates the development, deployment, and scaling of GenAI and ML applications with security, cost efficiency, and cross-cloud flexibility. It provides a unified gateway to manage and route workloads, enabling secure access, governance, and observability across models, APIs, and environments.
- We’re now scaling our Enterprise Outcomes motion a strategic arm focused on delivering domain-specific solutions that drive business transformation and shape our product roadmap.
Role summary
You’ll design and own core components that enable enterprise customers to run production agentic AI safely and efficiently on TrueFoundry. This includes building robust orchestration for multi-step agents (graph / stateful workflows), model / routing logic, observability and policy enforcement (cost, data residency, rate limiting), and integrating upstream tooling like LangGraph, LangChain, vector stores, and specialized LLM runtimes.
What you’ll do
Solve some of the most complex Engineering problems and drive it alongside a team of engineers & ML researchers.Build a deep, holistic understanding of the TrueFoundry platform across all components and shape the product vision and implementation.Act as the technical face of engineering for customer-related discussions and escalationsGuide and unblock engineers across projects in the US regionPartner closely with our drive system design, architecture, and implementation of complex productsLead technical design , critical customer problem-solving , and platform scalability initiatives end-to-endThis is a high-ownership , high-impact role designed for an engineer who loves combining world-class systems thinking with real-world execution .Must-have
4+ years of strong backend / systems engineering experience at top technology companies or startupsDeep expertise in distributed systems , cloud-native architectures , and scalable system designStrong working knowledge of Kubernetes , containerized workloads , and infrastructure engineeringPractical experience building or deploying ML / GenAI applications (or closely working with ML / DS teams)Skilled in programming languages such as Python , Go , or typescriptSolid understanding of system observability , resiliency design , and SRE practicesStrong technical leadership and communication skills — able to work with both customers and engineering teamsAbility to think strategically while also executing hands-on when required.