This job offer is not available in your country.

Architect

Milestone Technologies, Inc.Thane, IN

6 hours ago

Job description

We’re hiring an AI Architect to design and govern the end-to-end architecture for AI solutions spanning multi-agent orchestration, knowledge-graph reasoning, retrieval-augmented generation (RAG), and evaluation at scale. You will define model strategy, reasoning patterns, data interfaces, and safety / guardrails; establish reusable blueprints; and partner with engineering, security, and data teams to deliver reliable, explainable, and cost-effective AI capabilities.

Key Responsibilities

AI Architecture & Patterns : Define reference architectures for LLM-powered systems (planner / reasoner, router / composer, tool / connector mesh), including typed I / O contracts and failure isolation.
Knowledge & Retrieval Strategy : Architect coexistence of Knowledge Graph reasoning and RAG baselines; select embedding / vector stores, chunking, retrieval / reranking, and subgraph / query patterns.
Model Strategy & Routing : Establish model selection policies (small→large escalation), prompt / adapter patterns, caching, and cost / latency budgets; document routing outcomes.
Evaluation & Quality Gates : Design test sets and scoring rubrics (faithfulness / correctness, precision / recall, multi-hop coverage, latency); implement automated re-evaluation on any change (model / agent / prompt / data).
Safety & Guardrails : Specify policy-as-code, entitlement checks (role / row / column), PII / PHI handling, and content moderation; define red-team tests and jailbreak defenses.
Data & Interfaces : Define schemas for AI inputs / outputs; guide ontology / taxonomy alignment; ensure provenance / lineage for KG / RAG pipelines; minimize data movement.
Operability & Observability : Standardize tracing / logging / metrics for AI runs (call graphs, token / latency / cost); set SLOs and error budgets; partner with DevOps for CI / CD and environment promotion gates.
Technical Leadership : Review designs / PRs; mentor AI engineers; communicate decisions / trade-offs to stakeholders; maintain decision records and roadmaps.

Required Skills

Applied LLM Systems : 1+ years in ML / AI or platform architecture with production LLM solutions (planning / reasoning, tool use, function-calling, agent ecosystems).

Knowledge & Retrieval : Hands-on with Knowledge Graphs (RDF / SPARQL or property graph / Gremlin) and RAG (chunking, embeddings, retrieval / reranking); practical trade-offs between the two.

Model & Vector Ecosystem : Experience with at least one major model platform (Azure OpenAI, Vertex AI, Anthropic, open-weights) and vector DBs (pgvector, Pinecone, Weaviate, Milvus) plus search (OpenSearch / Elasticsearch).

Evaluation Engineering : Ability to construct evaluation harnesses, design rubrics, and automate regression testing; familiarity with A / B testing and human-in-the-loop review.

Security-by-Design : SSO / OIDC, secrets management, least-privilege design, policy-as-code, data minimization, and auditability for AI systems.

Software & APIs : Strong API design (REST / gRPC), JSON schema contracts, error taxonomies, retries / backoff / idempotency; proficiency in one or more languages (Python, TypeScript / Node.js, Go, or Java).

Observability & Reliability : OpenTelemetry or equivalent for traces / metrics / logs; resiliency patterns (circuit breakers, bulkheads, backpressure); performance tuning and cost governance.

Good to Have Skills

Ontology & Graph Practices : SHACL, OWL, ontology stewardship, and data quality checks; graph query optimization and subgraph extraction patterns.

Prompt & Tooling Ops : Prompt versioning, prompt injection defenses, retrieval parameter tuning, and structured tool-use schemas (e.g., MCP-style adapters).

MLOps & Platforms : IaC (Terraform), CI / CD for models / prompts / configs, feature flags, canary releases; experience with GPU / accelerator considerations.

External Tools : Designing safe, auditable action patterns and HITL approvals for external tool execution (simulation / models, document generation, ticketing).

Cost / Performance Analytics : Token accounting, cache strategies, and per-agent cost ceilings; dashboards for cost-per-answer and latency P50 / P95 targets.

UX for Explainability : Collaborating on rationale / explanation UX so users understand sources, subgraphs, and model / tool decisions.

Create a job alert for this search

Architect • Thane, IN