Talent.com
This job offer is not available in your country.
Architect

Architect

Milestone Technologies, Inc.Thane, IN
6 hours ago
Job description

We’re hiring an AI Architect to design and govern the end-to-end architecture for AI solutions spanning multi-agent orchestration, knowledge-graph reasoning, retrieval-augmented generation (RAG), and evaluation at scale. You will define model strategy, reasoning patterns, data interfaces, and safety / guardrails; establish reusable blueprints; and partner with engineering, security, and data teams to deliver reliable, explainable, and cost-effective AI capabilities.

Key Responsibilities

  • AI Architecture & Patterns : Define reference architectures for LLM-powered systems (planner / reasoner, router / composer, tool / connector mesh), including typed I / O contracts and failure isolation.
  • Knowledge & Retrieval Strategy : Architect coexistence of Knowledge Graph reasoning and RAG baselines; select embedding / vector stores, chunking, retrieval / reranking, and subgraph / query patterns.
  • Model Strategy & Routing : Establish model selection policies (small→large escalation), prompt / adapter patterns, caching, and cost / latency budgets; document routing outcomes.
  • Evaluation & Quality Gates : Design test sets and scoring rubrics (faithfulness / correctness, precision / recall, multi-hop coverage, latency); implement automated re-evaluation on any change (model / agent / prompt / data).
  • Safety & Guardrails : Specify policy-as-code, entitlement checks (role / row / column), PII / PHI handling, and content moderation; define red-team tests and jailbreak defenses.
  • Data & Interfaces : Define schemas for AI inputs / outputs; guide ontology / taxonomy alignment; ensure provenance / lineage for KG / RAG pipelines; minimize data movement.
  • Operability & Observability : Standardize tracing / logging / metrics for AI runs (call graphs, token / latency / cost); set SLOs and error budgets; partner with DevOps for CI / CD and environment promotion gates.
  • Technical Leadership : Review designs / PRs; mentor AI engineers; communicate decisions / trade-offs to stakeholders; maintain decision records and roadmaps.

Required Skills

  • Applied LLM Systems : 1+ years in ML / AI or platform architecture with production LLM solutions (planning / reasoning, tool use, function-calling, agent ecosystems).
  • Knowledge & Retrieval : Hands-on with Knowledge Graphs (RDF / SPARQL or property graph / Gremlin) and RAG (chunking, embeddings, retrieval / reranking); practical trade-offs between the two.
  • Model & Vector Ecosystem : Experience with at least one major model platform (Azure OpenAI, Vertex AI, Anthropic, open-weights) and vector DBs (pgvector, Pinecone, Weaviate, Milvus) plus search (OpenSearch / Elasticsearch).
  • Evaluation Engineering : Ability to construct evaluation harnesses, design rubrics, and automate regression testing; familiarity with A / B testing and human-in-the-loop review.
  • Security-by-Design : SSO / OIDC, secrets management, least-privilege design, policy-as-code, data minimization, and auditability for AI systems.
  • Software & APIs : Strong API design (REST / gRPC), JSON schema contracts, error taxonomies, retries / backoff / idempotency; proficiency in one or more languages (Python, TypeScript / Node.js, Go, or Java).
  • Observability & Reliability : OpenTelemetry or equivalent for traces / metrics / logs; resiliency patterns (circuit breakers, bulkheads, backpressure); performance tuning and cost governance.
  • Good to Have Skills

  • Ontology & Graph Practices : SHACL, OWL, ontology stewardship, and data quality checks; graph query optimization and subgraph extraction patterns.
  • Prompt & Tooling Ops : Prompt versioning, prompt injection defenses, retrieval parameter tuning, and structured tool-use schemas (e.g., MCP-style adapters).
  • MLOps & Platforms : IaC (Terraform), CI / CD for models / prompts / configs, feature flags, canary releases; experience with GPU / accelerator considerations.
  • External Tools : Designing safe, auditable action patterns and HITL approvals for external tool execution (simulation / models, document generation, ticketing).
  • Cost / Performance Analytics : Token accounting, cache strategies, and per-agent cost ceilings; dashboards for cost-per-answer and latency P50 / P95 targets.
  • UX for Explainability : Collaborating on rationale / explanation UX so users understand sources, subgraphs, and model / tool decisions.
  • Create a job alert for this search

    Architect • Thane, IN