We’re hiring an Senior AI Engineer to build production-grade components for an AI-first, data-centric platform. You will implement agentic capabilities (intent, planner, router / composer), integrate knowledge-graph reasoning alongside a strong RAG baseline, and instrument robust evaluation and observability. The ideal candidate writes clean, reliable code, understands LLM systems and data retrieval trade-offs, and can optimize for latency, quality, and cost.
Key Responsibilities
- Agent Implementation : Build and harden Intent , Planner , and Router / Composer agents with typed JSON I / O, retries / timeouts, and idempotency; emit call-graph traces and correlation IDs.
- Knowledge-Graph Reasoning : Generate correct graph queries ( SPARQL / Gremlin / PGQL ) from planner outputs; perform subgraph extraction; encode rationale and references in responses.
- RAG Baseline & Retrieval : Implement document prep, chunking / embeddings, hybrid retrieval and (where available) reranking; maintain a high-quality baseline path for side-by-side comparisons.
- Prompt / Config Tuning : Version and tune prompts, routing policies (small→large model escalation), temperature / top-p settings, and caching; document routing outcomes and cost / latency budgets.
- Evaluation Hooks : Integrate test sets and scoring (faithfulness / correctness, precision / recall, multi-hop coverage, latency); enable automated re-evaluation on any change (model / agent / prompt / data).
- Observability & Cost Controls : Instrument traces / metrics / logs (token usage, latency P50 / P95, error codes); surface cost-per-answer dashboards; implement backpressure and graceful degradation.
- Security & Guardrails : Enforce policy-as-code and entitlement checks (role / row / column), PII / PHI handling, content moderation, and HITL approval prompts for state-changing actions.
- Quality & CI / CD : Write unit / integration / contract tests; participate in PR reviews; ship via CI / CD with feature flags and environment promotion; maintain API / connector schemas and docs.
Required Skills
Applied LLM Engineering : 1-2+ years building production services; hands-on with LLM tool / function-calling, agent frameworks, and prompt / version management.Knowledge & Retrieval : Practical experience with Knowledge Graphs (RDF / SPARQL or property graph / Gremlin) and RAG pipelines (chunking, embeddings, retrieval / reranking).Data / Model Ecosystem : One or more vector DBs (pgvector, Pinecone, Weaviate, Milvus) and search (OpenSearch / Elasticsearch); familiarity with major model platforms (Azure OpenAI, Vertex, Anthropic, open-weights).Backend Skills : Proficiency in Python and / or TypeScript / Node.js ; strong REST / gRPC API design, JSON Schema / OpenAPI, retries / backoff / idempotency, and error taxonomies.Observability & Reliability : OpenTelemetry (traces / metrics / logs), performance profiling, resiliency patterns (circuit breakers, bulkheads, DLQ / queues).Security by Design : OIDC / SSO, secrets management, least-privilege access, audit logging, and secure coding for AI / data services.CI / CD & Testing : Git-based workflows, automated pipelines, unit / integration / contract tests, and environment promotion practices.Good to Have Skills
Ontology & Data Quality : SHACL / OWL basics, ontology stewardship, lineage / provenance capture, and data quality checks for KG / RAG pipelines.Evaluation Engineering : Judge-model setups, A / B testing, rubric design, and regression dashboards.Performance & FinOps : Async I / O, caching strategies, connection pooling, and token / runtime budget enforcement.Runtime & Platform : Containers / Kubernetes, service mesh / API gateways, feature flags, blue / green or canary releases.UX for Explainability : Collaborating on rationale / explanations (source lists, subgraph summaries) and clear HITL approval prompts.This role is ideal for a hands-on engineer who enjoys turning advanced reasoning patterns into robust, observable services-balancing quality, safety, and cost at enterprise scale.