Solution Architect - (Google Cloud, Agentic GenAI, MCP) Role summary
People Tech Group is seeking a hands-on Solution Architect to design and lead an
industrial Integrity Operating Window (IOW) AI platform for a leading refinery. You will own
the end-to-end architecture across GCP data services, event / middleware, agentic GenAI
(MCP microservices per source), and the AI & knowledge base layer (RAG over standards
like API 571 / 584 and SOPs) - delivering a secure, real-time, expert-in-the-loop system
Location : India (preferred IST timezone)
Function : Engineering / Architecture
Type : Full-time
Responsibilities : -
Own the reference architecture : Translate business & SME needs into a cloudnative,
real-time architecture spanning on-prem connectors
(PI / IOW / LIMS / APM / ERP / Logbook) → MCP adapter layer → event bus → agentic AI →
dashboard & workflows
Design MCP layer (per-source microservices) : Define APIs, schemas, and contracts
for MCP-PI, MCP-IOW / LIMS, MCP-APM, MCP-ERP, MCP-Logbook; standardize ondemand
aggregates (24h / 7d / 30d / 90d), normalized events, retries, and backpressure
Evented core : Architect and tune Pub / Sub (or Kafka) topics for
deviations / alarms / actions; ensure idempotency, ordered processing where
required, and replay strategy
AI & knowledge base : Lead agentic GenAI design—Data Analysis Agent (tool use
over MCPs), Knowledge Agent (RAG over standards like API 571 / 584, vendor
manuals, SOPs), Expert-Feedback Agent (thumbs / annotations loop), Orchestrator
(GPT-class LLM)
RAG & vector search : Choose and implement vector store (Cloud SQL + pgvector,
Vertex AI Matching Engine, or Milvus); define chunking, embeddings, refresh
cadence, grounding / citations
Security & governance by design : Enforce VPC / VPN / Interconnect, IAM (least
privilege / service accounts), Cloud KMS (CMEK), Cloud Audit Logs, Data Catalog &
Lineage, DLP & data quality rules, SSO / OIDC
Data services & middleware : Select and size Cloud Run / GKE for MCPs & BFF, Cloud
Storage for docs / dumps, Memorystore for feature cache, optional BigQuery
(analytical store) without coupling runtime
Integration workflows : Orchestrate Action Tracker ↔ ERP (SAP PM / Oracle EAM) via
MCP-ERP; define SLAs, ownership, and auditability for IOW→RBI hand-os (GE APM)
Observability & SLOs : Establish metrics, traces, logs, and SLOs (latency, freshness,
accuracy), plus runbooks for incident response
Cost & scale : Model infra + LLM cost envelopes, implement autoscaling,
request / response token budgets, and caching strategies
Delivery leadership : Produce architecture docs, ADRs, sequence diagrams; guide
developers, data / ML engineers, and MLOps; engage SMEs (operations,
corrosion / RBI) for iterative validation
Required qualifications
10+ years in distributed systems / data / AI architecture with at least 3+ years on GCP
leading production solutions
Proven design of event-driven architectures with Google Pub / Sub (or Kafka), and
microservices on Cloud Run or GKE
Strong with GCP data stack : Cloud Storage, Cloud SQL / Postgres (pgvector),
Dataflow / Data Fusion (or equivalent ETL), Cloud Monitoring, Cloud Logging, Secret
Manager, IAM, KMS, VPC
Hands-on GenAI + RAG : embeddings, vector search, grounding / citations, prompt
tooling, and safety / guardrails; experience integrating OpenAI / Vertex AI (Gemini) or
Anthropic models
MCP-style adapter design : built source-specific microservices / APIs that normalize
schemas, perform windowed aggregates, and publish events
Integration exposure to OSIsoft PI / PI Web API, IOW / LIMS, GE APM (RBI), and
SAP / Oracle ERP (REST, OData, queues, or middleware)
Practical security & compliance : IAM design, network segmentation, CMEK,
auditability, lineage; understands safety-critical environments
Excellent stakeholder skills : can translate SME input (e.g. API 571 / 584, IOW
practices) into technical decisions
Nice-to-have (preferred)
LLMOps / MLOps : Vertex Pipelines, model registries, prompt / versioning, evals, AB
tests, telemetry
Action management integrations (SAP PM / Maximo), Apigee / API gateway exposure
Time-series modeling (feature engineering, basic LSTM / TFT / Prophet, or XGBoost
classifiers for risk scoring)
Graph / causal modeling for chain-link analysis (propagation across units)
Experience hardening BFF / API layers, RBAC in UI, and expert-feedback / annotation
UX
Oil and Gas Industrial domain : Refining, IOW (knowledge on standards like API RP
571 / 584), corrosion loops, RBI workflows
Solution Architect • Hyderabad, Telangana, India