Data Engineer – AI-Powered Marketing Personalization Platform
We’re seeking an experienced Data Engineer to help build and scale our next-generation AI-powered marketing personalization platform (V2.0) . You’ll design and implement a robust multi-database infrastructure that enables real-time personalization, vector search, graph analytics, and large-scale data processing.
This is a greenfield opportunity to architect data pipelines from the ground up using vector and graph databases and LLM-based systems . You’ll play a key role in migrating our existing platform while creating a scalable foundation powering AI agents across thousands of marketing campaigns.
Core Responsibilities
Data Architecture & Infrastructure (40%)
Design and implement multi-database systems (MongoDB, Redis, Milvus, Neo4j, BigQuery)
Build scalable real-time pipelines and ETL / ELT workflows
Implement data governance, quality, and high-throughput optimization
Vector & Graph Systems (25%)
Optimize Milvus collections for semantic search
Design Neo4j schemas for customer journeys and relationships
Develop hybrid search (vector + graph + text) with performance tuning
ML Data Infrastructure (20%)
Build data pipelines for LLM fine-tuning and GNN training
Manage data versioning, lineage, and A / B testing systems
Implement real-time feature computation for contextual models
Analytics & Monitoring (15%)
Create BigQuery schemas and real-time dashboards
Implement observability (Prometheus, Grafana) and anomaly detection
Optimize data costs across cloud environments
Tech Stack
Databases : MongoDB, Redis, Milvus, Neo4j, BigQuery
Processing : Airflow / Prefect, Pandas / Polars, dbt, Spark
ML Pipeline : vLLM, MLflow, Sentence Transformers, PyTorch, TensorFlow
Cloud & Infra : GCP, Docker, Kubernetes, Terraform, GitHub Actions
Languages : Python (3.10+), SQL, Bash
Requirements
Must-Have
5+ years of data engineering experience in production systems
Advanced Python and SQL expertise
Experience with 3+ database types (SQL, NoSQL, Vector, Graph)
Proven ability to build high-scale data pipelines (>
1M records / day)
Strong data modeling, validation, and optimization skills
Experience with cloud data warehouses (BigQuery, Redshift, or Snowflake)
Preferred
Experience with Milvus, Pinecone, or Weaviate
Graph databases (Neo4j, Neptune) and embedding-based search
ML / Feature store experience
Background in marketing technology or CDPs
Key Projects
Phase 1 – Foundation : Migrate 10M+ vectors, implement MongoDB schemas, Neo4j models, and BigQuery warehouse
Phase 2 – Optimization : Build data quality monitoring, caching (Redis), and Phase 3 – ML Infrastructure : Create LLM fine-tuning pipelines, GNN feature stores, and A / B testing systems
Why Join Us
Collaborate with ML engineers and data scientists on cutting-edge AI systems
High ownership and real product impact
Modern tools and a flexible environment
Clear growth path to Senior, Lead, or Principal roles
Shape the future of AI-driven marketing personalization
Data Engineer • Pushkar, Rajasthan, India