Talent.com
Data Engineer
Data EngineerOWOW • India
No longer accepting applications
Data Engineer

Data Engineer

OWOW • India
11 days ago
Job description

What You'll Build

Core Responsibilities

Data Architecture & Infrastructure (40%)

  • Design and implement a multi-database architecture (MongoDB, Redis, Milvus, Neo4j, BigQuery)
  • Build scalable data pipelines for real-time conversation processing and personalization
  • Architect ETL / ELT workflows for data migration from legacy systems
  • Implement data partitioning, sharding, and optimization strategies for high-throughput systems
  • Create data governance frameworks ensuring quality, security, and compliance Vector & Graph Database Systems (25%)
  • Design and optimize Milvus vector collections for semantic search (1024-dim embeddings)
  • Build graph schemas in Neo4j for customer journey mapping and persona relationships
  • Implement HNSW indexing strategies and similarity search optimization
  • Create hybrid search systems combining vector, full-text, and graph queries
  • Monitor and tune database performance (query latency, throughput, resource utilization)

ML Data Infrastructure (20%)

  • Build data collection pipelines for LLM fine-tuning (conversation logs, tool executions)
  • Create feature stores for GNN training (customer interactions, engagement signals)
  • Implement data versioning and lineage tracking for ML experiments
  • Design A / B testing data infrastructure with CUPED variance reduction
  • Build real-time feature computation pipelines for contextual bandits
  • Analytics & Monitoring (15%)

  • Design BigQuery schemas for marketing analytics and performance tracking
  • Create materialized views and aggregation pipelines for real-time dashboards
  • Implement data quality monitoring and anomaly detection
  • Build observability infrastructure (Prometheus metrics, Grafana dashboards)
  • Develop cost optimization strategies for cloud data warehousing
  • Technical Stack You'll Work With

    Databases & Storage

  • MongoDB (conversation state, active sessions)
  • Redis (caching, rate limiting, real-time data)
  • Milvus (vector embeddings, semantic search)
  • Neo4j (customer journey graphs, persona networks)
  • BigQuery (analytics warehouse, historical data)
  • Data Processing & Orchestration

  • Apache Airflow or Prefect (workflow orchestration)
  • Pandas , Polars (data transformation)
  • Apache Spark (optional - for large-scale processing)
  • dbt (data transformation and modeling)
  • ML / AI Data Pipeline

  • vLLM (LLM inference serving)
  • MLflow (model registry, experiment tracking)
  • Sentence Transformers (embedding generation)
  • PyTorch , TensorFlow (ML model training)
  • Cloud & Infrastructure

  • Google Cloud Platform (BigQuery, Cloud Storage, Compute)
  • Docker & Kubernetes (containerization, orchestration)
  • Terraform (infrastructure as code)
  • GitHub Actions or GitLab CI (CI / CD pipelines)
  • Programming & Tools

  • Python 3.10+ (primary language)
  • SQL (complex queries, query optimization)
  • Shell scripting (Bash / Zsh)
  • Git (version control)
  • Requirements

    Must-Have Skills

  • 5+ years of data engineering experience with production systems
  • Expert-level SQL and database design skills
  • Strong Python programming (async / await, type hints, testing)
  • Experience with at least 3 different database technologies (SQL, NoSQL, Vector, Graph)
  • Proven track record building high-scale data pipelines (>
  • 1M records / day)

  • Deep understanding of data modeling (dimensional, normalized, denormalized)
  • Experience with cloud data warehouses (BigQuery, Redshift, or Snowflake)
  • Strong knowledge of data quality, validation, and governance
  • Excellent debugging and optimization skills
  • Highly Desirable

  • Experience with vector databases (Milvus, Pinecone, Weaviate, Qdrant)
  • Experience with graph databases (Neo4j, ArangoDB, Neptune)
  • Knowledge of embedding models and semantic search
  • Experience with ML data pipelines (feature stores, model training data)
  • Understanding of A / B testing and experimental design
  • Experience with real-time streaming (Kafka, Pub / Sub, Kinesis)
  • Knowledge of LLMs and conversational AI systems
  • Experience with data migration projects (especially large-scale)
  • Background in marketing technology or customer data platforms
  • Nice-to-Have

  • Experience with PyTorch Geometric or graph neural networks
  • Knowledge of marketing analytics (attribution, segmentation, personalization)
  • Familiarity with LangChain , LangGraph , or agent frameworks
  • Experience with cost optimization in cloud environments
  • Contributions to open-source data engineering projects
  • Experience with data compliance (GDPR, CCPA)
  • Key Projects You'll Own

    Phase 1 : Foundation

  • Migrate 10M+ conversation vectors from Pinecone to Milvus
  • Design and implement MongoDB schemas for real-time agent state
  • Set up Neo4j graph database with customer journey models
  • Create BigQuery data warehouse with partitioned tables
  • Phase 2 : Optimization

  • Build automated data quality monitoring system
  • Implement caching strategies (Redis) for 10x latency reduction
  • Optimize vector search queries (target :
  • Create real-time analytics dashboards (Grafana)
  • Phase 3 : ML Infrastructure

  • Build LLM fine-tuning data pipeline
  • Implement feature store for GNN training
  • Create A / B testing data infrastructure
  • Design multi-armed bandit state management
  • Work Environment

  • Collaborative team : Work with ML engineers, backend developers, and data scientists
  • Modern stack : Latest technologies and tools
  • Impact : Your work directly affects millions of marketing interactions
  • Autonomy : Own your projects end-to-end
  • Growth : Clear path to Senior / Lead / Principal roles
  • Create a job alert for this search

    Data Engineer • India

    Related jobs
    Data Engineer

    Data Engineer

    Insight Global • India, India
    GCP DATA ENGINEER - Contract (Long term).Data Engineer with hands-on support for Google Looker.Strong experience in data modeling and building data marts. Proficiency in ETL / ELT pipeline development...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Synechron • India
    We have immediate opportunity for.At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines creativity and innovative technology to...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Fiery • Republic Of India, IN
    Digital Front Ends (DFEs) and workflow solutions for the growing industrial and graphic arts print industries.Fiery is leading the transformation from analog to digital imaging with scalable, digit...Show more
    Last updated: 6 days ago • Promoted
    Data Engineer

    Data Engineer

    TerraGiG • Republic Of India, IN
    Lead the design, development, and implementation of data solutions using AWS and Snowflake.Collaborate with cross-functional teams to understand business requirements and translate them into techni...Show more
    Last updated: 1 day ago • Promoted
    Lead Data Engineer

    Lead Data Engineer

    VXI Global Solutions • Republic Of India, IN
    We are seeking talented and motivated Data Engineers to join our dynamic team and contribute to our mission of harnessing the power of data to drive growth and success. As a Data Engineer at VXI Glo...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Grantify • Republic Of India, IN
    Grantify is an innovative education platform that connects students and universities through a transparent admissions and tuition-matching ecosystem. By aligning student budgets and academic aspirat...Show more
    Last updated: 2 days ago • Promoted
    Aws Data Engineer

    Aws Data Engineer

    ACL Digital • Republic Of India, IN
    We have urgent openings for AWS Data Engineer.The ideal candidate will have a strong background in designing, deploying, and maintaining data pipelines using industry-leading tools such as.Familiar...Show more
    Last updated: 14 days ago • Promoted
    Data Engineer

    Data Engineer

    Mastek • Chennai, Republic Of India, IN
    Work closely with our data science team to help build complex algorithms that provide unique insights into our data.Use agile software development processes to make iterative improvements to our ba...Show more
    Last updated: 30+ days ago • Promoted
    Customer Engineer (Data Engineer)

    Customer Engineer (Data Engineer)

    Konecta • India
    In this role, you will be the technical point of contact for the customer – a trusted partner for their IT leadership – as the primary engineer on the project, working with multifunctional teams bo...Show more
    Last updated: 8 days ago • Promoted
    Data Engineer

    Data Engineer

    Randstad Enterprise • Republic Of India, IN
    Shift Timing : 2 : 00 Pm - 11 : 00 Pm.Experience : 2- 4 years relevant Experience only ( this is a Junior position with us ). GCP - 2 years minimum working Experience.Worked with global stakeholders.Ran...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Modulr • Republic Of India, IN
    The Data Engineer is a vital role within Modulr and this role will support the continuous improvement and innovation of our data platform, ensuring processes are robust, efficient and scalable.Extr...Show more
    Last updated: 18 days ago • Promoted
    Data Engineer

    Data Engineer

    AS Technology Corporation • India
    We are seeking an experienced Data Engineer to design, build, and optimize scalable data pipelines and data infrastructure solutions. This role involves working with cloud platforms, big data framew...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer - Palantir Foundry

    Data Engineer - Palantir Foundry

    NP Group • Republic Of India, IN
    Data Engineer - Palantir Foundry, Workshop, Pyspark & Typescript.Long Term (initially 6 months) full time contract.We have an immediate requirement for an experienced Data Engineer to join the glob...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Trigent Software Private Limited • India, KA, India
    Quick Apply
    BDC7A Summary : As a Data Platform Engineer, you will assist with the data platform blueprint and design, encompassing the relevant data platform components. Your typical day will involve collaborati...Show more
    Last updated: 30+ days ago
    Data Engineer 2

    Data Engineer 2

    Yubi • Republic Of India, IN
    As a Data Engineer, you will be part of a highly talented Data Engineering team.Responsible for developing reusable capabilities and tools to automate various types of data processing pipelines.You...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Atica Global • Pune, Republic Of India, IN
    Atica is a leading, tech-first remote sales & revenue management company for Hotels and Hotel operators in the US, enabled by a unified tech-led solution. We are a startup, funded by the top investo...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    MRF • India
    Azure / SQL / Application - Data Engineer.Responsible to maintain the data required for managed application like Advanced Planning System, Dealer Management System, etc. Responsible for user access mana...Show more
    Last updated: 7 hours ago • Promoted • New!
    Data Engineer

    Data Engineer

    Persistent Systems • Republic Of India, IN
    As a Data Engineer, you will design and manage robust data pipelines using Azure Databricks, Spark, and Kafka.You'll process large-scale datasets and implement CDC pipelines to power analytics plat...Show more
    Last updated: 17 days ago • Promoted