Talent.com
Data Engineer
Data EngineerOWOW • Anand, IN
No longer accepting applications
Data Engineer

Data Engineer

OWOW • Anand, IN
30+ days ago
Job description

What You'll Build

Core Responsibilities

Data Architecture & Infrastructure (40%)

  • Design and implement a multi-database architecture (MongoDB, Redis, Milvus, Neo4j, BigQuery)
  • Build scalable data pipelines for real-time conversation processing and personalization
  • Architect ETL / ELT workflows for data migration from legacy systems
  • Implement data partitioning, sharding, and optimization strategies for high-throughput systems
  • Create data governance frameworks ensuring quality, security, and compliance Vector & Graph Database Systems (25%)
  • Design and optimize Milvus vector collections for semantic search (1024-dim embeddings)
  • Build graph schemas in Neo4j for customer journey mapping and persona relationships
  • Implement HNSW indexing strategies and similarity search optimization
  • Create hybrid search systems combining vector, full-text, and graph queries
  • Monitor and tune database performance (query latency, throughput, resource utilization)

ML Data Infrastructure (20%)

  • Build data collection pipelines for LLM fine-tuning (conversation logs, tool executions)
  • Create feature stores for GNN training (customer interactions, engagement signals)
  • Implement data versioning and lineage tracking for ML experiments
  • Design A / B testing data infrastructure with CUPED variance reduction
  • Build real-time feature computation pipelines for contextual bandits
  • Analytics & Monitoring (15%)

  • Design BigQuery schemas for marketing analytics and performance tracking
  • Create materialized views and aggregation pipelines for real-time dashboards
  • Implement data quality monitoring and anomaly detection
  • Build observability infrastructure (Prometheus metrics, Grafana dashboards)
  • Develop cost optimization strategies for cloud data warehousing
  • Technical Stack You'll Work With

    Databases & Storage

  • MongoDB (conversation state, active sessions)
  • Redis (caching, rate limiting, real-time data)
  • Milvus (vector embeddings, semantic search)
  • Neo4j (customer journey graphs, persona networks)
  • BigQuery (analytics warehouse, historical data)
  • Data Processing & Orchestration

  • Apache Airflow or Prefect (workflow orchestration)
  • Pandas , Polars (data transformation)
  • Apache Spark (optional - for large-scale processing)
  • dbt (data transformation and modeling)
  • ML / AI Data Pipeline

  • vLLM (LLM inference serving)
  • MLflow (model registry, experiment tracking)
  • Sentence Transformers (embedding generation)
  • PyTorch , TensorFlow (ML model training)
  • Cloud & Infrastructure

  • Google Cloud Platform (BigQuery, Cloud Storage, Compute)
  • Docker & Kubernetes (containerization, orchestration)
  • Terraform (infrastructure as code)
  • GitHub Actions or GitLab CI (CI / CD pipelines)
  • Programming & Tools

  • Python 3.10+ (primary language)
  • SQL (complex queries, query optimization)
  • Shell scripting (Bash / Zsh)
  • Git (version control)
  • Requirements

    Must-Have Skills

  • 5+ years of data engineering experience with production systems
  • Expert-level SQL and database design skills
  • Strong Python programming (async / await, type hints, testing)
  • Experience with at least 3 different database technologies (SQL, NoSQL, Vector, Graph)
  • Proven track record building high-scale data pipelines (>
  • 1M records / day)

  • Deep understanding of data modeling (dimensional, normalized, denormalized)
  • Experience with cloud data warehouses (BigQuery, Redshift, or Snowflake)
  • Strong knowledge of data quality, validation, and governance
  • Excellent debugging and optimization skills
  • Highly Desirable

  • Experience with vector databases (Milvus, Pinecone, Weaviate, Qdrant)
  • Experience with graph databases (Neo4j, ArangoDB, Neptune)
  • Knowledge of embedding models and semantic search
  • Experience with ML data pipelines (feature stores, model training data)
  • Understanding of A / B testing and experimental design
  • Experience with real-time streaming (Kafka, Pub / Sub, Kinesis)
  • Knowledge of LLMs and conversational AI systems
  • Experience with data migration projects (especially large-scale)
  • Background in marketing technology or customer data platforms
  • Nice-to-Have

  • Experience with PyTorch Geometric or graph neural networks
  • Knowledge of marketing analytics (attribution, segmentation, personalization)
  • Familiarity with LangChain , LangGraph , or agent frameworks
  • Experience with cost optimization in cloud environments
  • Contributions to open-source data engineering projects
  • Experience with data compliance (GDPR, CCPA)
  • Key Projects You'll Own

    Phase 1 : Foundation

  • Migrate 10M+ conversation vectors from Pinecone to Milvus
  • Design and implement MongoDB schemas for real-time agent state
  • Set up Neo4j graph database with customer journey models
  • Create BigQuery data warehouse with partitioned tables
  • Phase 2 : Optimization

  • Build automated data quality monitoring system
  • Implement caching strategies (Redis) for 10x latency reduction
  • Optimize vector search queries (target :
  • Create real-time analytics dashboards (Grafana)
  • Phase 3 : ML Infrastructure

  • Build LLM fine-tuning data pipeline
  • Implement feature store for GNN training
  • Create A / B testing data infrastructure
  • Design multi-armed bandit state management
  • Work Environment

  • Collaborative team : Work with ML engineers, backend developers, and data scientists
  • Modern stack : Latest technologies and tools
  • Impact : Your work directly affects millions of marketing interactions
  • Autonomy : Own your projects end-to-end
  • Growth : Clear path to Senior / Lead / Principal roles
  • Create a job alert for this search

    Data Engineer • Anand, IN

    Related jobs
    Data Engineer

    Data Engineer

    IntraEdge • Anand, IN
    Python, PySpark, AWS services (Glue, Lambda), and Snowflake.The ideal candidate will design, build, and maintain scalable data pipelines, ensure efficient data integration, and enable advanced anal...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    BayOne Solutions • Nadiad, IN
    We are seeking a highly experienced Data Engineer to join our MarTech team and play a pivotal role in driving innovation within our microservices architecture, with a strong emphasis on data engine...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Vriba Solutions • Anand, IN
    AWS, Snowflake, Kafka, Airflow, GitHub, PySpark, Python.Design, develop, and maintain scalable ETL / ELT pipelines.Ingest data from various sources (APIs, databases, files, etc.Implement both real-ti...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Engineer

    Senior Data Engineer

    Primesoft Inc • Anand, IN
    Primesoft Enterprise IT Services Pvt.APIs, analytics, AI and machine learning at scale.Products team, building and maintaining production-grade pipelines and platform components that power business...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Engineer

    Senior Data Engineer

    CXC • Anand, IN
    Please apply only if you are available to work in Australian time zone and comfortable with 6 months contract duration • •. We’re seeking a highly skilled and autonomous.Power BI implementations to jo...Show more
    Last updated: 1 day ago • Promoted
    Data Engineer

    Data Engineer

    Aceolution • Anand, IN
    Data Engineer – Python Expert(Freelance Role).We are looking for a seasoned Senior Data Engineer to architect, build, and own the data pipelines that power our large language model (LLM) developmen...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Engineer

    Senior Data Engineer

    VRIZE • Nadiad, IN
    Design, architect, administer, and manage large database systems, particularly on Microsoft platforms such as SQL Server and Azure Database. Develop and optimize complex SQL queries for data extract...Show more
    Last updated: 6 days ago • Promoted
    Data Engineer

    Data Engineer

    TerraGiG • Anand, IN
    Lead the design, development, and implementation of data solutions using AWS and Snowflake.Collaborate with cross-functional teams to understand business requirements and translate them into techni...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Tata Consultancy Services • Nadiad, IN
    TCS has been a great pioneer in feeding the fire of Techies like you.We are a global leader in the technology arena and there’s nothing that can stop us from growing together.Your role is of key im...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer

    Data Engineer

    Staffingine LLC • Nadiad, IN
    The Data Engineer will be responsible for designing, developing, and optimizing scalable data pipelines and cloud-based data solutions. This role requires strong Python programming skills, expertise...Show more
    Last updated: 4 days ago • Promoted
    Data Engineer

    Data Engineer

    Randstad Enterprise • Anand, IN
    Shift Timing : 2 : 00 Pm - 11 : 00 Pm.Experience : 2- 4 years relevant Experience only ( this is a Junior position with us ). GCP - 2 years minimum working Experience.Worked with global stakeholders.Ran...Show more
    Last updated: 30+ days ago • Promoted
    GCP Data Engineer

    GCP Data Engineer

    PamTen Inc • Nadiad, IN
    You will work ongoing support activities and support project efforts as needed.You triage identified issues across Account source platforms, integrations and Customer Data Hub.You will analyze and ...Show more
    Last updated: 20 days ago • Promoted
    Data Engineer

    Data Engineer

    System Soft Technologies • Anand, IN
    Location : Remote (3–4-hour time zone overlaps with EST if off shore).Experience with next flow is required, as the consultant will make targeted enhancements to existing workflows and pipelines.Whi...Show more
    Last updated: 7 days ago • Promoted
    Data Engineer

    Data Engineer

    Sikich India • Anand, IN
    Sikich India is seeking an experienced Data Engineer to join our Data & AI practice.You will design, build, and optimize end-to-end data solutions using Microsoft’s data platforms, including Micros...Show more
    Last updated: 30+ days ago • Promoted
    GCP Data Engineer

    GCP Data Engineer

    HCLTech • Nadiad, IN
    Looking for 5+ Years of experience.Storage Classes, Dataflow, Big query, Pyspark / Python, Airflow.Show more
    Last updated: 28 days ago • Promoted
    Data Engineer

    Data Engineer

    Grantify • Nadiad, IN
    Grantify is an innovative education platform that connects students and universities through a transparent admissions and tuition-matching ecosystem. By aligning student budgets and academic aspirat...Show more
    Last updated: 3 days ago • Promoted
    Data Engineer - Fully Remote (Global Data Platform & Analytics Projects)

    Data Engineer - Fully Remote (Global Data Platform & Analytics Projects)

    SkillsCapital • Anand, IN
    Remote
    These fully remote, long-term freelance roles are ideal for engineers who can build scalable data pipelines, work with modern cloud-native data stacks, and support large-scale enterprise data initi...Show more
    Last updated: 2 days ago • Promoted
    PySpark Data Engineer

    PySpark Data Engineer

    EXTRAGIG • Anand, IN
    Contract Assistant – Data Engineer Support (Remote, EST Hours).PySpark Data Engineer with daily activities.This is a remote contract role. Execute creative software and data solutions, including des...Show more
    Last updated: 30+ days ago • Promoted