Talent.com
Data Platform Engineer

Data Platform Engineer

BharatGenGandhinagar, IN
18 hours ago
Job description

Job Summary :

BharatGen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable tools, platforms, and pipelines tailored for processing large-scale, multilingual, multimodal datasets critical for foundational AI models.

In this role, you will build scalable data pipelines to ingest, transform, and prepare data from diverse sources—text, speech, images, and video—making it ready for Generative AI model training. Your work will involve developing and managing the underlying platform while addressing challenges like governance, security, observability, lineage, and scalability. The outcomes of your work will include efficient tools for data processing, a reliable data platform, and high-quality datasets tailored to the evolving needs of large-scale AI and LLM training.

Collaborating closely with researchers and ML engineers, you will play a pivotal role in enabling BharatGen to deliver state-of-the-art AI models, contributing to the advancement of India’s AI ecosystem through innovative data engineering solutions.

Key Responsibilities :

  • Design and Build Scalable Platforms : Develop distributed infrastructure for ingesting, processing, and transforming diverse datasets (text, speech, images, video) at terabyte to petabyte scale.
  • Develop Robust Data Pipelines : Create reliable, scalable pipelines to prepare datasets for Generative AI and LLM training.
  • Implement Governance and Observability : Build frameworks for data lineage, monitoring, and access control to ensure data quality and operational reliability.
  • Optimize Performance and Cost : Enhance platform performance and resource utilization using cost-effective strategies, including GPU-accelerated preprocessing.
  • Collaborate and Innovate : Work closely with researchers and ML engineers to adapt platforms and data pipelines to evolving LLM requirements, addressing various data challenges.
  • Drive Innovation : Stay updated on emerging tools, frameworks, and best practices to implement cutting-edge solutions for large-scale dataset creation.

Minimum Qualifications and Experience :

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field with 3+ years of industry experience.
  • Required Skills :

  • Proficiency in distributed systems and frameworks (e.g., Kafka, Ray, PySpark) for scalable data workflows.
  • Exposure to end-to-end data lifecycle management, including DataOps.
  • Strong programming skills in Python, Scala, or Go, with a focus on high-performance pipeline development.
  • Experience with building and optimizing data pipelines, including ETL processes, data modeling, and integration into scalable workflows.
  • Expertise in data scraping, crawling frameworks, and modern dataset development techniques such as synthetic data generation techniques.
  • Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes).
  • Deep understanding of data platform design, including data architecture, metadata tracking, data lineage, observability, monitoring, and scalability best practices.
  • Familiarity with Infrastructure-as-Code tools (e.g., Terraform, CloudFormation), CI / CD pipelines, relational / NoSQL databases, and GPU-accelerated workflows.
  • Familiarity with visualization and monitoring tools for lifecycle management and pipeline performance tracking.
  • Expertise in managing unstructured data (text, speech, or multimodal datasets) for high-performance use cases, ideally in the context of LLM / AI datasets.
  • Understanding of challenges in scalable data engineering, including ingestion, transformation, and storage optimization for large-scale accelerated workflows.
  • Create a job alert for this search

    Data Platform Engineer • Gandhinagar, IN

    Related jobs
    • Promoted
    Senior Data Platform Engineer

    Senior Data Platform Engineer

    Black Dog LabsGandhinagar, IN
    Remote (collaboration across time zones), India or LATAM preferred.Proficient English communication.Data Engineering / Backend Engineering / DevOps. We’re looking for a hands-on Senior Data Platform...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    Insight GlobalAhmedabad, IN
    GCP DATA ENGINEER - Contract (Long term).Data Engineer with hands-on support for Google Looker.Strong experience in data modeling and building data marts. Proficiency in ETL / ELT pipeline development...Show moreLast updated: 30+ days ago
    • Promoted
    Data Platform Engineer – B2B Intelligence Systems (Life Sciences)

    Data Platform Engineer – B2B Intelligence Systems (Life Sciences)

    BioSalesAhmedabad, IN
    Data Platform Engineer – B2B Intelligence Systems (Life Sciences).BioSales partners with contract research organizations (CROs) and life sciences companies to provide comprehensive sales and go-to-...Show moreLast updated: 8 days ago
    • Promoted
    • New!
    Data Engineer

    Data Engineer

    Ingrain Systems IncAhmedabad, IN
    Proficient working knowledge of.Experience in Azure Cloud Services –.Show moreLast updated: 18 hours ago
    • Promoted
    Data Engineer

    Data Engineer

    RecroAhmedabad, IN
    Data Pipeline Engineering : Design, build, and maintain ingestion, transformation, and storage pipelines using Azure Data Factory, Synapse Analytics, and Data Lake. AI Data Enablement : Collaborate wi...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    Kanerika IncAhmedabad, Gujarat, India
    Our focus is to empower businesses to achieve their digital transformation goals and maximize their business impact through the effective use of data and AI. We leverage cutting-edge technologies in...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Data Engineer

    Data Engineer

    ProductSquadsAhmedabad, IN
    Company Profile : ProductSquads was founded with a bold mission : to engineer capital efficiency through autonomous AI agents, exceptional engineering, and real-time decision intelligence.We’re build...Show moreLast updated: 18 hours ago
    • Promoted
    Data Engineer

    Data Engineer

    IntraEdgeGandhinagar, IN
    Python, PySpark, AWS services (Glue, Lambda), and Snowflake.The ideal candidate will design, build, and maintain scalable data pipelines, ensure efficient data integration, and enable advanced anal...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer - Palantir Foundry

    Data Engineer - Palantir Foundry

    NP GroupGandhinagar, IN
    Data Engineer - Palantir Foundry, Workshop, Pyspark & Typescript.Long Term (initially 6 months) full time contract.We have an immediate requirement for an experienced Data Engineer to join the glob...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Data Engineer

    Data Engineer

    Terra Technology Circle Consulting Private LimitedGandhinagar, IN
    We are seeking a highly skilled and motivated.In this role, you will design, build, and optimize scalable data pipelines and architectures to support analytics, machine learning, and business intel...Show moreLast updated: 18 hours ago
    • Promoted
    Platform Engineer

    Platform Engineer

    Yum! India Global Services Private LimitedAhmedabad, IN
    We’re looking for a Platform Engineer to lead the design and development of internal self-service workflows and automation for our internal developer platform. Build reusable workflows using Go, emp...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    Vriba SolutionsAhmedabad, IN
    Design, develop & maintain ETL / ELT pipelines.Ingest & transform data from APIs, DBs, files, streams.Build real-time & batch processing solutions. Data validation, quality & cleansing.Translate busin...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    DigitalzoneAhmedabad, IN
    As a Data Engineer, you will design, build, and optimize data pipelines and real-time systems that power AI-driven decisioning and analytics. Develop and maintain scalable ETL / ELT pipelines using Py...Show moreLast updated: 8 days ago
    • Promoted
    Senior Data Engineer

    Senior Data Engineer

    Insight GlobalAhmedabad, IN
    The Senior Data Engineer is responsible for building and optimizing ETL / ELT pipelines that process terabytes of data daily across 186 data assets, implementing BigQuery datasets with enterprise-sca...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Data Engineer

    Data Engineer

    Baazi GamesGandhinagar, IN
    Reports To : Engineering Manager – Data Platform.The Data Engineer III will lead the design and optimization of Baazi’s large-scale data platform. You’ll architect end-to-end data solutions, mentor j...Show moreLast updated: 18 hours ago
    • Promoted
    Lead Data Engineer

    Lead Data Engineer

    AdvantmedAhmedabad, Gujarat, India
    Lead Data Engineer – Azure Cloud.We are looking for an experienced.The ideal candidate will design, develop, and maintain scalable data pipelines, ensure data quality, and leverage cloud technologi...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Data Engineer

    Data Engineer

    TalentOnLeaseAhmedabad, IN
    Job Description We are looking for a Data Engineer with strong skills in AWS and data engineering tools to support and monitor our data pipelines and systems. The candidate will be responsible for m...Show moreLast updated: 18 hours ago
    • Promoted
    Data Engineer

    Data Engineer

    People Prime WorldwideAhmedabad, IN
    Our Client is a global IT services company headquartered in Southborough, Massachusetts, USA.Founded in 1996, with a revenue of $1. B, with 35,000+ associates worldwide, specializes in digital engin...Show moreLast updated: 30+ days ago