Talent.com
Senior Data Pipeline Architect
Senior Data Pipeline ArchitectAceolution • Republic Of India, IN
Senior Data Pipeline Architect

Senior Data Pipeline Architect

Aceolution • Republic Of India, IN
13 days ago
Job description

Job Title : Data Engineer – Python Expert(Freelance Role)

Location : Remote / Hybrid

Employment Type : Contract / Freelance

Role Summary

We are looking for a seasoned Senior Data Engineer to architect, build, and own the data pipelines that power our large language model (LLM) development. As a senior Individual Contributor (IC), you will be the team's expert on data ingestion, processing, and quality for all AI training.

Your primary mission is to build scalable, automated systems that transform massive, raw datasets into pristine, model-ready formats. While your focus will be on data engineering, your expertise will be valued in collaborating on model training runs and experiments. You're the perfect fit if you are a Python expert who thrives on solving large-scale data challenges and enjoys working at the intersection of data engineering and machine learning.

Key Responsibilities

Architect & Build : Design, develop, and own robust, scalable, and automated ETL / ELT pipelines in Python for ingesting and processing terabyte-scale text datasets.

Data Quality : Implement rigorous data cleaning, deduplication, filtering, and normalization strategies. Define and enforce data quality standards to ensure the highest integrity for model training.

Data Transformation : Efficiently structure and format diverse datasets (JSON, Parquet, etc.) for consumption by LLM training frameworks.

Collaboration : Work closely with our team of AI researchers and ML engineers to understand data requirements, define metrics, and support the model training lifecycle.

Optimization : Continuously optimize data processing workflows for speed, cost, and reliability.

ML Support (Secondary) : Occasionally assist in launching, monitoring, and debugging data-related issues during model training runs.

Required Qualifications

8+ years of professional experience in data engineering, data processing, or backend software engineering.

Expert-level proficiency in Python and its data ecosystem (e.G., Pandas, NumPy, Dask, Polars).

Proven experience building and maintaining large-scale data pipelines.

Deep understanding of data structures, data modeling, and software engineering best practices (Git, CI / CD, testing).

Experience handling and parsing diverse data formats (JSON, CSV, XML, Parquet) at scale.

Excellent problem-solving skills and a meticulous attention to detail.

Strong communication and collaboration skills, with experience working in a team environment.

Preferred Qualifications (Nice-to-Haves)

Hands-on experience with the data preprocessing pipeline for an LLM (e.G., LLaMA, BERT, GPT-family).

Strong experience with big data frameworks like Apache Spark or Ray.

Experience with Hugging Face libraries (Transformers, Datasets, Tokenizers).

Familiarity with ML frameworks like PyTorch or TensorFlow.

Proficiency with cloud platforms (AWS, GCP, Azure) and their data / storage services.

Why Join Us

  • Opportunity to lead cutting-edge AI and ML projects.
  • Collaborative and innovative team culture.
  • Competitive compensation with continuous learning opportunities.

📩 If you are interested, please share your updated CV to sharmila@aceolution.com along with your expected rate per hour.

Create a job alert for this search

Data Pipeline Architect • Republic Of India, IN

Related jobs
Data Pipeline Architect

Data Pipeline Architect

MyData Insights Pvt Ltd • Republic Of India, IN
The ideal candidate has a strong background in building scalable data solutions on.CI / CD pipelines, infrastructure automation using Terraform, and cloud-native data engineering practices.Design, bu...Show more
Last updated: 4 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

Tredence Inc. • Republic Of India, IN
IT experience and 7+ years' experience in SQL Server, SSIS, SQL, Python / C#, ETL, Streaming.Design and develop SQL Server stored procedures, functions, views and triggers to be used during the ETL p...Show more
Last updated: 30+ days ago • Promoted
Senior Data Pipeline Architect

Senior Data Pipeline Architect

InfoBeans • Republic Of India, IN
We are seeking a highly skilled.Senior Data Engineer – Data Acquisition (ODS).The ideal candidate will have extensive hands-on experience in building and optimizing data ingestion and transformatio...Show more
Last updated: 14 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

IntraEdge • Republic Of India, IN
We are seeking a highly skilled Data Engineer with strong experience in Python, PySpark, Snowflake, and AWS Glue to join our growing data team. You will be responsible for building scalable and reli...Show more
Last updated: 30+ days ago • Promoted
Lead Data Pipeline Architect

Lead Data Pipeline Architect

CYGNVS • Chennai, Republic Of India, IN
Our company culture is a perfect blend of productivity and enjoyment, prioritizing the growth and development of our talented employees as part of a high-achieving team. We are in search of exceptio...Show more
Last updated: 22 days ago • Promoted
GCP Data Pipeline Architect

GCP Data Pipeline Architect

EXL • Pune, Republic Of India, IN
The ideal candidate will have a solid background in big data technologies, data warehousing, and cloud-based data architecture, with hands-on experience using GCP-native tools.Dataflow, Dataproc, P...Show more
Last updated: 14 days ago • Promoted
Lead Data Pipeline Architect

Lead Data Pipeline Architect

Lear Corporation • Pune, Republic Of India, IN
Lear, a global automotive technology leader in Seating and E-Systems, is Making every drive better by delivering intelligent in-vehicle experiences for customers around the world.With over 100 year...Show more
Last updated: 18 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

Solytics Partners • Pune, Republic Of India, IN
Solytics Partners is a Global Analytics firm, recognized with multiple industry awards for innovation and excellence.Our team comprises experts with deep knowledge in risk, analytics, AI / ML, AML / FC...Show more
Last updated: 30+ days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

Kanerika Inc • Indore, Republic Of India, IN
Hyderabad, Indore and Ahmedabad (India).Develop and maintain data pipelines using Azure Data Factory (ADF) / Databricks for data integration and ETL processes. Design, implement, and optimize Power B...Show more
Last updated: 21 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

HCLTech • Chennai, Republic Of India, IN
Senior Data Engineer – Vendor Experience : 6–12 Years.We are seeking Senior Data Engineer resources to work on the migration of applications from our legacy Cloudera environment to the new Kubernete...Show more
Last updated: 9 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

NielsenIQ • Chennai, Republic Of India, IN
Design, build, and optimize large-scale ETL and data-processing pipelines handling GB–TB volumes.Operate within the Databricks ecosystem and drive migration of selected workloads to high-performanc...Show more
Last updated: 9 days ago • Promoted
Lead Data Pipeline Architect

Lead Data Pipeline Architect

Dexian India • Chennai, Republic Of India, IN
Minimum 8+ years of hands-on experience designing, building, deploying, testing, maintaining, monitoring, and owning scalable, resilient, and distributed data pipelines. High proficiency in Python, ...Show more
Last updated: 8 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

EXL • Pune, Republic Of India, IN
Location - Pune, Bangalore, Noida, Gurgaon, Hyderabad.The ideal candidate will have strong expertise in Snowflake, Hadoop ecosystem, PySpark, and SQL, and will play a key role in enabling data-driv...Show more
Last updated: 30+ days ago • Promoted
Senior Data Pipeline Architect

Senior Data Pipeline Architect

Ascendion • Chennai, Republic Of India, IN
Job Description : Senior DBT Engineer (8–12 Years).Senior Data Engineer with strong ELT experience.Focus on building scalable, efficient, and well-governed DBT data models on modern cloud platforms.Show more
Last updated: 21 hours ago • Promoted • New!
Data Pipeline Architect

Data Pipeline Architect

Mastek • Chennai, Republic Of India, IN
Work closely with our data science team to help build complex algorithms that provide unique insights into our data.Use agile software development processes to make iterative improvements to our ba...Show more
Last updated: 4 days ago • Promoted
Data Pipeline Architect

Data Pipeline Architect

Blend • Chennai, Republic Of India, IN
We are looking for an experienced.In this role, you will build and maintain scalable data pipelines and architecture to support analytics, data science, and business intelligence initiatives.You’ll...Show more
Last updated: 21 hours ago • Promoted • New!
Lead Azure Data Pipeline Architect

Lead Azure Data Pipeline Architect

All European Careers • Republic Of India, IN
For an international project in Chennai, we are urgently looking for a Full Remote Senior Azure Data Engineer, who will build data pipeline for enterprise search applications using ADF and Databric...Show more
Last updated: 7 days ago • Promoted
Principal Data Pipeline Architect

Principal Data Pipeline Architect

Publicis Production • Republic Of India, IN
We are seeking a proactive and self-motivated Senior Data Engineer with a proven track record in building scalable cloud-based data solutions across multiple cloud platforms to support our work in ...Show more
Last updated: 21 hours ago • Promoted • New!