Talent.com
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)AIMLEAP • Panchkula, Haryana, India
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

AIMLEAP • Panchkula, Haryana, India
6 hours ago
Job description

Data Engineering Manager – Web Crawling & Pipeline Architecture

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.

Expertise in designing, building, and optimizing scalable data pipelines , preferably using workflow orchestration tools such as Airflow or Celery .

Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.

Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.

Deep understanding of web crawling frameworks , proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).

Strong expertise in AI-driven automation , including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.

Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.

Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.

Establish pipeline orchestration using Airflow, Celery , or similar distributed processing technologies.

Define and enforce data quality, validation, and security measures across all data flows and pipelines.

Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.

Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.

Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.

Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling

Qualifications

Bachelor's or master's degree in engineering, Computer Science, or related field.

7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems .

Strong expertise in Python, SQL , and modern data processing practices.

Experience working with Airflow, Celery , or similar workflow automation tools.

Solid understanding of proxy systems, anti-bot techniques , and scalable crawler architecture.

Hands-on experience with cloud data platforms (AWS / GCP / Azure).

Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).

Strong analytical, architectural, and leadership skills.

Create a job alert for this search

Engineering Manager • Panchkula, Haryana, India

Related jobs
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

AIMLEAP • panchkula, haryana, in
Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT .IT / Data / AI / E-commerce / FinTech / Healthcare . Experience working with cloud platforms such ...Show more
Last updated: 10 hours ago • Promoted • New!
Data Scientist

Data Scientist

Recro • panchkula, haryana, in
We’re seeking a highly skilled, hands-on Data Scientist with 4–10 years of experience in applied AI / ML to join our fast-paced team. This role requires deep expertise in transformer architectures and...Show more
Last updated: 30+ days ago • Promoted
Principal QA Engineer (Cypress)

Principal QA Engineer (Cypress)

CES • Panchkula, Haryana, India
We are seeking a Principal QA Engineer to join our agile development team and take ownership of delivering high-quality software through advanced testing strategies. This is a hands-on IC role w...Show more
Last updated: 30+ days ago • Promoted
Full Stack Engineer

Full Stack Engineer

AideWiser SolTek • Panchkula, Haryana, India
Position : Full Stack Developer Experience : 4+years Location : Remote Mandatory Skills : .AWS (EC2, Lambda, S3, RDS, DynamoDB, etc. Key Responsibilities : Design, develop, and maintain backend servi...Show more
Last updated: 30+ days ago • Promoted
Senior Software Engineer

Senior Software Engineer

Programmers.io • panchkula, India
We are seeking a highly skilled and experienced Senior Azure Data Engineer to join our team.The ideal candidate will have deep expertise in Microsoft Azure data services, cloud-based data engineeri...Show more
Last updated: 30+ days ago • Promoted
Senior Data Engineer

Senior Data Engineer

Primesoft Inc • panchkula, haryana, in
Primesoft Enterprise IT Services Pvt.As a Software Engineer II - Data, you will contribute to the design and development of data systems including pipelines, APIs, analytics, AI and machine learnin...Show more
Last updated: 30+ days ago • Promoted
Full Stack Engineer

Full Stack Engineer

Insight Global • panchkula, haryana, in
Contract with Insight Global Client.React, React Native, TypeScript.React, React Native, and TypeScript.Deploy containerized solutions using. Ensure high-quality deliverables through.CI / CD pipelines...Show more
Last updated: 30+ days ago • Promoted
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

AIMLEAP • Panchkula, Republic Of India, IN
Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT.IT / Data / AI / E-commerce / FinTech / Healthcare. Experience working with cloud platforms such as...Show more
Last updated: 6 hours ago • Promoted • New!
AI Analyst

AI Analyst

Aventis Solutions • panchkula, haryana, in
Aventis Solutions is igniting the AI revolution : .They have just launched The AI Executive podcast, which can be found here : . Now, our tech partner is establishing a new AI Innovation Hub in Pune, In...Show more
Last updated: 30+ days ago • Promoted
AI Business Analyst

AI Business Analyst

Aventis Solutions • panchkula, haryana, in
Aventis Solutions is igniting the AI revolution : .They have just launched The AI Executive podcast, which can be found here : . MMQBvaKxQSuXcZ2MLnv?si=f8fb3c2cd9ee4d12.Now, our tech partner is establis...Show more
Last updated: 10 hours ago • Promoted • New!
Full Stack Engineer

Full Stack Engineer

Programmers.io • panchkula, haryana, in
Job Title : Senior Full Stack Developer (Laravel + Vue).We are seeking highly skilled Senior Full Stack Developers with 7–10 years of experience in Laravel and modern frontend frameworks (Vue.The ca...Show more
Last updated: 15 days ago • Promoted
Sr. Azure Data Architect & Presales Solution

Sr. Azure Data Architect & Presales Solution

Programmers.io • panchkula, haryana, in
We offer a vibrant and collaborative work environment, cutting-edge tools and technologies, and ample opportunities for professional growth. Job Title : Azure Data Architect.Experience required : 15+ ...Show more
Last updated: 18 days ago • Promoted
Generative AI Engineer

Generative AI Engineer

Live Connections • panchkula, haryana, in
Required Notice Period - Immediate Joiners or Serving Notice or 30 days.Bachelor’s in CS / ML / AI or related field; Master’s or PhD preferred. ML / Data Science with a focus on generative AI, LLMs, or co...Show more
Last updated: 17 days ago • Promoted
Business Develop Manager

Business Develop Manager

Grantify • panchkula, haryana, in
Grantify is an innovative education platform that bridges students and universities through a transparent admissions and tuition-matching system. By aligning student budgets and academic goals with ...Show more
Last updated: 10 hours ago • Promoted • New!
AI / ML Developer

AI / ML Developer

Cozzera • panchkula, haryana, in
Job Title : AI / ML Builder – Salesforce + Generative AI.We are seeking a highly skilled.The ideal candidate will design and implement intelligent, secure, and scalable AI-driven solutions using.Einst...Show more
Last updated: 10 hours ago • Promoted • New!
AI / ML Engineer

AI / ML Engineer

Cozzera • Panchkula, Haryana, India
Job Title : AI / ML Engineer Experience : 5+ Years Location : Remote We are looking for an experienced AI / ML Engineer with a strong background in machine learning and deep learning, especially in time...Show more
Last updated: 6 hours ago • Promoted • New!
Full-Stack Developer - 20414

Full-Stack Developer - 20414

Turing • Panchkula, Haryana, India
Role Overview : Turing is looking for experienced Full Stack Developers to build modern solutions that power AI products and evaluation workflows. React / Angular / Vue) to implement features, improve c...Show more
Last updated: 30+ days ago • Promoted
Senior AI Engineer

Senior AI Engineer

Xtnsion.AI • panchkula, haryana, in
AI is building the agentic CX layer for modern businesses — AI voice + chat agents that autonomously handle bookings, lead follow-up, support workflows, CRM actions, and more across phone, WhatsApp...Show more
Last updated: 10 hours ago • Promoted • New!