Talent.com
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)AIMLEAP • faridabad, haryana, in
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

AIMLEAP • faridabad, haryana, in
21 hours ago
Job description

Data Engineering Manager – Web Crawling & Pipeline Architecture

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines , preferably using workflow orchestration tools such as Airflow or Celery .
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks , proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation , including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery , or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems .
  • Strong expertise in Python, SQL , and modern data processing practices.
  • Experience working with Airflow, Celery , or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques , and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Engineering Manager • faridabad, haryana, in

    Related jobs
    Product Manager

    Product Manager

    Melhor Group • South Delhi, Delhi, India
    FYORA is an innovative AI-driven platform revolutionizing workforce automation through specialized AI agents.We empower businesses by streamlining processes, boosting efficiency, and optimizing ope...Show more
    Last updated: 30+ days ago • Promoted
    Engineer

    Engineer

    PHOENIX CONTACT (I) Pvt. Ltd. • Faridabad, Haryana, India
    Job Title : Senior Engineer – OT, IIoT and ISO Compliance.Location : Prithla, Palwal, Haryana.This role is pivotal in ensuring robust security, compliance, and operational excellence across the entir...Show more
    Last updated: 12 days ago • Promoted
    Lead Generation Manager

    Lead Generation Manager

    lOOM SOLAR PRIVATE LIMITED • Faridabad, Haryana, India
    Job Title : - Lead Generation Manager.Rewards : On Time Pay, Certificates, New Skills, Opportunity to become Marketing Head. The Lead Generation Manager is responsible for building and managing a str...Show more
    Last updated: 2 days ago • Promoted
    Site Engineer

    Site Engineer

    PYRAMID • Faridabad, Haryana, India
    Hospitality (New Outlet Opening).The candidate will oversee site execution, coordinate with contractors, and ensure the project is delivered on time with high-quality standards.Supervise day-to-day...Show more
    Last updated: 12 days ago • Promoted
    Senior Project Manager

    Senior Project Manager

    Yamaha Motor Solutions India • Faridabad, Haryana, India
    We are seeking a highly skilled and experienced Senior Project Manager to lead Microsoft-focused projects involving multiple Customer Relationship Management (CRM) platforms.The role involves manag...Show more
    Last updated: 5 days ago • Promoted
    Data Scientist

    Data Scientist

    Recro • faridabad, haryana, in
    We’re seeking a highly skilled, hands-on Data Scientist with 4–10 years of experience in applied AI / ML to join our fast-paced team. This role requires deep expertise in transformer architectures and...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI Engineer

    Senior AI Engineer

    Xtnsion.AI • faridabad, haryana, in
    AI is building the agentic CX layer for modern businesses — AI voice + chat agents that autonomously handle bookings, lead follow-up, support workflows, CRM actions, and more across phone, WhatsApp...Show more
    Last updated: 21 hours ago • Promoted • New!
    Senior Data Engineer

    Senior Data Engineer

    Primesoft Inc • faridabad, haryana, in
    Primesoft Enterprise IT Services Pvt.As a Software Engineer II - Data, you will contribute to the design and development of data systems including pipelines, APIs, analytics, AI and machine learnin...Show more
    Last updated: 30+ days ago • Promoted
    AI Analyst

    AI Analyst

    Aventis Solutions • faridabad, haryana, in
    Aventis Solutions is igniting the AI revolution : .They have just launched The AI Executive podcast, which can be found here : . Now, our tech partner is establishing a new AI Innovation Hub in Pune, In...Show more
    Last updated: 30+ days ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    Programmers.io • faridabad, haryana, in
    We are seeking highly skilled Senior.Laravel and modern frontend frameworks (Vue.The candidate should have deep technical expertise, leadership ability, and experience architecting scalable web sol...Show more
    Last updated: 15 days ago • Promoted
    Data Architect

    Data Architect

    Tech Mahindra • faridabad, haryana, in
    We are seeking a highly skilled professional who can.ETL processes, and data quality initiatives.Having experience into any Cloud (Azure / GCP / AWS). Proposing solutions to optimize existing.Develo...Show more
    Last updated: 26 days ago • Promoted
    Lead generation manager

    Lead generation manager

    LOOM SOLAR PRIVATE LIMITED • Faridabad, Haryana, India
    Job Title : - Lead Generation Manager.Rewards : On Time Pay, Certificates, New Skills, Opportunity to become Marketing Head. The Lead Generation Manager is responsible for building and managing a str...Show more
    Last updated: 7 hours ago • Promoted • New!
    Data Engineer - Fully Remote (Global Data Platform & Analytics Projects)

    Data Engineer - Fully Remote (Global Data Platform & Analytics Projects)

    SkillsCapital • faridabad, haryana, in
    Remote
    These fully remote, long-term freelance roles are ideal for engineers who can build scalable data pipelines, work with modern cloud-native data stacks, and support large-scale enterprise data initi...Show more
    Last updated: 9 hours ago • Promoted • New!
    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    AIMLEAP • Faridabad, Republic Of India, IN
    Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT.IT / Data / AI / E-commerce / FinTech / Healthcare. Experience working with cloud platforms such as...Show more
    Last updated: 17 hours ago • Promoted • New!
    Project Manager – Smart Metering (NB-IoT / cellular IoT)

    Project Manager – Smart Metering (NB-IoT / cellular IoT)

    ZENNER India • Faridabad, Haryana, India
    ZENNER Aquamet India Private Limited (ZAIPL) is a joint venture with ZENNER International GmbH Germany specializing in water metering, submetering, and digital solutions for the Indian market.With ...Show more
    Last updated: 5 days ago • Promoted
    Project Manager & Team Lead (Web3)

    Project Manager & Team Lead (Web3)

    Splitmoon Studios • South Delhi, Delhi, India
    Project Manager & Team Lead (Web3).Full-time (Monday–Saturday, 9 : 00 a.This role is for a sister concern which is a dynamic start-up, a Pvt. Company, delivering practical solutions in fundraising, in...Show more
    Last updated: 4 days ago • Promoted
    Deputy Manager - Industrial Engineering

    Deputy Manager - Industrial Engineering

    Havells India Ltd • Faridabad, Haryana, India
    Conduct Time study, Line Balancing, VSM studies, Lean Layout Designing, manufacturing lead time reduction, etc.Conduct time study, line balancing studies & prepare norms. Prepare Layouts based on le...Show more
    Last updated: 15 days ago • Promoted
    Informatica Developer

    Informatica Developer

    Yamaha Motor Solutions India • Faridabad, Haryana, India
    Lead project team members through all activities required to successfully deliver Informatica.Define technical specifications for workflows and business rules. Prepare detailed design documents for ...Show more
    Last updated: 5 days ago • Promoted