Talent.com
Technical Lead – Web Crawling Systems, Data Pipelines
Technical Lead – Web Crawling Systems, Data PipelinesAIMLEAP • gurgaon, haryana, in
Technical Lead – Web Crawling Systems, Data Pipelines

Technical Lead – Web Crawling Systems, Data Pipelines

AIMLEAP • gurgaon, haryana, in
3 hours ago
Job description

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines, preferably using workflow orchestration tools such as Airflow or Celery.
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks, proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation, including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery, or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems.
  • Strong expertise in Python, SQL, and modern data processing practices.
  • Experience working with Airflow, Celery, or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques, and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Technical Lead • gurgaon, haryana, in

    Related jobs
    Lead Applied AI Engineer

    Lead Applied AI Engineer

    Taggd • Gurugram, Haryana, India
    Applied AI / LLMs; solid traditional ML).We’re building agentic AI for recruitment workflows—sourcing, screening, interview assistance, and offer orchestration. You’ll own LLM / agent design, retrieval,...Show more
    Last updated: 11 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Yum! India Global Services Private Limited • Gurugram, Haryana, India
    Design, test, implement, deploy, and support continuous integration pipelines that build and deploy to cloud-based environments (development, stage / testing, production). In this role, you will help ...Show more
    Last updated: 10 days ago • Promoted
    Team Lead II – Software Developer

    Team Lead II – Software Developer

    Real Time Data Services • Gurgaon, Haryana, India
    Job Responsibilities - Lead a team of developers to design, develop, and maintain scalable and secure RESTful APIs and web services. Drive technical discussions, provide guidance, and ensure adhere...Show more
    Last updated: 29 days ago • Promoted
    Technical Lead – Web Crawling Systems, Data Pipelines

    Technical Lead – Web Crawling Systems, Data Pipelines

    AIMLEAP • gurugram, uttar pradesh, in
    Tech / MCA / Computer Science / IT.Industry : IT / Data / AI / E-commerce / FinTech / Healthcare.Proven experience leading data engineering teams with strong ownership of web crawling systems and pi...Show more
    Last updated: 3 hours ago • Promoted • New!
    Full-Stack / Python Developer (Web Scraping & Automation Specialist)

    Full-Stack / Python Developer (Web Scraping & Automation Specialist)

    Youngun • Gurugram, Haryana, India
    Flexible, results-driven environment.MeldIt develops large-scale social media automation and data intelligence solutions. Our platform manages campaigns across multiple brands and collects high-volu...Show more
    Last updated: 2 days ago • Promoted
    Senior Lead Engineer - Full Stack

    Senior Lead Engineer - Full Stack

    REA • Gurgaon, India
    Senior Lead Engineer Full Stack.In 1995, in a garage in Melbourne, Australia, REA Group was born from a simple question : Can we change the way the world experiences property?Could we? Yes.Fast for...Show more
    Last updated: 30+ days ago • Promoted
    Website Operations Lead

    Website Operations Lead

    Honasa Consumer Ltd. • Gurgaon, Haryana, India
    Role Overview We are seeking a proactive and skilled Team Lead – Web Operations to oversee the end-to-end management of our brand websites across Shopify and Magento platforms.This role requires s...Show more
    Last updated: 5 days ago • Promoted
    Lead Business Analyst – PropTech Workflow Automation

    Lead Business Analyst – PropTech Workflow Automation

    BigStep Technologies • Gurugram, Haryana, India
    We are looking for a Lead Business Analyst with strong experience in PropTech or real-estate technology solutions.This role acts as the bridge between the client and our workflow engineering team, ...Show more
    Last updated: 1 day ago • Promoted
    Technology Leader(10-18 years)

    Technology Leader(10-18 years)

    Airtel Digital • Gurugram, Haryana, India
    You’ll play a key role in shaping the future of teacher capability development by leading our technical strategy, modernizing systems, and ensuring scalability and reliability at every level.This i...Show more
    Last updated: 1 day ago • Promoted
    Lead / Manager-Product Analytics

    Lead / Manager-Product Analytics

    Spinny • Gurugram, Haryana, India
    Lead / Manager-Product Analytics.As an Analytics Manager, you’ll work closely with business, product, design, engineering and growth teams to build a data-first culture and drive data backed decision...Show more
    Last updated: 2 days ago • Promoted
    Lead Data Engineer

    Lead Data Engineer

    Guidanz Inc • gurgaon, haryana, in
    BI Connector is the industry leading solution for integrating Oracle Fusion Cloud data into modern BI platforms like Power BI, Tableau, and Data Warehouse, without complex ETL.Our Data Architecture...Show more
    Last updated: 3 hours ago • Promoted • New!
    Lead Engineer - Full Stack

    Lead Engineer - Full Stack

    REA • Gurgaon, India
    In 1995, in a garage in Melbourne, Australia, REA Group was born from a simple question : Can we change the way the world experiences property?. Fast forward 30 years, REA Group is a market leader in...Show more
    Last updated: 30+ days ago • Promoted
    Lead Full Stack Engineer

    Lead Full Stack Engineer

    Convertway by Unicommerce • Gurugram, Haryana, India
    About Convertway by Unicommerce.D2C and eCommerce brands boost conversions through personalized WhatsApp and omnichannel campaigns. We empower brands to drive measurable growth, improve retention, a...Show more
    Last updated: 3 days ago • Promoted
    Global coupa Technical / functional Lead

    Global coupa Technical / functional Lead

    APPIT Software Inc • gurgaon, haryana, in
    Job Title : Global COUPA Technical / Functional Lead.Mandatory Skills : • Coupa, configuration, Procurement, integration testing, sap, solution design, Ariba, Python, Java, Spark, Kafka, SQL, AWS.Desira...Show more
    Last updated: 3 hours ago • Promoted • New!
    Lead Business Analyst - PropTech Workflow Automation

    Lead Business Analyst - PropTech Workflow Automation

    BigStep Technologies • Gurgaon, Haryana, India
    Role Summary We are looking for a Lead Business Analyst with strong experience in PropTech or real-estate technology solutions. This role acts as the bridge between the client and our workflow engin...Show more
    Last updated: 1 day ago • Promoted
    Full Stack Developer - FastAPI

    Full Stack Developer - FastAPI

    Nagarro • Gurugram, Haryana, India
    FastAPI (Capable),Node JS,React (Expert),Java.Define and lead application architecture across complex digital systems, emphasizing modular, reusable, and scalable designs.Architect solutions using ...Show more
    Last updated: 22 days ago • Promoted
    Senior Tech Lead CRM Developer with AI Builder Experience

    Senior Tech Lead CRM Developer with AI Builder Experience

    GTRTeK • gurgaon, haryana, in
    Microsoft Dynamics CRM 365 Senior Developer with minimum 5 years of experience in D 365 CRM along with .Looking for competent candidate in the relevant module. Minimum 3 years of work experience .Bu...Show more
    Last updated: 3 hours ago • Promoted • New!
    Sr Staff SDET

    Sr Staff SDET

    Alkami • Gurgaon, India
    Guide the work of a group of SDETs in one or more functional areas with responsibility for all aspects of test automation, including framework enhancements and being an evangelist of quality.Review...Show more
    Last updated: 30+ days ago • Promoted