Talent.com
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)AIMLEAP • Thane, IN
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

AIMLEAP • Thane, IN
19 hours ago
Job description

Data Engineering Manager – Web Crawling & Pipeline Architecture

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines , preferably using workflow orchestration tools such as Airflow or Celery .
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks , proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation , including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery , or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems .
  • Strong expertise in Python, SQL , and modern data processing practices.
  • Experience working with Airflow, Celery , or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques , and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Engineering Manager • Thane, IN

    Related jobs
    Data engineer(intern)

    Data engineer(intern)

    Tech Phoenix • Dombivli, Maharashtra, India
    Company Description Tech Phoenix is a dynamic platform dedicated to all things technology, offering the latest news, trends, and insights from the tech industry. Whether you are a tech enthusiast o...Show more
    Last updated: 2 hours ago • Promoted • New!
    Databricks Data Engineer Lead – Sustainability Project

    Databricks Data Engineer Lead – Sustainability Project

    Blue Cloud Softech Solutions Limited • Kalyan-Dombivli, IN
    BCSS is seeking a Databricks Data Engineer to support its enterprise-wide Sustainability initiative.The engineer will be responsible for building data pipelines and models to support product-level ...Show more
    Last updated: 11 days ago • Promoted
    Content manager

    Content manager

    Gleantap • Thane, Maharashtra, India
    About Us We’re a fast-growing Saa S company building AI-powered customer engagement and automation tools that help businesses connect with their audience in smarter, more meaningful ways.Our missio...Show more
    Last updated: 2 hours ago • Promoted • New!
    Ai data labeler - 53125

    Ai data labeler - 53125

    Turing • Dombivli, Maharashtra, India
    About Turing : Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems...Show more
    Last updated: 2 hours ago • Promoted • New!
    Engineering Manager

    Engineering Manager

    Tamara • Thane, IN
    Tamara is the leading fintech platform in Saudi Arabia and the wider GCC region with a mission to help people make their dreams come true by building the most customer-centric financial super-app o...Show more
    Last updated: 30+ days ago • Promoted
    Senior data engineer

    Senior data engineer

    Donyati • Thane, Maharashtra, India
    Senior Data Engineer Job Title : Senior Data Engineer Location : Remote / Willing to Travel Job Type : Full-time Experience Level : 8+ years. About the Role : We are seeking a highly skilled Senior Dat...Show more
    Last updated: 2 hours ago • Promoted • New!
    Senior Data Engineer - Data Acquisition

    Senior Data Engineer - Data Acquisition

    InfoBeans • Thane, IN
    We are seeking a highly skilled.Senior Data Engineer – Data Acquisition (ODS).The ideal candidate will have extensive hands-on experience in building and optimizing data ingestion and transformatio...Show more
    Last updated: 18 days ago • Promoted
    E-commerce Technical Project Manager( Bigcommerce / Shopify)

    E-commerce Technical Project Manager( Bigcommerce / Shopify)

    Upbott Consulting, Inc • Thane, IN
    E-commerce Technical Project Manager.BigCommerce or Shopify projects.Candidates must have led end-to-end e-commerce implementations specifically on. This role requires someone who understands the Bi...Show more
    Last updated: 1 day ago • Promoted
    Data scientist

    Data scientist

    Insight Global • Dombivli, Maharashtra, India
    Company : Insight Global (on behalf of our client) Location : Remote Compensation : Ranges depend on experience level Start Date : Immediate (No Notice Period Preferred) Notice Period : This is an ASAP ...Show more
    Last updated: 2 hours ago • Promoted • New!
    Senior Data Engineer

    Senior Data Engineer

    Donyati • Thane, IN
    We are seeking a highly skilled Senior Data Engineer to join our team in building a modern data platform on AWS.You will play a key role in transitioning from legacy systems to a scalable, cloud-na...Show more
    Last updated: 11 days ago • Promoted
    Project Manager – Data Engineering & Analytics

    Project Manager – Data Engineering & Analytics

    Brillio • Thane, IN
    We are looking for a skilled Technical Project Manager to lead and deliver projects in data engineering and analytics.You will manage cross-functional teams to execute data platform, pipeline, and ...Show more
    Last updated: 30+ days ago • Promoted
    Engineering Manager - II, Data Engineering Platform

    Engineering Manager - II, Data Engineering Platform

    Tamara • Kalyan-Dombivli, IN
    Data Engineering Manager II : Real-Time Data & Experimentation Platform (Remote, India).Saudi Arabia's first fintech unicorn. GCC, with a mission to empower dreams through customer-centric financial ...Show more
    Last updated: 25 days ago • Promoted
    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    AIMLEAP • Dombivli, Republic Of India, IN
    Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT.IT / Data / AI / E-commerce / FinTech / Healthcare. Experience working with cloud platforms such as...Show more
    Last updated: 12 hours ago • Promoted • New!
    Tech manager - mobile & web application

    Tech manager - mobile & web application

    Infojini Inc • Thane, Maharashtra, India
    Infojini is looking for a Tech Leader!.Join us as a Tech Manager / Sr.Tech Manager – drive innovation across Node JS, React JS, and Mobile App Development. Location : Mumbai / Noida (Onsite).Infojini...Show more
    Last updated: 2 hours ago • Promoted • New!
    Business analyst - 45430

    Business analyst - 45430

    Turing • Dombivli, Maharashtra, India
    About the Role : We’re hiring a Business Analyst to manage and mentor a team of analysts creating business-focused content to improve AI models. You’ll combine leadership, quality oversight, and sub...Show more
    Last updated: 2 hours ago • Promoted • New!
    Engineering Manager

    Engineering Manager

    The Transformation Group • Thane, IN
    We are partnering with our clients in the US to create something game-changing - a platform that redefines how digital products are built - faster, better, and with more impact than ever before.We ...Show more
    Last updated: 10 days ago • Promoted
    Engineering Manager

    Engineering Manager

    Branch International • Thane, IN
    Branch delivers world-class financial services to the mobile generation.With offices in the United States, Nigeria, Kenya, and India, Branch is a for-profit socially conscious company that uses the...Show more
    Last updated: 30+ days ago • Promoted
    AI Implementation Manager

    AI Implementation Manager

    Sutra.AI • Kalyan-Dombivli, IN
    Role : Senior Implementation Manager.As we expand our delivery footprint, we’re seeking a.Sutra’s implementation workflows into a. The Implementation Leader will own the.AI solution deployments - fro...Show more
    Last updated: 7 days ago • Promoted