Talent.com
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)AIMLEAP • panchkula, haryana, in
No longer accepting applications
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

AIMLEAP • panchkula, haryana, in
3 days ago
Job description

Data Engineering Manager – Web Crawling & Pipeline Architecture

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines , preferably using workflow orchestration tools such as Airflow or Celery .
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks , proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation , including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery , or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems .
  • Strong expertise in Python, SQL , and modern data processing practices.
  • Experience working with Airflow, Celery , or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques , and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Engineering Manager • panchkula, haryana, in

    Related jobs
    Senior Full Stack Developer

    Senior Full Stack Developer

    NUKG Business Solutions • panchkula, haryana, in
    Business, Process & Technology consulting company with niche expertise in the area of US Benefits Administration and Data Management. NUKG's headquarters are based in NJ, USA and the Global Delivery...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Senior Software Engineer

    Senior Software Engineer

    Programmers.io • panchkula, haryana, in
    We are seeking a highly skilled and experienced Senior Azure Data Engineer to join our team.The ideal candidate will have deep expertise in Microsoft Azure data services, cloud-based data engineeri...Show more
    Last updated: 30+ days ago • Promoted
    Generative AI Engineer

    Generative AI Engineer

    Turing • panchkula, haryana, in
    Turing is looking for people with LLM experience to join us in solving business problems for our Fortune 500 customers.You will be a key member of the Turing GenAI delivery organization and part of...Show more
    Last updated: 25 days ago • Promoted
    Engineering Manager

    Engineering Manager

    Confidential • Panchkula, Haryana, India
    The ideal candidate will be responsible for managing and inspiring his or her team to achieve their performance metrics.Your role will involve strategizing, project management, part staff managemen...Show more
    Last updated: 23 hours ago • Promoted
    Senior ML / AI Engineer

    Senior ML / AI Engineer

    Tritonium • Panchkula, Haryana, India
    About Tritonium : Tritonium is an AI-powered SaaS platform transforming how mobile product teams understand their users.We process millions of pieces of user feedback and turn them into actionable ...Show more
    Last updated: 23 hours ago • Promoted
    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    AIMLEAP • Panchkula, Republic Of India, IN
    Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT.IT / Data / AI / E-commerce / FinTech / Healthcare. Experience working with cloud platforms such as...Show more
    Last updated: 3 days ago • Promoted
    Junior Product Designer

    Junior Product Designer

    Bik.ai • panchkula, haryana, in
    We’re looking for a Junior UX / UI Designer who loves bringing ideas to life through visually refined interfaces and thoughtful interactions. You’ll collaborate closely with other designers, product m...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Senior Data Engineer

    Senior Data Engineer

    CXC • Panchkula, Haryana, India
    Please apply only if you are available to work in Australian time zone and comfortable with 6 months contract duration • • About the Role We’re seeking a highly skilled and autonomous Senior Data ...Show more
    Last updated: 1 day ago • Promoted
    SEO Specialist

    SEO Specialist

    Impressiko • Dera Bassi, Punjab, India
    We’re Hiring : SEO Specialist (2+ Years Experience).Chandigarh | Zirakpur | Mohali | Panchkula | Kharar (Hybrid).Impressiko – AI-Powered Digital Marketing & Website Design Agency.Are you passionate ...Show more
    Last updated: 24 days ago • Promoted
    Big Data Engineer

    Big Data Engineer

    K&K Talents - India • Panchkula, Haryana, India
    K&K Talents is an international recruiting agency that has been providing technical resources globally since 1993.This position is with one of our clients in India , who is actively hiring candi...Show more
    Last updated: 23 hours ago • Promoted
    Amazon Redshift

    Amazon Redshift

    Vidhema Technologies • Panchkula, Haryana, India
    Experience : 3 to 7 Years Location : Remote Employment Type : Full-Time Notice Period : Immediate Joiners Preferred Overview : - We are looking for an experienced Amazon Redshift Developer to lead the...Show more
    Last updated: 13 hours ago • Promoted • New!
    Azure Data Architect

    Azure Data Architect

    9NEXUS • Panchkula, Haryana, India
    Job Title : Azure Data Architect Experience : 8–12 Years Location : Remote Job Description We are seeking an experienced Azure Data Architect to design, implement, and optimize enterprise-scale da...Show more
    Last updated: 23 hours ago • Promoted
    Project Manager

    Project Manager

    Deloitte • Panchkula, Haryana, India
    What impact will you make? Every day, your work will make an impact that matters, while you thrive in a dynamic culture of inclusion, collaboration and high performance. As the undisputed leader in...Show more
    Last updated: 30+ days ago • Promoted
    Full Stack Developer - Global Product & Platform Engineering (Fully Remote)

    Full Stack Developer - Global Product & Platform Engineering (Fully Remote)

    SkillsCapital • Panchkula, Haryana, India
    Remote
    We are hiring multiple Full Stack Developers to join global engineering teams building high-quality, scalable, cloud-native applications. These long-term, fully remote freelance roles are ideal fo...Show more
    Last updated: 1 day ago • Promoted
    Data Engineer

    Data Engineer

    TerraGiG • Panchkula, Haryana, India
    Responsibilities Lead the design, development, and implementation of data solutions using AWS and Snowflake.Collaborate with cross-functional teams to understand business requirements and translat...Show more
    Last updated: 30+ days ago • Promoted
    Backend / Fullstack Engineer (PHP • Drupal • Python)

    Backend / Fullstack Engineer (PHP • Drupal • Python)

    Pulp Strategy • panchkula, haryana, in
    Pulp Strategy develops robust platforms, CMS ecosystems, and automated backend systems powering enterprise-level performance. With a focus on productized workflows and digital acceleration, we are s...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Business Development Manager – Shopify Development & AI Commerce

    Business Development Manager – Shopify Development & AI Commerce

    Mandasa Technologies | Shopify Plus Partner • Panchkula, Haryana, India
    Experience : 2–5+ Years Location : Remote / Hybrid Type : Full-Time Industry : E-commerce, Shopify Development, AI Automation About Us We are a Shopify development agency building high-performance...Show more
    Last updated: 19 hours ago • Promoted • New!
    Tech Lead Full Stack-Contract

    Tech Lead Full Stack-Contract

    Gravity Infosolutions, Inc. • Panchkula, Haryana, India
    Role : Tech Lead Full Stack-Contract Type : Contract Duration : 1 Year Experience : 5+ years Location : Remote Time Zone- European (CET) Job description for Tech Lead Full Stack : Conducts code rev...Show more
    Last updated: 1 day ago • Promoted