Talent.com
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)AIMLEAP • faridabad, haryana, in
Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

AIMLEAP • faridabad, haryana, in
15 hours ago
Job description

Data Engineering Manager – Web Crawling & Pipeline Architecture

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines , preferably using workflow orchestration tools such as Airflow or Celery .
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks , proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation , including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery , or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems .
  • Strong expertise in Python, SQL , and modern data processing practices.
  • Experience working with Airflow, Celery , or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques , and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Engineering Manager • faridabad, haryana, in

    Related jobs
    Product Manager

    Product Manager

    Melhor Group • South Delhi, Delhi, India
    FYORA is an innovative AI-driven platform revolutionizing workforce automation through specialized AI agents.We empower businesses by streamlining processes, boosting efficiency, and optimizing ope...Show more
    Last updated: 30+ days ago • Promoted
    Product Manager

    Product Manager

    EaseMyTrip.com • Faridabad, India
    As a Product Manager at EaseMyTrip.This role demands a strategic thinker who is passionate about creating innovative travel solutions that enhance user experience. You will lead the product lifecycl...Show more
    Last updated: 30+ days ago • Promoted
    Data Scientist

    Data Scientist

    Recro • faridabad, haryana, in
    We’re seeking a highly skilled, hands-on Data Scientist with 4–10 years of experience in applied AI / ML to join our fast-paced team. This role requires deep expertise in transformer architectures and...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI Engineer

    Senior AI Engineer

    Xtnsion.AI • faridabad, haryana, in
    AI is building the agentic CX layer for modern businesses — AI voice + chat agents that autonomously handle bookings, lead follow-up, support workflows, CRM actions, and more across phone, WhatsApp...Show more
    Last updated: 15 hours ago • Promoted • New!
    Senior Data Engineer

    Senior Data Engineer

    Primesoft Inc • faridabad, haryana, in
    Primesoft Enterprise IT Services Pvt.As a Software Engineer II - Data, you will contribute to the design and development of data systems including pipelines, APIs, analytics, AI and machine learnin...Show more
    Last updated: 30+ days ago • Promoted
    AI Analyst

    AI Analyst

    Aventis Solutions • faridabad, haryana, in
    Aventis Solutions is igniting the AI revolution : .They have just launched The AI Executive podcast, which can be found here : . Now, our tech partner is establishing a new AI Innovation Hub in Pune, In...Show more
    Last updated: 30+ days ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    Programmers.io • faridabad, haryana, in
    We are seeking highly skilled Senior.Laravel and modern frontend frameworks (Vue.The candidate should have deep technical expertise, leadership ability, and experience architecting scalable web sol...Show more
    Last updated: 15 days ago • Promoted
    Senior Manager Analytics

    Senior Manager Analytics

    Snapdeal • Faridabad, India
    Job Title : Senior Manager / Manager - Data Science & Analytics.We are looking for a Senior Manager / Manager - Data Science & Analytics to lead the business analytics and insights charter and partn...Show more
    Last updated: 1 hour ago • Promoted • New!
    Data Engineer - Fully Remote (Global Data Platform & Analytics Projects)

    Data Engineer - Fully Remote (Global Data Platform & Analytics Projects)

    SkillsCapital • faridabad, haryana, in
    Remote
    These fully remote, long-term freelance roles are ideal for engineers who can build scalable data pipelines, work with modern cloud-native data stacks, and support large-scale enterprise data initi...Show more
    Last updated: 3 hours ago • Promoted • New!
    Principal QA Engineer (Cypress)

    Principal QA Engineer (Cypress)

    CES • Faridabad, Haryana, India
    We are seeking a Principal QA Engineer to join our agile development team and take ownership of delivering high-quality software through advanced testing strategies. This is a hands-on IC role w...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 To 2 Yrs)

    AIMLEAP • Faridabad, Republic Of India, IN
    Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT.IT / Data / AI / E-commerce / FinTech / Healthcare. Experience working with cloud platforms such as...Show more
    Last updated: 11 hours ago • Promoted • New!
    Data Analyst

    Data Analyst

    EaseMyTrip.com • Faridabad, India
    As a Data Analyst at EaseMyTrip.Your role involves harnessing large sets of data to identify trends, forecast demand, and optimize our services to enhance customer satisfaction.You will collaborate...Show more
    Last updated: 7 days ago • Promoted
    Sr. Azure Data Architect & Presales Solution

    Sr. Azure Data Architect & Presales Solution

    Programmers.io • faridabad, haryana, in
    We offer a vibrant and collaborative work environment, cutting-edge tools and technologies, and ample opportunities for professional growth. Job Title : Azure Data Architect.Experience required : 15+ ...Show more
    Last updated: 18 days ago • Promoted
    Business Develop Manager

    Business Develop Manager

    Grantify • faridabad, haryana, in
    Grantify is an innovative education platform that bridges students and universities through a transparent admissions and tuition-matching system. By aligning student budgets and academic goals with ...Show more
    Last updated: 15 hours ago • Promoted • New!
    Data Integration Engineer

    Data Integration Engineer

    Yamaha Motor Solutions India • Faridabad, Republic Of India, IN
    Lead project team members through all activities required to successfully deliver Informatica.Define technical specifications for workflows and business rules. Prepare detailed design documents for ...Show more
    Last updated: 5 days ago • Promoted
    Director of Technical Engineering (configuration) - LifeScience Experience

    Director of Technical Engineering (configuration) - LifeScience Experience

    Qinecsa Solutions • Faridabad, India
    Job Description : We are seeking a Director / Manager of Technical Engineer to oversee the technical design, development and deployment of client solutions (configurations, migrations and integration...Show more
    Last updated: 1 hour ago • Promoted • New!
    Deputy Manager - Industrial Engineering

    Deputy Manager - Industrial Engineering

    Havells India Ltd • Faridabad, Haryana, India
    Conduct Time study, Line Balancing, VSM studies, Lean Layout Designing, manufacturing lead time reduction, etc.Conduct time study, line balancing studies & prepare norms. Prepare Layouts based on le...Show more
    Last updated: 15 days ago • Promoted
    Lead AI / ML Engineer

    Lead AI / ML Engineer

    Simelabs - Digital, AI / ML, Automation, Robotics, Gen AI. • Faridabad, India
    Architect scalable ML pipelines, services, and platforms using modern cloud and MLOps practices.Build, fine-tune, and integrate Generative AI models (LLMs, Vision Models, Multimodal Models) into bu...Show more
    Last updated: 1 hour ago • Promoted • New!