Talent.com
Data Engineering Manager – Web Crawling & Pipeline Architecture (2 to 7yrs)
Data Engineering Manager – Web Crawling & Pipeline Architecture (2 to 7yrs)AIMLEAP • guwahati, assam, in
No longer accepting applications
Data Engineering Manager – Web Crawling & Pipeline Architecture (2 to 7yrs)

Data Engineering Manager – Web Crawling & Pipeline Architecture (2 to 7yrs)

AIMLEAP • guwahati, assam, in
4 days ago
Job description

Data Engineering Manager – Web Crawling & Pipeline Architecture

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines , preferably using workflow orchestration tools such as Airflow or Celery .
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks , proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation , including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery , or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems .
  • Strong expertise in Python, SQL , and modern data processing practices.
  • Experience working with Airflow, Celery , or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques , and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Engineering Manager • guwahati, assam, in

    Related jobs
    AWS Data Architect

    AWS Data Architect

    ACL Digital • guwahati, assam, in
    AWS (S3, Redshift, Glue, Lake Formation, IAM).Proficient in data modeling, performance tuning, and security best practices. .AWS Certified Solutions Architect preferred.Show more
    Last updated: 25 days ago • Promoted
    Engineering Manager

    Engineering Manager

    Confidential • guwahati, assam, in
    The ideal candidate will be responsible for managing and inspiring his or her team to achieve their performance metrics.Your role will involve strategizing, project management, part staff managemen...Show more
    Last updated: 7 days ago • Promoted
    Data Engineer

    Data Engineer

    TerraGiG • guwahati, assam, in
    Lead the design, development, and implementation of data solutions using AWS and Snowflake.Collaborate with cross-functional teams to understand business requirements and translate them into techni...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Engineer

    Senior Data Engineer

    Ironbook AI • guwahati, assam, in
    The ideal candidate will have strong experience with cloud platforms, modern ETL / ELT tools, and deep technical skills in Python, SQL, and distributed data frameworks. Design, develop, and maintain s...Show more
    Last updated: 4 days ago • Promoted
    Senior Data Architect- Snowflake

    Senior Data Architect- Snowflake

    USEReady • guwahati, assam, in
    USEReady is a data and analytics firm that provides the strategies, tools, capability, and capacity that businesses need to turn their data into a competitive advantage. USEReady partners with cloud...Show more
    Last updated: 18 days ago • Promoted
    Enterprise Application Developer (FP&A and Data Integration)

    Enterprise Application Developer (FP&A and Data Integration)

    DRISHTICON Inc • guwahati, assam, in
    Job Title : Experienced Anaplan Model Builder (FP&A & Data Integration).The position is long term contract and Remote.Nice to have skills : FP&A knowledge or certification, Anaplan Data Integration u...Show more
    Last updated: 4 days ago • Promoted
    Engineering Manager - Cloud Platform PaaS

    Engineering Manager - Cloud Platform PaaS

    IBM • guwahati, assam, in
    This role is responsible for leading our India-based Cloud Platform Services team at HashiCorp.From Core Platform & Visibility to Identity / IAM and cloud-based Billing services, Cloud Platform Servi...Show more
    Last updated: 25 days ago • Promoted
    Senior Data Engineer

    Senior Data Engineer

    Donyati • guwahati, assam, in
    We are seeking a highly skilled Senior Data Engineer to join our team in building a modern data platform on AWS.You will play a key role in transitioning from legacy systems to a scalable, cloud-na...Show more
    Last updated: 20 days ago • Promoted
    Senior Data Architect - Snowflake

    Senior Data Architect - Snowflake

    Reflections Info Systems • guwahati, assam, in
    We are looking for 10 + year experienced Data Architect with strong background in Snowflake, demonstrating leadership in technical design, architecture, and implementation of complex data solutions...Show more
    Last updated: 19 days ago • Promoted
    Engineering Manager

    Engineering Manager

    Cargoz.com • guwahati, assam, in
    This role is perfect for leaders who excel in dynamic, high-velocity environments, enjoy developing both people and systems, and want to help shape our product and engineering culture from the grou...Show more
    Last updated: 4 days ago • Promoted
    Senior Snowflake Data Engineer

    Senior Snowflake Data Engineer

    Luxoft • guwahati, assam, in
    We are seeking a highly skilled Snowflake Data Engineer with 7 years of IT experience to design, build, and optimize scalable data pipelines and cloud-based solutions across AWS, Azure, and GCP.The...Show more
    Last updated: 14 days ago • Promoted
    Lead Gen AI Engineer

    Lead Gen AI Engineer

    Impetus • guwahati, India
    We are looking for a Software Engineer who combines deep data engineering expertise with hands-on experience in Generative AI and Agentic AI system development on AWS Cloud.This role is ideal for s...Show more
    Last updated: 27 days ago • Promoted
    Director of Technical Engineering (configuration) - LifeScience Experience

    Director of Technical Engineering (configuration) - LifeScience Experience

    Qinecsa Solutions • guwahati, assam, in
    We are seeking a Director / Manager of Technical Engineer to oversee the technical design, development and deployment of client solutions (configurations, migrations and integrations) for our flagshi...Show more
    Last updated: 11 days ago • Promoted
    Data Engineer

    Data Engineer

    IntraEdge • guwahati, assam, in
    We are seeking a highly skilled Data Engineer with strong experience in Python, PySpark, Snowflake, and AWS Glue to join our growing data team. You will be responsible for building scalable and reli...Show more
    Last updated: 30+ days ago • Promoted
    Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

    Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

    Sixteen Alpha AI • guwahati, assam, in
    The crawler will be integrated with an.AI or NLP / LLM-based components.JavaScript, infinite scrolling, or APIs.LLM-based data labeling, or automated content enrichment modules.Airflow, Prefect, or c...Show more
    Last updated: 26 days ago • Promoted
    Engineering Manager

    Engineering Manager

    Branch International • guwahati, assam, in
    Branch delivers world-class financial services to the mobile generation.With offices in the United States, Nigeria, Kenya, and India, Branch is a for-profit socially conscious company that uses the...Show more
    Last updated: 30+ days ago • Promoted
    Senior Data Engineer - Data Acquisition

    Senior Data Engineer - Data Acquisition

    InfoBeans • guwahati, assam, in
    We are seeking a highly skilled.Senior Data Engineer – Data Acquisition (ODS).The ideal candidate will have extensive hands-on experience in building and optimizing data ingestion and transformatio...Show more
    Last updated: 27 days ago • Promoted
    Senior Fullstack Engineer

    Senior Fullstack Engineer

    Black Dog Labs • guwahati, India
    Senior Fullstack Engineer (with Data Engineering Experience).Remote (collaboration across time zones), India or LATAM preferred. Proficient English communication.Full-Stack Engineering / Backend Eng...Show more
    Last updated: 14 days ago • Promoted