Talent.com
Technical Lead – Web Crawling Systems, Data Pipelines
Technical Lead – Web Crawling Systems, Data PipelinesAIMLEAP • Kolkata, IN
Technical Lead – Web Crawling Systems, Data Pipelines

Technical Lead – Web Crawling Systems, Data Pipelines

AIMLEAP • Kolkata, IN
5 hours ago
Job description

Experience : 7 to 12 Years

Location : Remote / Bangalore

Engagement : Full-time

Positions : 2

Qualification : B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry : IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period : Immediate

What We Are Looking For

  • Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
  • Expertise in designing, building, and optimizing scalable data pipelines, preferably using workflow orchestration tools such as Airflow or Celery.
  • Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
  • Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
  • Deep understanding of web crawling frameworks, proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR / CCPA-safe crawling).
  • Strong expertise in AI-driven automation, including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..

Responsibilities

  • Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
  • Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
  • Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
  • Establish pipeline orchestration using Airflow, Celery, or similar distributed processing technologies.
  • Define and enforce data quality, validation, and security measures across all data flows and pipelines.
  • Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
  • Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
  • Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS / GCP / Azure.
  • Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling
  • Qualifications

  • Bachelor's or master's degree in engineering, Computer Science, or related field.
  • 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems.
  • Strong expertise in Python, SQL, and modern data processing practices.
  • Experience working with Airflow, Celery, or similar workflow automation tools.
  • Solid understanding of proxy systems, anti-bot techniques, and scalable crawler architecture.
  • Hands-on experience with cloud data platforms (AWS / GCP / Azure).
  • Experience with AI / LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
  • Strong analytical, architectural, and leadership skills.
  • Create a job alert for this search

    Technical Lead • Kolkata, IN

    Related jobs
    Global coupa Technical / functional Lead

    Global coupa Technical / functional Lead

    APPIT Software Inc • Kolkata, IN
    Job Title : Global COUPA Technical / Functional Lead.Mandatory Skills : • Coupa, configuration, Procurement, integration testing, sap, solution design, Ariba, Python, Java, Spark, Kafka, SQL, AWS.Desira...Show more
    Last updated: 5 hours ago • Promoted • New!
    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

    Data Engineering Manager – Web Crawling & Pipeline Architecture ( 7 to 2 yrs)

    AIMLEAP • Kolkata, IN
    Data Engineering Manager – Web Crawling & Pipeline Architecture.Tech / MCA / Computer Science / IT .IT / Data / AI / E-commerce / FinTech / Healthcare . Experience working with cloud platforms such ...Show more
    Last updated: 4 days ago • Promoted
    Senior Data Architect- Snowflake

    Senior Data Architect- Snowflake

    USEReady • Kolkata, IN
    USEReady is a data and analytics firm that provides the strategies, tools, capability, and capacity that businesses need to turn their data into a competitive advantage. USEReady partners with cloud...Show more
    Last updated: 13 days ago • Promoted
    Technical Operations Lead

    Technical Operations Lead

    ClearTrail Technologies • Kolkata, IN
    Computer Science, Information Technology, or a related field.We are seeking a highly skilled and experienced.The ideal candidate will have a strong background in Linux system administration, incide...Show more
    Last updated: 30+ days ago • Promoted
    Cybersecurity Lead(6 months contract)

    Cybersecurity Lead(6 months contract)

    Sekuro Asia • Kolkata, IN
    Our client oversees and operates digital asset-related businesses.Our client aims to transform the financial industry by building a tech-enabled institutional grade ecosystem for issuance, distribu...Show more
    Last updated: 7 days ago • Promoted
    Salesforce Senior Tech Lead (Noida)

    Salesforce Senior Tech Lead (Noida)

    Connect Tech+Talent • Kolkata, IN
    Remote / Noida (as applicable).Shift Timing : 10 : 00 PM to 7 : 00 AM IST (US Time Zone Coverage).We are looking for a highly skilled Salesforce Senior Technical Lead to lead a team of 3–5 Salesforce de...Show more
    Last updated: 9 days ago • Promoted
    Team Lead

    Team Lead

    Zensar Technologies • Kolkata, IN
    ZENSAR -TEAM LEAD | PROJECT MANAGER OPPORTUNITY FOR GEN AI PROJECT.Dear Aspirant, Greetings from Zensar!!.We are a technology consulting and services company with over 11,500 associates in 33 globa...Show more
    Last updated: 25 days ago • Promoted
    Guidewire Claim Center

    Guidewire Claim Center

    The AES Group • Kolkata, IN
    Guidewire ClaimCenter Integration Developer.Bangalore and Pune (preferred).Guidewire ClaimCenter Integration Developer.The role involves developing REST / SOAP APIs, designing batch and messaging pro...Show more
    Last updated: 13 days ago • Promoted
    Lead Engineer

    Lead Engineer

    Hyqoo • Kolkata, IN
    Design, deploy, and manage AWS cloud infrastructure, including EC2 instances, S3 buckets, VPCs, RDS databases, and Lambda functions. Assist in the design, implementation, and maintenance of backup, ...Show more
    Last updated: 23 days ago • Promoted
    Full-Stack Lead Developer High End Wellness Hospitality(Equity Only - Remote) Worldwide

    Full-Stack Lead Developer High End Wellness Hospitality(Equity Only - Remote) Worldwide

    Pranissa • Kolkata, IN
    Remote
    Pranissa is a top-tier wellness and longevity platform connecting individuals with exceptional Wellness and longevity destinations, evidence-based wellness resorts, and age-defying experiences worl...Show more
    Last updated: 30+ days ago • Promoted
    E-commerce Technical Project Manager( Bigcommerce / Shopify)

    E-commerce Technical Project Manager( Bigcommerce / Shopify)

    Upbott Consulting, Inc • Kolkata, IN
    E-commerce Technical Project Manager.BigCommerce or Shopify projects.Candidates must have led end-to-end e-commerce implementations specifically on. This role requires someone who understands the Bi...Show more
    Last updated: 5 days ago • Promoted
    HYPERVISOR TEST ENGINEER (Foundation Level)

    HYPERVISOR TEST ENGINEER (Foundation Level)

    Piepeople Consulting Inc. • Kolkata, IN
    Solid understanding of hypervisors, virtual machines (VMs), and core concepts like CPU, memory, and I / O allocation.Basic operating systems (especially Linux), hardware basics, and fundamental progr...Show more
    Last updated: 9 days ago • Promoted
    APAC AWS Alliance Lead

    APAC AWS Alliance Lead

    SoftwareOne • Kolkata, IN
    Ready to build something from the ground up?.Passionate about driving strategic partnerships with AWS across APAC?.Thrive in fast-paced, cross-cultural environments where ambiguity is the norm?.AWS...Show more
    Last updated: 25 days ago • Promoted
    Web Crawling Engineer

    Web Crawling Engineer

    Forage AI • Kolkata, IN
    The ideal candidate will have strong Python programming skills and experience in web scraping frameworks, browser automation tools, and handling anti-scraping mechanisms. Forage AI is a pioneering A...Show more
    Last updated: 21 days ago • Promoted
    Technical Project Lead

    Technical Project Lead

    Sundew • Kolkata, West Bengal, India
    Sundew is a leading digital transformation firm with a 18 year legacy of excellence.We specialize in digital strategy, application development, and engineering, utilizing MEAN, MERN, and LAMP stack...Show more
    Last updated: 1 day ago • Promoted
    Technical Lead

    Technical Lead

    Mphasis • Kolkata, IN
    Looking for Senior Ingenium Developer with 10+ years' experience and following skills.Experience in Mainframe O / S and Development using COBOL programming language & JCL. Experience in development an...Show more
    Last updated: 13 days ago • Promoted
    Quality Assurance Lead (Ruby on Rails)

    Quality Assurance Lead (Ruby on Rails)

    Hireologist • Kolkata, IN
    You’ll lead automation, performance, and data integrity initiatives — ensuring our product can handle thousands of customers, hundreds of thousands of users, and enterprise-grade expectations witho...Show more
    Last updated: 29 days ago • Promoted
    AWS Data Architect

    AWS Data Architect

    ACL Digital • Kolkata, IN
    AWS (S3, Redshift, Glue, Lake Formation, IAM).Proficient in data modeling, performance tuning, and security best practices. .AWS Certified Solutions Architect preferred.Show more
    Last updated: 19 days ago • Promoted