Talent.com
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)Sixteen Alpha AI • sangli, maharashtra, in
No longer accepting applications
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

Sixteen Alpha AI • sangli, maharashtra, in
14 days ago
Job description

About the Project

We’re developing a next-generation intelligent web crawling system capable of exploring deep and dynamic web data sources — including sites behind authentication, infinite scrolls, and JavaScript-heavy pages.

The crawler will be integrated with an AI-driven pipeline for automated data understanding, classification, and transformation.

We’re looking for a highly experienced engineer who has previously built large-scale, distributed crawling frameworks and integrated AI or NLP / LLM-based components for contextual data extraction.

Key Responsibilities

  • Design, develop, and deploy scalable deep web crawlers capable of bypassing common anti-bot mechanisms.
  • Implement AI-integrated pipelines for data processing, entity extraction, and semantic categorization.
  • Develop dynamic scraping systems for sites that rely on JavaScript, infinite scrolling, or APIs.
  • Integrate with vector databases , LLM-based data labeling, or automated content enrichment modules.
  • Optimize crawling logic for speed, reliability, and stealth across distributed environments.
  • Collaborate on data pipeline orchestration using tools like Airflow, Prefect, or custom async architectures.

Required Expertise

  • Proven experience building deep or dark web crawlers (Playwright, Scrapy, Puppeteer, or custom async frameworks).
  • Strong understanding of browser automation, session management, and anti-detection mechanisms .
  • Experience integrating AI / ML / NLP pipelines — e.g., text classification, entity recognition, or embedding-based similarity.
  • Skilled in asynchronous Python (asyncio, aiohttp, Playwright async API).
  • Familiar with database and pipeline systems — PostgreSQL, MongoDB, Elasticsearch, or similar.
  • Ability to design robust data flows that connect crawling → AI inference → storage / visualization.
  • Nice to Have

  • Knowledge of LLMs (OpenAI, Hugging Face, LangChain, or custom fine-tuned models) .
  • Experience with data cleaning, deduplication, and normalization pipelines .
  • Familiarity with distributed crawling frameworks (Ray, Celery, Kafka) .
  • Prior experience integrating real-time analytics dashboards or monitoring tools.
  • What We Offer

  • Competitive freelance pay based on expertise and delivery.
  • Flexible, async-first remote collaboration.
  • Opportunity to shape an AI-first data platform from the ground up.
  • Potential for long-term partnership if the collaboration is successful.
  • Create a job alert for this search

    Engineer • sangli, maharashtra, in

    Related jobs
    Sr Full Stack developer AWS

    Sr Full Stack developer AWS

    Falkondata • Sangli, Maharashtra, India
    ONLY IMMEDIATE SR Joiners apply Company Description Falkondata specializes in delivering innovative cloud solutions that seamlessly connect fragmented healthcare systems, improve workflows, and en...Show more
    Last updated: 11 days ago • Promoted
    Full Stack Developer

    Full Stack Developer

    Future-Able • Sangli, Maharashtra, India
    Future-Able is looking for a Full Stack Developer (E-Commerce / Shopify), a full-time contract role, to work for : Naked & Thriving - An organic, botanical skincare brand focused on helping everyon...Show more
    Last updated: 17 days ago • Promoted
    Forward Deployed Engineer

    Forward Deployed Engineer

    Searchability® • Sangli, Maharashtra, India
    Forward Deployed Engineer - AI Remote-based - relocation to Dubai Salary Dh20,000 -25,000 Searchability MENA is working with an innovative AI startup looking for a Forward Deployed Engineer to...Show more
    Last updated: 9 days ago • Promoted
    Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

    Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

    Sixteen Alpha AI • Sangli, Maharashtra, India
    About the Project We’re developing a next-generation intelligent web crawling system capable of exploring deep and dynamic web data sources — including sites behind authentication, infinite sc...Show more
    Last updated: 14 days ago • Promoted
    Web Crawling Engineer

    Web Crawling Engineer

    Forage AI • Sangli, Maharashtra, India
    We are seeking a Web Crawling Engineer who will be responsible for building and maintaining web crawlers, extracting valuable insights from the web, and ensuring data quality.The ideal candidate ...Show more
    Last updated: 15 days ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    UsefulBI Corporation • Sangli, Maharashtra, India
    About UsefulBI : UsefulBI is a leading AI-driven data solutions provider specializing in data engineering, cloud transformations, and AI-powered analytics for Fortune 500 companies.We help busines...Show more
    Last updated: 22 days ago • Promoted
    Full Stack AI engineer

    Full Stack AI engineer

    AnswerThis (YC F25) • sangli, maharashtra, in
    Remote (Applications open worldwide).Semantic Search, Vector Databases, Prompt Engineering, GenAI Frameworks, React Agents, Graph Agents, Document Parsing, Python, Scalable APIs.AnswerThis is an AI...Show more
    Last updated: 30+ days ago • Promoted
    Remote GenAI Engineer

    Remote GenAI Engineer

    EazyML • sangli, maharashtra, in
    Remote
    Founded by Bell Labs research veterans, and associated with breakthrough startups like Amelia, EazyML, specializes in Transparent Machine Learning. Early on EazyML founders saw the need for Transpa...Show more
    Last updated: 30+ days ago • Promoted
    Data Engineer - Web Scraping

    Data Engineer - Web Scraping

    Alternative Path • Sangli, Maharashtra, India
    Alternative Path is seeking skilled software developers to collaborate on client projects with an asset management firm.In this role, you will collaborate with individuals across various company de...Show more
    Last updated: 30+ days ago • Promoted
    Web Developer

    Web Developer

    Smart Moves Consultants • Sangli, Maharashtra, India
    Key Responsibilities : Design and develop high-performance, responsive web portals using React.Build scalable backend services and APIs with Node. Integrate and optimize Snowflake for secure data sto...Show more
    Last updated: 23 hours ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    ValueLabs • sangli, maharashtra, in
    This role requires strong technical leadership and hands-on expertise in modern web technologies.CRM, Call Center, and AI models. Mentor junior developers and coordinate cross-functional delivery.SQ...Show more
    Last updated: 30+ days ago • Promoted
    Freelance Deep Web Crawler Engineer (Ai-Integrated Data Pipeline)

    Freelance Deep Web Crawler Engineer (Ai-Integrated Data Pipeline)

    Sixteen Alpha AI • Sāngli, Republic Of India, IN
    The crawler will be integrated with an.AI or NLP / LLM-based components.JavaScript, infinite scrolling, or APIs.LLM-based data labeling, or automated content enrichment modules.Airflow, Prefect, or c...Show more
    Last updated: 13 days ago • Promoted
    AI Web Scraping Engineer

    AI Web Scraping Engineer

    S2T AI - AI-Powered Investigations • Sangli, Maharashtra, India
    We're seeking a forward-thinking Web Scraping Engineer who leverages AI tools to accelerate development and streamline data extraction processes. Join our India team and work at the intersection o...Show more
    Last updated: 30+ days ago • Promoted
    Full Stack and AI Engineer

    Full Stack and AI Engineer

    Loam.ai • Sangli, Maharashtra, India
    AI Consulting startup that designs and deploys custom artificial‑intelligence solutions for forward‑thinking businesses.We couple state‑of‑the‑art GenAI techniques with rock‑solid engineering to tu...Show more
    Last updated: 1 day ago • Promoted
    Founding Engineer at JustCopy.AI

    Founding Engineer at JustCopy.AI

    JustCopy Inc • Sangli, Maharashtra, India
    AI provides a platform for cloning production-ready software applications instantly, eliminating the need for extensive coding and AI prompting. Our innovative solution allows users to copy battle-t...Show more
    Last updated: 11 days ago • Promoted
    Senior Web Scraping Engineer

    Senior Web Scraping Engineer

    Zomunk • Sangli, Maharashtra, India
    About us We're building a product that relies heavily on collecting structured data from a number of known websites.We need someone experienced who can own this part of the system end-to-end; from...Show more
    Last updated: 8 days ago • Promoted
    Full Stack AI Developer

    Full Stack AI Developer

    HJ Recruitment • Sangli, Maharashtra, India
    TypeScript • LLMs • Agents • RAG) Location : Remote Type : Full-time Industry : AI Startup We’re building next-generation AI systems real agents, real intelligence, real impact.If you want to pus...Show more
    Last updated: 9 days ago • Promoted
    Web3 Engineer

    Web3 Engineer

    {xpay} • sangli, maharashtra, in
    Agent to Agent payments in the Agentic Economy with its cutting-edge control plane for managing x402 payments.The platform enables businesses to prevent runaway agent costs, monetize APIs instantly...Show more
    Last updated: 15 days ago • Promoted