Talent.com
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)Sixteen Alpha AI • ludhiana, punjab, in
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

Sixteen Alpha AI • ludhiana, punjab, in
11 days ago
Job description

About the Project

We’re developing a next-generation intelligent web crawling system capable of exploring deep and dynamic web data sources — including sites behind authentication, infinite scrolls, and JavaScript-heavy pages.

The crawler will be integrated with an AI-driven pipeline for automated data understanding, classification, and transformation.

We’re looking for a highly experienced engineer who has previously built large-scale, distributed crawling frameworks and integrated AI or NLP / LLM-based components for contextual data extraction.

Key Responsibilities

  • Design, develop, and deploy scalable deep web crawlers capable of bypassing common anti-bot mechanisms.
  • Implement AI-integrated pipelines for data processing, entity extraction, and semantic categorization.
  • Develop dynamic scraping systems for sites that rely on JavaScript, infinite scrolling, or APIs.
  • Integrate with vector databases , LLM-based data labeling, or automated content enrichment modules.
  • Optimize crawling logic for speed, reliability, and stealth across distributed environments.
  • Collaborate on data pipeline orchestration using tools like Airflow, Prefect, or custom async architectures.

Required Expertise

  • Proven experience building deep or dark web crawlers (Playwright, Scrapy, Puppeteer, or custom async frameworks).
  • Strong understanding of browser automation, session management, and anti-detection mechanisms .
  • Experience integrating AI / ML / NLP pipelines — e.g., text classification, entity recognition, or embedding-based similarity.
  • Skilled in asynchronous Python (asyncio, aiohttp, Playwright async API).
  • Familiar with database and pipeline systems — PostgreSQL, MongoDB, Elasticsearch, or similar.
  • Ability to design robust data flows that connect crawling → AI inference → storage / visualization.
  • Nice to Have

  • Knowledge of LLMs (OpenAI, Hugging Face, LangChain, or custom fine-tuned models) .
  • Experience with data cleaning, deduplication, and normalization pipelines .
  • Familiarity with distributed crawling frameworks (Ray, Celery, Kafka) .
  • Prior experience integrating real-time analytics dashboards or monitoring tools.
  • What We Offer

  • Competitive freelance pay based on expertise and delivery.
  • Flexible, async-first remote collaboration.
  • Opportunity to shape an AI-first data platform from the ground up.
  • Potential for long-term partnership if the collaboration is successful.
  • Create a job alert for this search

    Engineer • ludhiana, punjab, in

    Related jobs
    Machine Learning Engineer-Agentic AI

    Machine Learning Engineer-Agentic AI

    Innodata Inc. • ludhiana, punjab, in
    Design and implement multi-agent systems using LangChain, LangGraph, CrewAI, AutoGen or similar frameworks.Build A2A (agent-to-agent) orchestration and implement MCP (multi-context protocol) for co...Show more
    Last updated: 19 days ago • Promoted
    Databricks Engineer

    Databricks Engineer

    TTC Group • ludhiana, punjab, in
    We are seeking a Mid-Level Databricks Engineer with strong data engineering fundamentals and hands-on experience building scalable data pipelines on the Databricks platform.The ideal candidate will...Show more
    Last updated: 3 days ago • Promoted
    Senior Machine Learning Engineer - NLP

    Senior Machine Learning Engineer - NLP

    Observe.AI • ludhiana, punjab, in
    AI is the leading AI agent platform for customer experience.It enables enterprises to deploy AI agents that automate customer interactions, delivering natural conversations for customers with predi...Show more
    Last updated: 19 days ago • Promoted
    AWS Data Engineer

    AWS Data Engineer

    Tata Consultancy Services • ludhiana, punjab, in
    Aws data engineer having strong experience of Python.Technical / Behavioral Competency.Proficient in Python, with experience in deploying Python packages and OOP, Experience in ingesting data from di...Show more
    Last updated: 28 days ago • Promoted
    (Laravel / Php) Web Developer With React Native Experience

    (Laravel / Php) Web Developer With React Native Experience

    TellByte • Ludhiāna, Republic Of India, IN
    PHP / Laravel applications into a.The ideal candidate will have a solid background in backend development, database management, and API design, with hands-on experience enabling smooth integration wi...Show more
    Last updated: 1 day ago • Promoted
    Python Automation & Web Scraping Engineer (2 To 4 Yrs)

    Python Automation & Web Scraping Engineer (2 To 4 Yrs)

    AIMLEAP • Ludhiāna, Republic Of India, IN
    Python Automation & Web Scraping Engineer (WFH).Bachelor’s degree in Computer Science / Information Technology.Selenium, BeautifulSoup, Requests. Experience in backend / API development using.Strong s...Show more
    Last updated: less than 1 hour ago • Promoted • New!
    Web Crawling Engineer

    Web Crawling Engineer

    Forage AI • ludhiana, punjab, in
    The ideal candidate will have strong Python programming skills and experience in web scraping frameworks, browser automation tools, and handling anti-scraping mechanisms. Forage AI is a pioneering A...Show more
    Last updated: 12 days ago • Promoted
    Snowflake Data Engineer

    Snowflake Data Engineer

    Live Connections • ludhiana, punjab, in
    Role - Snowflake Data Engineer.Required Notice Period - Immediate Joiner.To apply, connect with Abhishek via abhishek.Show more
    Last updated: 9 days ago • Promoted
    Web Front End Architect

    Web Front End Architect

    KORE Geosystems • ludhiana, punjab, in
    Kore Geosystems is at the forefront of technology, leveraging AI and advanced software to deliver innovative solutions for the mining industry. Our products empower geologists and engineers to captu...Show more
    Last updated: 30+ days ago • Promoted
    Ai Engineer

    Ai Engineer

    Tensor Pilot • Ludhiāna, Republic Of India, IN
    Tensor Pilot, through its flagship product Tensor AI, provides a sophisticated desktop-based AI assistant for interacting with local files such as code, documents, images, and videos.Tensor AI emph...Show more
    Last updated: 1 day ago • Promoted
    Full Stack AI Developer

    Full Stack AI Developer

    HJ Recruitment • ludhiana, punjab, in
    TypeScript • LLMs • Agents • RAG).We’re building next-generation AI systems real agents, real intelligence, real impact.If you want to push the frontier of what’s possible with LLMs, autonomous wor...Show more
    Last updated: 6 days ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    Programmers.io • ludhiana, punjab, in
    Job Title : Senior Full Stack Developer (Laravel + Vue).We are seeking highly skilled Senior Full Stack Developers with 7–10 years of experience in Laravel and modern frontend frameworks (Vue.The ca...Show more
    Last updated: 9 days ago • Promoted
    Full Stack Engineer AI (4-6 YOE)

    Full Stack Engineer AI (4-6 YOE)

    Redica Systems • ludhiana, punjab, in
    Redica Systems is a SaaS start-up serving more than 200 customers within the life science sector, with a specific focus on Pharmaceuticals and MedTech. Our workforce is distributed globally, with he...Show more
    Last updated: 5 days ago • Promoted
    (Laravel / PHP) Web developer with React Native Experience

    (Laravel / PHP) Web developer with React Native Experience

    TellByte • ludhiana, punjab, in
    PHP / Laravel applications into a.The ideal candidate will have a solid background in backend development, database management, and API design, with hands-on experience enabling smooth integration wi...Show more
    Last updated: 1 day ago • Promoted
    AI Engineer

    AI Engineer

    Tensor Pilot • Ludhiana, Punjab, India
    Company Description Tensor Pilot, through its flagship product Tensor AI, provides a sophisticated desktop-based AI assistant for interacting with local files such as code, documents, images, and ...Show more
    Last updated: 1 day ago • Promoted
    Back End Developer

    Back End Developer

    Idea Elan India • ludhiana, punjab, in
    Idea Elan LLC is a product based company that provides comprehensive software solutions for.Universities and Institutions worldwide. Design and develop high-performance, scalable, and secure backend...Show more
    Last updated: 4 days ago • Promoted
    Python Automation & Web Scraping Engineer (2 to 4 yrs)

    Python Automation & Web Scraping Engineer (2 to 4 yrs)

    AIMLEAP • ludhiana, punjab, in
    Python Automation & Web Scraping Engineer (WFH).Bachelor’s degree in Computer Science / Information Technology.Selenium, BeautifulSoup, Requests. Experience in backend / API development using.Strong s...Show more
    Last updated: 8 hours ago • Promoted • New!
    Sr. Web Designer

    Sr. Web Designer

    Confidential • Ludhiana, India
    We're Hiring : PHP / Laravel Developer.Location : Ludhiana (On-site role).Education Culture is looking for a skilled and passionate PHP / Laravel Developer to join our growing team.If you are enthusi...Show more
    Last updated: 3 days ago • Promoted