Talent.com
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)Sixteen Alpha AI • Gurgaon, India
No longer accepting applications
Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

Freelance Deep Web Crawler Engineer (AI-Integrated Data Pipeline)

Sixteen Alpha AI • Gurgaon, India
9 days ago
Job description

About the Project We’re developing a next-generation intelligent web crawling system capable of exploring deep and dynamic web data sources — including sites behind authentication, infinite scrolls, and JavaScript-heavy pages.

The crawler will be integrated with an AI-driven pipeline for automated data understanding, classification, and transformation.

We’re looking for a highly experienced engineer who has previously built large-scale, distributed crawling frameworks and integrated AI or NLP / LLM-based components for contextual data extraction.

Key Responsibilities Design, develop, and deploy scalable deep web crawlers capable of bypassing common anti-bot mechanisms.

Implement AI-integrated pipelines for data processing, entity extraction, and semantic categorization.

Develop dynamic scraping systems for sites that rely on JavaScript, infinite scrolling, or APIs.

Integrate with vector databases , LLM-based data labeling, or automated content enrichment modules.

Optimize crawling logic for speed, reliability, and stealth across distributed environments.

Collaborate on data pipeline orchestration using tools like Airflow, Prefect, or custom async architectures.

Required Expertise Proven experience building deep or dark web crawlers (Playwright, Scrapy, Puppeteer, or custom async frameworks).

Strong understanding of browser automation, session management, and anti-detection mechanisms .

Experience integrating AI / ML / NLP pipelines — e.g., text classification, entity recognition, or embedding-based similarity.

Skilled in asynchronous Python (asyncio, aiohttp, Playwright async API).

Familiar with database and pipeline systems — PostgreSQL, MongoDB, Elasticsearch, or similar.

Ability to design robust data flows that connect crawling → AI inference → storage / visualization.

Nice to Have Knowledge of LLMs (OpenAI, Hugging Face, LangChain, or custom fine-tuned models) .

Experience with data cleaning, deduplication, and normalization pipelines .

Familiarity with distributed crawling frameworks (Ray, Celery, Kafka) .

Prior experience integrating real-time analytics dashboards or monitoring tools.

What We Offer Competitive freelance pay based on expertise and delivery.

Flexible, async-first remote collaboration.

Opportunity to shape an AI-first data platform from the ground up.

Potential for long-term partnership if the collaboration is successful.

Create a job alert for this search

Engineer • Gurgaon, India

Related jobs
AI Engineer

AI Engineer

empirical.run • Gurugram, Haryana, India
At Empirical, we build AI agents that write and maintain e2e tests for web apps.Our agents ship thousands of test changes daily, by replicating actions that QA engineers take : editing test code fil...Show more
Last updated: 5 days ago • Promoted
AI Engineer

AI Engineer

Recro • Gurugram, Haryana, India
Designing & deploying agentic workflows (Semantic Kernel / LangGraph / AutoGen / CrewAI).Building tool-calling flows, RAG pipelines, and hybrid search. Deploying AI agents on cloud (containers, iden...Show more
Last updated: 30+ days ago • Promoted
Full Stack Engineer

Full Stack Engineer

UsefulBI Corporation • Gurgaon, Haryana, India
About UsefulBI : UsefulBI is a leading AI-driven data solutions provider specializing in data engineering, cloud transformations, and AI-powered analytics for Fortune 500 companies.We help busines...Show more
Last updated: 23 days ago • Promoted
Web Developer (Freelance)

Web Developer (Freelance)

Sweet • gurgaon, India
Sweet is the AI-native business platform built for creators — a business partner that clears the clutter, automates the back-office, and gives creators the freedom to focus on craft, while Sweet gr...Show more
Last updated: 2 days ago • Promoted
Full Stack and AI Engineer

Full Stack and AI Engineer

Loam.ai • gurugram, uttar pradesh, in
AI Consulting startup that designs and deploys custom artificial‑intelligence solutions for forward‑thinking businesses.We couple state‑of‑the‑art GenAI techniques with rock‑solid engineering to tu...Show more
Last updated: 3 days ago • Promoted
(Laravel / PHP) Web developer with React Native Experience

(Laravel / PHP) Web developer with React Native Experience

TellByte • Gurgaon, Haryana, India
About the Role We are looking for a skilled Backend Engineer with strong expertise in Laravel / PHP who can manage, maintain, and migrate our existing PHP / Laravel applications into a React Nati...Show more
Last updated: 5 days ago • Promoted
Web Scraping Engineer

Web Scraping Engineer

noon • Gurugram, Haryana, India
Job title : Web Scraping Engineer.The ideal candidate will design and implement robust scrapers to collect, clean, and normalize product data (pricing, availability, reviews, images, etc.Develop and...Show more
Last updated: 5 days ago • Promoted
Generative AI Engineer

Generative AI Engineer

Reqpedia • Gurugram, Haryana, India
We seek a motivated Junior Generative AI Developer to design, implement, and optimize cutting-edge generative AI solutions. You’ll work closely with senior engineers to build applications leveraging...Show more
Last updated: 5 days ago • Promoted
Deep Learning Engineer

Deep Learning Engineer

SystemBender • gurgaon, haryana, in
An experienced Deep Learning Engineer specializing in Computer Vision, Sensor Fusion, and Multimodal AI to advance R&D; in autonomous aerial systems and geospatial intelligence, working with large-...Show more
Last updated: 5 hours ago • Promoted • New!
Full Stack Engineer (4-6 YOE)

Full Stack Engineer (4-6 YOE)

Redica Systems • gurgaon, haryana, in
Redica Systems is a SaaS start-up serving more than 200 customers within the life science sector, with a specific focus on Pharmaceuticals and MedTech. Our workforce is distributed globally, with he...Show more
Last updated: 3 days ago • Promoted
AI Engineer

AI Engineer

NyxaLabs • gurugram, India
We're seeking an exceptional AI Engineer with deep expertise in TensorFlow model training to design and build next-generation AI systems. This role focuses on developing sophisticated machine learni...Show more
Last updated: 5 hours ago • Promoted • New!
Full Stack Web Developer (Agentic AI Application)

Full Stack Web Developer (Agentic AI Application)

Aryng • Gurugram, HR, IN
Remote
Quick Apply
Welcome! You made it to the job description page!.This is a 100% REMOTE job opportunity.You can work from anywhere, given that you have strong internet connectivity and a personal device (laptop) t...Show more
Last updated: 30+ days ago
Founding AI Engineer

Founding AI Engineer

Ourguide.ai • Gurugram, Haryana, India
Jobs Title : Founding AI Engineer – Computer / Browser Use Systems.Type : Full-time | Start : Immediate.We’re building an AI desktop app that can see your screen and take the next step for you—a true co...Show more
Last updated: 4 days ago • Promoted
Back End Developer

Back End Developer

Jivi AI • Gurugram, Haryana, India
Jivi is transforming primary healthcare with an AI-powered clinical agentic platform designed for 8 billion people.Our flagship product, a super health app, combines an AI doctor and longevity coac...Show more
Last updated: 4 days ago • Promoted
Full Stack Engineer

Full Stack Engineer

Programmers.io • gurugram, India
Job Title : Senior Full Stack Developer (Laravel + Vue).We are seeking highly skilled Senior Full Stack Developers with 7–10 years of experience in Laravel and modern frontend frameworks (Vue.The ca...Show more
Last updated: 11 days ago • Promoted
Full-Stack Developer (AI Projects)

Full-Stack Developer (AI Projects)

AJAZ Solutions • gurugram, uttar pradesh, in
Full-Stack Developer (AI Projects) – Remote.AJAZ Solutions (Recruiting on behalf of a client).Experience Level : Minimum FOUR YEARS of AI-Centric Experience. AJAZ Solutions is hiring on behalf of a f...Show more
Last updated: 5 hours ago • Promoted • New!
AWS AI / ML Engineer (Remote)

AWS AI / ML Engineer (Remote)

Mindcraft Labs • gurgaon, haryana, in
Remote
This is a hands-on engineering role focused on building and maintaining AI and ML services on AWS.You will help turn ideas and prototypes into robust, production-ready APIs and ML flows using Amazo...Show more
Last updated: 5 hours ago • Promoted • New!
Clicflyer - Data Engineer - Hadoop

Clicflyer - Data Engineer - Hadoop

ClicFlyer • Gurgaon
Roles and Responsibilities : Proficiency in building highly scalable ETL and streaming-based data pipelines using Google Cloud Platform (GCP) services and pr...Show more
Last updated: 30+ days ago • Promoted