Talent.com
AI/ML Integrated Web Data Extraction Engineer (Remote)
AI/ML Integrated Web Data Extraction Engineer (Remote)Sixteen Alpha AI • New Delhi, Republic Of India, IN
AI / ML Integrated Web Data Extraction Engineer (Remote)

AI / ML Integrated Web Data Extraction Engineer (Remote)

Sixteen Alpha AI • New Delhi, Republic Of India, IN
12 days ago
Job type
  • Remote
Job description

About the Project

We’re developing a next-generation intelligent web crawling system capable of exploring deep and dynamic web data sources — including sites behind authentication, infinite scrolls, and JavaScript-heavy pages.

The crawler will be integrated with an AI-driven pipeline for automated data understanding, classification, and transformation.

We’re looking for a highly experienced engineer who has previously built large-scale, distributed crawling frameworks and integrated AI or NLP / LLM-based components for contextual data extraction.

Key Responsibilities

  • Design, develop, and deploy scalable deep web crawlers capable of bypassing common anti-bot mechanisms.
  • Implement AI-integrated pipelines for data processing, entity extraction, and semantic categorization.
  • Develop dynamic scraping systems for sites that rely on JavaScript, infinite scrolling, or APIs.
  • Integrate with vector databases , LLM-based data labeling, or automated content enrichment modules.
  • Optimize crawling logic for speed, reliability, and stealth across distributed environments.
  • Collaborate on data pipeline orchestration using tools like Airflow, Prefect, or custom async architectures.

Required Expertise

  • Proven experience building deep or dark web crawlers (Playwright, Scrapy, Puppeteer, or custom async frameworks).
  • Strong understanding of browser automation, session management, and anti-detection mechanisms .
  • Experience integrating AI / ML / NLP pipelines — e.G., text classification, entity recognition, or embedding-based similarity.
  • Skilled in asynchronous Python (asyncio, aiohttp, Playwright async API).
  • Familiar with database and pipeline systems — PostgreSQL, MongoDB, Elasticsearch, or similar.
  • Ability to design robust data flows that connect crawling → AI inference → storage / visualization.
  • Nice to Have

  • Knowledge of LLMs (OpenAI, Hugging Face, LangChain, or custom fine-tuned models) .
  • Experience with data cleaning, deduplication, and normalization pipelines .
  • Familiarity with distributed crawling frameworks (Ray, Celery, Kafka) .
  • Prior experience integrating real-time analytics dashboards or monitoring tools.
  • What We Offer

  • Competitive freelance pay based on expertise and delivery.
  • Flexible, async-first remote collaboration.
  • Opportunity to shape an AI-first data platform from the ground up.
  • Potential for long-term partnership if the collaboration is successful.
  • Create a job alert for this search

    Data Engineer • New Delhi, Republic Of India, IN

    Related jobs
    Remote GenAI Engineer

    Remote GenAI Engineer

    EazyML • India, India
    Remote
    Founded by Bell Labs research veterans, and associated with breakthrough startups like Amelia, EazyML, specializes in Transparent Machine Learning. Early on EazyML founders saw the need for Transpa...Show more
    Last updated: 21 days ago • Promoted
    Artificial Intelligence Engineer

    Artificial Intelligence Engineer

    ACL Digital • Nagpur, IN
    We are Hiring : AI Engineer : Remote Opportunity.Design, develop and deploy scalable.Machine Learning and AI models.Perform data extraction, cleaning, transformation and modeling using.Develop end-to...Show more
    Last updated: 13 days ago • Promoted
    AI Lead Engineer

    AI Lead Engineer

    TekGenio • Nagpur, IN
    Experience : 5+ Years | Type : Full-Time | Location : WFH.Minimum of 5+ years of experience in AI / ML engineering, data science, or algorithm development. Strong experience in machine learning, deep lea...Show more
    Last updated: 16 hours ago • Promoted • New!
    Lead AI Engineer

    Lead AI Engineer

    Blend • Nagpur, IN
    We are looking for an AI Engineer with hands-on experience designing and deploying scalable AI solutions.In this role, you will be part of a cross-functional team working on cutting-edge projects i...Show more
    Last updated: 17 days ago • Promoted
    Generative AI Engineer

    Generative AI Engineer

    Philodesign Technologies Inc • Nagpur, IN
    Gen AI Engineer | Remote | 4+ Years Experience | Budget : 1 LPM.We are looking for an experienced.GenAI solutions for global clients. If you have a solid background in AI / ML engineering and have deli...Show more
    Last updated: 15 days ago • Promoted
    AI / ML Developer

    AI / ML Developer

    Viionn Labs • nagpur, maharashtra, in
    Derive and design use cases from structured and unstructured data.Provide LLM expertise to solve AI problems using state-of-the-art language models and off-the-shelf LLM services such as OpenAI mod...Show more
    Last updated: 21 days ago • Promoted
    AI Developer - ML & AI Agents (3 to 9 Years)

    AI Developer - ML & AI Agents (3 to 9 Years)

    AIMLEAP • Nagpur, IN
    AI Developer - ML & AI Agents (3 to 9 Years).Tech in Computer Science, AI / ML, or related field.AI / ML with hands-on exposure to LLMs and agentic AI development. Strong Python programming background f...Show more
    Last updated: 7 days ago • Promoted
    Artificial Intelligence Engineer

    Artificial Intelligence Engineer

    INSPYR Solutions • Nagpur, IN
    Hybrid / Remote / Onsite — Add as needed].LLMs, RAG systems, AI agents, and autonomous workflows.You will play a key role in designing architecture, building scalable pipelines, and enabling production...Show more
    Last updated: 5 days ago • Promoted
    Software Engineer - AI / ML

    Software Engineer - AI / ML

    Mindfire Solutions • Nagpur, IN
    As an AI / ML Engineer, you will be responsible for designing, validating, and integrating cutting-edge machine learning models and algorithms. Collaborate closely with cross-functional teams, includi...Show more
    Last updated: 14 days ago • Promoted
    Full Stack AI engineer

    Full Stack AI engineer

    AnswerThis (YC F25) • Nagpur, IN
    Remote (Applications open worldwide).Semantic Search, Vector Databases, Prompt Engineering, GenAI Frameworks, React Agents, Graph Agents, Document Parsing, Python, Scalable APIs.AnswerThis is an AI...Show more
    Last updated: 30+ days ago • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Innodata Inc. • India, India
    Our AI-driven platforms and expert teams empower clients in healthcare, life insurance, and other industries to identify risks, improve efficiency, and make smarter decisions.By combining proprieta...Show more
    Last updated: 20 days ago • Promoted
    AI / ML Engineer

    AI / ML Engineer

    C4 Technical Services • Nagpur, IN
    AI Solution Development & Deployment : Design and implement AI models and algorithms using Azure AI services, including Azure Machine Learning, Cognitive Services, and Azure OpenAI.Develop and deploy...Show more
    Last updated: 16 hours ago • Promoted • New!
    AI Engineer

    AI Engineer

    TechKareer • India, India
    Mumbai / Bengaluru / Gurgaon (Hybrid : 3 days / week in office).Remote option for exceptional candidates.We’re building production-grade AI workflows and agentic applications that power real user expe...Show more
    Last updated: 30+ days ago • Promoted
    Principal RTL Design Engineer / Co-founder - AI / ML Accelerator

    Principal RTL Design Engineer / Co-founder - AI / ML Accelerator

    Faststream Technologies • Nagpur, IN
    Lead / Own a world class NPU for Edge AI Inference.Develop ultra-low-power machine learning chips for intelligent sensing and autonomous navigation. Architect / Work independently and collaborativel...Show more
    Last updated: 2 days ago • Promoted
    Senior AI / ML Cloud Engineer – Graph Analytics Platform

    Senior AI / ML Cloud Engineer – Graph Analytics Platform

    ContexQ • Nagpur, IN
    Senior AI / ML Cloud Engineer – Graph Analytics Platform.ContexQ, a Singapore-based B2B SaaS AI startup, is dedicated to transforming financial crime, fraud, and risk management through a groundbre...Show more
    Last updated: 5 days ago • Promoted
    Search Engineer

    Search Engineer

    YourTribe • India, India
    Design & implement search solutions.Architect and develop advanced search features using.OpenSearch / Elasticsearch, including custom analysers, tokenisers, and scoring algorithms.Create and maintain...Show more
    Last updated: 30+ days ago • Promoted
    AI / ML Engineer

    AI / ML Engineer

    Lingaro • Nagpur, IN
    AI / ML Engineer – Senior Consultant.AI Engineering Group is part of Data Science & AI Competency Center and is focusing technical and engineering aspects of DS / ML / AI solutions.We are looking for exp...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Engineer-Agentic AI

    Machine Learning Engineer-Agentic AI

    Innodata Inc. • Nagpur, IN
    Design and implement multi-agent systems using LangChain, LangGraph, CrewAI, AutoGen or similar frameworks.Build A2A (agent-to-agent) orchestration and implement MCP (multi-context protocol) for co...Show more
    Last updated: 21 days ago • Promoted