Talent.com
No longer accepting applications
Data Engineer (Webscraping)

Data Engineer (Webscraping)

Solytics PartnersPrayagraj(Allahabad), IN
7 hours ago
Job description

Company Profile :

Solytics Partners is a Global Analytics firm, recognized with multiple industry awards for innovation and excellence. Our team comprises experts with deep knowledge in risk, analytics, AI / ML, AML / FCC, and fraud. By converging this expertise with cutting edge technologies like AI, Machine Learning, Generative AI, and Large Language Models (LLMs), we deliver powerful automated platforms and incisive point solutions. Our offerings enable clients to streamline and future-proof their risk, AML, and analytics processes, comply seamlessly with global regulations, and safeguard financial systems. Whether it’s solving complex challenges or driving operational efficiency, Solytics Partners is committed to empowering organizations with transformative tools to stay ahead in an evolving regulatory landscape.

Job Title : Data Engineer (Web Scraping)

Experience : 5 – 10 years of relevant experience

Location & Timings : Pune – Work from office & Timing - 11 : 00 AM – 8 : 00 PM

Education Qualification : Masters or bachelor's in computer science or IT or in other relevant discipline from a reputed institute.

Role Type : Permanent / Full Time

Job Description : We are seeking an experienced Data Engineering & Automation Lead to design, automate, and optimize large-scale data processing and web scraping pipelines. The role involves leading a team to build and maintain high-performance ETL workflows using Apache Airflow, Apache Spark, and AWS services, while integrating AI / NLP solutions powered by OpenAI GPT and other GenAI models for intelligent data extraction and analytics.

Responsibilities :

  • Design, automate, and maintain ETL and data processing pipelines using Apache Airflow and Apache Spark.
  • Build, monitor, and optimize web scraping and data extraction workflows for global compliance and risk data sources.
  • Lead and manage web scraping and data engineering teams, ensuring delivery excellence, code quality, and scalability.
  • Create, design, and document automation workflows and secure data-sharing systems using AWS (Lambda, S3, API Gateway, SQS).
  • Implement AI and NLP integrations using OpenAI GPT and GenAI models for intelligent data extraction, tagging, and analytical automation.
  • Analyze large-scale datasets to identify quality gaps, improve accuracy, and optimize indexing and retrieval performance.
  • Collaborate with Backend, DevOps, and Frontend teams for data modeling, monitoring, and visualization.
  • Work closely with clients to gather and translate business requirements into scalable automation and analytics solutions.
  • Author HLD / LLD documentation, mentor junior engineers, and continuously improve automation processes and data workflows.

Required Skills :

  • Programming : Python, SQL, JavaScript
  • Data Engineering & Automation : Apache Airflow, Apache Spark, Web Scraping (Scrapy, Selenium), Pandas, NumPy
  • Databases & Storage : Elasticsearch, MongoDB, MySQL
  • Cloud & Backend : AWS (Lambda, S3, EC2, CloudWatch, SQS, SNS, EKS), Docker, Django, Flask
  • AI / ML & NLP : OpenAI GPT APIs, NER, Sentiment Analysis, Embeddings, Information Extraction
  • Monitoring & Tools : Grafana, Git, Postman, Jupyter, VS Code Good to Have
  • Strong understanding of Large Language Models (LLMs) and Generative AI for building intelligent data extraction and analytics agents.
  • Familiarity with risk and compliance domains, including Sanctions, PEP (Politically Exposed Persons), and AMS (Adverse Media Screening) data and processes.
  • Soft Skills :

  • Leadership & Team Mentoring
  • Problem-Solving & Analytical Thinking
  • Clear Technical Communication
  • Cross-functional Collaboration
  • Create a job alert for this search

    Data Engineer • Prayagraj(Allahabad), IN

    Related jobs
    • Promoted
    Data Engineer

    Data Engineer

    RecroPrayagraj(Allahabad), IN
    Data Pipeline Engineering : Design, build, and maintain ingestion, transformation, and storage pipelines using Azure Data Factory, Synapse Analytics, and Data Lake. AI Data Enablement : Collaborate wi...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Sr. Data Engineers (Google Stack)- Remote

    Sr. Data Engineers (Google Stack)- Remote

    Mewar Infotech LimitedPrayagraj(Allahabad), IN
    Remote
    BigQuery, Vertex AI, Pub / Sub, Cloud Functions.Implement transformations using .Collaborate with stakeholders for data modeling, operational support, and performance tuning.Strong hands-on experienc...Show moreLast updated: 7 hours ago
    • New!
    Senior Data Engineer (Snowflake)

    Senior Data Engineer (Snowflake)

    Redwolf + RoschAU
    We are seeking an experienced Senior Data Engineer who is passionate about automation and committed to building scalable, repeatable processes. Your efforts will focus on optimising workflows and re...Show moreLast updated: 14 hours ago
    • Promoted
    AI Web Scraping Engineer

    AI Web Scraping Engineer

    S2T AI - AI-Powered InvestigationsPrayagraj(Allahabad), IN
    We're seeking a forward-thinking.AI tools to accelerate development and streamline data extraction processes.Join our India team and work at the intersection of traditional scraping expertise and c...Show moreLast updated: 30+ days ago
    • Promoted
    AWS Data Engineer

    AWS Data Engineer

    TerraGiGPrayagraj(Allahabad), IN
    Design, development, and implementation of performant ETL pipelines using python API (pySpark) of Apache Spark on AWS EMR. Writing reusable, testable, and efficient code.Integration of data storage ...Show moreLast updated: 17 days ago
    • Promoted
    AI / ML & Data Engineer

    AI / ML & Data Engineer

    Mindfire SolutionsPrayagraj(Allahabad), IN
    We are looking for an experienced AI / ML & Data Engineer to design, develop, and deploy scalable machine learning models and data infrastructure on AWS. You will work closely with cross-functional te...Show moreLast updated: 19 days ago
    • Promoted
    Data Engineer

    Data Engineer

    IntraEdgePrayagraj(Allahabad), IN
    We are seeking a highly skilled Data Engineer with strong experience in Python, PySpark, Snowflake, and AWS Glue to join our growing data team. You will be responsible for building scalable and reli...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    Insight GlobalPrayagraj(Allahabad), IN
    GCP DATA ENGINEER - Contract (Long term).Data Engineer with hands-on support for Google Looker.Strong experience in data modeling and building data marts. Proficiency in ETL / ELT pipeline development...Show moreLast updated: 30+ days ago
    • Promoted
    Snowflake Data Engineer

    Snowflake Data Engineer

    Newpage SolutionsPrayagraj(Allahabad), IN
    Location : Remote | Type : Contract.Newpage Solutions is a global digital health innovation company helping people live longer, healthier lives. We partner with life sciences organizations—including p...Show moreLast updated: 6 days ago
    • Promoted
    Data & Analytics Engineer

    Data & Analytics Engineer

    APPIT Software IncPrayagraj(Allahabad), IN
    Data Engineer : Snowflake -Mandatory – Hands -on Experience.ETL Tool -Informatica [IVS version],BDT.GCP : Big query – Mandatory -handson experience. Data Modelling & Data Warehouse -Mandatory -Hands...Show moreLast updated: 4 days ago
    • Promoted
    Data Engineer (Analyst)

    Data Engineer (Analyst)

    Canopus Infosystems - A CMMI Level 3 CompanyPrayagraj(Allahabad), IN
    We are looking for an experienced .The ideal candidate should have a strong understanding of SQL, cloud data warehouses (preferably Snowflake), and hands-on experience with advertising or marketing...Show moreLast updated: 5 days ago
    • Promoted
    Data Engineer - Web Scraping

    Data Engineer - Web Scraping

    Alternative PathPrayagraj(Allahabad), IN
    Alternative Path is seeking skilled software developers to collaborate on client projects with an asset management firm.In this role, you will collaborate with individuals across various company de...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    DigitalzonePrayagraj(Allahabad), IN
    As a Data Engineer, you will design, build, and optimize data pipelines and real-time systems that power AI-driven decisioning and analytics. Develop and maintain scalable ETL / ELT pipelines using Py...Show moreLast updated: 6 days ago
    • Promoted
    Freelance Data Engineer

    Freelance Data Engineer

    upGradPrayagraj(Allahabad), IN
    We are seeking a highly skilled and motivated.The ideal candidate will be responsible for designing, developing, and optimizing large-scale data pipelines and data warehouse solutions, utilizing a ...Show moreLast updated: 26 days ago
    • Promoted
    Data Engineer (Qlick Experience)

    Data Engineer (Qlick Experience)

    Add Min Web WorldPrayagraj(Allahabad), IN
    Job Title : Data Engineer (Qlik Experience).We are looking for a dedicated.The ideal candidate should have a strong understanding of data processing, integration, and reporting, along with the abili...Show moreLast updated: 6 days ago
    • Promoted
    Prompt Engineer

    Prompt Engineer

    Innodata Inc.Prayagraj(Allahabad), IN
    Demonstrated experience programmatically using LLMs to automate data labeling, classification, localization and annotation tasks. Strong expertise in Python for NLU, for data processing & transforma...Show moreLast updated: 6 days ago
    • Promoted
    • New!
    Data Engineer

    Data Engineer

    Ironbook AIPrayagraj(Allahabad), IN
    Data Engineer (Microsoft Fabric & Azure) - Relocation to KL.The Data Engineer is responsible for designing, building, and maintaining scalable data pipelines and modern data lakehouse architectures...Show moreLast updated: 7 hours ago
    • Promoted
    Data Engineer

    Data Engineer

    DraconXPrayagraj(Allahabad), IN
    DraconX specializes in creating intelligent, scalable digital solutions that drive growth and innovation for startups and enterprises. As pioneers in AI business automation and SaaS platforms, we ex...Show moreLast updated: 2 days ago