PLEASE READ CAREFULLY :
IN ORDER TO BE CONSIDERED FOR THIS ROLE, YOU MUST SUBMIT AN APPLICATION HERE :
https : / / onboard.daply.ai / developers
Daply is hiring a Python Developer with deep experience in data scraping. You will architect reliable crawlers, pipelines, and services that turn messy web data into clean, usable datasets for analytics and AI features.
What you will do
Build distributed crawlers and parsers with strong error handling
Use headless browsers and smart rotation to bypass common defenses
Create ETL flows that validate, dedupe, and enrich data before loading
Expose datasets through APIs and internal tools
Instrument jobs, add alerts, and drive continuous performance gains
Must haves
Expert Python with Scrapy or Playwright or Selenium and requests or httpx
Advanced HTML parsing with lxml and BeautifulSoup and XPath or CSS selectors
Proven proxy rotation, session management, and CAPTCHA solving experience
Strong SQL with PostgreSQL and practical NoSQL experience such as MongoDB or Redis
Async programming with asyncio or Trio for high throughput crawling
Data processing with pandas or Polars and job scheduling with Airflow or Prefect
Docker, CI, and a major cloud such as AWS or GCP
Clean, tested, production grade code
Nice to have
Kafka or Kinesis for streaming ingestion
Columnar storage such as Parquet on S3 and DuckDB or BigQuery for analytics
Basic NLP to extract entities from scraped text
Familiarity with legal and ethical guidelines for web data collection
Our stack
Python, Scrapy, Playwright, Selenium
requests, httpx, BeautifulSoup, lxml
PostgreSQL, Redis, MongoDB
Airflow or Prefect, Docker, AWS or GCP, Git
pandas or Polars, Parquet, S3
Why Daply
Remote friendly and async
Work directly with founders on high impact data products
Fast moving environment with real ownership
Python Developer • Chennai, IN