About the Role :
We are seeking a highly skilled Web Scraping & Python API Developer to build and maintain scalable data extraction systems from various websites and APIs. The ideal candidate has hands-on experience with web scraping frameworks, RESTful API development, and data integration techniques.
Responsibilities :
- Design and develop robust, scalable web scraping scripts using Python (e.g., Scrapy, BeautifulSoup, Selenium).
- Build and maintain RESTful APIs to serve scraped data to internal systems or clients.
- Handle anti-bot mechanisms like CAPTCHAs, JavaScript rendering, and IP rotation.
- Optimize scraping processes for speed, reliability, and data integrity.
- Parse and normalize structured and unstructured data (HTML, JSON, XML).
- Monitor and maintain scraping pipelines; handle failures and site structure changes.
- Implement logging, error handling, and reporting mechanisms.
- Collaborate with product managers and data analysts to define data requirements.
- Ensure compliance with website terms of service and data use regulations.
Requirements :
3+ years of experience with Python, especially in data extraction and web automation.Strong knowledge of web scraping libraries (Scrapy, BeautifulSoup, Requests, Selenium).Experience with REST API development (FastAPI, Flask, or Django REST Framework).Proficient with data handling libraries (Pandas, JSON, Regex).Experience working with proxies, headless browsers, and CAPTCHA solving tools.Familiarity with containerization (Docker) and deployment on cloud platforms (AWS, GCP, Azure).Strong understanding of HTML, CSS, JavaScript (from a scraping perspective).Experience with version control (Git) and agile development methodologies.Nice to Have :
Experience with GraphQL scraping.Familiarity with CI / CD pipelines and DevOps tools.Knowledge of data storage solutions (PostgreSQL, MongoDB, Elasticsearch).Prior experience with large-scale web crawling infrastructure.Benefits :
Competitive salary and performance bonuses.Flexible work hours and remote work option.Opportunity to work on high-impact, data-driven products.Learning budget for conferences, books, and courses.(ref : hirist.tech)