Location : India (remote – Bangalore / Karnataka area preferred)
Type : Full-time contractor / employee
Urgency : Position to be filled ASAP
About the role
You will be a core member of the team building a data platform that maps economic, advertising and real-estate actors using public / open data sources (social networks, marketplaces, registers, press) for public administration.
Key responsibilities
Design, build and maintain ingestion pipelines from APIs, web scrapers and open data sources (batch & incremental / delta loads).
Implement data workflows using Airflow (or similar) and Python / Spark for cleaning, normalization and entity resolution.
Model and optimize datasets in PostgreSQL / PostGIS , S3-compatible object storage (MinIO) and Elasticsearch .
Implement provenance tracking (URL, timestamp, hash) and basic quality checks (coverage, error rate, freshness).
Work closely with Data Analysts, BI and DevOps to ensure reliable, secure and scalable data flows.
Document data models, schemas, and pipelines for handover to local and client teams.
Must-have skills
3–6 years of experience as Data Engineer.
Strong Python (Pandas, PySpark or Spark) and SQL.
Hands-on with Airflow (or similar orchestrator), Kafka or other streaming / queue systems.
Experience with PostgreSQL (indexes, partitioning, query optimization).
Good understanding of data modelling , lineage and data quality.
Comfortable working in Linux environments, Docker and Git.
Experience in distributed systems and performance tuning.
Nice-to-have
Experience with PostGIS , GeoServer / MapLibre or geo-analytics.
Experience with OSINT / public-data / web-scraping projects (Playwright / Selenium).
Knowledge of Neo4j or other graph DBs.
Exposure to security and compliance (data residency, GDPR-like frameworks).
Senior Data Engineer • Borivali, Maharashtra, India