We’re hiring an experienced Data Engineer to design and build modern data pipelines that power advanced analytics, AI, and healthcare insights.
If you thrive in cloud-native environments and love transforming complex, multi-source data into meaningful intelligence — this role is for you.
⚙️ What You’ll Work On
Design and maintain scalable batch and streaming data pipelines using Google Cloud Dataflow, Datastream, and Airbyte
Develop and optimize ETL / ELT processes across AWS Postgres, Google FHIR Store, and Big Query
Build unified data models integrating EHR / FHIR, claims, HL7, CRM, and transactional data
Implement transformations in dbt / OBT to create curated semantic layers for AI / BI pipelines
Ensure data quality, lineage, validation, and HIPAA compliance across all pipelines
Collaborate with AI / ML, BI, and product teams to deliver data-driven insights
Drive cost optimization and performance tuning for Big Query and streaming systems
Contribute to architectural decisions and mentor junior engineers on best practices
What You Bring
Strong healthcare experience with claims and EHR data
Expert in SQL including complex joins, window functions, and advanced transformations
Proven skills in data modeling (star schema, snowflake)
Hands-on experience with GCP
Expertise in Big Query partitioning, clustering, and cost optimization
Familiarity with Cloud Storage, Dataflow, and Datastream
Experience building scalable ETL / ELT pipelines , including batch and streaming, using tools like Dataflow, dbt, Datastream, Airflow / Composer
Proficiency with dbt for staging, fact, and marts layering; incremental models; partition and cluster strategies
Knowledge of data snapshotting techniques
Strong focus on data quality and governance : dbt tests (unique, not null, relationships), handling duplicates, schema drift, late-arriving data, and ensuring referential integrity
Ability to design modular and reusable data models
Version control experience (Git)
Security and compliance skills handling PHI / PII with column-level masking, tokenization, anonymization, and a basic understanding of HIPAA requirements
Familiarity with secure access patterns, IAM roles, and restricted permissions
Location : Pune (Work from Office)
☁️ Tech Stack : GCP, Dataflow, Datastream, Big Query, Python, SQL, dbt
Domain : Health Tech / Cloud Data Engineering
Join a growing healthtech product team and help build the data backbone that powers AI, predictive analytics, and better patient outcomes.
Data Engineer • Pune, Maharashtra, India