We’re hiring an experienced Data Engineer to design and build modern data pipelines that power advanced analytics, AI, and healthcare insights.
If you thrive in cloud-native environments and love transforming complex, multi-source data into meaningful intelligence — this role is for you.
⚙️ What You’ll Work On
- Design and maintain scalable batch and streaming data pipelines using Google Cloud Dataflow, Datastream, and Airbyte
- Develop and optimize ETL / ELT processes across AWS Postgres, Google FHIR Store, and BigQuery
- Build unified data models integrating EHR / FHIR, claims, HL7, CRM, and transactional data
- Implement transformations in dbt / OBT to create curated semantic layers for AI / BI pipelines
- Ensure data quality, lineage, validation, and HIPAA compliance across all pipelines
- Collaborate with AI / ML, BI, and product teams to deliver data-driven insights
- Drive cost optimization and performance tuning for BigQuery and streaming systems
- Contribute to architectural decisions and mentor junior engineers on best practices
🧠 What You Bring
Strong healthcare experience with claims and EHR dataExpert in SQL including complex joins, window functions, and advanced transformationsProven skills in data modeling (star schema, snowflake)Hands-on experience with GCPExpertise in BigQuery partitioning, clustering, and cost optimizationFamiliarity with Cloud Storage, Dataflow, and DatastreamExperience building scalable ETL / ELT pipelines , including batch and streaming, using tools like Dataflow, dbt, Datastream, Airflow / ComposerProficiency with dbt for staging, fact, and marts layering;incremental models;partition and cluster strategies
Knowledge of data snapshotting techniquesStrong focus on data quality and governance : dbt tests (unique, not null, relationships), handling duplicates, schema drift, late-arriving data, and ensuring referential integrityAbility to design modular and reusable data modelsVersion control experience (Git)Security and compliance skills handling PHI / PII with column-level masking, tokenization, anonymization, and a basic understanding of HIPAA requirementsFamiliarity with secure access patterns, IAM roles, and restricted permissions📍 Location : Pune (Work from Office)
☁️ Tech Stack : GCP, Dataflow, Datastream, BigQuery, Python, SQL, dbt
🏥 Domain : HealthTech / Cloud Data Engineering
Join a growing healthtech product team and help build the data backbone that powers AI, predictive analytics, and better patient outcomes.