We’re hiring an experienced Data Engineer to design and build modern data pipelines that power advanced analytics, AI, and healthcare insights.
If you thrive in cloud-native environments and love transforming complex, multi-source data into meaningful intelligence — this role is for you.
⚙️ What You’ll Work On
- Design and maintain scalable batch and streaming data pipelines using Google Cloud Dataflow, Datastream, and Airbyte
- Develop and optimize ETL / ELT processes across AWS Postgres, Google FHIR Store, and BigQuery
- Build unified data models integrating EHR / FHIR, claims, HL7, CRM, and transactional data
- Implement transformations in dbt / OBT to create curated semantic layers for AI / BI pipelines
- Ensure data quality, lineage, validation, and HIPAA compliance across all pipelines
- Collaborate with AI / ML, BI, and product teams to deliver data-driven insights
- Drive cost optimization and performance tuning for BigQuery and streaming systems
- Contribute to architectural decisions and mentor junior engineers on best practices
What You Bring
Strong healthcare experience with claims and EHR dataExpert in SQL including complex joins, window functions, and advanced transformationsProven skills in data modeling (star schema, snowflake)Hands-on experience with GCPExpertise in BigQuery partitioning, clustering, and cost optimizationFamiliarity with Cloud Storage, Dataflow, and DatastreamExperience building scalable ETL / ELT pipelines, including batch and streaming, using tools like Dataflow, dbt, Datastream, Airflow / ComposerProficiency with dbt for staging, fact, and marts layering; incremental models; partition and cluster strategiesKnowledge of data snapshotting techniquesStrong focus on data quality and governance : dbt tests (unique, not null, relationships), handling duplicates, schema drift, late-arriving data, and ensuring referential integrityAbility to design modular and reusable data modelsVersion control experience (Git)Security and compliance skills handling PHI / PII with column-level masking, tokenization, anonymization, and a basic understanding of HIPAA requirementsFamiliarity with secure access patterns, IAM roles, and restricted permissionsLocation : Pune (Work from Office)
☁️ Tech Stack : GCP, Dataflow, Datastream, BigQuery, Python, SQL, dbt
Domain : HealthTech / Cloud Data Engineering
Join a growing healthtech product team and help build the data backbone that powers AI, predictive analytics, and better patient outcomes.