About the Role
We are seeking a Senior Data Engineer with deep expertise in building data pipelines and integrating healthcare data. You will report to the Head of AI and Engineering and be responsible for designing, building, and operating the backbone of our AI-powered applications. This is a critical role where you will create scalable pipelines that process clinical encounter data, retrieve associated medical documents, and deliver validated information to our machine learning systems for inference. You will work in close collaboration with software engineers, ML specialists, and our clinical partners to bring our vision to life.
What you’ll do
Design, build, and operate
scalable, event-driven data pipelines
that support real-time clinical workflows.
Integrate with
EHR / EMR systems
(e.g., Epic) using standards such as
FHIR and HL7 .
Develop reliable systems to
ingest, track, and update clinical events in real-time
Ensure downstream ML systems receive
high-quality, structured data
for inference and decision support.
Establish
observability, monitoring, and error handling
for robust pipeline performance.
Collaborate with ML engineers to align data flows with model needs (e.g., document ingestion, embedding storage, inference triggers).
Continuously improve pipeline design to maximize
scalability, resilience, and automation .
What we’re looking for
5+ years
of experience building and maintaining production-grade data pipelines.
Strong background in
workflow orchestration
and event-driven systems.
Proficiency with
relational databases and SQL
(e.g., Postgres) as well as handling unstructured data.
Experience operating in
cloud environments
and working with distributed, highly available systems.
Track record of delivering
data infrastructure that supports ML or analytics applications .
Excellent communication and collaboration skills; able to drive projects independently while working as part of a cross-functional team.
Bonus
Knowledge of
HIPAA and healthcare data security / privacy requirements .
Familiarity with
FHIR / HL7
data modeling in production settings.
Experience supporting
machine learning inference pipelines
in real-world applications.
Expertise with image data
Exposure to
vector databases, embedding stores, or document retrieval systems .
Background in healthcare IT, clinical workflows, or health data analytics.
Senior Data Engineer • Panchkula, Haryana, India