Job Title : Lead Data Engineer
Experience : 8–10 Years
Location : Remote ( 1 week from noida office then remote)
Job Overview :
We are looking for an accomplished Lead Data Engineer with 8–10 years of experience in
architecting, building, and managing complex data ecosystems using Python, Databricks
(Delta Lake, Spark, MLflow), and modern orchestration frameworks. The candidate should
possess strong leadership and technical acumen to drive data strategy, pipeline
optimization, and AI / ML workflow automation across large-scale projects.
The role involves architecting scalable data solutions, ensuring data governance and
compliance (HIPAA, SOC 2, GDPR), and mentoring data engineering teams to deliver high
performance solutions in healthcare or life sciences domains.
Key Responsibilities :
- Lead the architecture, design, and implementation of scalable and reliable data
solutions using Databricks, Delta Lake, and Spark.
Oversee the development and optimization of ETL / ELT pipelines using Python andmodern orchestration tools (Airflow, Prefect, Dagster).
Drive data platform modernization and migration to cloud-native architectures(AWS, Azure, or GCP).
Collaborate with Data Scientists, Analysts, and Architects to enable advancedanalytics and ML model deployment using MLflow.
Establish and enforce data engineering standards, best practices, and code reviewsacross teams.
Ensure data governance, lineage, and compliance with regulatory frameworks(HIPAA, SOC 2, GDPR).
Manage performance tuning, cost optimization, and scalability planning for dataworkflows.
Mentor and guide junior data engineers, fostering a culture of technical excellence.Partner with stakeholders to align business objectives with data strategy and deliveryroadmap.
Technical Skills Required :
Languages : Python, SQL, PySparkData Platforms : Databricks, Delta Lake, MLflow, Apache SparkWorkflow Orchestration : Airflow, Prefect, DagsterCloud Platforms : AWS, Azure, or GCP (with focus on data services)Data Architecture : Data Lakes, Warehouses, and Streaming ArchitecturesDevOps & Version Control : Git, Jenkins, Terraform, CI / CD pipelinesGovernance & Security : HIPAA, SOC 2, GDPR compliance expertiseNice to Have : Knowledge of Kafka, dbt, and Infrastructure-as-Code (IaC) toolsSoft Skills :
Proven leadership and mentorship capabilitiesExcellent communication and client interaction skillsStrategic mindset with strong analytical and problem-solving abilityAbility to manage multiple projects and distributed teams in Agile environmentsEducation :
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related fieldChances of Closure
Yes