Job Title : Lead Data Engineer
Experience : 8–10 Years
Location : Remote ( 1 week from noida office then remote)
Job Overview :
We are looking for an accomplished Lead Data Engineer with 8–10 years of experience in
architecting, building, and managing complex data ecosystems using Python, Databricks
(Delta Lake, Spark, MLflow), and modern orchestration frameworks. The candidate should
possess strong leadership and technical acumen to drive data strategy, pipeline
optimization, and AI / ML workflow automation across large-scale projects.
The role involves architecting scalable data solutions, ensuring data governance and
compliance (HIPAA, SOC 2, GDPR), and mentoring data engineering teams to deliver high
performance solutions in healthcare or life sciences domains.
Key Responsibilities :
Lead the architecture, design, and implementation of scalable and reliable data
solutions using Databricks, Delta Lake, and Spark.
Oversee the development and optimization of ETL / ELT pipelines using Python and
modern orchestration tools (Airflow, Prefect, Dagster).
Drive data platform modernization and migration to cloud-native architectures
(AWS, Azure, or GCP).
Collaborate with Data Scientists, Analysts, and Architects to enable advanced
analytics and ML model deployment using MLflow.
Establish and enforce data engineering standards, best practices, and code reviews
across teams.
Ensure data governance, lineage, and compliance with regulatory frameworks
(HIPAA, SOC 2, GDPR).
Manage performance tuning, cost optimization, and scalability planning for data
workflows.
Mentor and guide junior data engineers, fostering a culture of technical excellence.
Partner with stakeholders to align business objectives with data strategy and delivery
roadmap.
Technical Skills Required :
Languages : Python, SQL, PySpark
Data Platforms : Databricks, Delta Lake, MLflow, Apache Spark
Workflow Orchestration : Airflow, Prefect, Dagster
Cloud Platforms : AWS, Azure, or GCP (with focus on data services)
Data Architecture : Data Lakes, Warehouses, and Streaming Architectures
DevOps & Version Control : Git, Jenkins, Terraform, CI / CD pipelines
Governance & Security : HIPAA, SOC 2, GDPR compliance expertise
Nice to Have : Knowledge of Kafka, dbt, and Infrastructure-as-Code (IaC) tools
Soft Skills :
Proven leadership and mentorship capabilities
Excellent communication and client interaction skills
Strategic mindset with strong analytical and problem-solving ability
Ability to manage multiple projects and distributed teams in Agile environments
Education :
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field
Chances of Closure
Yes
Lead Data Engineer • Hyderabad, Telangana, India