Job Summary :
We are seeking three skilled and motivated Databricks Engineers to significantly contribute to our data engineering initiatives within the pharmaceutical and life sciences domain.
This role demands strong, hands-on expertise in Databricks, Python, and SQL to design, build, and optimize scalable, robust data pipelines. The Engineer will be responsible for transforming complex, industry-specific datasets, ensuring data quality and accessibility, and enabling critical analytics that drive business intelligence and compliance Responsibilities and Pipeline Development :
- Design, implement, and maintain high-performance, scalable ETL / ELT data pipelines primarily utilizing Databricks (Delta Lake, Databricks SQL, and Workspace features).
- Leverage strong hands-on experience with Databricks to manage large-scale data processing workloads using Spark clusters and optimizing job execution for efficiency and cost.
- Develop and maintain data transformation logic using Databricks Notebooks written in Python and / or and Data Management :
- Apply solid programming skills in Python (including libraries like Pandas and PySpark) for complex data manipulation, cleansing, validation, and automation of data ingestion workflows.
- Utilize proficiency in SQL for ad-hoc data querying, complex transformations, stored procedure logic, and effective troubleshooting of data discrepancies across the data lakehouse.
- Implement and enforce data governance and quality checks within the pipeline to ensure the accuracy and reliability of all downstream data and Compliance Focus :
- Apply proven experience working with pharmaceutical or life sciences data, including familiarity with industry-specific data structures (e.g., clinical trials, patient data, R&D data) and standards.
- Ensure all data solutions adhere to relevant compliance considerations and regulatory standards specific to the pharmaceutical and Support :
- Collaborate effectively with data scientists, BI developers, and cross-functional teams to understand data needs and ensure high data accessibility and performance.
- Participate in code reviews, contribute to technical documentation, and provide production support for data pipelines in a fast-paced Skills and Experience :
- 35 years of dedicated experience in data engineering or a closely related role.
- Strong hands-on expertise with Databricks for data processing, pipeline development, and managing Delta Lake architecture.
- Proficiency in SQL for complex querying, data transformation, and troubleshooting data issues.
- Solid programming skills in Python for data manipulation, scripting, and automation.
- Proven experience working with pharmaceutical or life sciences data, including familiarity with industry data structures and compliance considerations.
- Experience with cloud platforms (e.g., AWS, Azure) and associated data Skills :
- Hands-on experience with Delta Live Tables (DLT) for declarative pipeline implementation.
- Familiarity with database version control and migration tools.
- Experience with CI / CD implementation for Databricks jobs.
- Knowledge of advanced data governance practices and tools.
- Understanding of statistical modeling and machine learning concepts in a life sciences context.
(ref : hirist.tech)