Technical Leadership : Provide technical leadership, guide, and mentor data engineering teams in designing, developing, and implementing robust and scalable Databricks solutions.
Architecture & Design : Develop and implement scalable data pipelines and architectures using the Databricks Lakehouse platform, ensuring data quality, consistency, and accessibility.
Data Engineering : Lead the ingestion, transformation, and processing of both batch and streaming data from diverse sources into the Databricks environment.
Performance Optimization : Analyze and ensure efficient resource utilization within Databricks clusters and troubleshoot performance bottlenecks in data pipelines and queries.
Security & Compliance : Implement and enforce best practices for data governance, access control, data lineage, and compliance within the Databricks ecosystem, including a strong focus on data security.
Collaboration : Work closely with data engineers, data analysts, data scientists, and business stakeholders to understand data requirements and deliver effective data solutions.
Cloud Integration : Manage and optimize Databricks environments on various cloud platforms, including AWS and Azure (and potentially GCP), leveraging their respective data services.
Monitoring & Automation : Set up comprehensive monitoring tools and automate data workflows and operational tasks for increased efficiency and reliability.
Develop and optimize complex SQL queries for data extraction, transformation, and loading.
Required Skills & Qualifications :
8+ years of hands-on experience in Databricks, Apache Spark, and big data processing.
Proficiency in Python, Scala, or SQL for data engineering and development within Databricks.
Strong knowledge of Delta Lake, Unity Catalog, and MLflow.
Extensive experience with ETL (Extract, Transform, Load) processes and ELT (Extract, Load, Transform) methodologies.
Proven experience working with cloud platforms, specifically AWS and Azure.
Excellent problem-solving, analytical, and leadership skills.
Strong understanding of data warehousing and data lakehouse concepts.
Experience with version control tools (Git).
Good communication skills (written and verbal) and ability to articulate complex technical concepts.
Bachelors or Masters degree in Computer Science, Engineering, or a related quantitative field- or equivalent practical experience