Position Summary :
We are seeking an experienced and forward-thinking Databricks Architect to lead the design and implementation of scalable data solutions using the Databricks Lakehouse Platform. This role requires strong technical leadership, a deep understanding of big data and analytics, and the ability to architect solutions that empower enterprise data initiatives across data engineering, advanced analytics, and machine learning workloads.
The ideal candidate will have extensive experience with Apache Spark, Delta Lake, PySpark / Scala, and cloud platforms (Azure, AWS, or GCP), along with a proven ability to define best practices for architecture, governance, security, and performance optimization on Responsibilities :
- Design and implement end-to-end modern data architectures leveraging Databricks Lakehouse, Delta Lake, and cloud-native technologies.
- Define scalable architecture for data ingestion, ETL / ELT pipelines, data processing, analytics, and data science workflows.
- Develop reference architectures and solution blueprints for various business and technical use cases.
- Lead the development of robust data pipelines and ETL frameworks using PySpark / Scala and Databricks notebooks.
- Enable streaming and batch data processing using Apache Spark on Databricks.
- Collaborate with DevOps teams to implement CI / CD pipelines for Databricks workloads using tools like GitHub, Azure DevOps, or Jenkins.
- Optimize Databricks clusters, Spark jobs, and data workflows for performance, scalability, and cost efficiency.
- Implement caching, partitioning, Z-Ordering, and data compaction strategies on Delta Lake
- Define and implement data governance standards using Unity Catalog, role-based access control (RBAC), and data lineage tracking.
- Ensure data compliance and security policies are enforced across data pipelines and storage layers.
- Maintain metadata catalogs and ensure data quality and observability across the pipeline.
- Engage with business analysts, data scientists, product owners, and solution architects to gather requirements and translate them into technical solutions.
- Present architectural solutions and recommendations to senior leadership and cross-functional teams.
- Provide technical guidance and mentorship to data engineers and junior architects.
- Conduct code reviews, enforce coding standards, and foster a culture of engineering Qualifications Skills :
- Expert-level knowledge of Databricks, including Delta Lake, Unity Catalog, MLflow, and Workflows.
- Strong hands-on experience with Apache Spark, especially using PySpark or Scala.
- Proficient in building and maintaining ETL / ELT pipelines in a large-scale distributed environment.
- In-depth understanding of cloud platforms AWS (with S3, Glue, EMR), Azure (with ADLS, Synapse), or GCP (with BigQuery, Dataflow).
- Familiarity with SQL and data modeling techniques for both OLAP and OLTP systems
(ref : hirist.tech)