Role : Databricks Engineer
Position : Remote
Type : Contract
Duration : 3 - 6 months (extendable)
Role Overview
We are seeking a Mid-Level Databricks Engineer with strong data engineering fundamentals and hands-on experience building scalable data pipelines on the Databricks platform. The ideal candidate will be comfortable working with distributed data processing, designing Delta Lake architectures, implementing CI / CD for data workloads, and optimizing Spark jobs for performance and reliability.
Required Skills
- Strong hands-on experience in building data pipeline using scripting language includes Python, PySpark and SQL
- Strong working experience with Databricks (Unity Catalog is plus) and Azure Data Factory (ADF)
- Hands-on working experience with Azure DevOps Repo and CI / CD
- Strong understanding of modern Data Architecture, Data Warehousing concepts, Data Modelling and ETL / ELT processes
- Understanding of Agile ways of working
- Any experience working in DBT is a plus
- Working with Retail Domain is a plus
Platform Engineering & Operations
Configure and optimize Databricks clusters, including autoscaling, cluster sizing, and runtime selection.Manage databases, schemas, and Delta Lake tables, including vacuuming, Z-ordering, partitioning, and optimization.Work with catalogs, Unity Catalog (UC), and access controls for secure data governance.Monitor and troubleshoot performance issues across jobs, clusters, and workflows.CI / CD & DevOps
Implement CI / CD pipelines for Databricks workflows using GitHub Actions / Azure DevOps / GitLab.Work with Databricks Repos and Git workflows (branching, PRs, code reviews).Version and promote data pipelines across environments.Cloud & Integration
Integrate Databricks with cloud services such as :Azure : ADLS Gen2, Key Vault, ADF, Event Hub, SynapseAWS : S3, Glue, IAM, Lambda, Kinesis (depending on cloud environment)Work with data ingestion frameworks, messaging systems, and event-driven architectures.Quality, Testing & Documentation
Implement unit tests and integration tests using pytest, SQL tests, or dbx testing frameworks.Maintain pipeline documentation, data dictionaries, and technical design artifacts.Ensure high data quality, reliability, and observability.