Senior Databricks Engineer
Job Summary :
We are seeking a highly skilled and experienced Senior Data Engineer with deep expertise in Databricks to join our Digital Capital team. The ideal candidate will have over 6 years of experience working in Databricks to create operational excellence across the platform, develop, optimize, and maintain data pipelines, and has a solid foundation in traditional enterprise data warehousing. You will play a critical role in building and maintaining our next-generation data platform, ensuring data quality, reliability, and accessibility for various analytical and operational needs across CDM Responsibilities :
- Databricks Platform : Act as a subject matter expert for the Databricks platform within the Digital Capital team, provide technical guidance, best practices, and innovative solutions.
- Databricks Workflows and Orchestration : Design and implement complex data pipelines using Azure Data Factory or Qlik replicate.
- End-to-End Data Pipeline Development : Design, develop, and implement highly scalable and efficient ETL / ELT processes using Databricks notebooks (Python / Spark or SQL) and other Databricks-native tools.
- Delta Lake Expertise : Utilize Delta Lake for building reliable data lake architecture, implementing ACID transactions, schema enforcement, time travel, and optimizing data storage for performance.
- Spark Optimization : Optimize Spark jobs and queries for performance and cost efficiency within the Databricks environment. Demonstrate a deep understanding of Spark architecture, partitioning, caching, and shuffle operations.
- Data Governance and Security : Implement and enforce data governance policies, access controls, and security measures within the Databricks environment using Unity Catalog and other Databricks security features.
- Collaborative Development : Work closely with data scientists, data analysts, and business stakeholders to understand data requirements and translate them into Databricks based data solutions.
- Monitoring and Troubleshooting : Establish and maintain monitoring, alerting, and logging for Databricks jobs and clusters, proactively identifying and resolving data pipeline issues.
- Code Quality and Best Practices : Champion best practices for Databricks development, including version control (Git), code reviews, testing frameworks, and documentation.
- Performance Tuning : Continuously identify and implement performance improvements for existing
Databricks data pipelines and data models.
Cloud Integration : Experience integrating Databricks with other cloud services (e.g., Azure Data Lake Storage Gen2, Azure Synapse Analytics, Azure Key Vault) for a seamless data ecosystem.Traditional Data Warehousing & SQL : Design, develop, and maintain schemas and ETL processes for traditional enterprise data warehouses. Demonstrate expert-level proficiency in SQL for complex data manipulation, querying, and optimization within relational database :Bachelor's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.Minimum of 6+ years of relevant experience in data engineering, with a significant portion dedicated to building and managing data solutions.Demonstrable expert-level proficiency with Databricks, including :1. Extensive experience with Spark (PySpark, Spark SQL) for large-scale data processing.
2. Deep understanding and practical application of Delta Lake.
3. Hands-on experience with Databricks Notebooks, Jobs, and Workflows.
Experience with Unity Catalog for data governance and security.Proficiency in optimizing Databricks cluster configurations and Spark job performance.Strong programming skills in Python.Expert-level SQL proficiency with a strong understanding of relational databases, data warehousing concepts, and data modeling techniques (e.g., Kimball, Inmon).Solid understanding of relational and NoSQL databases.Experience with cloud platforms (preferably Azure, but AWS or GCP with Databricks experience is also valuable).Excellent problem-solving, analytical, and communication skills.Ability to work independently and collaboratively in a fast-paced Qualifications :Databricks Certifications (e.g., Databricks Certified Data Engineer Experience with CI / CD pipelines for data engineering projects.ref : hirist.tech)