No longer accepting applications

Tech Lead - Data Bricks

ConfidentialNavi Mumbai, Mumbai, Mumbai City

30+ days ago

Job description

We are seeking a skilled Databricks Architect to design, implement, and optimize scalable data solutions within our cloud-based data platform. This role requires extensive knowledge of Databricks (Azure / AWS), data engineering, and a deep understanding of data architecture principles, with the ability to drive strategy, best practices, and hands-on implementation for high-performance data processing and analytics solutions.

Responsibilities :

Solution Architecture :
Design and architect end-to-end data solutions using Databricks and Azure / AWS, including data ingestion, processing, and storage.
Delta Lake Implementation :
Leverage Delta Lake and Lakehouse architecture to create robust, unified data structures that support advanced analytics and machine learning.
Data Processing Development :
Develop, design, and automate large-scale, high-performance data processing systems (batch and / or streaming) to drive business growth and enhance the product experience.
Performance Tuning :
Ensure optimal performance of data pipelines and workloads by implementing best practices for resource management, auto-scaling, and query optimization in Databricks.
Engineering Best Practices :
Advocate for high-quality software engineering practices in building scalable data infrastructure and pipelines.
Architecture / Solution Development :
Develop Architecture or solution for large data project using Databricks.
Project Leadership :
Lead data engineering projects to ensure pipelines are reliable, efficient, testable, and maintainable.
Data Modeling :
Design data models optimized for storage, retrieval, and critical product and business requirements.
Logging Architecture :
Understand and influence logging to support data flow, implementing logging best practices as needed.
Standardization and Tooling :
Contribute to shared data engineering tools and standards to boost productivity and quality for Data Engineers across the company.
Collaboration :
Work closely with leadership, engineers, program managers, and data scientists to understand and meet data needs.
Partner Education :
Use data engineering expertise to identify gaps and improve existing logging and processes for partners.
Data Governance :
Collaborate with stakeholders to build data lineage, data governance, and data cataloging using unity catalog.
Agile Project Management :
Lead projects using agile methodologies.
Communication :
Communicate effectively with stakeholders at all organizational levels.
Team Development :
Recruit, retain, and develop team members, preparing them for increased responsibilities and challenges.

Requirements :

10+ years of relevant industry experience.

ETL Expertise :

Skilled in custom ETL design, implementation, and maintenance.

Data Modeling :

Experience in developing and designing data models for reporting systems.

Databricks Proficiency :

Hands-on experience with Databricks SQL workloads.

Data Ingestion :

Expertise in data ingestion from offline files (e.g., CSV, TXT, JSON) along with API and DB, CDC data ingestion. Should have handled such projects in past.

Pipeline Observability :

Skilled in setting up robust observability for complete pipelines and Databricks in Azure / AWS.

Database Knowledge :

Proficient in relational databases and SQL query authoring.

Programming and Frameworks :

Experience with Java, Scala, Spark, PySpark, Python, and Databricks.

Cloud Platforms :

Cloud experience required (Azure / AWS preferred).

Data Scale Handling :

Experience working with large-scale data.

Pipeline Design and Operations :

Proven experience in designing, building, and operating robust data pipelines.

Performance Monitoring :

Skilled in deploying high-performance pipelines with reliable monitoring and logging.

Cross-Team Collaboration :

Able to work effectively across teams to establish overarching data architecture and provide team guidance.

ETL Optimization :

Ability to optimize ETL pipelines to reduce data transfer and storage costs.

Auto Scaling :

Skilled in using Databricks SQL s auto-scaling feature to adjust worker numbers based on workload.

Tech Stack : Cloud Platform :

Azure / AWS.

Azure / AWS :

Databricks SQL Serverless, Databricks SQL, Databricks workspaces, Databricks notebooks, Databricks job scheduling, Data Catalog.

Data Architecture :

Delta Lake, Lakehouse concepts.

Data Processing :

Spark Structured / Streaming.

File Formats :

CSV, Avro, Parquet.

CI / CD :

CI / CD for ETL pipelines.

Governance Model :

Databricks SQL unified governance model (Unity Catalog) across clouds, supporting open formats and APIs.

Skills Required

Data Modeling, Spark, Databricks, Azure, Aws, Etl

Create a job alert for this search

Tech Lead • Navi Mumbai, Mumbai, Mumbai City