About the Role
We are looking for a highly skilled Data Engineer with strong hands-on experience in Databricks, PySpark, GCP, and cloud data infrastructure. The ideal candidate will be responsible for designing, developing, and maintaining scalable data pipelines and infrastructure to support analytics, reporting, and data-driven decision-making.
Key Responsibilities
- Design, build, and optimize scalable data pipelines and ETL processes using Databricks and PySpark.
- Implement and manage Unity Catalog for data governance, lineage, and access control.
- Develop and maintain infrastructure as code using Terraform for GCP services.
- Create and manage CI / CD pipelines for data engineering workflows.
- Work with GCP Services such as Cloud Build, BigQuery, Firestore, and other related tools.
- Ensure high data quality, security, and availability across all data platforms.
- Collaborate with cross-functional teams including data scientists, analysts, and DevOps engineers.
- Troubleshoot data-related issues and optimize performance.
Required Skills
Databricks – Expert hands-on experienceUnity Catalog – Strong understanding and implementation knowledgePySpark – Advanced coding and optimization skillsSQL – Proficient in writing complex queries and performance tuningTerraform – Experience in automating infrastructure deploymentsCI / CD – Proficient in setting up automated build and deployment pipelinesGCP Services – Hands-on experience with Cloud Build, BigQuery, Firestore, etc.Good to Have
Knowledge of Git and Agile methodologies