About the Role
We are looking for a highly skilled Data Engineer with strong hands-on experience in Databricks, PySpark, GCP , and cloud data infrastructure . The ideal candidate will be responsible for designing, developing, and maintaining scalable data pipelines and infrastructure to support analytics, reporting, and data-driven decision-making.
Key Responsibilities
- Design, build, and optimize scalable data pipelines and ETL processes using Databricks and PySpark .
- Implement and manage Unity Catalog for data governance, lineage, and access control.
- Develop and maintain infrastructure as code using Terraform for GCP services.
- Create and manage CI / CD pipelines for data engineering workflows.
- Work with GCP Services such as Cloud Build, BigQuery, Firestore , and other related tools.
- Ensure high data quality, security, and availability across all data platforms.
- Collaborate with cross-functional teams including data scientists, analysts, and DevOps engineers.
- Troubleshoot data-related issues and optimize performance.
Required Skills
Databricks – Expert hands-on experienceUnity Catalog – Strong understanding and implementation knowledgePySpark – Advanced coding and optimization skillsSQL – Proficient in writing complex queries and performance tuningTerraform – Experience in automating infrastructure deploymentsCI / CD – Proficient in setting up automated build and deployment pipelinesGCP Services – Hands-on experience with Cloud Build, BigQuery, Firestore , etc.Good to Have
Knowledge of Git and Agile methodologies