We are looking for a Data Engineer with strong experience in CI / CD practices, Databricks (Spark), Python, Github and SQL. The ideal candidate should have hands-on expertise in building and automating data pipelines, managing multi-environment deployments, and working with modern DevOps / configuration management tools.
Key Responsibilities
- Design, implement, and manage CI / CD pipelines using GitHub Actions or equivalent tools.
- Configure and maintain GitHub branch protection rules and workflows to ensure smooth code integration and deployment.
- Build, optimize, and debug Spark jobs in Databricks for large-scale data processing.
- Automate Databricks deployments using Databricks Asset Bundles (DABs) or similar frameworks.
- Develop, test, and maintain robust data pipelines using Python and SQL.
- Implement unit testing, validation, and quality checks for data workflows.
- Manage multi-environment deployments (Dev, Stage, Prod) ensuring reliability and consistency.
- Write and maintain configuration files (YAML, JSON) for workflows, pipelines, and automation.
- Troubleshoot and resolve configuration or syntax issues in YAML and related files.
- Keep critical Databricks notebooks running reliably to ensure business continuity.
- Migrate legacy reports into modern, self-serve dashboards (Snowflake, Tableau, Power BI).
- Partner with cross-functional teams to understand needs and deliver data models, insights, and reporting they can trust.
- Put guardrails in place with data contracts, quality checks, and documentation to ensure long-term reliability.
What we’re looking for :
Strong skills in SQL, Databricks (Spark), and Python / R.Strong knowledge of GitHub and CI / CD practices (branch protection, GitHub Actions workflows).Experience in implementing unit tests and quality checks in data workflows.Knowledge of Databricks Asset Bundles (DABs) or similar automation tools.Familiarity with YAML configuration management, debugging, and error resolution.Strong problem-solving and debugging skills in data engineering workflows.Experience with Snowflake or another cloud data warehouse.Hands-on experience with Tableau / Power BI dashboards and a good sense for usability.