Job Title : Data Engineer
About the Role :
We are seeking a highly skilled Data Engineer to join our team. The ideal candidate will have expertise in Databricks, Python, and SQL.
Key Responsibilities :
- Design, develop, and maintain scalable ETL / ELT data pipelines using Databricks (PySpark) on Azure / AWS / GCP.
- Develop clean, reusable, and performant Python code for data ingestion, transformation, and quality checks.
- Write efficient and optimized SQL queries for querying structured and semi-structured data.
- Work with stakeholders to understand data requirements and implement end-to-end data workflows.
- Perform data profiling, validation, and ensure data quality and integrity.
- Optimize data pipelines for performance, reliability, and integrate data from various sources APIs, flat files, databases, cloud storage e.g., S3, ADLS.
- Build and maintain delta tables using Delta Lake format for ACID-compliant streaming and batch pipelines.
- Work with Databricks Workflows to orchestrate pipelines and scheduled jobs.
- Collaborate with DevOps and cloud teams to ensure secure, scalable, and compliant infrastructure.
Technical Skills Required :
Core Technologies :
Databricks Spark on Databricks, Delta Lake, Unity CatalogPython with strong knowledge of PySparkSQL Advanced level joins, window functions, CTEs, aggregationETL & Orchestration :
Databricks Workflows / JobsAirflow, Azure Data Factory, or similar orchestration toolsAuto Loader, Structured Streaming preferredCloud Platforms Any one or more :
Azure Databricks on Azure, ADLS, ADF, SynapseAWS Databricks on AWS, S3, Glue, RedshiftGCP Dataproc, BigQuery, GCSData Modeling & Storage :
Experience working with Delta Lake, Parquet, AvroUnderstanding of dimensional modeling, data lakes, and lakehouse architecturesMonitoring & Version Control :
CI / CD pipelines for Databricks via Git, Azure DevOps, or JenkinsLogging, debugging, and monitoring with tools like Datadog, Prometheus, or Cloud-native toolsOptional / Preferred :
Knowledge of MLflow, Feature Store, or MLOps workflowsExperience with REST APIs, JSON, and data ingestion from 3rd-party servicesFamiliarity with DBT Data Build Tool or Great Expectations for data qualitySoft Skills :
Strong analytical, problem-solving, and debugging skillsClear communication and documentation skillsAbility to work independently and within cross-functional teamsAgile / Scrum working experience(ref : hirist.tech)