Role : Data Engineer
Exp : 5 Years+
Location : Remote / Hybrid
Work Timing : 2 PM - 10 PM UK Time Zone
Immediate Joiners
Key Skills : Python, Pyspark, Azure Cloud, Azure Functions for Data Extraction from different APIs , Azure Synapse, Data Bricks , JSON, SQL
We are looking for an experienced Data Engineer (5+ years) who has strong hands-on expertise in building scalable data pipelines, API integrations, cloud-native solutions, and big data processing . The ideal candidate must have deep experience with Python , PySpark , Azure Data Platform (ADLS, ADF, Functions, Synapse, Databricks) , and SQL-based data modeling .
Key Responsibilities
1. Data Pipeline Development
Design, develop, and maintain end-to-end ETL / ELT pipelines using Azure Data Factory, Databricks, and PySpark.
Build scalable data ingestion workflows from various sources including REST APIs, databases, cloud storage, and streaming systems .
Implement incremental load, CDC (Change Data Capture) , and high-volume data ingestion frameworks.
2. API Data Extraction & Azure Functions
Develop Azure Functions (Python) to extract data from REST / SOAP APIs , handle pagination, throttling, and authentication (OAuth2, JWT, API keys).
Implement retry logic, monitoring, error handling, and logging in API ingestion pipelines.
Parse and transform complex JSON / XML structures.
3. Big Data Processing (PySpark / Databricks)
Write optimized PySpark jobs for data processing, cleansing, aggregation, and transformations.
Work with Delta Lake for ACID transactions, schema evolution, and time-travel queries.
Optimize Spark jobs using partitioning, caching, broadcast joins, and AQE.
4. Azure Cloud Data Platform
Utilize Azure services such as :
Azure Data Lake Storage (ADLS Gen2)
Azure Data Factory (ADF)
Azure Synapse Analytics (SQL Pool / Serverless SQL)
Azure Databricks
Azure Key Vault
Azure Functions
Deploy and manage solutions using CI / CD pipelines with Azure DevOps.
5. Data Modeling & SQL
Design and develop data warehouse models (Star Schema, Snowflake).
Write advanced SQL queries for transformations, validation, performance tuning.
Create and manage tables, partitions, stored procedures, and views in Azure Synapse / SQL Server .
6. Data Quality & Governance
Implement data validation, profiling, and quality checks.
Maintain metadata, lineage, and documentation using appropriate tools.
Ensure compliance with security policies, encryption, and access management.
Required Technical Skills
Core Skills
Python (5+ years)
PySpark / Spark SQL
Azure Cloud (ADF, ADLS, Functions, Synapse, Databricks)
REST API integration
JSON, XML Parsing
SQL (T-SQL / Spark SQL / Synapse SQL)
ETL / ELT development
Delta Lake
Azure Data Engineer • Bengaluru, India