Skill : Azure Data Engineer
Total Experience Required : 4 to 8 years
Location : Mumbai / Pune / Bangalore / Gurgaon
Overview :
We are seeking a highly skilled and motivated Data Engineer to join our growing data engineering team. The ideal candidate will be responsible for designing, developing, and maintaining scalable data pipelines and data lake solutions using Azure Data Services, Apache Spark, and Big Data technologies. This role requires strong hands-on experience in data integration, transformation, and storage using tools such as Azure Synapse, Azure Data Factory, Databricks, SQL Server, and PySpark. The successful candidate will work closely with data scientists, analysts, and business stakeholders to ensure the availability, quality, and reliability of data assets for analytics and AI workloads.
Responsibilities :
- Design, build, and maintain scalable and reliable data pipelines using Azure Data Factory, Azure Synapse, and Databricks.
- Develop and optimize large-scale data processing jobs using PySpark and Spark SQL on Azure Databricks or Synapse Spark pools.
- Manage and work with large datasets stored in data lakes (ADLS Gen2) and integrate with enterprise data warehouses (e.g., SQL Server, Synapse).
- Implement robust data transformation, cleansing, and aggregation logic to support analytics and reporting use cases.
- Collaborate with BI developers, analysts, and data scientists to provision clean, reliable, and timely datasets.
- Optimize data flows for performance and cost efficiency on Azure cloud platforms.
- Implement data governance, lineage, and security practices across the data architecture.
- Troubleshoot and resolve data pipeline failures and ensure high availability and fault-tolerant design.
- Participate in code reviews, adhere to version control practices, and maintain high coding standards.
Mandatory Skill-sets :
Azure Data Factory (ADF), Azure Synapse Analytics, Azure DatabricksApache Spark, PySparkSQL Server / T-SQL / Synapse SQLAzure Data Lake Storage Gen2Big Data ecosystem knowledge (Parquet, Delta Lake, etc.)Git, DevOps pipelines for data engineeringPerformance tuning of Spark jobs and SQL queriesPreferred Skill-sets :
Python for data engineering workflowsAzure Monitor, Log Analytics for pipeline observabilityPower BI (data modeling, DAX)Delta LakeExperience with CI / CD for data pipelines (YAML pipelines, ADF integration)Knowledge of data quality tools and frameworks