This job offer is not available in your country.

Lead Data Engineer - Azure / Python / PySpark

Scripting ResumesPune

6 days ago

Job description

We are in the process of helping our client in hiring Lead Data Engineer which is Full Time role in Location in Pune and client is looking for immediate joiner or max 15 days

Lead Data Engineer Azure | Python | Pyspark

Job Type : Full Time

Job Positions : 2

Location : Pune (Hybrid model 3 days WFO required)

Job Description :

We are looking for a Senior Data Engineer with 7+ years of hands-on experience in architecting and optimizing complex data pipelines, with a strong command over Azure cloud ecosystem, Python, and PySpark. This hybrid role is based out of Pune and requires deep technical expertise in building scalable, resilient, and secure data platforms that drive business intelligence and style="">

Key Responsibilities :

Design and develop complex SQL queries and Python scripts for data extraction, transformation, and processing.
Build and optimize scalable data pipelines and architectures for performance, cost-efficiency, and resiliency.
Lead on-prem to cloud data migration initiatives, especially into Azure-based environments.
Develop and manage data models and implement effective ETL frameworks across large datasets.
Implement batch and real-time data ingestion strategies using tools like Azure Data Factory, Kafka, and Spark.
Utilize Azure Synapse, Azure Data Lake, Azure SQL, and other Azure-native services for data orchestration and storage.
Ensure data quality, lineage, governance, and enforce security protocols including encryption and access control.
Automate data workflows to improve delivery speed and minimize manual efforts.
Collaborate with cross-functional teams including Data Scientists, Analysts, and Platform Engineers.
Optional but preferred) Participate in managing CI / CD pipelines, infrastructure-as-code, and monitoring data platform health.

Core Skills & Requirements :

7+ years of proven experience in data engineering, with a focus on designing scalable data architectures, building automated pipelines, and working across large, complex datasets in enterprise environments.

Deep expertise in writing, optimizing, and debugging complex, high-performance SQL queries across relational and cloud databases

Advanced hands-on experience using Python for data wrangling, automation, and ETL / ELT pipeline orchestration.

Proficient in distributed data processing using PySpark for big data pipelines in real-time and batch modes.

Azure Synapse Analytics for scalable query processing and data warehousing

Azure Data Factory (ADF) for orchestrating pipelines and data integration

Azure Data Lake (Gen2) for storage and structured / unstructured data ingestion

Azure SQL, Azure Cosmos DB, and exposure to Azure Fabric

Practical experience with Apache Spark for in-memory computation

Skilled in end-to-end ETL / ELT pipeline design, development, and optimization

Experience in on-premise to cloud migration projects, especially to Azure-based environments

Knowledge of data modeling, delta-lake architecture, and lakehouse patterns for scalable analytics solutions

Focus on resiliency, cost-efficiency, and performance optimization of data workflows

Understanding of CI / CD concepts, with exposure to implementing automated deployments for data solutions

Experience with infrastructure-as-code, environment provisioning, and pipeline monitoring tools

Hands-on implementation of data security measures, including encryption, RBAC, auditing, and PII protection

Familiarity with governance standards, compliance practices, and best practices for enterprise data platforms

Strong analytical and problem-solving abilities

Effective communication and collaboration in cross-functional agile teams

Self-driven and proactive in identifying quality gaps and proposing solutions

Willingness to continuously learn and adapt to new technologies and testing methodologies

Must-Have- Skills : Python & Pyspark, Advanced SQL , Azure Synapse Analytics, Azure Data Factory (ADF), Azure Data Lake & Azure SQL, Azure Cloud Data Migration, Data Security

Academic : Post Graduate / Graduate in Engineering / Technology / MBA

ref : hirist.tech)

Create a job alert for this search

Lead Data Engineer • Pune