This job offer is not available in your country.

Lead Data Engineer - PySpark / Spark

SUPERSOURCING TECHNOLOGIES PRIVATE LIMITEDGurugram

30+ days ago

Job description

About the Role :

We are looking for an experienced Lead Data Engineer with deep expertise in Big Data technologies, particularly within the Google Cloud Platform (GCP) ecosystem. The ideal candidate should have a strong command of PySpark / Spark, SQL, and Python, and a proven track record in building, optimizing, and managing large- scale data pipelines and cloud- native data platforms.

Key Responsibilities :

Lead the design and implementation of scalable ETL / ELT pipelines using Spark (batch and stream) and Python
Architect and optimize BigQuery solutions using advanced SQL, partitioning, clustering, and materialized views
Guide the team on GCP services : Dataproc, GCS, BigQuery, Cloud Composer, Cloud Functions, IAM, and Cloud Logging
Conduct code reviews and mentor team members on Spark optimization (caching, memory management, broadcast joins, skew handling)
Drive Airflow DAG development, configuration management, and orchestration workflows
Solve complex data engineering problems and contribute to architectural decisions for performance, scalability, and cost- efficiency
Ensure data quality, governance, and security best practices are enforced across all data platforms
Support team readiness through technical ramp- up and ongoing skill enhancement

Must- Have Skills :

Hands- on experience with PySpark / Spark core concepts, internal workings, transformations, and tuning

Strong knowledge of SQL and BigQuery (including window functions, CTEs, performance tuning, joins)

Proficiency in Python with strong problem- solving abilities

Deep experience with GCP components : BigQuery, Dataproc, GCS, and Cloud Composer

Understanding of Airflow, including XComs, variables, schema- based DAG creation, and branching

Exposure to Hive, partitioning (static / dynamic), and bucketed tables

Familiarity with data pipeline orchestration, monitoring, and failure handling

Solid grasp of data security (column- level, row- level, IAM roles)

(ref : hirist.tech)

Create a job alert for this search

Lead Data Engineer • Gurugram

Related jobs

Promoted

Lead Azure Data Engineer

RandomTreesDelhi, IN

We’re a leading software company specializing in Artificial Intelligence, Machine Learning, Data Analytics, Innovative data solutions, Cloud-based technologies. If you're passionate about building r...Show moreLast updated: 25 days ago

Promoted

Data Engineer Team Lead

SGIgurgaon, haryana, in

To be discussed based on your skills and experience.Strong hands-on data engineering experience with a proven ability to design, build, and optimize scalable data pipelines in .Deep technical exper...Show moreLast updated: 5 days ago

Promoted

Azure Data Engineer

Texplorers Incfaridabad, haryana, in

We are seeking a skilled Data Engineer with proficient knowledge in Spark and SQL.The ideal candidate will be responsible for designing,. Data Pipeline Development : Design, develop, and maintain sca...Show moreLast updated: 18 days ago

Promoted

Senior Data Engineer

Otomeyt AIGhaziabad, IN

We are seeking a highly skilled 7+.The ideal candidate will have strong technical expertise in Azure, Data Engineering tools, and advanced ETL design along with excellent communication and problem-...Show moreLast updated: 17 days ago

Promoted

Data Engineer

R SystemsGhaziabad, IN

Offshore candidates accepted (Singapore Based Company).Please don't apply if less than 4 years exp in Data Engineer).ETL pipelines, real-time streaming, and data transformations.RDDs, transformatio...Show moreLast updated: 25 days ago

Promoted

Senior Data Engineer

SAIVA AIDelhi, IN

We are building the future of healthcare analytics.Join us to design, build, and scale robust data pipelines that power nationwide analytics and support our machine learning systems.Our goal : pipel...Show moreLast updated: 6 days ago

Promoted

Data Engineer

ACL Digitalgurugram, uttar pradesh, in

Design, develop, and optimize Spark-based data pipelines on Databricks for large-scale data processing.Design, develop, and optimize AWS pipeline as applicable. Implement and manage GitHub asset bun...Show moreLast updated: 30+ days ago

Promoted

Senior Data Engineer

Manuh TechnologiesDelhi, IN

Strong proficiency in Python programming (Pandas, NumPy, PySpark, or similar).Hands-on experience with Dask for large-scale distributed data processing. Proven expertise as a Data Modeler (conceptua...Show moreLast updated: 30+ days ago

Promoted

Senior Data Engineer

Straivedelhi, delhi, in

The ideal candidate is a strong software engineer with hands-on experience in Spark (3.You'll be responsible for designing and implementing ETL / ELT solutions, collaborating with teams to deliver da...Show moreLast updated: 3 days ago

Promoted

Senior Data Engineer

Tredence Inc.faridabad, haryana, in

Design, build, and maintain scalable data pipelines using DBT and Airflow.Develop and optimize SQL queries and data models in Snowflake. Implement ETL / ELT workflows, ensuring data quality, performan...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

Manuh TechnologiesDelhi, IN

Promoted

Azure Data Engineer

Tata Consultancy Servicesfaridabad, haryana, in

Job Title : Azure Data Engineer.Required Skillset : Azure Databricks(Strong),scala(strong), Pyspark(Good to have).Azure Databricks using Scala, PySpark. Software Development Life Cycle experience.Work...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

Otomeyt AIDelhi, IN

We are seeking a highly skilled 5+.The ideal candidate will have strong technical expertise in Azure, Data Engineering tools, and advanced ETL design along with excellent communication and problem-...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

INFEC Servicesfaridabad, haryana, in

Design, develop, and optimize data pipelines and ETL processes on GCP or Azure.Work with structured and unstructured data, integrating sources such as databases, APIs, and streaming platforms.Imple...Show moreLast updated: 3 days ago

Promoted

Senior Data Engineer

DeltacubesGhaziabad, IN

Build and maintain scalable ETL / ELT pipelines.Work with Snowflake and BigQuery for data storage.Implement orchestration with Airflow or Prefect. Integrate data workflows with Python.Optimize data pi...Show moreLast updated: 14 days ago

Promoted

Data Engineer

HISH IT SERVICESDelhi, IN

Location : Remote(Banglore,Chennai,Pune).Pay : 14LPA - 18 LPA(Based on Experience).Timings : A couple of hours overlap with EST, as the client is Canada-based (till 12AM IST).Start Date : 20th Octob...Show moreLast updated: 5 days ago

Promoted

Scala Big Data Lead Engineer - 7 YoE - Immediate Joiner - Any UST Location

USTfaridabad, haryana, in

CTC, notice period, and current location details to.Apache Hadoop, Airflow, Kubernetes, and Containers.Troubleshoot Hadoop log files and work with. Required Skills & Qualifications : .Scala, Spark, Py...Show moreLast updated: 30+ days ago

Promoted

Lead Data Engineer

Eucloid Data SolutionsGhaziabad, IN

Eucloid is looking for a Lead Data Engineer to join our Data Platform team supporting various business applications.The ideal candidate will support development of data infrastructure on Databricks...Show moreLast updated: 6 days ago

Promoted

Data Engineer

Innodata Inc.Delhi, IN

CI / CD practices, Databricks (Spark), Python, Github and SQL.The ideal candidate should have hands-on expertise in building and automating data pipelines, managing multi-environment deployments, and...Show moreLast updated: 24 days ago

Promoted

Data Engineer

AceolutionDelhi, IN

We are looking for a freelancer to engage with us for 20-40 hours per week.Kindly find the JD below for your reference.Design, develop, and maintain scalable data pipelines and workflows.Work exten...Show moreLast updated: 6 days ago