This job offer is not available in your country.

Senior Datalake Implementation Specialist

ConfidentialBengaluru / Bangalore, India

8 days ago

Job description

Job Title : Senior DataLake Implementation Specialist

Experience : 10–12+ Years

Location : Bangalore

Type : Full-time / Contract

Notice Period : Immediate

Job Summary

We are looking for a highly experienced and sharp DataLake Implementation Specialist to lead and execute scalable data lake projects using technologies such as Apache Hudi, Hive, Python, Spark, Flink , and cloud-native tools on AWS or Azure . The ideal candidate must have deep expertise in designing and optimizing modern data lake architectures with strong programming skills and data engineering capabilities.

Key Responsibilities

Design, develop, and implement robust data lake architectures on cloud platforms (AWS / Azure).
Implement streaming and batch data pipelines using Apache Hudi, Apache Hive, and cloud-native services like AWS Glue, Azure Data Lake, etc.
Architect and optimize ingestion, compaction, partitioning, and indexing strategies in Apache Hudi.
Develop scalable data transformation and ETL frameworks using Python, Spark, and Flink.
Work closely with DataOps / DevOps to build CI / CD pipelines and monitoring tools for data lake platforms.
Ensure data governance, schema evolution handling, lineage tracking, and compliance.
Collaborate with analytics and BI teams to deliver clean, reliable, and timely datasets.
Troubleshoot performance bottlenecks in big data processing workloads and pipelines.

Must-Have Skills

4+ years hands-on experience in Data Lake and Data Warehousing solutions

3+ years experience with Apache Hudi, including insert / upsert / delete workflows, clustering, and compaction strategies

Strong hands-on experience in AWS Glue, AWS Lake Formation, or Azure Data Lake / Synapse

6+ years of coding experience in Python, especially in data processing

2+ years working experience in Apache Flink and / or Apache Spark

Sound knowledge of Hive, Parquet / ORC formats, and DeltaLake vs Hudi vs Iceberg

Strong understanding of schema evolution, data versioning, and ACID guarantees in data lakes

Nice To Have

Experience with Apache Iceberg, Delta Lake

Familiarity with Kinesis, Kafka, or any streaming platform

Exposure to dbt, Airflow, or Dagster

Experience in data cataloging, data governance tools, and column-level lineage tracking

Education & Certifications

Bachelor's or Master's degree in Computer Science, Information Technology, or related field

Relevant certifications in AWS Big Data, Azure Data Engineering, or Databricks

Show less

Skills Required

orc, Airflow, AWS Glue, Kafka, Hive, Kinesis, dbt, Spark, Azure Data Lake, Python, Apache Hive

Create a job alert for this search

Implementation Specialist • Bengaluru / Bangalore, India