This job offer is not available in your country.

Senior Data Engineer - Apache Spark

ApolisDelhi, IN

27 days ago

Job description

About the Role :

We are looking for a highly skilled Sr. Data Engineer with strong expertise in Apache Spark and Databricks to join our growing data engineering team. You will be responsible for designing, developing, and optimizing scalable data pipelines and applications using modern cloud data technologies. This is a hands-on role requiring deep technical knowledge, strong problem-solving skills, and a passion for building efficient, high-performance data solutions that drive business value.

Responsibilities :

Design, develop, and implement scalable data pipelines and applications using Apache Spark and Databricks, adhering to industry best practices.
Integrate data from multiple source systems (Legacy warehouse, modern cloud data platforms (Databricks, Google Bigquery), streaming systems (Kafka), and APIs)
Apply Lambda and Kappa architectures where appropriate to balance real-time and batch processing needs
Troubleshoot complex issues related to data ingestion, transformation, and pipeline execution. Ensure data quality, lineage, and governance practices are embedded into data pipelines
Work with data modelling teams to implement canonical models (customer, contract, usage, network).
Implement and monitor consent management and regulatory compliance GDPR in data pipelines.
Perform in-depth performance tuning and optimization of Spark applications within the Databricks environment.
Collaborate with cross-functional teams including data scientists, analysts, and architects to deliver end-to-end data solutions.
Continuously evaluate and adopt new technologies and tools in the Databricks and cloud ecosystem.
Collaborate with DataOps, architects, and monitoring teams to establish observability and incident management practices
Document technical designs, development processes, and operational procedures.

Qualifications :

Bachelor's degree in Computer Science, Engineering, or a related field.

10+ years of experience in data engineering or big data development.

6+ years of hands-on experience with Apache Spark and Databricks.

Deep understanding of Spark internals, Spark Streaming, Kafka, and Delta Lake.

Experience designing Lambda and / or Kappa architectures for hybrid batch + streaming data processing.

Experience in developing solutions using Azure Data Services including : Azure Databricks, Azure Data Factory, Azure DevOps, Azure Functions, Event / Message services, Azure SQL Database.

Proficient in PySpark or Scala.

Strong experience with open-source big data technologies, including : Apache Flink, Airflow, Nifi, Kafka etc

Strong experience in performance tuning, cost optimization, and cluster management in Databricks.

Solid understanding of data warehousing, ETL / ELT pipelines, and data modelling.

Experience working with cloud platforms (Azure preferred; AWS / GCP is a plus).

Familiarity with governance and compliance frameworks (e.g., GDPR, PII, consent management)

Familiarity with Agile / Scrum methodologies.

Proficiency in German and English (written and verbal).

Preferred Qualifications :

Databricks Certified Professional Data Engineer certification is a strong plus.

Strong communication skills-both written and verbal-with the ability to convey technical concepts to non-technical stakeholders.

Proven ability to work independently as well as in a team-oriented, collaborative environment.

(ref : hirist.tech)

Create a job alert for this search

Senior Data Engineer • Delhi, IN