JOB DESCRIPTION / REQUIREMENT
Role Overview :
We are looking for a Senior Data Engineer with 5+ years of experience in Big Data technologies . The ideal candidate will have strong hands-on experience in Spark , PySpark , and Databricks , and be capable of building scalable and reliable data pipelines. Knowledge of DevOps practices and containerization tools is a plus.
Primary Responsibilities :
Develop and maintain scalable data pipelines for large-scale data processing.
Translate business requirements into technical specifications and efficient ETL code.
Work independently to develop and optimize Spark / PySpark pipelines.
Write complex business logic using PySpark and Spark SQL.
Understand and manage Spark clusters and parallel data processing.
Develop unit tests for ETL components to ensure data integrity and quality.
Build modular, reusable code functions to streamline development.
Work with Athena : create tables, indexes, and write complex SQL queries.
Integrate data from relational databases (Oracle, SQL Server) and NoSQL databases (e.g., MongoDB).
Technical Skills Required (Primary) :
Apache Spark , PySpark , Spark SQL
Databricks
Python (for Data Engineering use cases)
Relational Databases : Oracle, SQL Server
NoSQL : MongoDB
Big Data Architecture and scalable pipeline design
Java or Scala (good to have)
Secondary / Desirable Skills :
AWS IAM : Role and policy creation
Docker : Image creation and container management
Camunda : Workflow orchestration and process management
Kubernetes : Container orchestration (good to have)
Db • Vellore, Tamil Nadu, India