Job Description
Required Skills : Python Programming :
Strong ability to write clean and efficient code.
Spark SQL :
Good understanding of Spark SQL for distributed data processing.
Data Processing :
Experience with large datasets and structured data manipulation.
SQL Fundamentals :
Ability to write queries and optimize database performance.
Problem-Solving :
Analytical mindset to debug and optimize workflows.
Preferred Skills : AWS Cloud Services :
Familiarity with
S3, Redshift, Lambda, EMR
is an advantage.
ETL Development :
Understanding of ETL processes and data engineering principles.
Version Control : Experience using
Git
for collaborative development.
Big Data Tools : Exposure to
Hive, PySpark , or similar technologies.
Roles & Responsibilities
Develop and optimize
Python scripts
for data processing and automation.
Write efficient
Spark SQL queries
for handling large-scale structured data.
Assist in
ETL pipeline development
and maintenance.
Support
data validation
and integrity checks across systems.
Collaborate with teams to implement
cloud-based solutions
(AWS preferred).
Optimize performance of data queries and workflows.
Troubleshoot and debug issues in existing applications.
Document processes and ensure best practices in coding and data handling.
Data Engineer • Bhubaneswar, Odisha, India