The ideal candidate will have expertise in SQL, Python and Big Data concepts
- Assist in building scalable data pipelines using Python and SQL.
- Support data modeling activities for analytics and reporting use cases.
- Perform data cleansing, transformation, and validation using PySpark.
- Collaborate with data engineers and analysts to ensure high data quality and availability.
- Work on Hadoop ecosystem tools to process large datasets.
- Contribute to data documentation and maintain version-controlled scripts
Job Requirements
Should be good in SQL and Analytical thinking with Problem Solving skills.
Strong proficiency in Python for data processing and scripting.Good knowledge of SQL – writing complex queries, joins, aggregationsUnderstanding of Data Modeling concepts – Star / Snowflake schema, Fact / Dimension tables.Familiarity with Big Data / Hadoop ecosystem – HDFS, Hive, Spark.Basic exposure to PySpark will be a strong plus.Recruitment necessity
The ideal candidate will have expertise in SQL, Python and Big Data concepts