The developer must have sound knowledge in Apache Spark and Python programming.
Deep experience in developing data processing tasks using Pyspark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
Create Spark jobs for data transformation and aggregation Produce unit tests for Spark transformations and helper methods.
Design data processing pipelines to perform batch and Realtime / stream analytics on structured and unstructured data.
Spark query tuning and performance optimization Good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries / processing and compression techniques.
SQL database integration.
Hands on expertise in cloud services like AWS.
Mandatory skills
Spark, Python
Desired skills
Spark, Python.
Skills Required
Spark, Python, Aws, Pyspark
Developer • Hyderabad / Secunderabad, Telangana