Job Title : Pyspark Developer
Location : Hyderabad
Experience Required : 4- 8
Keywords : AWS, Pyspark, Databricks
Skills and experiences required :
- 3- 6 years of hands-on development in PySpark.
- Experience with Databricks and performance tuning using Spark UI.
- Strong understanding of AWS services, Kafka, and distributed data processing.
- Proficient in partitioning, caching, join optimization, and resource configuration.
- Familiarity with data formats like Parquet, Avro, and OyC.
- Exposure to orchestration tools (Airflow, Databricks Workflows).
- Scala experience is a strong plus
Key Roles and Responsibilities :
Develop and maintain large-scale data pipelines using PySpark and Databricks.Optimize and troubleshoot Spark applications using Spark UI and performance tuning techniques.Work with large datasets and implement best practices for partitioning, caching, join optimization, and cluster resource management.Build efficient data ingestion, transformation, and aggregation logic from multiple sources using Kafka and other streaming technologies.Leverage AWS services such as S3, EMR, Lambda, Glue, Redshift, etc., for cloud-based data processing and orchestration.Implement workflows using tools such as Airflow or Databricks Workflows.(ref : hirist.tech)