This job offer is not available in your country.

PySpark Developer - ETL / Python

Nazztec Private LimitedHyderabad

29 days ago

Job description

We are looking for a highly skilled PySpark Developer with strong experience in Python programming, Apache Spark (PySpark), and AWS cloud services. The ideal candidate will be responsible for developing scalable data pipelines, performing data transformations, and supporting data engineering initiatives on the cloud.

Key Responsibilities :

Design, develop, and maintain ETL / ELT data pipelines using PySpark.
Perform data ingestion, cleansing, and transformation from various structured and unstructured sources.
Work with large-scale datasets in distributed computing environments using Apache Spark.
Deploy and manage data pipelines in AWS (S3, Lambda, EMR, Glue, Redshift, etc.).
Collaborate with data engineers, analysts, and business teams to deliver data-driven solutions.
Write optimized, modular, and reusable Python code for data engineering workflows.
Perform troubleshooting, performance tuning, and optimization of Spark jobs.

Required Skills :

4+ years of hands-on experience with PySpark and Apache Spark.

Strong proficiency in Python and its data libraries (Pandas, NumPy, etc.).

Experience working with AWS cloud services such as S3, Glue, EMR, Lambda, Redshift.

Solid understanding of data warehousing concepts, data lakes, and ETL pipelines.

Experience with SQL and relational databases.

Good understanding of data formats like JSON, Parquet, Avro, CSV.

Familiarity with version control tools (Git) and CI / CD workflows.

Preferred Skills (Good to Have) :

Experience with Airflow or other workflow orchestration tools.

Knowledge of DevOps practices in AWS environment.

Exposure to big data ecosystems and real-time data processing.

ref : hirist.tech)

Create a job alert for this search

Pyspark Developer • Hyderabad