Join SAA
Synergistic AI Analytics
We Are Hiring — PySpark Data Engineering Intern
Location : Bhopal, MP ( Work from Office )
Internship Duration : 3–6 Months
Experience : Fresher / Final-year student / 0–2 Years
Are you passionate about data and ready to build real data engineering projects in a modern cloud ecosystem
Key Responsibilities
- Assist in building PySpark-based ETL pipelines.
- Work with data lakes , Delta Lake , and structured / semi-structured data flows.
- Process data across Bronze → Silver → Gold layers.
- Write and optimise SQL for transformation and reporting.
- Participate in performance tuning (cluster optimisation, caching, partitioning).
- Support job orchestration in Databricks / ADF.
- Learn and assist with data lineage and data quality checks.
- Participate in CI / CD and Git-based workflows.
- Support testing, debugging, and validation for data migration workloads.
- Contribute to metadata-driven frameworks and automation scripts.
- Prepare documentation and collaborate in Agile sprints.
Key Requirements
Strong foundation in PySpark, Python, SQL , and cloud data concepts.Strong problem-solving mindset and ability to learn fast.Candidate must be strong in at least two key areas — exceptional analytical skills required :1) PySpark — DataFrames, transformations, joins, debugging (preferred).
2) SQL — Strong query writing, joins, window functions.
3) Python — Solid logic, functions, data structures, error handling.
Exposure to the Azure data ecosystem ( ADF, Databricks, Synapse ) is a plus.❌ Not suitable for candidates seeking remote / hybrid options.
❌ Not suitable for candidates without strong fundamentals in PySpark, Python, and SQL .
Skills Required
Git, Adf, Pyspark, Databricks, Azure, Sql, Python