We're Hiring : PySpark Developer (Databricks)
- Experience : 4+ Years in Data Engineering / Distributed Systems
- Location : Offshore (Sompo IT Systems Development Unit)
- Joiners : Immediate
- Budget : Competitive
Role Summary :
Design, build, and optimize large-scale data pipelines on Databricks Unified Analytics Platform, leveraging PySpark, big data frameworks, and cloud solutions to support advanced analytics and machine learning workflows.
Key Responsibilities :
Develop & manage scalable ETL / ELT pipelines using PySpark on Databricks.Optimize data pipelines for performance, scalability & cost.Deploy & manage Databricks on Azure (ADF, Delta Lake, SQL).Build CI / CD pipelines (Azure DevOps / GitHub Actions / Jenkins).Collaborate with cross-functional teams to deliver robust data solutions.Technical Skills :
Python, PySpark, Databricks (Lakehouse, Delta Lake, Databricks SQL).Azure Cloud (Data Lake, Synapse), Hadoop, Spark, Kafka.SQL & NoSQL databases (PostgreSQL, SQL Server, MongoDB, Cosmos DB).Familiarity with MLflow, Kubernetes (added advantage).General Skills :
Strong problem-solving & analytical mindset.Excellent communication & organizational skills.Experience in regulated industries (insurance) preferred.Education & Certifications :
Bachelor's in CS, Data Engineering, or related field.Databricks, PySpark, or cloud certifications highly desirable.Skills Required
Pyspark, Azure, Databricks, Data Engineer