Big Data Lead (8+ Years)
Location: Pune/Nagpur (WFO)
Experience Required: 8+ Years
Job Summary:
We are looking for a Big Data Lead with strong expertise in PySpark and Big Data ecosystems. The candidate will be responsible for designing scalable data solutions, leading teams, and ensuring high performance and reliability of data platforms.
Key Responsibilities:
- Design and develop scalable data pipelines using PySpark.
- Lead implementation across Hadoop ecosystem (HDFS, Hive, Sqoop, etc.).
- Drive architecture, design, and best practices for data engineering solutions.
- Perform performance tuning and optimization of distributed systems.
- Collaborate with business, delivery, and cross-functional teams.
- Manage and mentor a team of data engineers.
- Ensure data quality, reliability, and governance standards.
- Manage workflow orchestration using Airflow/Oozie.
Required Skills:
- Strong experience in Big Data Engineering with PySpark.
- Deep knowledge of HDFS, Hive, Sqoop, and Hadoop ecosystem.
- Strong expertise in SQL/HiveQL for large datasets.
- Proven experience in performance tuning and optimization.
- Experience working in Agile environments.
- Strong leadership, problem-solving, and communication skills.
Good to Have:
- Experience with Kafka / Spark Streaming.
- Knowledge of data modeling and data warehousing.
- Exposure to DevOps and CI/CD pipelines.
- Experience working on cloud-based data platforms.