Role & responsibilities :
- Evaluate domain, financial and technical feasibility of solution ideas with help of all key stakeholders
- Design, develop, and maintain highly scalable data processing applications
- Write efficient, reusable and well documented code
- Deliver big data projects using Spark, Scala , Python, SQL
- Maintain and tune existing Spark applications to the fullest.
- Find opportunities for optimizing existing spark applications.
- Work closely with QA, Operations and various teams to deliver error free software on time
- Actively lead / participate daily agile / scrum meetings
- Take responsibility for Apache Spark development and implementation
- Translate complex technical and functional requirements into detailed designs
- Investigate alternatives for data storing and processing to ensure implementation of the most streamlined solutions
- Serve as a mentor for junior staff members by conducting technical training sessions and reviewing project outputs
Qualifications :
Engineering graduates in computer science backgrounds preferred with 8+ years of software development experience with Hadoop framework components(HDFS, Spark, Spark, Scala, PySpark)Excellent at verbal, written and presentation skillsAbility to present and defend a solution with technical facts & business proficiencyUnderstanding of data-warehousing and data-modeling techniquesStrong data engineering skillsKnowledge of Core Java, Linux, SQL, and any scripting languageAt least 6+ years of experience using Python / Scala, Spark, SQLKnowledge of shell scripting is a plusKnowledge of Core and Advance Java is a plus.Experience in developing and tuning spark applicationsExcellent understanding of spark architecture, data frames and tuning sparkStrong knowledge of database concepts, systems architecture, and data structures is a must Process oriented with strong analytical and problem solving skillsExperience in writing Python standalone applications dealing with PySpark APIKnowledge of DELTA.IO package is a plus(ref : hirist.tech)