Key Responsibilities :
- Develop high quality, secure and scalable data pipelines using spark, Scala / Python / Java on Hadoop or object storage like MinIO.
- Leverage technologies and solutions to innovate with increasingly large data sets.
- Drive automation and efficiency in Data ingestion, data movement and data access workflows by innovation and collaboration.
- Understand, implement and enforce Software development standards and engineering principles in the Big Data space.
- Contribute ideas to help ensure that required standards and processes are in place and actively look for opportunities to enhance standards and improve process efficiency.
- Perform assigned tasks and production incident independently.
Must Have :
5-9 years of experience in Data Warehouse / Data Lake / Lake House related projects in product or service-based organizationExpertise in Data Engineering and implementing multiple end-to-end DW projects in Big Data environment handling petabyte scale data.Solid Experience of building complex data pipelines through Spark with Scala / Python / Java on Hadoop or Object storageExperience of working with Databases like Oracle, Netezza and have strong SQL knowledge.Proficient in working within an Agile / Scrum framework, including creating user stories with well-defined acceptance criteria, participating in sprint planning and reviewsWrite and maintain Unix shell scripts, Oracle SQL, PL / SQL, and perform SQL tuning.Optimize and troubleshoot Spark applications for performance, scalability, and fault toleranceUse Git-based version control systems and CI / CD pipelines (e.g., Jenkins).Implement and manage HIVE external tables, partitions, and various file formats.Work across on-premises and cloud environments (AWS, Azure, Databricks).Strong experience with Hadoop ecosystem and Cloudera Data Platform (CDP).Experience of building Nifi pipelines.
Preferred
Strong analytical skills required for debugging production issues, providing root cause and implementing mitigation planStrong communication skills - both verbal and written.Ability to multi-task across multiple projects, interface with external / internal resourcesProactive, detail-oriented and able to function under pressure in an independent environment along with a high degree of initiative and self-motivation to drive resultsWillingness to quickly learn and implement new technologies, participate POC to explore best solution for the problem statementExperience working diverse and geographically distributed project teams.Skills & Qualifications : Bachelor’s degree in information systems, Information Technology, Computer Science or Engineering or equivalent work experience.