Key Responsibilities :
- Cluster Management : Design, implement, and manage large-scale Hadoop clusters to support business data requirements.
- Configuration & Tuning : Configure and optimize Hadoop components, including HDFS, MapReduce, YARN, and Hive, for maximum performance and stability.
- ETL Development : Develop and maintain custom applications and scripts to extract, transform, and load (ETL) data into the Hadoop ecosystem.
- Integration : Integrate Hadoop with other Big Data tools and technologies to enable comprehensive data solutions.
- Maintenance & Upgrades : Perform regular maintenance, patching, and version upgrades to ensure cluster reliability and high availability.
- Monitoring & Troubleshooting : Continuously monitor Hadoop environments, identify potential issues, and resolve technical problems promptly.
- Collaboration : Work closely with data scientists, analysts, and other teams to understand data processing needs and optimize Hadoop usage.
- Documentation : Maintain detailed documentation of configurations, processes, and best practices.
Education & Experience :
Bachelor's degree in Computer Science, Information Technology, or a related field.Proven experience in Hadoop administration, implementation, and troubleshooting.Experience with Big Data technologies such as Hive, Pig, Spark, or HBase is a plus.Skills Required
Hadoop Ecosystem, Mapreduce, Hiveql, Pig Latin