Your Role :
We are looking for a Senior Cloudera Developer (Data Engineer) with extensive experience in Spark, the Hadoop ecosystem, and GCP services to design and implement data :
- Design, develop, and implement data solutions using Cloudera technologies such as Hadoop, Spark, and Hive.
- Collaborate with data engineers to optimize data pipelines and data processing workflows.
- Work closely with data analysts and data scientists to ensure data quality and integrity.
- Troubleshoot and resolve issues with data processing and storage systems.
- Stay up to date with trends and best practices in Cloudera development.
- Participate in code reviews and provide feedback to team members.
- Perform ETL processes using Spark.
- Integrate Spark Streaming with other technologies like Kafka.
- Deploy and manage Spark applications on a Hadoop cluster or GCP Dataproc.
Must-Have Criteria :
6- 10 years of experience in data engineering.Expertise in Apache Spark, including architecture and fault-tolerance mechanisms.Proficiency with Spark DataFrames and Spark SQL for querying structured data.Experience with ETL processes using Spark.Strong knowledge of Python.Bachelors degree in Computer Science, IT, or a related field.Proven experience as a Cloudera Developer or similar role.Solid understanding of Cloudera technologies : Hadoop, Spark, Hive.Experience with data modeling, data warehousing, and data integration.Programming skills in Java, Scala, or Python.Excellent problem-solving and communication skills.Ability to work independently and in a team environment.Desired Skills :
Experience optimizing Spark execution plans.Experience integrating Spark Streaming with technologies like Kafka.Familiarity with the Hadoop ecosystem including HDFS, Hive, and the Cloudera stack.Experience deploying and managing Spark applications on Hadoop clusters or GCP Dataproc.Experience with Java programming.Knowledge of DevOps tools and practices (CI / CD, Docker).Practical experience with GCP services : Dataproc, Cloud Functions, Cloud Run, Pub / Sub, BigQuery.(ref : hirist.tech)