Key Responsibilities :
- Design, architect, and implement end-to-end big data solutions using MapR , Apache Hadoop , and associated ecosystem tools (e.g., Hive, HBase, Spark, Kafka).
- Lead data platform modernization efforts including architecture reviews, platform upgrades, and migrations.
- Collaborate with data engineers, data scientists, and application teams to gather requirements and build scalable, secure data pipelines.
- Define data governance, security, and access control strategies in the MapR ecosystem.
- Optimize performance of distributed systems, including storage and compute workloads.
- Guide teams on best practices in big data development, deployment, and maintenance.
- Conduct code reviews and architecture assessments.
- Provide mentorship to junior engineers and technical leadership across big data initiatives.
Qualifications and Requirements :
Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.6+ years of experience in data architecture, with at least 3 years working specifically on MapR and Hadoop ecosystems.Expertise in MapR-DB , MapR Streams , and MapR-FS .Proficiency with big data tools : Apache Spark , Kafka , Hive , HBase , Oozie , Sqoop , Flume .Strong programming skills in Java , Scala , or Python .Solid understanding of distributed systems , high availability , and cluster management .Experience with data ingestion, transformation, and ETL pipelines.Familiarity with security controls (Kerberos, Ranger, Knox, etc.) and data governance.Experience with CI / CD, Docker, Kubernetes is a plus.Desirable Skills and Certifications :
Certifications such as :Cloudera Certified Professional (CCP) , MapR Certified , or Hortonworks HDP CertificationExposure to cloud-based big data platforms like AWS EMR , Azure HDInsight , or GCP Dataproc .Experience with NoSQL and real-time data streaming architectures.Ability to communicate architectural concepts to both technical and non-technical stakeholders.Skills Required
Nosql, Docker, Kubernetes