Job Title : Big Data : 7 to 10 : Hyderabad / Period : Immediate to 20 Days
About the Role :
We are looking for a highly skilled and experienced Big Data Developer to join our data engineering team. The ideal candidate will have a strong background in big data technologies, with hands-on experience in building scalable data pipelines and infrastructure. If you're passionate about working on large-scale distributed systems and cutting-edge open-source technologies, wed love to connect with you.
Key Responsibilities :
- Design, develop, and maintain scalable data ingestion, transformation, and enrichment pipelines.
- Work with Apache Spark (batch and streaming) for processing large volumes of data efficiently.
- Utilize Scala and Python to implement robust data engineering solutions.
- Manage and optimize HDFS and lead migration strategies across storage layers.
- Implement and manage table formats like Apache Iceberg, Delta Lake, or Apache Hudi for schema evolution, ACID compliance, and time travel features.
- Run Spark / Flink workloads on Kubernetes using tools like Spark-on-K8s operator or Flink-on-K8s.
- Leverage distributed object storage systems such as Ceph or AWS S3.
- Use Infrastructure-as-Code (Terraform, Helm) to provision and manage data infrastructure.
Required Skills & Experience :
7 to 10 years of experience in Big Data Engineering.Proficiency in Scala and Python.Expertise in Apache Spark (batch + streaming).Strong understanding of HDFS internals and hands-on experience in migration strategies.Hands-on experience with Apache Iceberg (or similar Delta Lake, Apache Hudi).Experience running Spark / Flink on Kubernetes.Familiarity with distributed blob storage solutions such as Ceph or AWS S3.Experience building high-throughput data pipelines for large-scale datasets.Strong knowledge of Terraform and Helm for infrastructure provisioning.Preferred Qualifications :
Contributions to open-source big data projects.Exposure to performance tuning in Spark / Flink.Experience working in cloud-native environments (AWS / GCP / Azure).(ref : hirist.tech)