We are seeking a highly skilled and motivated Big Data Engineer with strong experience in building, optimizing, and managing large-scale data processing pipelines. The ideal candidate will have hands-on expertise in Scala, Python, Apache Spark, Kafka, and / or Apache NiFi, and will work closely with cross-functional teams to design and implement scalable data solutions that drive business intelligence and analytics.
Key Responsibilities :
- Design, build, and maintain large-scale distributed data processing systems using Spark, Kafka, and / or NiFi.
- Write efficient and reusable code in Scala and Python for data transformation, ingestion, and enrichment.
- Collaborate with data scientists, analysts, and engineers to ensure data quality, consistency, and availability.
- Optimize data pipelines for performance, reliability, and scalability.
- Integrate diverse data sources, ensuring data governance and security compliance.
- Monitor and troubleshoot data workflows, resolving bottlenecks and issues.
- Participate in code reviews, architecture discussions, and best practice development.
- Stay updated on emerging big data technologies and frameworks to continuously improve system efficiency.
Required Skills and Experience :
Strong programming experience in Scala and Python for data engineering.Extensive experience with Apache Spark, including optimization, tuning, and resource management.Hands-on experience with Kafka for real-time data streaming and messaging systems.Experience with Apache NiFi for data flow automation and integration.Working knowledge of Hadoop ecosystem components is a plus.Understanding of cloud-based data solutions (AWS, Azure, or GCP) is desirable.Familiarity with relational and NoSQL databases (e.g., PostgreSQL, MongoDB, Cassandra).Experience in working with data warehousing solutions and ETL pipelines.Good analytical, problem-solving, and debugging skills.Excellent communication and teamwork abilities.(ref : hirist.tech)