Job Description
HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.
You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.
This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.
What you will do :
Requirements
Benefits
About Us : HEROIC Cybersecurity ( HEROIC.com ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.
Position Keywords : DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux / Unix Administration, Bash, Docker, Kubernetes, CI / CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data
Requirements
Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale Strong understanding of NoSQL architecture, sharding, replication, and high availability Advanced knowledge of Linux / Unix, shell scripting, and automation tools (e.g., Ansible, Terraform) Proficient in at least one programming language : Python, Java, or Scala Experience building large-scale automated data ingestion systems or ETL workflows Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification Excellent written and spoken English communication skills Prior experience with cybersecurity or dark web data (preferred but not required)
Senior Infrastructure Engineer • pune, India