Job Title : Principal / Senior Data Engineer Product Development
Location : Hyderabad / Pune (Onsite)
Job Type : Full-time, Payroll
Duration : Long-term
Job Description :
We are looking for an experienced Principal / Senior Data Engineer to join our product development team. You will design and build high-performance, distributed data systems that support large-scale data ingestion and processing. This role requires deep expertise in parallel processing, distributed computing, and modern cloud data platforms.
Key Responsibilities :
- Understand and translate complex design specifications into scalable solutions.
- Build and optimize data ingestion pipelines for distributed systems with parallel processing using Golang, C++, and Java.
- Ingest data from multiple sources including :
1. Cloud Storage : Amazon S3, Azure Blob, Google Cloud Storage
2. Relational Databases : Snowflake, Google BigQuery, PostgreSQL
3. Files, Kafka, Data Lakehouse (Iceberg)
Implement high-availability (HA) loading, cross-region replication, and robust loading monitoring / error reporting.Develop and maintain Spark Connectors and integrate with third-party systems such as Kafka and Kafka Connect.Work in an Agile development environment and follow best practices for CI / CD pipelines.Requirements :
Strong experience building distributed systems with parallel processing.Hands-on expertise in one or more of the following : Kafka, Zookeeper, Spark, Stream Processing.Proficiency in Kafka Connect, Kafka Streams, Kafka security, and customization.Experience with Spark connectors and event-driven architectures.Familiarity with Agile methodologies and modern CI / CD practices.Must-have Technical Skills :
Programming : Java (4+ yrs), C++ (3+ yrs), Golang (3+ yrs) AdvancedData Platforms : Snowflake (3+ yrs Advanced), BigQuery (3+ yrs Intermediate), PostgreSQL (4+ yrs Advanced)Streaming & Processing : Apache Kafka (4+ yrs Advanced), Apache Spark (3+ yrs Intermediate)Cloud Storage : AWS S3 (4+ yrs Advanced), Azure (4+ yrs Advanced), Google Cloud (4+ yrs Intermediate)Distributed Computing : 2+ yrs experienceAgile & CI / CD : 3- 4+ yrs Intermediate to AdvancedNice to Have :
Experience with gRPC protocol and multi-threading.Knowledge of Zookeeper / ETCD / Consul.Familiarity with distributed consensus algorithms (Paxos / Raft).Exposure to Docker and Kubernetes.(ref : hirist.tech)