Key Responsibilities :
- Kafka Cluster Setup & Configuration :
- Deploy and configure Apache Kafka clusters in both on-premise and cloud environments .
- Configure Kafka brokers , ZooKeeper nodes , and other related components like Kafka Connect , Kafka Streams , and Kafka MirrorMaker .
- Tune and optimize Kafka configurations for high throughput, low latency, and fault tolerance.
- Cluster Management & Maintenance :
- Monitor the health of Kafka brokers, topics , partitions , and consumer groups .
- Perform routine Kafka cluster maintenance , including upgrades , patching , and failover testing .
- Ensure the Kafka infrastructure scales appropriately to meet growing data ingestion and processing requirements.
- Monitoring & Troubleshooting :
- Use monitoring tools like Prometheus , Grafana , and Confluent Control Center to ensure optimal performance of Kafka clusters.
- Troubleshoot and resolve issues related to Kafka brokers, network latency, consumer lag, partitioning, and other performance bottlenecks.
- Implement automated alerting and notification systems for any Kafka-related incidents.
- Security & Compliance :
- Configure SSL / TLS encryption for secure communication between Kafka brokers and clients.
- Implement SASL authentication and Kafka ACLs to enforce security policies and restrict access to Kafka topics.
- Ensure Kafka clusters meet compliance requirements and best practices for security and data governance.
- Data Replication & Backup :
- Set up and maintain data replication between Kafka clusters using MirrorMaker and other replication tools.
- Implement automated backup and disaster recovery procedures for Kafka data.
- Ensure data durability and high availability through Kafka replication and fault-tolerant architectures.
- Performance Tuning :
- Monitor and optimize Kafka's disk usage , network throughput , and memory utilization .
- Identify and mitigate latency issues and improve throughput for high-volume data pipelines.
- Fine-tune Kafka's internal mechanisms, such as log compaction and message retention , for better performance.
- Capacity Planning & Scaling :
- Plan for Kafka cluster expansion and ensure the system can handle increasing data volumes by adding Kafka brokers and partitions.
- Perform capacity planning based on forecasted data volumes, load, and throughput.
- Maintain Kafka's high availability and ensure minimal service disruption during scaling operations.
- Documentation & Best Practices :
- Document all Kafka configurations, troubleshooting steps, and operational procedures.
- Establish best practices for Kafka cluster management, security, and performance tuning.
- Provide training and guidance to other team members and stakeholders on Kafka-related operations.
Skills and Qualifications :
Mandatory Skills :
Strong experience with Apache Kafka (Kafka broker configuration, scaling, and performance tuning).Proficiency in Kafka ecosystem tools (e.g., Zookeeper , Kafka Connect , Kafka Streams ).Experience with Kafka security configurations (SSL / TLS encryption, SASL authentication, ACLs).Strong experience in Kafka cluster management , including upgrades , patching , and failover testing .Proficient in monitoring and alerting tools (e.g., Prometheus , Grafana , Confluent Control Center ).Hands-on experience in troubleshooting Kafka-related issues like consumer lag, broker failures, and network bottlenecks.Experience with cloud-based Kafka solutions (e.g., Confluent Cloud , AWS MSK , Azure Event Hubs ).Knowledge of data replication tools like Kafka MirrorMaker and Kafka Connect .Familiarity with Linux / Unix systems for managing Kafka environments.Strong scripting skills (e.g., Bash , Python , Groovy ) for automating routine tasks.Skills Required
Kafka, Bash, Python, Groovy, Aws Admin