Strimzi Kafka
Strimzi is an open-source project that provides a set of resources for running Apache Kafka on Kubernetes. It offers various features such as automatic creation and management of Apache Kafka clusters, providing a convenient way to run Apache Kafka on a cloud-native platform, and improving the resilience and scalability of Apache Kafka clusters.
kuberentes aces on stateless apps. Stateful apps are applications that require persistent storage and are used to store data even after the application or container restarts. Examples of stateful apps include databases, message brokers, and file servers.
Kubernetes provides various features for running stateful apps, such as StatefulSets, which ensure that each replica of a stateful app gets a unique network identity, and Persistent Volumes and Persistent Volume Claims, which provide a way to persistently store data in a storage backend. Additionally, Kubernetes also offers features like network attachment definition (NAD) and volume snapshots, which further improve the management of stateful apps on Kubernetes.
Create a new namespace for kafka for Strimzi Kafka Operator
kubectl create namespace kafka
Install the CRDs
The CRDs define the schemas used for the custom resources (CRs, such as Kafka, KafkaTopic and so on) you will be using to manage Kafka clusters, topics and users.
kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka
The YAML files for ClusterRoles and ClusterRoleBindings downloaded from strimzi.io contain a default namespace of myproject. The query parameter namespace=kafka updates these files to use kafka instead. By specifying -n kafka when running kubectl create, the definitions and configurations without a namespace reference are also installed in the kafka namespace. If there is a mismatch between namespaces, then the Strimzi cluster operator will not have the necessary permissions to perform its operations.
Verify running pods on the kafka namespace
kubectl get pod -n kafka --watch
View the operator log with the following command
kubectl logs deployment/strimzi-cluster-operator -n kafka -f
Create a kafka cluster using the following yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: k8s-kafka-cluster
spec:
kafka:
version: 3.3.2
replicas: 1
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
- name: external
port: 9094
type: loadbalancer
tls: false
config:
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
default.replication.factor: 1
min.insync.replicas: 1
inter.broker.protocol.version: "3.3"
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 10Gi
deleteClaim: false
class: nfs-client
zookeeper:
replicas: 1
storage:
type: persistent-claim
size: 10Gi
deleteClaim: false
class: nfs-client
entityOperator:
topicOperator: {}
userOperator: {}
The snippet exposes kafka on kubernetes cluster on port 9094 using loadbalancer as the service type. Also note additional class: nfs-client
attribute which allocates storage based on the storage class.
- name: external
port: 9094
type: loadbalancer
tls: false
Creata a topic to start sending data
apiVersion: kafka.strimzi.io/v1beta1
kind: KafkaTopic
metadata:
name: my-topic
labels:
strimzi.io/cluster: k8s-kafka-cluster
spec:
partitions: 3
replicas: 1
config:
retention.ms: 7200000
segment.bytes: 1073741824
Using kcat to view metadata about the kafka cluster
kcat -b 10.10.10.23:9094 -L
The loadbalancer IP for the kubernetes cluster is 10.10.10.23 in my case.
Publish message on the topic
kcat -b 10.10.10.23:9094 -t my-topic -P
Press Ctrl+D to send messages to the topic
Consume message from the topic
kcat -b 10.10.10.23:9094 -t my-topic -C
We are building an autoscaling system based number of messages of kafka topic using KEDA. Stay tuned :)