Strimzi Kafka

Roshan Khatri

February 01, 2023

kubernetes, kafka, strimzi, stateful workloads

Strimzi is an open-source project that provides a set of resources for running Apache Kafka on Kubernetes. It offers various features such as automatic creation and management of Apache Kafka clusters, providing a convenient way to run Apache Kafka on a cloud-native platform, and improving the resilience and scalability of Apache Kafka clusters.

kuberentes aces on stateless apps. Stateful apps are applications that require persistent storage and are used to store data even after the application or container restarts. Examples of stateful apps include databases, message brokers, and file servers.

Kubernetes provides various features for running stateful apps, such as StatefulSets, which ensure that each replica of a stateful app gets a unique network identity, and Persistent Volumes and Persistent Volume Claims, which provide a way to persistently store data in a storage backend. Additionally, Kubernetes also offers features like network attachment definition (NAD) and volume snapshots, which further improve the management of stateful apps on Kubernetes.

Create a new namespace for kafka for Strimzi Kafka Operator

kubectl create namespace kafka

Install the CRDs

The CRDs define the schemas used for the custom resources (CRs, such as Kafka, KafkaTopic and so on) you will be using to manage Kafka clusters, topics and users.

kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka

The YAML files for ClusterRoles and ClusterRoleBindings downloaded from strimzi.io contain a default namespace of myproject. The query parameter namespace=kafka updates these files to use kafka instead. By specifying -n kafka when running kubectl create, the definitions and configurations without a namespace reference are also installed in the kafka namespace. If there is a mismatch between namespaces, then the Strimzi cluster operator will not have the necessary permissions to perform its operations.

Verify running pods on the kafka namespace

kubectl get pod -n kafka --watch

View the operator log with the following command

kubectl logs deployment/strimzi-cluster-operator -n kafka -f

Create a kafka cluster using the following yaml

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: k8s-kafka-cluster
spec:
  kafka:
    version: 3.3.2
    replicas: 1
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
      - name: external
        port: 9094
        type: loadbalancer
        tls: false
    config:
      offsets.topic.replication.factor: 1
      transaction.state.log.replication.factor: 1
      transaction.state.log.min.isr: 1
      default.replication.factor: 1
      min.insync.replicas: 1
      inter.broker.protocol.version: "3.3"
    storage:
      type: jbod
      volumes:
      - id: 0
        type: persistent-claim
        size: 10Gi
        deleteClaim: false
        class: nfs-client
  zookeeper:
    replicas: 1
    storage:
      type: persistent-claim
      size: 10Gi
      deleteClaim: false
      class: nfs-client
  entityOperator:
    topicOperator: {}
    userOperator: {}

The snippet exposes kafka on kubernetes cluster on port 9094 using loadbalancer as the service type. Also note additional class: nfs-client attribute which allocates storage based on the storage class.

      - name: external
        port: 9094
        type: loadbalancer
        tls: false

Creata a topic to start sending data

apiVersion: kafka.strimzi.io/v1beta1
kind: KafkaTopic
metadata:
  name: my-topic
  labels:
    strimzi.io/cluster: k8s-kafka-cluster
spec:
  partitions: 3
  replicas: 1
  config:
    retention.ms: 7200000
    segment.bytes: 1073741824

Using kcat to view metadata about the kafka cluster

 kcat -b 10.10.10.23:9094 -L

The loadbalancer IP for the kubernetes cluster is 10.10.10.23 in my case.

Publish message on the topic

kcat -b 10.10.10.23:9094 -t my-topic -P

Press Ctrl+D to send messages to the topic

Consume message from the topic

kcat -b 10.10.10.23:9094 -t my-topic -C

We are building an autoscaling system based number of messages of kafka topic using KEDA. Stay tuned :)