Kafka on Kubernetes: Step-by-step guides to run Kafka on the most popular k8s platforms

What is Kafka?

Apache Kafka is an event streaming platform that runs as a cluster of nodes called “brokers” and was developed initially as a messaging queue. Today, Kafka can be used to process and store a massive amount of information all while seamlessly allowing applications to publish and consume these messages stored as records within a what is called a topic. Typically Kafka is used to efficiently broker data between systems or to allow applications to react to streams of data in real-time. In addition to being a popular message queue for distributed systems, it is commonly used to stream data in IoT use cases.

What is Kubernetes?

Kubernetes is an open-source platform which runs a cluster of worker nodes and master nodes which allow teams to deploy, manage, scale and automate containerized workloads such as Kafka. Kubernetes can manage many applications at massive scale including stateful applications such as databases or streaming platforms. Kuberenetes builds on the shoulders of giants such as Google who initially conceived the software after using similar technology to run production workloads for over a decade.

Running Kafka on Kubernetes

There are a variety of reasons architecting Kafka on Kubernetes is appealing. First, if your organization is standardizing on using Kubernetes as an application platform, then this is a great reason to look at running Kafka there too. Running Kafka on Kubernetes allows organizations to simplify operations such as upgrades, scaling, restarts, and monitoring which are more-or-less built into the Kubernetes platform.

Some additional items to consider when running Kafka on Kubernetes: