Stateless vs Stateful

Contact Sales

Architect’s Corner

left treated

This is a post that has been a very long time in the making, and my title even has some inherent flaws! My hope is to have a more in-depth discussion about containers that have been informed by my travels as a cloud architect. My hope (as always) is to approach this subject with curiosity and hospitality.
First, I want to start by saying that I have a horse in this race: I work for Portworx. I work on a product that solves problems around stateful applications in Kubernetes. This opens the door to countless arguments, discussions, and pontifications about stateful vs. stateless microservices. I tell you this to hopefully understand more of my thoughts on this journey (or to have a reason to dismiss me outright if you would like).

What is Stateless Microservices?

I would have to give a lot of credit to my friend and colleague Alex, who took a job at a cloud-native security company. He was the first one who told me “If you are putting stateful data in containers, you are doing something wrong!” It has taken years, but this is my good-natured rebuttal.

Before I go further, I want to help clear up a couple of terms. What is ‘stateless’ and ‘stateful’? What do these terms encompass? Our subsequent discussion will pivot around discerning the unique differences between stateful and stateless microservices. This will provide us a clearer understanding of their distinct functionalities.

I feel that we have come to use the terms Stateful vs.and Stateless services to talk about persistent storage that is connected to a container, as well as a bit of design philosophy. I think it can be useful shorthand, but let’s be honest with ourselves: all containers have state. If I modify a file on a running container that file will persist, right up until the point that that container goes away. This is a common occurrence for containers and is essential knowledge when designing any system for Kubernetes.

I would modify Alex’s statement a little: It is not desirable to persist configuration data in a container. If we try, we will end up with a configuration mess that is no different from the VMs I managed in a previous life where configurations had to be applied imperatively instead of declaratively. Why even run containers if you are going to do that? (Honestly, I have done this a lot in my lab, usually because containers still provide some value in portability and abstraction).

So the question I have is: when is it appropriate to persist data in a container? What are the benefits, and what are the drawbacks?

It is all about the data

Ok, maybe not all about the data, but the data for an application is very important. Traditionally, companies would pay a lot of money to other companies to store data (such as SAN appliances), and manage and structure data (in the form of databases and other data structures). This ends up being complicated, and sometimes risky.

We are always looking for ways to make our lives easier, and the Cloud architecture provided an easy answer. Data services that you can purchase from AWS, Azure, and GCP provide a way of consuming data services as a service. This can effectively take away much of the risk of running your own infrastructure as well as provide consumption-based costs. Given how nice this sounds, why would anyone ever bother persisting data on Kubernetes?

I really have a few answers for this, but it really does depend. Here are some of the things that my customers have told me about the type data they put on Kubernetes.

Kubernetes as a deployment platform

One of the first things we need to wrap our walnut around is why we would even consider using Kubernetes to deploy persistent applications. The reason is that it can provide the same benefits to persistent applications that it does for non-persistent or stateless microservices:

Consistency – containers in general simplify a lot by including all the required libraries in the container, which solves a lot of application issues. Solving “Well it worked fine on my computer!”
Image registries – Containers can be delivered from a registry that is provided by the software vendors, which can remove any need to install or integrate software. Don’t believe me? Find instructions for installing a database application on Kubernetes VS Linux servers. The more complex the deployment, the easier the Kubernetes installation will be.
Object Abstraction – This one is Kubernetes-specific. By breaking apart the objects required to build a data service, we can simplify its scale. Load balancer configurations are build-in. Service accounts and passwords can be auto-generated. Scaling can be done in the blink of an eye.
Operator Patterns and CRDs – One of my favorite things about Kubernetes is that it can be taught new tricks. Instead of needing complex instructions to install a vendor’s data service, I can simply deploy an operator. Now Kubernetes knows what a MongoDB cluster is, and all of the requirements, health checks, and replica configurations to make it work. It also solves day 2 operation issues by being smart enough to know how to upgrade a data service, instead of simply replacing the running image.

The above transforms Kubernetes – in my mind – from a simple container orchestrator, into a platform (dare I say a cloud operating system?). When I want to consume a new bit of software, Kubernetes makes the process of installing and configuring the software simple.

Kubernetes also has the advantage of being an open standard, meaning that it runs (more or less) the same in any cloud, or any infrastructure.

What sort of data should I consider persisting in Kubernetes?

Transitional Data – Remember that all containers save state, we just can’t count on it being there when the container restarts. Even if the data isn’t a source of record for the application, there can be value in persisting the data so that recovery from a pod failure is faster.

Imagine you are running applications or apps Kafka, Flink, or RabbitMQ. Many times these applications are simply collecting, transforming, and routing messages between systems. They also write what they are doing to the disk, and when the container restarts, having access to the data it was working on can save some expensive rebuilds. It also opens up architectures where the messaging system is actually the system of record. Remember, you are paying for this space somewhere (on the disks of your Kubernetes nodes if nothing else) if we don’t tell the system to do something different.

Databases – “The biggest challenge is understanding that rubbing Kubernetes on Postgres won’t turn it into Cloud SQL” – Kelsey Hightower.

Putting databases on Kubernetes can be scary. I have heard countless stories from customers that have a policy that says: “We don’t put persistent data on Kubernetes” due to a failure, often because we treated our database application like a “stateless” application. We need to ensure that we have accounted for a few things:

Database resilience – ensuring that we have replicas of the database to ensure availability. This should be aware of the datacenter topology (availability zones, rack awareness, etc).
Upgrades – Although the configuration should be part of the container manifest, the data is not. This can lead to issues when it is time to upgrade because we can jump versions unexpectedly.
Backups – Don’t forget about simply having database backups. We backed up DBs when they were running on servers, we create back up databases when they are running on managed services, and we need to back up databases when they are on Kubernetes.

Any data where we value portability

Some data benefits from being versioned and tested right next to the application. Whenever this is true, it can be beneficial to put that data in Kubernetes. All of the abstractions that Kubernetes provides mean that this sort of data can be portable. Some examples I have run into:

WordPress sites – Because it can be useful to have the DB, web assets, and images versioned together, I can allow a WordPress site to move between clouds, copy the entire site for testing, or integrate the stack as part of a CD pipeline that tests the data.
Scientific ML workloads – I am able to package the image, source data, methods, and outputs into a namespace and reproduce it anywhere.
Caching – Having a cache of a data service next to the applications accessing them can allow you to scale your applications easily between any cloud.

The above are just some limited examples.

What data should I not persist in Kubernetes?

There are plenty of times that I would not put data in Kubernetes:

When you are running a database that doesn’t have a supported operator/chart/vendor.

If my database doesn’t have a supported way of installing it on Kubernetes, then I would consider not doing it.

I also don’t mean to say that support needs to come from the vendor (lots of open-source databases don’t have a vendor in the traditional sense), but unless you are after forging a new trail by being the first one to install Oracle 12c on containers use a vendor such as:
– CrunchyData
– Portworx Data Services
– MongoDB Enterprise Operator

When you do not want to deal with the risk of data

There is a time and place to use a cloud service, and if you don’t want to have the risk of running your own data service, I would absolutely consider using a service offering from a cloud vendor such as:
– AWS RDS
– Azure Cosmos

I’m not going to spend my time listing out all 13 million cloud offerings, but I would consider the following:

Using a vendor typically costs more than hosting it yourself. As surprising as this is, this tends to be a prevailing business reality. A company that has a lot of data can do it for less money running it themselves in Kubernetes or VMs on the same cloud vendor.
We sacrifice some portability – It is more of a challenge to move from AWS RDS to Azure’s SQL hosted service than it is to move a Kubernetes namespace between the 2 clouds.
We sacrifice some of the things that make data on Kubernetes great – Integration with CD pipelines, automated testing, developer agility, and many more are some of the things we give up.

Again, this isn’t to say that I wouldn’t put data in a cloud-hosted service (I do it all of the time!) but it comes down to the right tool for the right job. Cloud vendors provide a great service in that it is just that: a service. I can swipe a credit card and pay by the transaction.

Conclusion

This article about the difference between stateless and stateful data – and how Kubernetes treats it – turned out a lot longer than I had planned when I sat down to write it. This is a big topic and a debate that is still raging in the industry. In the hundreds of conversations I have had across many companies and disciplines, I mostly hope for one thing: that we could have honest and curious conversations.

The issue isn’t that persisting data on Kubernetes is bad, it is that developers made that decision without knowing what the benefits and drawbacks were (rubbing k8s on Postgres for lack of a better term). The issue isn’t using the cloud data service, it is that someone made that choice before realizing that their data would need to move later, or that the cloud SLA didn’t match the user’s expectation.

I have always liked the point Carl Sagan made about Einstein’s papers on relativity: the tone was humble and curious. Technologies are changing fast. It is exhilarating. It is frightening. It is more than anyone can learn in a lifetime. It’s best to stay curious.