Leveraging Databases in Multi-Cloud Kubernetes

Contact Sales

Architect’s Corner

The multi-cloud approach enables you to leverage multiple private, public, and hybrid cloud architectures. By strategically distributing your cloud needs across different providers, you gain operational efficiencies, faster disaster mitigation, and secure architecture resiliency while benefiting from economies of scale.

One advantage of running databases in multi-cloud Kubernetes is access to more deployment options through different cloud vendors. You can also gain better redundancy across your microservices.

However, running databases in a multi-cloud Kubernetes system has its fair share of challenges due to the granular configuration and maintenance required — especially for stateful applications. Data regulations, business policies, vendor lock-in, and the cost of running databases add another layer of complexity when operating databases in a multi-cloud environment.

Nonetheless, applying proper practices while deploying your databases in Kubernetes can help you get better value from your multi-cloud environment, scale applications faster, and reduce cloud management costs.

Let’s discuss the importance of multi-cloud architecture for Kubernetes, ways to leverage databases more efficiently in a multi-cloud Kubernetes environment, and explore use cases where Kubernetes requires multi-cloud data services.

The Benefits of Multi-Cloud Architecture for Kubernetes

More businesses are adopting a multi-cloud strategy to deploy their stateful databases. Gartner notes that 81 percent of IT enterprises work with multiple cloud vendors. This is because distributing databases across different cloud providers offers benefits you wouldn’t get from a single cloud service.

Adopting a multi-cloud approach also serves as a business continuity strategy by providing reliable platforms for managing stateful databases across multiple cloud services.

High Availability

Distributing stateful databases across multiple cloud vendors in various locations ensures the high availability of applications for users regardless of their location. This approach also lets you customize your solutions to the region’s data regulation laws without affecting the quality of services you deliver to users.

Flexibility

Working with multiple cloud providers exposes your Kubernetes environment to different vendors’ top solutions. Also, because no single solution provides everything, you can spread your needs across various options to get more value from your subscriptions.

How to Leverage Databases More Efficiently

Kubernetes’s original “stateless” design handled ephemeral workloads that don’t require storing data when replacing and updating containers. But as cloud-native development grew popular, developers needed a simple way to manage stateful apps and databases in Kubernetes.

So, Kubernetes developed storage constructs to help you persist your databases in Kubernetes across different cloud platforms. These constructs include persistent volumes, persistent volume claims, StorageClass, StatefulSet controller, and DaemonSet controller.

Kubernetes PersistentVolumes

Ephemeral filesystems are the central storage units attached to pods and their life cycles. However, these filesystems lack persistence because they store data temporarily.

PersistentVolumes (PVs) provide portability, data retention, and scalability. With PersistentVolumes, clusters manage the data while ensuring database availability to other pods and beyond the pod’s lifecycle.

PersistentVolumeClaims

PersistentVolumeClaims (PVCs) are requests made by users for persistent storage properties such as size, access mode, and performance. PVCs allow users to consume PersistentVolumes the same way pods request and consume node resources.

When you create a PersistentVolumeClaim, you specify storage filesystem resources to provision data, like names, size, access modes, and volume ID. Then, you store these configurations in a StorageClass.

StorageClass

StorageClass is a Kubernetes object that enables you to define storage categories and describe the attributes you assign to each PersistentVolume. These objects work with PersistentVolumes and PersistentVolumeClaims to ensure PersistentVolumes have the required resources before mounting them to pods.

StorageClasses enable Kubernetes to provision PersistentVolumes dynamically without manually making calls to a cloud provider. This way, Kubernetes doesn’t have to locate a pre-deployed volume.

The StorageClass object also creates two controllers based on specific configurations, enabling you to deploy stateful apps in a containerized environment and automate database operations from within Kubernetes. This approach eliminates the need for separate data services.

StatefulSet Controller

The StatefulSet controller is a workload API object. It manages and assigns each pod a unique and persistent ID to maintain identity. StatefulSets can be useful for applications that require one or more of the following.

Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.

In persistent storage, these StatefulSet controllers work by assigning each pod a persistent volume. They do this based on the default storage class if not defined or the defined storage class. So, Kubernetes doesn’t delete PersistentVolumes when you scale down pods.

StatefulSets provide a standardized pod deployment and scaling method in ordered graceful deployments. They create pods in the order N to 1 and delete them in reverse order from 1 to N. This ordered pod management ensures data is always available in Stateful applications.

The StatefulSet controller incrementally updates your workloads with resources to prevent downtime in ordered, automated rolling updates.

DaemonSet Controller

The DaemonSet controller is another native Kubernetes object. It ensures nodes have a copy of the pod. Though a less common way to run databases, the controller can add a database pod when you add nodes to a cluster apart from nodes that may be tainted. Likewise, it removes the accompanying pod and sends it to the garbage collector when you delete a node.

Since pods contain network resources and shared storage, a DaemonSet controller schedules pods and ensures they run on every node available. The controller helps run tasks not requiring user intervention, like monitoring services, log collection, and maintenance task distribution across nodes.

Multi-Cloud Data Service Use Cases

In a multi-cloud architecture where applications are designed to be interconnected and portable, developers are free to choose from a variety of database solutions. This flexibility allows you to choose tools that suit the unique needs of your software.

Modern applications comprise numerous microservices, often supported by multiple databases like Cassandra, PostgreSQL, NoSQL, or MySQL, and streaming AI or machine learning (ML) and search data services. Managing these databases consistently and efficiently is time-consuming and prone to human error and downtime.

A managed data service provides complete automation of Day-2 operations (housekeeping and maintenance tasks). Most often, they deliver their service through a single-click deployment.

Teams don’t have to configure backups, high availability, automated capacity management, or data migration. The managed service enables production-ready deployments from one console.

Cloud bursting is one multi-cloud Kubernetes use case. In cloud bursting, you set up private cloud deployments to use public cloud resources when traffic spikes and overflows or “bursts.” Cloud bursting enables you to leverage better performance, lower costs, or higher throughput when you have demanding workloads. An online retailer might benefit from this approach on Black Friday, for example.

Another use case for multi-cloud Kubernetes is increasing redundancy across your workloads. A multi-cloud Kubernetes approach enables you to replicate your databases across multiple cloud platforms without configuring workloads to meet each cloud provider’s requirements individually.

Though you can still achieve redundancy without Kubernetes, hosting your workloads on different cloud providers, for example, via virtual machines, would require granular configurations, which is complex and costly. A financial organization might, for example, use this approach to meet regulatory requirements for record-keeping.

Summary

Running databases within a multi-cloud Kubernetes environment has its benefits and challenges. Where you run a database — Kubernetes or in the cloud — and the particular cloud service you choose depends on each use case. Evaluate your choices case by case. Although running databases within a multi-cloud environment can be challenging, Portworx is here to help and can enable a common data management layer across each cloud vendor you are using. Learn more about how the Portworx Data Services database as a service (DBaaS) deploys your data service.

Subscribe for Updates

About Us
Portworx is the leader in cloud native storage for containers.

Thanks for subscribing!

Tim Darnell

Tim is a Principal Technical Marketing Manager within the Cloud Native Business Unit at Pure Storage. He has held a variety of roles in the two decades spanning his technology career, most recently as a Product Owner and Master Solutions Architect for converged and hyper-converged infrastructure targeted for virtualization and container-based workloads. Tim joined Pure Storage in October of 2021.

Explore Related Content:

databases
kubernetes

March 25, 2022 Architect’s Corner

Choosing a Kubernetes Operator for Cassandra

Ryan Wallner

March 18, 2022 Technical Insights

Choosing a Kubernetes Operator for PostgreSQL

Bhavin Shah

March 4, 2022 Technical Insights

What is the Best Database for Data on Kubernetes?