Apache Cassandra is an open-source distributed database management system designed to handle large amounts of data across commodity servers. Running Cassandra containers as part of a Mesosphere DC/OS, Kubernetes, or Docker cluster is challenging, however. If you are thinking about containerizing Cassandra, this guide answers the question “how do I run Cassandra in Docker containers?”
Problems with containerized Cassandra include:
Cassandra works best when deployed on bare-metal servers instead of shared storage because of its intense requirements around disk i/o and throughput. That means that when running Cassandra in a container, you need to schedule your containers on the same host where your data is located.
You can force your scheduler, for example, Kubernetes, Mesosphere DC/OS or Docker Swarm, to run your Cassandra containers only on a host that already has a copy of the data with Portworx. Portworx enforces these types of scheduling decisions using host labels. By labeling hosts based on which volumes they have and passing these constraints to the scheduler, your containers will run in the location that minimizes rebuild time and maximizes performance using direct attached storage. In the case of Amazon EBS, Google Persistent Disk or other block device, it is also possible to use labels to force Cassandra to run on hosts with a block device already mounted.
For production environments, Cassandra recommends placing replicas in your cluster using the Network Topology strategy. This strategy takes into consideration what are called Fault Domains when placing replicas in a ring. A fault domain could be a host, a rack, or a datacenter.
The problem with the Network Topology strategy is that it is cumbersome to implement manually. Additionally, if you hand place your containers in the optimum network topology, you can’t take advantage of automated scheduling. Portworx can automatically understand the topology of your datacenter and its fault domains and constrain automated scheduling decisions based on these factors.
In the cloud, Portworx automatically handles this placement by querying a REST API endpoint on the cloud provider for zone & region information associated with each host. Then with this understanding of your datacenter topology, Portworx can automatically place your partitions across fault domains, providing extra resilience against failures even before employing replication.
On prem, Portworx can also influence scheduling decisions by reading the topology defined in a yaml file like cassandra-topology.properties.
Cassandra itself is capable of replicating data and if a node dies, a new node brought into the cluster will be populated with data from other healthy nodes, a process known as bootstrapping. The bootstrap process puts load on your cluster as cluster resources are used to stream data to new nodes. This load reduces read/write performance of Cassandra, which slows your application down. The key with bootstrapping is to finish as quickly as possible yet this is increasingly difficult as the cluster is under stress.
Like Cassandra, Portworx also can bootstrap a node by replicating data from another host. Portworx replication, however is at the block-level and as a result, is faster than Cassandra replication. Block-level replication is faster for two main reasons: 1- Block level granularity allows for massive parallelization of network transfers so data is transfered faster. 2- Cassandra is written in Java which is not as performant for the task of data transfer as C++ which is the language Portworx is written in.
By using Portworx replication alongside Cassandra replication, you can reduce the time to recovery for your Cassandra cluster by completing the bootstrap operation faster.
The above operational best practices have been concerned with reliability and performance. Now we can look at efficiency. Most organizations run multiple Cassandra rings, and when each ring is architected as outlined above, you can achieve fast and reliable performance. However, since Cassandra is a resource intensive application, the costs of operating multiple rings can be considerable.
With Portworx, it is possible to run multiple Cassandra rings on the same cluster. First, Portworx can provide container-granular volumes to multiple Cassandra replicas running on the same host. On prem, these volumes can use local direct attached storage which Portworx formats as a block device and “slices” up for each container. Alternatively in the cloud, a single or multiple network-attached block devices like AWS EBS or Google Persistent disk can be used, likewise with Portworx slicing each block device into multiple container-granular block devices.
Then, the same placement algorithms described above apply to multiple rings. Since Portworx is application aware, you can pass in the equivalent of a Cassandra ring id as a group id in volume. Using this id Portworx will make sure that it does not colocate data for two Cassandra instances that belong to the same ring in the same node. Each of your rings will be spread across the cluster, maximizing resiliency in the face of hardware failure.