Kubernetes Tutorial: How to Create Local Snapshots of MySQL Persistent Volume Claims on GKE

This tutorial is a walk-through of how to create local snapshots of MySQL persistent volume claims on Google Kubernetes Engine (GKE) by Certified Kubernetes Administrator (CKA) and Application Developer (CKAD) Janakiram MSV.

TRANSCRIPT:

Janakiram MSV: Hi. In this demo, I’m going to walk you through how to create a local snapshot of MySQL database running on Google Kubernetes Engine and Portworx. So we have a pod that’s currently running MySQL as a part of a deployment with one replica set. And because we already created the replication factor of Portworx storage cluster as three, this is a highly available instance of MySQL. Now we’re going to take this a level further by creating a snapshot of a running pod or the database instance. So let’s first make sure that we have the data that’s currently available is going to remain intact when we take a snapshot and restore it. So we’ll grab the name of the pod that’s currently pointing to the MySQL instance, and then we are going to access the MySQL shell, and then query the sample data. So we have a sample database and a bunch of rows, so let’s make sure that is available. There we go. Now this is in place, so let me exit and get back to the prompt.

Now what we’re going to do is to create a local snapshot. So a snapshot in Portworx is a point-in-time backup of a running volume, or a running pod backed by a volume. So when we create a local snapshot, Portworx is going to store that snapshot within the storage cluster, within the local environment of the Portworx storage cluster. That’s why it is called as a local snapshot. So how do we create a snapshot? So it all starts with the Volume Snapshot object type. So this is a Kubernetes primitive, it’s an artifact that is pointing us to a snapshot, and what is important here is that we are pointing the snapshot to an existing PVC. And this is the Persistent Volume Claim that is currently backing the MySQL pod. So we are pointing the snapshot to the PVC that is powering the current production database of MySQL. So let’s go ahead and create the snapshot.

So this is going to create a new snapshot, and that is going to be available to us through these commands, so when we actually do kubectl get volume snapshot, it shows that we have successfully created a snapshot. We can also verify this with volume snapshot data, and this is going to further confirm that we have been able to take a snapshot of the running database. Alright, now what we’re going to do is to simulate data corruption. So how do we do that? By deleting one of the databases in the MySQL database. So let’s grab the name of the pod, and then let’s access the MySQL shell. And within that, let me show you the databases that we have. So we have a database called classicmodels. Now we are going to drop this database, so this is how we are simulating data corruption, so drop database classicmodels, and now, when we actually run show databases, it is gone. So now we don’t have any customer data available within the database cluster. Now our job is to restore it from the snapshot. So how do we do that?

Well, once we have a snapshot created, it is as simple as creating a new PVC from the snapshot. So we are going to create a new PVC from the local snapshot that we have taken in the previous step. So let’s take a look at the definition of this PVC. Everything remains the same. This is like any other dynamically provisioned Persistent Volume Claim, except that we have an annotation that is pointing to the snapshot we have taken in the previous step. So, px-mysql-snapshot is the name of the snapshot that we created in the previous step. By associating this PVC with the snapshot, we are essentially mounting a new volume that is going to be restored from the snapshot, which means the data will also get recovered. And this also gives us an opportunity to create a new PVC with an increase of size. The original PVC was of 1Gb, but because we are restoring it from the snapshot, we can go ahead and expand it to 2Gb.

So let’s go ahead and create a new Persistent Volume Claim, based on the snapshot. So this is possible because of one of the storage classes that is registered within the Kubernetes cluster and, if you notice, this is the STORK snapshot SC. The magic of snapshots is delivered with this storage class that is very specific to Portworx, and it is installed by default when you are setting up a Kubernetes cluster with STORK option enabled, a Portworx cluster on Kubernetes with STORK enabled. So, with that in place, let’s go ahead and create a new PVC, which is pointing to the snapshot. So now we can verify this by getting the PVC. So we’ll actually have two PVCs, one is the original PVC backing the production pod, and the clone which is now pointing to the snapshot. This is the most recent one, which was created just six seconds ago.

Alright. So now, with the PVC in place, we can go ahead and create a new deployment, a new pod as a part of the deployment, and nothing changes when it comes to this definition, everything remains intact, it’s almost like the same deployment definition. The only difference here is the claim name, which is pointing to the PVC that we have created here. So this is the PVC name, px-mysql-snap-clone, all we got to do is point this pod or the deployment to exactly the same PVC that was created earlier in this demo.

So now a pod gets created and it is pointing to the PVC, which is, in turn, pointing to the snapshot. Perfect. Now let’s go ahead and create a new version of the database, which is restored from the snapshot. So now, this is going to result in two pods, so this pod is currently being created and we already have a corrupted pod. So we left it as is, we didn’t really terminate it. So as soon as this becomes available, we can access the MySQL shell and check out the data. At this point, we have two pods. So the first pod is the original one where the database is corrupted, and the second pod is associated with the PVC restored from a snapshot.

So now we are going to grab the pod name, the most recent one, which is restored from the snapshot, and then do an exec to access MySQL shell, and let’s verify that the data is available. So use classicmodels, looks like everything is intact and we can now query the table. Perfect. So even though the original database is corrupted, we are able to access point-in-time snapshot that was already created and this will ensure that you can always have a very graceful restoration of data that was taken as snapshot. So, all from kubectl. So this is a part of storage ops where you are able to take point-in-time snapshots and restore them, in case of data corruption or database unavailability. So I hope you found this video useful. Thanks for watching.