Kubernetes Tutorial: How to Create Cloud Snapshots of MongoDB Persistent Volume Claims on GKE
Kubernetes Tutorial: How to Create Cloud Snapshots of MongoDB Persistent Volume Claims on GKE
October 18, 2018
This tutorial is a walk-through of how to create cloud snapshots of MongoDB persistent volume claims on Google Kubernetes Engine (GKE) by Certified Kubernetes Administrator (CKA) and Application Developer (CKAD) Janakiram MSV.
Janakiram MSV: Hi. In this demo, I am going to walk you through the steps involved in taking a cloud snapshot of MongoDB running on Google Kubernetes Engine. So let’s explore the environment. We have one and only one pod that’s currently running MongoDB, and this pod is a part of the deployment. And MongoDB here comes with some sample data, which we’re going to use as a part of the use case, so let’s look at the data. It has a collection with some records, so this is already running as a part of the deployment. Now what we are going to do is to take a snapshot of the PVC backing this pod. Since this is a cloud snapshot, we need to perform additional steps before we can take a snapshot of this PVC, so let me walk you through the steps.
The very first thing that you need to do is to create a new credential, and this credential is going to be helpful in connecting Portworx with the cloud service provider’s object storage service, so these credentials will enable Portworx to get authenticated with the cloud storage provider in order to store the snapshots. So in GKE, access the API & Services console, click on Service Account Key, and create a new service account key that is going to be added to the Portworx storage cluster. So create a new service account, and when it comes to role, make sure you are giving the Storage Admin role to this service account. Give it a name, call it px-snap, and then, when you click on create, it is going to download a JSON file, and this JSON file is a key that is going to enable Portworx to securely talk to the Google Cloud Platform to talk to the object storage service.
To save time, I’ve already gone ahead and created the service account JSON key. So once we have the key in place, we need to use that to create a set of credentials within Portworx. So in order to proceed further, we need to copy the service account key that we just created to one of the nodes running the Portworx storage cluster, so let’s go ahead and do that. So we are going to get the node name from one of the running pods. This can be any node, but it is convenient for us to access from the kubectl command, so we are getting the name of the node, and then we are going to copy the JSON file that we created and downloaded from the Google Cloud console.
So this is the JSON file that was created as a part of the service account creation. Now what we’re going to do is to copy this file from our local development machine to one of the Kubernetes clusters, one of the Kubernetes nodes of the cluster. So we will now scp. Okay, before that, I need to set the actual node definition, so let me set this, and then securely copy the key from the local machine on which we downloaded the key to the node. Okay, so now this is copied, and we are all set to create the actual credential as a part of Portworx. So now, I’m going to SSH into the same node on which we copied the service account key. When we verify that, we can actually see that the JSON file is copied to the node.
Now, we’re going to invoke the pxctl with the credentials parameter. So currently, when you list the credentials, it is empty, because we haven’t created a Portworx credential to talk to the cloud service provider. So in the next step, we’re going to create a credential that is going to be associated with the key that we created and copied over to the node. So this is the Google Cloud provider with the project name, which is very specific to your GCP project, and then the JSON key that we copied from our development work station to one of the nodes of the Kubernetes cluster. Now, when we run the list command again, we’ll actually notice that Portworx has created a pair of credentials for us to talk to the object storage service.
Now, this is going to be used by Portworx when you request a cloud snapshot to be created, it uses this as the link to talk to the object storage service, and in this case, it is Google Cloud Storage. I want to call out that before you can execute this command, and before you use cloudsnap, you’ve got to make sure that STORK is enabled, which is, by default, enabled, but the most important thing is, you’ve got to send the Secrets Store Type to Kubernetes, and this is done when you are creating the Portworx spec through the Portworx spec generator, which is available at install.portworx.com. So when you are storing the secret as a part of Kubernetes, it will seamlessly integrate the secret with Portworx, which is later on used by the cloud snapshot workflow to access the cloud storage service.
Alright. So now we see that the Google Cloud Storage has no buckets, it is currently empty. Now, as soon as we go ahead and create a cloudsnap, this should get populated. So let’s leave this as is, and come back to the command prompt. So now we are done with the creation of credentials, and we can proceed with the remaining steps of creating a snapshot, which, by the way, is not very different from creating a local snapshot. So the very first step is to create a cloud snapshot. And again, when we look at this, it’s not very different, except that we create an annotation that indicates Kubernetes, that it should create a cloud snapshot instead of a local snapshot, and this simple annotation makes a lot of difference. So let’s go ahead and create the cloud snapshot. But before that, I want to remind you that the magic of snapshots is done through the STORK engine, the storage class created through STORK, which is the STorage OTchestration for Kubernetes, created exclusively by Portworx, and that is responsible for orchestrating the entire workflow, all the way from creating a cloud snapshot to restoring it, and then, eventually, placing the pod on the same node where the volume is really available.
So with that in place, we are all set to create the cloud snapshot. So let’s go ahead and create the snapshot. So now we can verify by running get volumesnapshot. So we now have a new snapshot. We can also check whether the data is readily available. Yes it is. So now we can go ahead and consume this as a part of PVC. The other way of verifying this is to refresh the object storage browser of Google Cloud Platform. So here, we notice that there is a new bucket that’s created, which is responsible for holding all the snapshots. And this is an indication that the workflow so far has been pretty smooth and has been successful. Perfect, so now, with the snapshot in place, we can access the data in the pod and simulate data corruption. So we are accessing the Mongo shell of the original pod where the data is available, and we can now drop it because we have successfully taken a snapshot. So what I’m going to do now is simulating a data corruption event by dropping the collection. So now we can no more access the data, but since we have taken the snapshot, we should be able to restore it. Alright, so now we are going to create what is called as the PVC, and this PVC is going to come from the snapshot that we have taken in the previous steps. So let’s take a look at the definition. This is, again, not very different if you are familiar with the local snapshot concept of Portworx, it’s exactly the same. It doesn’t even know that it is actually coming from the cloud.
So let’s go ahead and create a new PVC from the cloud snapshot. So we can verify by querying the number of PVCs, we should have two PVCs: One, which was originally bound to the initial pod that we used for populating the sample data and then simulating the corruption, the second one is the PVC created from the cloud snapshot. So now it is ready to be consumed, so what we will do is to create the new MongoDB instance backed by the PVC, which is indeed restored from the cloud snapshot. So let’s take a look at the MongoDB pod definition as a part of the deployment. Again, this is not different at all, it’s just that we are pointing it to the PVC created in the previous step, which is px-mongo-snap-clone. In fact, that’s exactly the name that we have defined for the PVC created in the previous step. And just because of this association, MongoDB will retain the data that was created by the original pod.
So now, let’s go ahead and create a pod that’s going to access the data, so let’s ensure that the pod is created. So now, when we do kubectl get pods, we’ll see that there is the original pod on which the data has been corrupted, then we have another pod which is restored from the snapshot, and this has been up and running only for a few seconds. So now, let’s access the shell. So we’ll invoke the Mongo shell through the kubectl exec command, and let’s see if the data is still available, and there we go. Even though data was corrupted on the original pod, because we have periodically taken the snapshot, we were able to successfully restore it and associate it with another pod that is now running exactly the same MongoDB instance with the same set of data. So that’s how simple it is to create a cloud snapshot and restore it to another instance of MongoDB, without any downtime and without losing the data. So I hope you found this video useful. Thanks again for watching.