Kubernetes Tutorial: How to Deploy PostreSQL on Google Kubernetes Engine (GKE)

How To

This tutorial is a walk-through of how to deploy PostgreSQL on Google Kubernetes Engine (GKE) by Certified Kubernetes Administrator (CKA) and Application Developer (CKAD) Janakiram MSV.

TRANSCRIPT:

Janakiram MSV: Hi. In this demo I want to show you how to install a high availability Postgres database backed by Portworx and Kubernetes. Since we’re going to use Portworx as the storage back-end, we don’t need to configure additional parameters to ensure high availability of Postgres which means you don’t need to create PostgreSQL as a stateful set or use any other mechanism to achieve high availability. Since the storage back-end powering our database is already highly available and is redundant, we automatically achieve HA of our database. So let’s see how to do this.

The very first thing that we’re going to do is to create what is called as a storage class. If you are familiar with Kubernetes, you know that the storage class is going to associate a specific storage back-end with our volumes and volume claims that we’ll create in the next step. This is a storage class specific to Portworx. While everything looks almost the same as the other storage classes, what is unique to this configuration is the parameter called replication factor, and in this case we are mentioning the replication factor as 3, which indicates that the data retained to any of the Portworx volumes is going to be replicated across three nodes, and this is the magic that Portworx will bring to achieve high availability of the stateful workloads.

So, let’s go ahead and create the storage class. So this is going to create a storage class called “px-repl3-sc”. Once we have the storage class, the next step is to create a persistent volume claim. Now again, if you have background of Kubernetes, you know that a PVC is always bound to a PV, which is a persistent volume. But thanks to dynamic provisioning, we can straight away associate the PVC with the storage class, bypassing this step to create the volume first and then creating a volume claim. We do that by mentioning the storage class and associating it with the previous storage class that we just created. We’re also making sure that we have at least 1 GB of claim available for the application.

So, let’s now create the PVC based on the storage class that we just created. So this is going to result in 1 GB PVC which is already bound, and this is the concept of dynamic provisioning where we can bypass the creation of volume before claiming the volume space. Alright, with those two in place, the storage class and the PVC, we need to now create the part of the deployment running pgSQL. Because pgSQL expects username and password, we are going to follow the best practices of creating a Kubernetes secret. For that, I’m going to create a plain manila text file that contains a dummy password we are going to use for this demo. We’ll ensure that we are getting rid of the newline character. And then from this file, we are going to create a generic secret, which is called Postgres pass. So, when we do “kubectl get secret”, we’ll see that the password is available as a secret.

Now it’s time for us to create the actual deployment of Postgres. So let’s take a closer look at the YAML definition. This is configured as a deployment with replicas as one. As I mentioned, even if we are just running this as one instance, thanks to the back-end, we’ll still achieve HA. Then we are going to use the secret that we just created from the generic secret already stored within the cluster and in the default name space. Finally, we are associating this deployment with a volume resulting from the PVC. So, the claim name is now pointing to “px-postgres-pvc”, which was created here, and this is in turn, pointing to the storage class and this is how the parts are going to be backed by Portworx storage back-end.

Alright, so let’s go ahead and create the Postgres deployment. This is going to result in the creation of the pod, so let’s put this in watch mode. In just a few seconds, we’ll notice that this pod, which is a part of the deployment, will move into a ready state with status becoming running and that indicates the successful configuration and deployment of PostgreSQL. There we go. Now we have the Postgres pod up and running. How do we ensure that everything has been configured perfectly well and we are ready to run our workloads on top of Postgres? Well, let’s grab the volume resulting from the creation of the PVC. So we’ll run this command to basically grab the volume name associated with this PVC. Verifying this will show us a unique ID, which is bound to the Portworx back-end.

Now, we’re going to grab the pod name, which is a part of the Daemon set running Portworx. And because we are pointing it to the kube-system namespace with the label key-value beta’s name is equal to Portworx. This is going to give us the name of the Portworx pod. On that, we are going to execute a command, which will invoke pxctl, the binary associated with Portworx, and we’re going to inspect the volume that is backing our Postgres database. So, volume inspect, followed by the volume name that we grabbed from this, and this is going to show us some pretty interesting details.

So, this is the name of the Portworx volume, and because it is a part of dynamic provisioning, you will see that it is associated with the PVC, it’s of 1 GB size. Again, based on the PVC that we defined earlier, it is highly available based on three-factor replication. It is attached to one of the SSD disks, part of the GCP infrastructure and the GKE cluster. And because we have set up the replication factor as 3, it’s automatically available on all the three nodes of our cluster. Replication status is all green, which is up and running. This is pointing us to a very healthy installation of Portworx, as well as PostgreSQL database.

In the next video, we’re going to show how you can achieve high availability by deleting a running pod and recreating it with all the data intact. Thanks for watching.