How to Run HA PostgreSQL on GKE

Schedule a Demo

How To

This post is part of our ongoing series on running PostgreSQL on Kubernetes. We’ve published a number of articles about running PostgreSQL on Kubernetes for specific platforms and for specific use cases. If you are looking for a specific Kubernetes platform, check out these related articles.

Running HA PostgreSQL on Amazon Elastic Container Service for Kubernetes (EKS)

Running HA PostgreSQL on Azure Kubernetes Service (AKS)

Running HA PostgreSQL on Red Hat OpenShift

Running HA PostgreSQL on IBM Cloud Kubernetes Service (IKS)

Running HA PostgreSQL on IBM Cloud Private

Running HA PostgreSQL with Rancher Kubernetes Engine (RKE)

And now, onto the post…

Google Kubernetes Engine (GKE) is a managed, production-ready environment for deploying containerized applications in Google Cloud Platform. Launched in 2015, GKE is one of the first hosted container platforms which is built on the learnings from Google’s experience of running services like Gmail and YouTube in containers for over 12 years. GKE allows customers to quickly get up and running with Kubernetes by completely eliminating the need to install, manage, and operate Kubernetes clusters.

Portworx is a cloud-native storage platform to run persistent workloads deployed on a variety of orchestration engines including Kubernetes. With Portworx, customers can manage the database of their choice on any infrastructure using any container scheduler. It provides a single data management layer for all stateful services, no matter where they run.

This tutorial is a walk-through of the steps involved in deploying and managing a highly available PostgreSQL cluster on Google Kubernetes Engine.

In summary, to run HA PostgreSQL on Google Cloud Platform you need to:

Launch a GKE cluster
Install cloud native storage solution like Portworx as a daemon set on GKE
Create storage class defining your storage requirements like replication factor, snapshot policy, and performance profile
Deploy Postgres using Kubernetes
Test failover by killing or cordoning node in your cluster

How to launch a GKE cluster

When launching a GKE cluster to run Portworx, you need to ensure that the cluster is based on Ubuntu. Due to certain restrictions with GKE clusters based on Container-Optimized OS (COS), Portworx requires Ubuntu as the base image for the GKE Nodes.

The following command configures a 3-node GKE Cluster in zone ap-south-1-a. You can modify the parameters accordingly.

$ gcloud container clusters create "gke-px" \
--zone "asia-south1-a" \
--username "admin" \
--cluster-version "1.8.10-gke.0" \
--machine-type "n1-standard-4" \
--image-type "UBUNTU" \
--disk-type "pd-ssd" \
--disk-size "100" \
--num-nodes "3" \
--enable-cloud-logging \
--enable-cloud-monitoring \
--network "default" \
--addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard

Once the cluster is ready, configure kubectl CLI with the following command:

$ gcloud container clusters get-credentials gke-px --zone asia-south1-a

Portworx requires a ClusterRoleBinding for your user. Without this configuration, the command fails with an error clusterroles.rbac.authorization.k8s.io "portworx-pvc-controller-role" is forbidden.

Let’s create a ClusterRoleBinding with the following command:

$ kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole cluster-admin \
--user $(gcloud config get-value account)

Installing Portworx in GKE

Installing Portworx on GKE is not very different from installing it on any other Kubernetes cluster. Portworx GKE documentation has the steps involved in running the Portworx cluster in a Kubernetes environment deployed in AWS.

Once the GKE cluster is up and running, and Portworx is installed and configured, we will deploy a highly available PostgreSQL database.

Creating a Postgres storage class in Kubernetes

Through Storage Class objects, an admin can define different classes of Portworx volumes that are offered in a cluster. These classes will be used during the dynamic provisioning of volumes. The Storage Class defines the replication factor, IO profile (e.g. for a database or a CMS), and priority (e.g. SSD or HDD). These parameters impact the availability and throughput of workload and can be specified for each volume. This is important because a production database will have different requirements than a development Jenkins cluster.

In this example, the Storage Class that we deploy has a replication factor of 3 with IO profile set to “db_remote”, and priority set to “high”. This means that the storage will be optimized for low latency database workloads like Postgres and automatically placed on the highest performance storage available in the cluster.

$ cat > px-repl3-sc.yaml <<EOF
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
    name: px-repl3-sc
provisioner: kubernetes.io/portworx-volume
parameters:
   repl: "3"
   io_profile: "db_remote"
   priority_io: "high"
EOF

$ kubectl create -f px-repl3-sc.yaml
storageclass "px-repl3-sc" created

Creating a Postgres PVC

We can now create a Persistent Volume Claim (PVC) based on the Storage Class. Thanks to dynamic provisioning, the claims will be created without explicitly provisioning Persistent Volume (PV).

$ cat > px-postgres-pvc.yaml <<EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
   name: px-postgres-pvc
   annotations:
     volume.beta.kubernetes.io/storage-class: px-repl3-sc
spec:
   accessModes:
     - ReadWriteOnce
   resources:
     requests:
       storage: 1Gi
EOF

$ kubectl create -f px-postgres-pvc.yaml
persistentvolumeclaim "px-postgres-pvc" created

The password for PostgreSQL will be created as a secret. Run the following commands to create the secret in the correct format.

$ echo postgres123 > password.txt
$ tr -d '\n' .strippedpassword.txt && mv .strippedpassword.txt password.txt
$ kubectl create secret generic postgres-pass --from-file=password.txt
secret "postgres-pass" created

How to deploy Postgres on GKE

Finally, let’s create PostgreSQL instance as a Kubernetes deployment object. For simplicity sake, we will just be deploying a single Postgres pod. Because Portworx provides synchronous replication for High Availability, a single Postgres instance might be the best deployment option for your Postgres database. Portworx can also provide backing volumes for multi-node Postgres deployments. The choice is yours.

$ cat > postgres-app.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      schedulerName: stork
      containers:
      - name: postgres
        image: postgres:9.5
        imagePullPolicy: "Always"
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_USER
          value: pgbench
        - name: PGUSER
          value: pgbench
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-pass
              key: password.txt
        - name: PGBENCH_PASSWORD
          value: superpostgres
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        volumeMounts:
        - mountPath: /var/lib/postgresql/data
          name: postgredb
      volumes:
      - name: postgredb
        persistentVolumeClaim:
          claimName: px-postgres-pvc
EOF

$ kubectl create -f postgres-app.yaml
deployment "postgres" created

Make sure that the Postgres pods are in running state.

$ kubectl get pods -l app=postgres -o wide --watch

Wait utill the Postgres pod is in running state.

pgsql-gke-1

We can inspect the Portworx volume by accessing the pxctl tool running with the Postgres Pod.

$ VOL=`kubectl get pvc | grep px-postgres-pvc | awk '{print $3}'`
$ PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
$ kubectl exec -it $PX_POD -n kube-system -- /opt/pwx/bin/pxctl volume inspect ${VOL}

pgsql-gke-2

The output from the above command confirms the creation of volumes that are backing PostgreSQL database instance.

Failing over PostgreSQL on GKE

Let’s populate the database will 5 million rows of sample data.

We will first find the pod that’s running PostgreSQL to access the shell.

$ POD=`kubectl get pods -l app=postgres | grep Running | grep 1/1 | awk '{print $1}'`
$ kubectl exec -it $POD bash

Now that we are inside the pod, we can connect to Postgres and create a database.

# psql
pgbench=# create database pxdemo;
pgbench=# \l
pgbench=# \q

pgsql-gke-3

By default, pgbench will create 4 tables (pgbench_branches, pgbench_tellers, pgbench_accounts, and pgbench_history) with 100,000 rows in the main pgbench_accounts table. This creates a simple 16MB database.

The -s option is used for multiplying the number of rows entered into each table. In the command below, we enter a “scaling” option of 50. This tells pgbench to create a database with 50 times the default size.

What this means is our pgbench_accounts table now has 5,000,000 records. It also means our database size is now 800MB (50 x 16MB).

# pgbench -i -s 50 pxdemo;

Wait for pgbench to finish populating the table. After that’s done, let’s verify that the pgbench_accounts is populated by 5 million rows.

# psql pxdemo
\dt
select count(*) from pgbench_accounts;
\q
exit

pgsql-gke-4

pgsql-gke-5
Now, let’s simulate the node failure by cordoning off the node on which PostgreSQL is running.

$ NODE=`kubectl get pods -l app=postgres -o wide | grep -v NAME | awk '{print $7}'`
$ kubectl cordon ${NODE}

node "gke-gke-px-default-pool-ce0f28fc-tmxk" cordoned

Executing kubectl get nodes confirms that scheduling is disabled for one of the nodes.

$ kubectl get nodes

pgsql-gke-6

We will now go ahead and delete the PostgreSQL pod.

$ POD=`kubectl get pods -l app=postgres -o wide | grep -v NAME | awk '{print $1}'`
$ kubectl delete pod ${POD}

pod "postgres-6f754d4454-kssvd" deleted

As soon as the pod is deleted, it is relocated to the node with the replicated data. STorage ORchestrator for Kubernetes (STORK), Portworx’s custom storage scheduler allows co-locating the pod on the exact node where the data is stored. It ensures that an appropriate node is selected for scheduling the pod.

Let’s verify this by running the below command. We will notice that a new pod has been created and scheduled in a different node.

$ kubectl get pods -l app=postgres

pgsql-gke-7

Let’s find the pod name and exec into the container.

$ POD=`kubectl get pods -l app=postgres | grep Running | grep 1/1 | awk '{print $1}'`
$ kubectl exec -it $POD bash

Now use psql to make sure our data is still there.

# psql pxdemo
pxdemo=# \dt
pxdemo=# select count(*) from pgbench_accounts;
pxdemo=# \q
pxdemo=# exit

pgsql-gke-8
Observe that the database table is still there and all the content intact!

Performing Storage Operations on Postgres

After testing end-to-end failover of the database, let’s perform StorageOps on our GKE cluster.
Expanding the Volume with no downtime

We will now run a bigger benchmark to run out of space to show how easy it is to add space to a volume dynamically.

Open a shell inside the container.

$ POD=`kubectl get pods -l app=postgres | grep Running | awk '{print $1}'`
$ kubectl exec -it $POD bash

Let’s use pgbench to run a baseline transaction benchmark which will try to grow the volume to more than 1 Gib and fail.

$ pgbench -c 10 -j 2 -t 10000 pxdemo
$ exit

pgsql-gke-9

There may be multiple errors during the execution of the above command. The first error indicates that Pod is running out of space.

PANIC: could not write to file "pg_xlog/xlogtemp.73": No space left on device

Since Kubernetes doesn’t support modifying the PVC after creation, we perform this operation directly on Portworx with the pxctl cli tool.

Let’s get the volume name and inspect it through the pxctl tool.

If you have access, SSH into one of the nodes and run the following command.

POD=`/opt/pwx/bin/pxctl volume list --label pvc=px-postgres-pvc | grep -v ID | awk '{print $1}'`

$ /opt/pwx/bin/pxctl v i $POD

pgsql-gke-10
Notice that the volume is within 10% of being full. Let’s expand it using the following command.

$ /opt/pwx/bin/pxctl volume update $POD --size=2


Update Volume: Volume update successful for volume 234738950250161130

pgsql-gke-11

Summary

Portworx can be easily deployed on Google Kubernetes Engine to run stateful workloads in production. Through the integration of STORK, DevOps and StorageOps teams can seamlessly run highly available database clusters in GKE. They can perform traditional operations such as volume expansion, backup, and recovery for the cloud-native applications is a automated and efficient manner.