Portworx Guided Hands On-Labs. Register Now
This post is part of our ongoing series on running PostgreSQL on Kubernetes. We’ve published a number of articles about running PostgreSQL on Kubernetes for specific platforms and for specific use cases. If you are looking for a specific Kubernetes platform, check out these related articles.
Running HA PostgreSQL on Amazon Elastic Container Service for Kubernetes (EKS)
Running HA PostgreSQL on Google Kubernetes Engine (GKE)
Running HA PostgreSQL on Red Hat OpenShift
Running HA PostgreSQL on IBM Cloud Kubernetes Service (IKS)
Running HA PostgreSQL on IBM Cloud Private
Running HA PostgreSQL with Rancher Kubernetes Engine (RKE)
And now, onto the post…
We’ve been excited to partner with Microsoft, including enabling innovative customers like Beco to build an IoT cloud on Microsoft Azure. Working with the Azure Kubernetes Service (AKS) team and Brendan Burns, distinguished engineer at Microsoft, we’re excited to showcase how Portworx runs on AKS to provide seamless support for any Kubernetes customer. Brendan and Eric Han, our VP of Product, were both part of the original Kubernetes team at Google and it is exciting to watch Kubernetes mature and extend into the enterprise.
Today’s post will look at Azure Kubernetes Service (AKS), a managed Kubernetes offering from Microsoft, which makes it easy to create, configure, and manage a cluster of virtual machines that are preconfigured to run containerized applications.
Portworx, is a cloud-native storage platform to run persistent workloads deployed on a variety of orchestration engines including Kubernetes. With Portworx, customers can manage the database of their choice on any infrastructure using any container scheduler. It provides a single data management layer for all stateful services, no matter where they run.
This tutorial is a walk-through of the steps involved in deploying and managing a highly available PostgreSQL cluster on AKS.
In summary, to run HA PostgreSQL on AKS you need to:
- Create an AKS cluster following these instructions, which this guide refers to as a set of compute nodes
- Provision storage nodes with Managed Disks in order to allow for compute nodes to scale independently of storage
- Install cloud native storage solution like Portworx as a daemon set on AKS
- Create storage class defining your storage requirements like replication factor, snapshot policy, and performance profile
- Deploy PostgreSQL using Kubernetes
- Test failover by killing or cordoning node in your cluster and confirming that data is still accessible
- Optional: dynamically resize Postgres volume, snapshot and backup Postgres to Azure object storage
How to set up an AKS cluster
Portworx is fully supported on Azure Kubernetes Service. Run the following commands to configure a 3 node cluster in Europe West. More on Azure AKS is available here.
$ az group create --name px --location westeurope $ az aks install-cli $ az aks create --resource-group px --name pxdemo --node-count 3 --generate-ssh-keys $ az aks get-credentials --resource-group px --name pxdemo
When the cluster is ready, verify it with the following command:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION aks-nodepool1-12748671-0 Ready agent 55m v1.9.6 aks-nodepool1-12748671-1 Ready agent 55m v1.9.6 aks-nodepool1-12748671-2 Ready agent 55m v1.9.6
Provision Azure Storage Nodes
Running storage nodes separately from compute nodes will allow us to independently scale compute and storage resources. We use Portworx to manage the storage nodes and also access storage from the compute nodes.
Follow these steps to create three storage node VMs. Afterwards, we attach the Managed Disk to the storage nodes in the Azure portal using these instructions. Finally, we ssh into the VM and install Portworx on each of the storage nodes using the commands below.
latest_stable=$(curl -fsSL 'https://install.portworx.com/1.4/?type=dock&stork=false' | awk '/image: / {print $2}') # Download OCI bits (reminder, you will still need to run `px-runc install ..` after this step) sudo docker run --entrypoint /runc-entry-point.sh \ --rm -i --privileged=true \ -v /opt/pwx:/opt/pwx -v /etc/pwx:/etc/pwx \ $latest_stable # Basic installation where sudo /opt/pwx/bin/px-runc install -c CLUSTER-NAME \ -k etcd://[etcd-service]:2379 \ -a -f # Reload systemd configurations, enable and start Portworx service sudo systemctl daemon-reload sudo systemctl enable portworx sudo systemctl start portworx
For the CLUSTER-NAME
, use the same string in each of your installs for that cluster.
Installing Portworx in AKS
Here is an example spec for deploying Portworx on Azure. Note that the [cluster-name]
should be replaced with cluster name provided to the storage nodes (in the prior step). Also the Kubernetes version used here is 1.9.6 and the etcd-service should be updated with an etcd endpoint.
$ kubectl apply -f portworx_azure_example_spec.yaml
After all the components of Portworx are installed, check for the Portworx Pods running running on each node as a DaemonSet. Example PX daemonset spec can be found here
$ kubectl get pods -n=kube-system -l name=portworx NAME READY STATUS RESTARTS AGE portworx-kw2kz 1/1 Running 0 43m portworx-pkgdf 1/1 Running 0 43m portworx-w6hkg 1/1 Running 0 43m
Get the first Pod name of the DaemonSet, and check the status of Portworx cluster.
$ PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
Notice that the total capacity available to Portworx is an aggregate of all the disks that we attached, which adds up to 600GiB.
Once the AKS cluster is up and running, and Portworx is installed and configured, we will deploy a highly available PostgreSQL database.
Creating a Postgres storage class in Kubernetes
Through Storage Class objects, an admin can define different classes of Portworx volumes that are offered in a cluster. These classes will be used during the dynamic provisioning of volumes. The Storage Class defines the replication factor, IO profile (e.g. for a database or a CMS), and priority (e.g. SSD or HDD). These parameters impact the availability and throughput of workload and can be specified for each volume. This is important because a production database will have different requirements than a development Jenkins cluster.
In this example, the Storage Class that we deploy has a replication factor of 3 with IO profile set to “db_remote”, and priority set to “high”. This means that the storage will be optimized for low latency database workloads like PostgreSQL and automatically placed on the highest performance storage available in the cluster.
$ cat > px-repl3-sc.yaml << EOF kind: StorageClass apiVersion: storage.k8s.io/v1beta1 metadata: name: px-repl3-sc provisioner: kubernetes.io/portworx-volume parameters: repl: "3" io_profile: "db_remote" priority_io: "high" EOF
$ kubectl create -f px-repl3-sc.yaml storageclass "px-repl3-sc" created
Creating a Postgres PVC on AKS
We can now create a Persistent Volume Claim (PVC) based on the Storage Class. Thanks to dynamic provisioning, the claims will be created without explicitly provisioning Persistent Volume (PV).
$ cat > px-postgres-pvc.yaml << EOF kind: PersistentVolumeClaim apiVersion: v1 metadata: name: px-postgres-pvc annotations: volume.beta.kubernetes.io/storage-class: px-repl3-sc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi EOF
$ kubectl create -f px-postgres-pvc.yaml persistentvolumeclaim "px-postgres-pvc" created
The password for PostgreSQL will be created as a secret. Run the following commands to create the secret in the correct format.
$ echo postgres123 > password.txt $ tr -d '\n' .strippedpassword.txt && mv .strippedpassword.txt password.txt $ kubectl create secret generic postgres-pass --from-file=password.txt secret "postgres-pass" created
Deploying Postgres on AKS
Finally, let’s create PostgreSQL instance as a Kubernetes deployment object. For simplicity sake, we will just be deploying a single Postgres pod. Because Portworx provides synchronous replication for High Availability, a single Postgres instance might be the best deployment option for your Postgres database. Portworx can also provide backing volumes for multi-node Postgres deployments. The choice is yours.
$ cat > postgres-app.yaml << EOF apiVersion: apps/v1 kind: Deployment metadata: name: postgres spec: strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: schedulerName: stork containers: - name: postgres image: postgres:9.5 imagePullPolicy: "Always" ports: - containerPort: 5432 env: - name: POSTGRES_USER value: pgbench - name: PGUSER value: pgbench - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: postgres-pass key: password.txt - name: PGBENCH_PASSWORD value: superpostgres - name: PGDATA value: /var/lib/postgresql/data/pgdata volumeMounts: - mountPath: /var/lib/postgresql/data name: postgredb volumes: - name: postgredb persistentVolumeClaim: claimName: px-postgres-pvc EOF
$ kubectl create -f postgres-app.yaml deployment "postgres" created
Wait till the Postgres pod is in running state.
We can inspect the Portworx volume by accessing the pxctl tool running with the Postgres pod.
$ VOL=`kubectl get pvc | grep px-postgres-pvc | awk '{print $3}'` $ PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}') $ kubectl exec -it $PX_POD -n kube-system -- /opt/pwx/bin/pxctl volume inspect ${VOL}
Failing over PostgreSQL on AKS
Let’s populate the database will 5 million rows of sample data.
We will first find the pod that’s running PostgreSQL to access the shell.
$ POD=`kubectl get pods -l app=postgres | grep Running | grep 1/1 | awk '{print $1}'` $ kubectl exec -it $POD bash
Now that we are inside the pod, we can connect to Postgres and create a database.
# psql pgbench=# c\q pgbench=# \l pgbench=# \q
By default, pgbench will create 4 tables (pgbench_branches, pgbench_tellers, pgbench_accounts, and pgbench_history) with 100,000 rows in the main pgbench_accounts table. This creates a simple 16MB database.
The -s option is used for multiplying the number of rows entered into each table. In the command below, we enter a “scaling” option of 50. This tells pgbench to create a database with 50 times the default size.
What this means is our pgbench_accounts table now has 5,000,000 records. It also means our database size is now 800MB (50 x 16MB).
# pgbench -i -s 50 pxdemo;
Wait for pgbench to finish populating the table. After that’s done, let’s verify that the pgbench_accounts is populated by 5 million rows.
# psql pxdemo \dt select count(*) from pgbench_accounts; \q exit
Now, let’s simulate the node failure by cordoning off the node on which PostgreSQL is running.
$ NODE=`kubectl get pods -l app=postgres -o wide | grep -v NAME | awk '{print $7}'` $ kubectl cordon ${NODE} node "ip-172-20-57-55.ap-southeast-1.compute.internal" cordoned
Executing kubectl get nodes confirms that scheduling is disabled for one of the nodes.
$ kubectl get nodes
We will now go ahead and delete the PostgreSQL pod.
$ POD=`kubectl get pods -l app=postgres -o wide | grep -v NAME | awk '{print $1}'` $ kubectl delete pod ${POD} pod "postgres-556994cbd4-b6ghn" deleted
As soon as the pod is deleted, it is relocated to the node with the replicated data. STorage ORchestrator for Kubernetes (STORK), Portworx’s custom storage scheduler allows co-locating the pod on the exact node where the data is stored. It ensures that an appropriate node is selected for scheduling the pod.
Let’s verify this by running the below command. We will notice that a new pod has been created and scheduled in a different node.
$ kubectl get pods -l app=postgres
Let’s uncordon the node to bring it back to action.
$ kubectl uncordon ${NODE} node "ip-172-20-57-55.ap-southeast-1.compute.internal" uncordoned
Finally, let’s verify that the data is still available.
Let’s find the pod name and exec into the container.
$ POD=`kubectl get pods -l app=postgres | grep Running | grep 1/1 | awk '{print $1}'` $ kubectl exec -it $POD bash
Now use psql to make sure our data is still there.
# psql pxdemo pxdemo=# \dt pxdemo=# select count(*) from pgbench_accounts; pxdemo=# \q pxdemo=# exit
Observe that the database table is still there and all the content intact!
Performing Storage Operations on Postgres on AKS
After testing end-to-end failover of the database, let’s perform StorageOps on our AKS cluster.
Expanding the Volume with no downtime
We will now run a bigger benchmark to run out of space to show how easy it is to add space to a volume dynamically.
Open a shell inside the container.
$ POD=`kubectl get pods -l app=postgres | grep Running | awk '{print $1}'` $ kubectl exec -it $POD bash
Let’s use pgbench to run a baseline transaction benchmark which will try to grow the volume to more than 1 Gib and fail.
$ pgbench -c 10 -j 2 -t 10000 pxdemo $ exit
There may be multiple errors during the execution of the above command. The first error indicates that Pod is running out of space.
PANIC: could not write to file “pg_xlog/xlogtemp.73”: No space left on device
Since Kubernetes doesn’t support modifying the PVC after creation, we perform this operation directly on Portworx with the pxctl cli tool.
Let’s get the volume name and inspect it through the pxctl tool.
If you have access, SSH into one of the nodes and run the following command.
POD=`/opt/pwx/bin/pxctl volume list --label pvc=px-postgres-pvc | grep -v ID | awk '{print $1}'` $ /opt/pwx/bin/pxctl v i $POD
Notice that the volume is within 10% of being full. Let’s expand it using the following command.
$ /opt/pwx/bin/pxctl volume update $POD --size=2 Update Volume: Volume update successful for volume 834897770479704521
Taking Snapshots of a Kubernetes volume and restoring the Postgres database
Portworx supports creating snapshots for Kubernetes PVCs.
Let’s create a snapshot for the PVC we created for Postgres.
$ cat > px-snap.yaml <<EOF apiVersion: volumesnapshot.external-storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: px-postgres-snapshot namespace: default spec: persistentVolumeClaimName: px-postgres-pvc EOF
kubectl create -f px-snap.yaml volumesnapshot "px-postgres-snapshot" created
You can see all the snapshots using the below command.
$ kubectl get volumesnapshot,volumesnapshotdata
With the snapshot in place, let’s go ahead and delete the database.
$ POD=`kubectl get pods -l app=postgres | grep Running | grep 1/1 | awk '{print $1}'` $ kubectl exec -it $POD bash $ psql drop database pxdemo; \l \q exit
Since snapshots are just like volumes, we can use it to start a new instance of PostgresSQL. Let’s create a new instance of PostgresSQL by restoring the snapshot data.
$ cat > px-snap-pvc.yaml <<EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: px-postgres-snap-clone annotations: snapshot.alpha.kubernetes.io/snapshot: px-postgres-snapshot spec: accessModes: - ReadWriteOnce storageClassName: stork-snapshot-sc resources: requests: storage: 2Gi EOF
kubectl create -f px-snap.yaml persistentvolumeclaim "px-postgres-snap-clone" created
From the new PVC, we will create a PostgreSQL pod.
$ cat > postgres-app-restore.yaml << EOF apiVersion: apps/v1 kind: Deployment metadata: name: postgres-snap spec: strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate replicas: 1 selector: matchLabels: app: postgres-snap template: metadata: labels: app: postgres-snap spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: px/running operator: NotIn values: - "false" - key: px/enabled operator: NotIn values: - "false" containers: - name: postgres image: postgres:9.5 imagePullPolicy: "IfNotPresent" ports: - containerPort: 5432 env: - name: POSTGRES_USER value: pgbench - name: PGUSER value: pgbench - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: postgres-pass key: password.txt - name: PGBENCH_PASSWORD value: superpostgres - name: PGDATA value: /var/lib/postgresql/data/pgdata volumeMounts: - mountPath: /var/lib/postgresql/data name: postgredb volumes: - name: postgredb persistentVolumeClaim: claimName: px-postgres-snap-clone EOF
kubectl create -f postgres-app-restore.yaml deployment "postgres-snap" created
Verify that the new pod is in running state
$ kubectl get pods -l app=postgres-snap
Finally, let’s access the data created by the benchmark tool earlier in the walkthrough.
$ POD=`kubectl get pods -l app=postgres-snap | grep Running | grep 1/1 | awk '{print $1}'` $ kubectl exec -it $POD bash $ psql pxdemo \dt select count(*) from pgbench_accounts; \q exit
Notice that the table is still there with the data intact.
Summary
Portworx can be easily deployed on Azure Kubernetes Service to run stateful workloads in productions. Through the integration of Portworx and AKS, DevOps and DataOps teams can seamlessly run highly available database clusters in AKS. They can perform traditional operations such as volume expansion, snapshots, backup and recovery for the cloud-native applications.
Share
Subscribe for Updates
About Us
Portworx is the leader in cloud native storage for containers.
Thanks for subscribing!
Janakiram MSV
Contributor | Certified Kubernetes Administrator (CKA) and Developer (CKAD)Explore Related Content:
- aks
- azure
- postgresql