Portworx Guided Hands On-Labs. Register Now
This post is part of our ongoing series on running MongoDB on Kubernetes. We’ve published a number of articles about running MongoDB on Kubernetes for specific platforms and for specific use cases. If you are looking for a specific Kubernetes platform, check out these related articles.
Running HA MongoDB on Azure Kubernetes Service (AKS)
Running HA MongoDB on Amazon Elastic Container Service for Kubernetes (EKS)
Running HA MongoDB on Google Kubernetes Engine (GKE)
Running HA MongoDB on IBM Cloud Kubernetes Service (IKS)
Running HA MongoDB on IBM Cloud Private
Running HA MongoDB with Rancher Kubernetes Engine (RKE)
Failover MongoDB 300% faster and run only 1/3 the pods
And now, onto the post…
Red Hat OpenShift is a comprehensive enterprise-grade application platform built for containers powered by Kubernetes. OpenShift lets developers quickly build, develop, and deploy applications on nearly any infrastructure, public or private.
OpenShift comes in four flavors – OpenShift Origin, OpenShift Online, OpenShift Container Platform, and OpenShift Dedicated. OpenShift Origin is the upstream, open source version which can be installed on Fedora, CentOS or Red Hat Enterprise Linux. OpenShift Online is the hosted version of the platform managed by Red Hat. OpenShift Container Platform is the enterprise offering that can be deployed in the public cloud or within an enterprise data center. OpenShift Dedicated is a single-tenant, highly-available cluster running in the public cloud.
For this walk-through, we are using a cluster running OpenShift Origin.
Portworx, is a cloud-native storage platform to run persistent workloads deployed on a variety of orchestration engines including Kubernetes. With Portworx, customers can manage the database of their choice on any infrastructure using any container scheduler. It provides a single data management layer for all stateful services, no matter where they run.
Portworx is Red Hat certified for Red Hat OpenShift Container Platform and PX-Enterprise is available in the Red Hat Container Catalog. This certification enables enterprises to confidently run high-performance stateful applications like databases, big and fast data workloads, and machine learning applications on the Red Hat OpenShift Container Platform. Learn more about Portworx & OpenShift in our Product Brief.
This tutorial is a walk-through of the steps involved in deploying and managing a highly available MongoDB database on OpenShift.
In summary, to run HA MongoDB on OpenShift you need to:
- Create an OpenShift cluster running at least three nodes
- Install a cloud native storage solution like Portworx as a daemon set on OpenShift
- Create storage class defining your storage requirements like replication factor, snapshot policy, and performance profile
- Deploy MongoDB using Kubernetes
- Test failover by killing or cordoning node in your cluster and confirming that data is still accessible
- Dynamically resize MongoDB volume
- Take a snapshot and backup MongoDB to object storage
How to install and configure an OpenShift Origin cluster
OpenShift Origin can be deployed in a variety of environments ranging from VirtualBox to a public cloud IaaS such as Amazon, Google, Azure. Refer to the official installation guide for the steps involved in setting up your own cluster. For this guide, we run an OpenShift Origin cluster in Microsoft Azure. Follow the instructions mentioned in Azure documentation.
Your OpenShift cluster setup should look similar to the below configuration. It is recommended that you run at least 3 nodes for the HA configuration.
$ oc get nodes NAME STATUS ROLES AGE VERSION mycluster-infra-0 Ready none 1d v1.9.1+a0ce1bc657 mycluster-master-0 Ready master 1d v1.9.1+a0ce1bc657 mycluster-node-0 Ready compute 1d v1.9.1+a0ce1bc657 mycluster-node-1 Ready compute 1d v1.9.1+a0ce1bc657 mycluster-node-2 Ready compute 1d v1.9.1+a0ce1bc657
Though almost all the steps can be performed through the OpenShift Console, we are using the oc CLI
. Please note that most of the kubectl
commands are available through oc
tool. You may find the tools used interchangeably.
Installing Portworx on OpenShift
Since OpenShift is based on Kubernetes, the steps involved in installing Portworx are not very different from the standard Kubernetes installation. Portworx documentation has a detailed guide with the prerequisites and all the steps to install on OpenShift.
Before proceeding further, ensure that Portworx is up and running on OpenShift.
$ oc get pods -n=kube-system -l name=portworx
We can check the status of Portworx by running the following commands:
$ PX_POD=$(oc get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}') $ oc exec -it $PX_POD -n kube-system -- /opt/pwx/bin/pxctl status
Once OpenShift Origin cluster is up and running and Portworx is installed and configured, we will deploy a highly available MongoDB database.
Creating a storage class for MongoDB
Through storage class objects, an admin can define different classes of Portworx volumes that are offered in a cluster. These classes will be used during the dynamic provisioning of volumes. The storage class defines the replication factor, IO profile (e.g. for a database or a CMS), and priority (e.g. SSD or HDD). These parameters impact the availability and throughput of workload and can be specified for each volume. This is important because a production database will have different requirements than a development Jenkins cluster.
In this example, the storage class that we deploy has a replication factor of 3 with I/O profile set to “db,” and priority set to “high.” This means that the storage will be optimized for low latency database workloads like MongoDB and automatically placed on the highest performance storage available in the cluster. Notice that we also mention the filesystem, xfs in the storage class.
$ cat > px-mongo-sc.yaml << EOF kind: StorageClass apiVersion: storage.k8s.io/v1beta1 metadata: name: px-ha-sc provisioner: kubernetes.io/portworx-volume parameters: repl: "3" io_profile: "db_remote" priority_io: "high" fs: "xfs" EOF
Create the storage class and verify its available in the default namespace.
$ oc create -f px-mongo-sc.yaml storageclass.storage.k8s.io "px-ha-sc" created $ oc get sc NAME PROVISIONER AGE generic (default) kubernetes.io/azure-disk 52m px-ha-sc kubernetes.io/portworx-volume 13s stork-snapshot-sc stork-snapshot 17m
Creating a MongoDB PVC on OpenShift
We can now create a Persistent Volume Claim (PVC) based on the Storage Class. Thanks to dynamic provisioning, the claims will be created without explicitly provisioning a persistent volume (PV).
$ cat > px-mongo-pvc.yaml << EOF kind: PersistentVolumeClaim apiVersion: v1 metadata: name: px-mongo-pvc annotations: volume.beta.kubernetes.io/storage-class: px-ha-sc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi EOF $ oc create -f px-mongo-pvc.yaml persistentvolumeclaim "px-mongo-pvc" created $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE px-mongo-pvc Bound pvc-4a43eaca-999f-11e8-9135-000d3a1a1cdf 1Gi RWO px-ha-sc 15s
Deploying MongoDB on OpenShift
Finally, let’s create a MongoDB instance as a Kubernetes deployment object. For simplicity’s sake, we will just be deploying a single Mongo pod. Because Portworx provides synchronous replication for High Availability, a single MongoDB instance might be the best deployment option for your MongoDB database. Portworx can also provide backing volumes for multi-node MongoDB replica sets. The choice is yours.
$ cat > px-mongo-app.yaml << EOF apiVersion: apps/v1 kind: Deployment metadata: name: mongo spec: strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate replicas: 1 selector: matchLabels: app: mongo template: metadata: labels: app: mongo spec: schedulerName: stork containers: - name: mongo image: mongo imagePullPolicy: "Always" ports: - containerPort: 27017 volumeMounts: - mountPath: /data/db name: mongodb volumes: - name: mongodb persistentVolumeClaim: claimName: px-mongo-pvc EOF
$ oc create -f px-mongo-app.yaml deployment.extensions "mongo" created
The MongoDB deployment defined above is explicitly associated with the PVC, px-mongo-pvc
created in the previous step.
This deployment creates a single pod running MongoDB backed by Portworx.
$ oc get pods -l app=mongo NAME READY STATUS RESTARTS AGE mongo-97b758c4c-d845d 1/1 Running 0 1m
We can inspect the Portworx volume by accessing the pxctl
tool running with the Mongo pod.
$ VOL=`oc get pvc | grep px-mongo-pvc | awk '{print $3}'` $ PX_POD=$(oc get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}') $ oc exec -it $PX_POD -n kube-system -- /opt/pwx/bin/pxctl volume inspect ${VOL} Volume : 270718527425014856 Name : pvc-4a43eaca-999f-11e8-9135-000d3a1a1cdf Size : 1.0 GiB Format : xfs HA : 3 IO Priority : LOW Creation time : Aug 6 17:36:46 UTC 2018 Shared : no Status : up State : Attached: mycluster-node-2 (10.2.0.4) Device Path : /dev/pxd/pxd270718527425014856 Labels : pvc=px-mongo-pvc Reads : 130 Reads MS : 249 Bytes Read : 2326528 Writes : 108 Writes MS : 189 Bytes Written : 2453504 IOs in progress : 0 Bytes used : 10 MiB Replica sets on nodes: Set 0 Node : 10.2.0.6 (Pool 0) Node : 10.2.0.5 (Pool 0) Node : 10.2.0.4 (Pool 0) Replication Status : Up Volume consumers : - Name : mongo-97b758c4c-vlhrr (5ff99af2-999f-11e8-9135-000d3a1a1cdf) (Pod) Namespace : default Running on : mycluster-node-2 Controlled by : mongo-97b758c4c (ReplicaSet)
The output from the above command confirms the creation of volumes that are backing MongoDB database instance.
Failing over MongoDB pod on OpenShift
Populating sample data
Let’s populate the database with some sample data.
We will first find the pod that’s running MongoDB to access the shell.
$ POD=`oc get pods -l app=mongo | grep Running | grep 1/1 | awk '{print $1}'` $ oc exec -it $POD mongo MongoDB shell version v4.0.0 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 4.0.0 Welcome to the MongoDB shell. …..
Now that we are inside the shell, we can populate a collection.
db.ships.insert({name:'USS Enterprise-D',operator:'Starfleet',type:'Explorer',class:'Galaxy',crew:750,codes:[10,11,12]}) db.ships.insert({name:'USS Prometheus',operator:'Starfleet',class:'Prometheus',crew:4,codes:[1,14,17]}) db.ships.insert({name:'USS Defiant',operator:'Starfleet',class:'Defiant',crew:50,codes:[10,17,19]}) db.ships.insert({name:'IKS Buruk',operator:' Klingon Empire',class:'Warship',crew:40,codes:[100,110,120]}) db.ships.insert({name:'IKS Somraw',operator:' Klingon Empire',class:'Raptor',crew:50,codes:[101,111,120]}) db.ships.insert({name:'Scimitar',operator:'Romulan Star Empire',type:'Warbird',class:'Warbird',crew:25,codes:[201,211,220]}) db.ships.insert({name:'Narada',operator:'Romulan Star Empire',type:'Warbird',class:'Warbird',crew:65,codes:[251,251,220]})
Let’s run a few queries on the Mongo collection.
Find one arbitrary document:
db.ships.findOne() { "_id" : ObjectId("5b5c16221108c314d4c000cd"), "name" : "USS Enterprise-D", "operator" : "Starfleet", "type" : "Explorer", "class" : "Galaxy", "crew" : 750, "codes" : [ 10, 11, 12 ] }
Find all documents and using nice formatting:
db.ships.find().pretty() ….. { "_id" : ObjectId("5b5c16221108c314d4c000d1"), "name" : "IKS Somraw", "operator" : " Klingon Empire", "class" : "Raptor", "crew" : 50, "codes" : [ 101, 111, 120 ] } { "_id" : ObjectId("5b5c16221108c314d4c000d2"), "name" : "Scimitar", "operator" : "Romulan Star Empire", "type" : "Warbird", "class" : "Warbird", "crew" : 25, "codes" : [ 201, 211, 220 ] } …..
Shows only the names of the ships:
db.ships.find({}, {name:true, _id:false}) { "name" : "USS Enterprise-D" } { "name" : "USS Prometheus" } { "name" : "USS Defiant" } { "name" : "IKS Buruk" } { "name" : "IKS Somraw" } { "name" : "Scimitar" } { "name" : "Narada" }
Finds one document by attribute:
db.ships.findOne({'name':'USS Defiant'}) { "_id" : ObjectId("5b5c16221108c314d4c000cf"), "name" : "USS Defiant", "operator" : "Starfleet", "class" : "Defiant", "crew" : 50, "codes" : [ 10, 17, 19 ] }
Exit from the client shell to return to the host.
Simulating node failure
Now, let’s simulate node failure by cordoning off the OpenShift node on which MongoDB is running.
$ NODE=`oc get pods -l app=mongo -o wide | grep -v NAME | awk '{print $7}'` $ oc adm cordon ${NODE} node "mycluster-node-1" cordoned
The above command disabled scheduling on one of the nodes.
$ oc get nodes NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION mycluster-infra-0 Ready none 1h v1.9.1+a0ce1bc657 mycluster-master-0 Ready master 1h v1.9.1+a0ce1bc657 mycluster-node-0 Ready compute 1h v1.9.1+a0ce1bc657 mycluster-node-1 Ready,SchedulingDisabled compute 1h v1.9.1+a0ce1bc657 mycluster-node-2 Ready compute 1h v1.9.1+a0ce1bc657
Now, let’s go ahead and delete the MongoDB pod.
$ POD=`oc get pods -l app=mongo -o wide | grep -v NAME | awk '{print $1}'` $ oc delete pod ${POD} pod "mongo-97b758c4c-d845d" deleted
As soon as the pod is deleted, it is relocated to the node with the replicated data. STorage ORchestrator for Kubernetes (STORK), Portworx’s custom storage scheduler allows co-locating the pod on the exact node where the data is stored. It ensures that an appropriate node is selected for scheduling the pod.
Let’s verify this by running the below command. We will notice that a new pod has been created and scheduled in a different node.
$ oc get pods -l app=mongo -o wide NAME READY STATUS RESTARTS AGE IP NODE mongo-97b758c4c-sssfg 1/1 Running 0 18s 10.129.0.7 mycluster-node-2
Let’s uncordon the node to bring it back to action.
$ oc adm uncordon ${NODE} node "mycluster-node-1" uncordoned
Finally, let’s verify that the data is still available.
Verifying that the data is intact
Let’s find the pod name and run the ‘exec’ command, and then access the Mongo shell.
$ POD=`oc get pods -l app=mongo | grep Running | grep 1/1 | awk '{print $1}'` $ oc exec -it $POD mongo MongoDB shell version v4.0.0 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 4.0.0 Welcome to the MongoDB shell. …..
We will query the collection to verify that the data is intact.
Find one arbitrary document:
db.ships.findOne() { "_id" : ObjectId("5b5c16221108c314d4c000cd"), "name" : "USS Enterprise-D", "operator" : "Starfleet", "type" : "Explorer", "class" : "Galaxy", "crew" : 750, "codes" : [ 10, 11, 12 ] }
Find all documents and using nice formatting:
db.ships.find().pretty() ….. { "_id" : ObjectId("5b5c16221108c314d4c000d1"), "name" : "IKS Somraw", "operator" : " Klingon Empire", "class" : "Raptor", "crew" : 50, "codes" : [ 101, 111, 120 ] } { "_id" : ObjectId("5b5c16221108c314d4c000d2"), "name" : "Scimitar", "operator" : "Romulan Star Empire", "type" : "Warbird", "class" : "Warbird", "crew" : 25, "codes" : [ 201, 211, 220 ] } …..
Shows only the names of the ships:
db.ships.find({}, {name:true, _id:false}) { "name" : "USS Enterprise-D" } { "name" : "USS Prometheus" } { "name" : "USS Defiant" } { "name" : "IKS Buruk" } { "name" : "IKS Somraw" } { "name" : "Scimitar" } { "name" : "Narada" }
Finds one document by attribute:
db.ships.findOne({'name':Narada'}) { "_id" : ObjectId("5b5c16221108c314d4c000d3"), "name" : "Narada", "operator" : "Romulan Star Empire", "type" : "Warbird", "class" : "Warbird", "crew" : 65, "codes" : [ 251, 251, 220 ] }
Observe that the MongoDB collection is still there and all the content is intact! Exit from the client shell to return to the host.
Performing Storage Operations on MongoDB
After testing end-to-end failover of the database, let’s perform StorageOps on our OpenShift cluster.
Expanding the OpenShift Volume with no downtime
Currently the Portworx volume that we created at the beginning is of 1Gib size. We will now expand it to double the storage capacity.
First, let’s get the volume name and inspect it through the pxctl tool.
If you have access, SSH into one of the nodes and run the following command.
$ POD=`/opt/pwx/bin/pxctl volume list --label pvc=px-mongo-pvc | grep -v ID | awk '{print $1}'` $ /opt/pwx/bin/pxctl v i $POD Volume : 270718527425014856 Name : pvc-4a43eaca-999f-11e8-9135-000d3a1a1cdf Size : 1.0 GiB Format : xfs HA : 3 IO Priority : LOW Creation time : Aug 6 17:36:46 UTC 2018 Shared : no Status : up State : Attached: mycluster-node-2 (10.2.0.4) Device Path : /dev/pxd/pxd270718527425014856 Labels : pvc=px-mongo-pvc Reads : 130 Reads MS : 249 Bytes Read : 2326528 Writes : 108 Writes MS : 189 Bytes Written : 2453504 IOs in progress : 0 Bytes used : 10 MiB Replica sets on nodes: Set 0 Node : 10.2.0.6 (Pool 0) Node : 10.2.0.5 (Pool 0) Node : 10.2.0.4 (Pool 0) Replication Status : Up Volume consumers : - Name : mongo-97b758c4c-vlhrr (5ff99af2-999f-11e8-9135-000d3a1a1cdf) (Pod) Namespace : default Running on : mycluster-node-2 Controlled by : mongo-97b758c4c (ReplicaSet)
Notice the current Portworx volume. It is 1GiB. Let’s expand it to 2GiB.
$ /opt/pwx/bin/pxctl volume update $POD --size=2 Update Volume: Volume update successful for volume 270718527425014856s
Check the new volume size. It is expanded to 2GiB.
$ /opt/pwx/bin/pxctl v i $POD Volume : 270718527425014856 Name : pvc-4a43eaca-999f-11e8-9135-000d3a1a1cdf Size : 2.0 GiB Format : xfs HA : 3 IO Priority : LOW Creation time : Aug 6 17:36:46 UTC 2018 Shared : no Status : up State : Attached: mycluster-node-2 (10.2.0.4) Device Path : /dev/pxd/pxd270718527425014856 Labels : pvc=px-mongo-pvc Reads : 135 Reads MS : 257 Bytes Read : 2347008 Writes : 230 Writes MS : 364 Bytes Written : 3235840 IOs in progress : 0 Bytes used : 11 MiB Replica sets on nodes: Set 0 Node : 10.2.0.6 (Pool 0) Node : 10.2.0.5 (Pool 0) Node : 10.2.0.4 (Pool 0) Replication Status : Up Volume consumers : - Name : mongo-97b758c4c-vlhrr (5ff99af2-999f-11e8-9135-000d3a1a1cdf) (Pod) Namespace : default Running on : mycluster-node-2 Controlled by : mongo-97b758c4c (ReplicaSet)
Taking Snapshots of a OpenShift volume and restoring the database
Portworx supports creating snapshots for OpenShift PVCs.
Let’s create a snapshot for the PVC we created for MongoDB.
cat > px-mongo-snap.yaml << EOF apiVersion: volumesnapshot.external-storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: px-mongo-snapshot namespace: default spec: persistentVolumeClaimName: px-mongo-pvc EOF
$ oc create -f px-mongo-snap.yaml volumesnapshot.volumesnapshot.external-storage.k8s.io "px-mongo-snapshot" created
Verify the creation of volume snapshot.
$ oc get volumesnapshot NAME AGE px-mongo-snapshot 7s
$ oc get volumesnapshotdatas NAME AGE k8s-volume-snapshot-0cb1c49f-9325-11e8-bae2-0a580a800005 7s
With the snapshot in place, let’s go ahead and delete the database.
$ POD=`oc get pods -l app=mongo | grep Running | grep 1/1 | awk '{print $1}'` $ oc exec -it $POD mongo
db.ships.drop()
Since snapshots are just like volumes, we can use it to start a new instance of MongoDB. Let’s create a new instance of MongoDB by restoring the snapshot data.
$ cat > px-mongo-snap-pvc << EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: px-mongo-snap-clone annotations: snapshot.alpha.kubernetes.io/snapshot: px-mongo-snapshot spec: accessModes: - ReadWriteOnce storageClassName: stork-snapshot-sc resources: requests: storage: 2Gi EOF $ oc create -f px-mongo-snap-pvc.yaml persistentvolumeclaim "px-mongo-snap-clone" created
From the new PVC, we will create a MongoDB pod.
cat < px-mongo-snap-restore.yaml >> EOF apiVersion: apps/v1 kind: Deployment metadata: name: mongo-snap spec: strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate replicas: 1 selector: matchLabels: app: mongo-snap replicas: 1 template: metadata: labels: app: mongo-snap spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: px/running operator: NotIn values: - "false" - key: px/enabled operator: NotIn values: - "false" spec: containers: - name: mongo image: mongo imagePullPolicy: "Always" ports: - containerPort: 27017 volumeMounts: - mountPath: /data/db name: mongodb volumes: - name: mongodb persistentVolumeClaim: claimName: px-mongo-snap-clone EOF $ oc create -f px-mongo-snap-restore.yaml deployment.extensions "mongo-snap" created
Verify that the new pod is in running state.
$ oc get pods -l app=mongo-snap NAME READY STATUS RESTARTS AGE mongo-snap-85474d56c-f2ff7 1/1 Running 0 3m
Finally, let’s access the sample data created earlier in the walkthrough.
$ POD=`oc get pods -l app=mongo-snap | grep Running | grep 1/1 | awk '{print $1}'` $ oc exec -it $POD mongo MongoDB shell version v4.0.0 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 4.0.0 Welcome to the MongoDB shell. ….. sdb.ships.find({}, {name:true, _id:false}) { "name" : "USS Enterprise-D" } { "name" : "USS Prometheus" } { "name" : "USS Defiant" } { "name" : "IKS Buruk" } { "name" : "IKS Somraw" } { "name" : "Scimitar" } { "name" : "Narada" }
Notice that the collection is still there with the data intact. We can also push the snapshot to Amazon S3 if we want to create a Disaster Recovery backup in another Amazon region. Portworx snapshots also work with any S3 compatible object storage, so the backup can go to a different cloud or even an on-premises data center.
Summary
Portworx can be easily deployed on Red Hat OpenShift to run stateful workloads in production. Through the integration of Portworx and OpenShift, DevOps and DataOps teams can seamlessly run highly available database clusters in OpenShift. They can perform traditional operations such as volume expansion, snapshots, backup and recovery for the cloud-native applications.
Share
Subscribe for Updates
About Us
Portworx is the leader in cloud native storage for containers.
Thanks for subscribing!
Janakiram MSV
Contributor | Certified Kubernetes Administrator (CKA) and Developer (CKAD)Explore Related Content:
- databases
- kubernetes
- mongodb
- openshift
- red hat