For a large organization, maintaining software efficiency as data volume rises means increased resource use.…
March 18, 2022
Choosing a Kubernetes Operator for PostgreSQL
PostgreSQL (Postgres) is a robust stateful, object-relational, and open-source database known for its reliability, high performance, and extensibility. The database management system offers features including foreign key constraints, high availability, complex queries, SQL multi-version concurrency control (MVCC), and updatable views. These features make PostgreSQL suitable for providing resilient and highly available data services to Kubernetes (K8s) applications.
However, to implement PostgreSQL database components into a Kubernetes environment, you need an operator that automatically provisions, modifies, and monitors a PostgreSQL cluster. The operator monitors PostgreSQL cluster manifests — nodes, security, fault tolerance, and site topology — and makes necessary adjustments.
PostgreSQL Kubernetes operators include:
- Crunchy Data Postgres Operator
- Zalando Postgres Operator
- KubeDB Postgres Operator
- Percona Postgres Operator
Let’s take a closer look at these PostgreSQL Kubernetes operators. We’ll review some of each operator’s features and the installation of each operator to help you decide which one best meets your needs.
Crunchy Data Postgres Operator
Crunchy Postgres Operator for Kubernetes (PGO) automates deployment and simplifies management of Kubernetes-enabled PostgreSQL clusters. The operator also provides pod customization and PostgreSQL configuration features, delivering high availability, monitoring, and data recovery through the pgBackRest open-source utility.
Significant PGO features include:
- The Crunchy Data operator, which allows persistent storage configuration and works primarily with Dynamic StorageClasses, HostPath and the Network File Systems (NFS)
- PostgreSQL provisioning, which ensures the operator deploys healthy clusters
- Enables horizontal scaling for added redundancy and high availability
- Custom PostgreSQL configuration, which allows users to configure PostgreSQL workloads in production environments
- Transport Layer Security (TLS) to secure interactions between data servers and the Postgres operator (prerequisites for enabling TLS security include a CA certificate, TLS private key, and TLS certificate)
- pgMonitor library to monitor your PostgreSQL cluster and host environment health, metrics, and performance
- PostgreSQL user management to help manage user authentication, onboarding, and removal from PostgreSQL
- Node affinity deployments that enable the operator to schedule and deploy PostgreSQL clusters to Kubernetes nodes using node affinity
You can install the PGO PostgreSQL operator in two ways:
- Through a Postgres operator installer (works with Ansible)
- From a marketplace such as OperatorHub.io or Google Cloud Marketplace
How to Set Up PGO
You need these prerequisites to install and run PGO:
- Kubernetes v1.13+
- Application ports
- A Kubernetes cluster such as Amazon Elastic Kubernetes Service (Amazon EKS)
For detailed steps on installing and configuring PGO, dive into Crunchy Data’s documentation. The Crunchy Operator’s architecture relies on Kubernetes Custom Resources (CRs) to create various CustomResourceDefinitions (CRDs) for deploying and managing PostgreSQL clusters. These CRDs include:
For the operator to provision PostgreSQL clusters with custom resources, you need to start by adding attributes for defining a PostgreSQL cluster to the pgclusters.crunchydata.com CRD. Include details such as the cluster name, storage, secret references, sidecars, and high availability rules.
Next, the PostgreSQL Operator will run various tasks to deploy healthy clusters. One of these tasks is creating deployments for secret references such as the pgBackRest repository and PostgreSQL primary instances.
Once you set up the PostgreSQL instance, the operator creates a pgBackRest to leverage its repo’s replica provisioning function. The pgBackRest repository also includes functions and features like the following:
- Provides automatic healing failed of primaries through the “delta restore” component
- Provides cluster cloning capabilities
- Allows full, incremental, and differential backups
- Allows complete and point-in-time data restores
See Crunchy’s PostgreSQL Operator documentation for further guidance on setting up custom configurations, disaster recovery, user roles, and authentication.
Zalando Postgres Operator
Patroni powers Zalando’s Postgres Operator. The Operator helps you integrate into CI/CD pipelines automatically without accessing Kubernetes API. This direct access eliminates manual operations and automates resource provisioning. Also, the operator enables you to manage multiple Postgres clusters in various namespaces.
Much of the operator scope simplifies deploying Patroni-powered clusters on Kubernetes — rolling updates, provisioning, and cleaning up Postgres clusters — while Patroni handles high availability cluster bootstrapping operations.
How to Set Up the Zalando Postgres Operator
We recommend running Kubernetes locally using one of the following options:
- Minikube for creating single-node Kubernetes clusters
- k3d and kind for creating multi-node Kubernetes clusters
Also, be sure to install kubectl to interact with the Kubernetes infrastructure.
You can install the Postgres Operator one of three ways:
- Using the Helm chart
- With the customization manifest. You can use kubectl v1.14 or newer versions to install with kubectl apply -k github.com/zalando/postgres-operator/manifests
- Manually using YAML manifests through manual deployment (note that you’d need to configure the manifests to your Kubernetes environment)
Refer to Zalando’s documentation for detailed installation guidelines.
Once installation is complete and the operator is running, you can deploy a Postgres cluster. First, submit a Postgres cluster manifest with the command below:
kubectl create -f manifests/minimal-postgres-manifest.yaml
After the Postgres Operator adds and validates the cluster manifests, it creates endpoint and service resources plus a StatefulSet to generate new pods based on the manifest’s indicated instances.
Next, the database pods will run the Spilo image and Patroni pod template, identified by number suffixes. They’ll split into the main pod and another pod for replicas under the endpoints and services.
AppsCode developed KubeDB, which enables you to create operators for various databases. KubeDB automates routine PostgreSQL operations in Kubernetes, such as cluster provisioning, backup, recovery, patching, failure discovery, and repair. Additionally, the KubeDB operator reduces complexity by allowing you to manage one stack for all your stateful and stateless applications.
Other core KubeDB features include:
- The ability to choose between one-off backups or a frequency you prefer
- KubeDB only takes encrypted and non-duplicated backup, so you only pay for the cost of incremental storage
- The PostgreSQL operators use PersistentVolumeClaims (PVC) to provision disks for database instances, improving performance
- Prometheus-enabled monitoring for the KubeDB operator and its databases
- Supports PostgreSQL, MySQL, Redis, Memcached, MongoDB, and Elasticsearch
How to Set Up KubeDB
You’ll need these tools to run KubeDB:
- Kubernetes cluster
Since the KubeDB operator monitors Postgres objects in a Kubernetes environment, the KubeDB operator spins up a new StatefulSet plus two cluster IP services after creating the Postgres object.
What differentiates KubeDB from other PostgreSQL operators is that the other operators manage clusters with in-house tools instead of Patroni. Also, KubeDB maintains well-organized documentation in a stand-alone GitHub repository.
Though the company offers a free community operator, all the core components required to deploy a functional operator are only available in their subscription-based enterprise edition. Refer to KubeDB’s documentation for further installation instructions.
The StackGres Operator provides multiple frameworks for setting up resources, configuring persistent volumes, and defining configuration files, much like Zalando offers. The configurations are fully validated to include best practices and support for deploying the StackGres operator at an individual and enterprise level.
Some core challenges that teams face when handling distributed infrastructure are observability, performance tuning, and abstracting away the complexities associated with intertwined platforms. One of StackGres’s solutions to these challenges is adding Envoy to their operator, a proxy that helps pinpoint errors through observability and consistent performance tuning.
In doing so, the operator performs its Kubernetes tasks while at the same time collecting metrics and tuning the platform to increase its fault tolerance and availability.
How to Set Up StackGres
StackGres installation is relatively straightforward: you can quickly install it with a kubectl command or via Helm for production-ready and customizable installations:
kubectl apply -f 'https://sgres.io/install or helm install --namespace stackgres stackgres-operator --set-string adminui.service.type=LoadBalancer https://stackgres.io/downloads/stackgres-k8s/stackgres/latest/helm/stackgres-operator.tgz
After installation, confirm that the operator is ready for use with this command:
kubectl wait -n stackgres deployment -l group=stackgres.io --for=condition=Available
You’ll see the pods are Running when the operator is ready to use:
➜ kubectl get pods -n stackgres -l group=stackgres.io NAME READY STATUS RESTARTS AGE stackgres-operator-78d57d4f55-pm8r2 1/1 Running 0 3m34s stackgres-restapi-6ffd694fd5-hcpgp 2/2 Running 0 3m30s
You can now create, scale and customize Postgres clusters. Check out StackGres documentation for further guidance on how to configure and deploy StackGres clusters.
Percona Postgres Operator
The Percona Distribution for PostgreSQL Operator is Percona’s addition to the operator landscape, based on Crunchy Data’s operator. Percona’s architecture is quite similar to Crunchy Data, with a few additional features:
- Percona’s in-house monitoring solution, Percona Monitoring and Management (PMM), does the heavy lifting in monitoring. It also has a user-friendly web console for viewing high availability (HA) proxy metrics.
- Percona’s operator is storage-agnostic and can work with network file systems (NFSs), hostPath volumes, and diverse storage classes.
The Postgres Operator adds the components below to the Postgres container suite when deployed:
- Primary Postgres database
- pgBackRest utility for backups and disaster recovery
- Patroni-based high availability component
- pg-stat-monitor for monitoring and performance querying
- pgBouncer for PostgreSQL connection pooling
How to Set Up the Percona Postgres Operator
The latest Percona Postgres operator (1.10) officially supports these platforms:
- OpenShift by RedHat version 4.7 – 4.9
- Amazon Elastic Container Service (EKS) 1.18 – 1.21
- Google Kubernetes Engine (GKE) version 1.19-1.22
You can also install Percona via Minikube, or through Helm charts. You’ll need the kubectl tool for deploying the operator across all platforms.
Setting up Percona on each platform requires different configurations, all of which are available in their documentation.
The operators we’ve highlighted meet the base-level threshold for running high availability and fault-tolerant PostgreSQL clusters on Kubernetes. However, you’d need to provide dynamic and container-native storage for Postgres pods that maintain availability, even when a pod reschedules.
You need an end-to-end data management solution that meets your availability, performance, security, compliance, and service-level agreement (SLA) requirements. Learn more about how the Portworx Data Services database-as-a-service platform helps deploy and automatically manage your Kubernetes data services.