shutterstock

Amazon Web Services (AWS) is a leader in cloud services with its highly efficient databases making information available around the clock. It offers more than fifteen database options, including relational and NoSQL.

Kubernetes is another popular cloud service, helping organizations manage and scale containerized applications and workloads. This open-source platform’s self-healing nature ensures almost no downtime during an outage, and organizations choose it for its performance and scalability.

AWS offers three ways to manage and run Kubernetes clusters depending on your security and role management needs. You can use an Amazon Elastic Compute Cloud (EC2) instance for DIY Kubernetes clusters, Amazon Elastic Kubernetes Service (EKS) for managed Kubernetes solution, or Amazon Elastic Container Registry (ECR) to host your container images.

In this article, we’ll explore various AWS database services and discuss how Kubernetes works with each. We’ll consider Amazon Relational Database (RDS) — which works with Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server — and the Amazon Redshift relational database. We’ll also explore NoSQL databases, including Amazon DynamoDB, Amazon ElastiCache, Amazon DocumentDB, Amazon Neptune, Amazon Timestream, and Amazon Quantum Ledger Database (QLDB).

Using AWS Relational Database Services with Kubernetes

 

Amazon Relational Database Services (RDS)

A relational database is the simplest type for Internet applications, including simple websites and mobile applications. Amazon RDS is a managed relational database service that provides six engine options:

It features exceptional ease of use with the AWS console and SQL. Additionally, RDS Read Replicas help achieve higher read throughput and increased availability, and RDS also features Multi-Availability Zone deployments. Furthermore, you pay only for what you use.

When using RDS, consider using an open-source database engine for the application. Depending on the properties you wish to achieve, you can use MySQL, MariaDB, or PostgreSQL from RDS. You might select the popular MySQL database on AWS if the application does not require high concurrency or replication properties.

Consider a patient record management application in a hospital, where we must maintain data integrity in a stateless Kubernetes environment. To ensure the accuracy and safety of the hospital’s data, we need to configure a Kubernetes pod to handle the database. Also, we need to implement Kubernetes disaster recovery methods to enable a continuous connection to the users.

We could use MariaDB for our application’s fast data replication property and its compatibility with MySQL. Say, for example, a travel agency needs to provide details to its customers worldwide. We can meet this goal using different AWS availability zones, MariaDB’s fast replication properties, and Kubernetes scalability.

To connect the MariaDB instance, running via AWS RDS, with Kubernetes, we need to perform four main steps:

  1. Create an RDS instance with MariaDB as the engine
  2. Configure security groups to allow traffic
  3. Create and deploy a Kubernetes external service manifest
  4. Connect the RDS database instance

Also, we can leverage some open-source repositories to connect Kubernetes to an AWS RDS instance.

PostgreSQL is powerful in processing many concurrent requests. Applications with many simultaneous users, such as Blackboard, use PostgreSQL in AWS RDS to handle this degree of activity. To use this engine with AWS RDS and connect it with Kubernetes, we just need to follow the same steps as above but select PostgreSQL as the engine.

Existing applications are moving toward using cloud infrastructure. For example, government websites use commercialized databases such as Oracle and SQL Server. We must use an operator to connect a Kubernetes cluster to an Oracle database, but this is challenging in AWS architecture because the Oracle Database operator for Kubernetes native is in the development stage.

The steps for connecting the AWS RDS SQL Server instance with Kubernetes are the same as above. However, as SQL Server is not open-source, select the “license included” model or provide your own license.

To minimize the cost without compromising database access speed, we can use Amazon Aurora. It costs as little as an open-source database but features the performance of a commercialized database. It’s also compatible with MySQL and PostgreSQL-type database requirements. Enormous organizations such as the United Nations and Verizon use SQL and PostgreSQL-compatible Aurora. To use Aurora with the Kubernetes cluster in AWS, we can use AWS Controllers for Kubernetes (ACK), which enables us to use other Amazon services in a Kubernetes cluster.

Amazon Redshift

AWS Redshift is a data warehouse solution for large applications, and it can drive business intelligence and real-time analysis. With the Kubernetes cluster in place, AWS Redshift can work as a real-time analysis layer to provide meaningful insights from the data. To connect AWS Redshift to the Kubernetes cluster, first, we need to provide the necessary access permissions to establish the connection. Then, we can use the connection string to connect it with the Kubernetes cluster.

Using AWS NoSQL Database Services with Kubernetes

Amazon DynamoDB

Amazon DynamoDB is a key-value pair commercialized database service that developers typically use for e-commerce and gaming. It is easier and lighter to leverage DynamoDB with the help of a containerized Kubernetes system. Using the AWS service operator helps us keep up the DocumentDB statefulness with Kubernetes stateless architecture. Open-source Kubernetes operators and Helm charts can make it easier to connect AWS DynamoDB to Kubernetes.

Amazon ElastiCache

Amazon ElastiCache helps us cache, manage sessions, manage geospatial apps, and more. It provides an in-memory cache service for low-latency response. We can use it as a caching layer for Kubernetes using either the Redis or Memcached engine. Service controllers and Helm charts are also beneficial for connecting AWS ElastiCache to Kubernetes. AWS ElastiCache alongside AWS RDS can provide faster reads because of the added caching layer.

Amazon DocumentDB

Amazon DocumentDB is a document-type database service to provide features such as content management, catalogs, and user profiles. It is a managed MongoDB service. After running a Kubernetes cluster, we must create a pod with the correct image to leverage DocumentDB in Kubernetes.

Amazon Neptune

Amazon Neptune is a graph database service for highly-connected data. Using a reformed visualized output from graph queries can detect anomalies and fraud. AWS Neptune’s direct connection is restricted to VPC only. For development purposes, bastion servers are helpful to create a tunnel that connects a Neptune endpoint from Kubernetes.

Amazon Timestream

Amazon Timestream is a time series-based database for the Internet of things, industrial telemetry, DevOps, and more. Domains where time series-based data is crucial widely use Timestream. You can connect a Kubernetes cluster in a Raspberry Pi to Timestream.

Amazon Quantum Ledger Database

Amazon Quantum Ledger Database (QLDB) is a ledger database that records data such as supply chain information, banking transactions, and registrations. It provides an immutable, transparent transaction log that can be cryptographically verified. According to AWS documentation, it is possible to connect AWS QLDB using a microservice-based architecture in AWS Elastic Kubernetes Services (AWS EKS). However, it requires manual effort as no Kubernetes operator for AWS QLDB is available.

Conclusion

AWS and Kubernetes are widely used technologies. Amazon offers a variety of relational and NoSQL databases that we can use with Kubernetes. The database you choose greatly depends on your application’s needs. It may take some manual work to leverage Kubernetes while using AWS services. To save the developer time and resources needed to navigate the challenging configurations, consider Portworx’s database-as-a-service (DBaaS) platform. Learn more about how Portworx Data Services can deploy data services on Kubernetes with just one click.

Share
Subscribe for Updates

About Us
Portworx is the leader in cloud native storage for containers.

tim

Tim Darnell

Tim is a Principal Technical Marketing Manager within the Cloud Native Business Unit at Pure Storage. He has held a variety of roles in the two decades spanning his technology career, most recently as a Product Owner and Master Solutions Architect for converged and hyper-converged infrastructure targeted for virtualization and container-based workloads. Tim joined Pure Storage in October of 2021.

Explore Related Content:
  • Amazon EKS
  • kubernetes
link
April 8, 2024 Featured
Portworx, Red Hat OpenShift Virtualization, and KubeVirt
Tim Darnell
Tim Darnell
link
Optimizing Red Hat OpenShift
February 28, 2024 Technical Insights
Optimizing Red Hat OpenShift Deployments: Addressing Common Storage Challenges
Andy Gower
Andy Gower
link
February 2, 2024 Product Announcements
Portworx Enterprise 3.1.0 - Introducing Journal I/O Profile
Tim Darnell
Tim Darnell