PX-Autopilot for Capacity Management tackles these problems head-on by providing automated operations for storage infrastructure…
June 18, 2020
Automated Storage Pool Expansion with PX-AutoPilot
Hello and welcome back to Portworx Lightboard Sessions and today, we’re gonna be talking about AutoPilot, and this is a feature underneath the capacity management for automatically managing both volumes and pool sizes. Today, we’re gonna focus on how you automatically manage PVCs in Kubernetes. So when Portworx is deployed to Kubernetes, you can actually integrate a monitoring and metrics, right? This is things like LogDNA, Prometheus, Datadog, those kind of things. And so these are usually deployed side-by-side or kind of externally to Kubernetes, however that may work, but in either case, Portworx understands how to export its own metrics volume so things like how much capacity is being used in a particular volume, how many I/Os are going to a volume, how many replicas exist where, what’s the throughput of the volume itself, etcetera.
So Portworx can make that available at an API that it exposes automatically. And so, something like Prometheus can target right into this API and gather those metrics. Now, once that is complete, you can deploy what is called AutoPilot. And as the name suggests, this is all about automatically doing operations when conditions are met. When we talk about AutoPilot, there are rules and these rules are made up of actions and conditions. And actions and conditions are things that the metrics are going to provide, so things like capacity, so that’s what we’re gonna focus on today.
Now, let’s say we have a couple of applications running in our cluster, let’s say that one of them is Postgres. And this Postgres database has a volume which is 10 gigabytes, and this is a Portworx provided volume. Now these metrics for this volume get sent over to the metrics API, then at the same time, AutoPilot is able to pull these metrics from the deployment of Prometheus or whatever you have for metrics. Now, once AutoPilot is aware of these metrics, you can create these rules. How you would target this application in this volume is by a selector, so if this application has a label of app equals Postgres, something like this, then you can hand this to your AutoPilot rule. And then we talked about the rule being made up of conditions and actions and we’re gonna separate these two here.
A condition may be that the volume is, let’s say, the usage is greater than 60%. And so this means that if you have a 1 gigabyte volume, and you crested that 60% of 1 gigabyte, then this rule will kick into action and once you have a rule, you need action, right? We can say grow by 200% so if you have 1 gigabyte volume, it’ll become 3 because 200% of 1 is 2 and that’s what’s added. Now, Portworx can automatically resize volumes already, this just makes the manual intervention of editing a YAML file or having to do it yourself eliminated, right?
Another thing to keep in mind is that you can have limits on this. You can also put in never bigger than, let’s say, 500 gigabytes. If you have a rogue application and it’s getting lots of data, you don’t want this thing to grow more than 500 gigabytes because you know the underlying storage or limits on certain namespaces or groups within your DevOps organization, you wanna propose these limits, right? And the other part of this that we’ll cover in another video is that, if you do have these problems where it becomes bigger than what you have underneath Portworx, you can scale Portworx itself automatically with AutoPilot, but we’re gonna focus on the volumes here. Let’s say Postgres starts using its volume and it starts using 6 of the 10 gigabytes. This rule kicks in, AutoPilot will wait 30 seconds for conditions to stabilize, so this condition needs to stabilize and be true for the 30 seconds. And if it is, it’ll trigger this action and 10 gigabytes will become 30, and that’ll automatically happen. Kubernetes will become aware of it, Portworx will become aware of it and then Postgres has more data to work with, more space to work with. And this rule remains true. If it continues to grow and it uses 60% of that 30, it will trigger again.
Now, in order for you to have some control over how often it triggers, we do allow you to kind of have a way to set a cool-down period in the rule, which is after it triggers, it won’t trigger again for two, five minutes, etcetera. And these are all fairly fine grading configurable. So this is what AutoPilot can do for you and there are many use cases. This works with both StatefulSets and single deployments of databases and stateful applications in Kubernetes. Hopefully, you find it useful. I will add links below in the description of this video around resources where you can learn more and try it out yourself. Thanks.