Optimizing Nova Compute Host Utilization With OpenStack
For about three years, I have been talking with businesses that want to deploy OpenStack. Some have questions about what distribution they should use, some want to know about multi-cloud management, others are wondering how to best manage their clouds once they are installed — the list of questions is certainly long. One common theme that I have heard however, “Don’t worry we have the hardware under control, so we are in good shape there.” I understand the efficiency goals here, hardware is always the longest lead time for any infrastructure project. There are however some inherent complexities that will arise when deploying a software solution like OpenStack. Getting the hardware right for an OpenStack cloud deployment is not always as obvious as it seems. I wanted to take some time to explore how to approach the hardware you are going to use when you deploy OpenStack — specifically, nova compute node hardware.
When optimizing the nova compute hosts the resources should be viewed as a logical resource pool. A resource pool can be looked at from several angles. At the most basic level, a single server defines a resource pool. The next level would be a group, or rack of servers, and from there you can take this to a site, or availability zone level as well. The compute node is a foundational element in your cloud and will greatly effect the utilization, performance and economics of your deployment. The design needs to factor in both the workloads to be run as well as the hardware that will be deployed. Evaluating either of these factors in isolation will put your deployment at risk.
OpenStack has been designed to be flexible. What I have seen is that most people tend to create one or two flavor types when they start out then define more on an ad hoc basis. The OpenStack scheduler is then configured to cast (place) VMs only based on available RAM or CPU. While this will work out of the gate, what will happen to cloud operated in this fashion is that it will develop fragmentation over time. I have seen this first hand. If the hypervisors all have different amounts of CPU, RAM and Disk, and the flavor creation process is not strategic, these environments end up in a situation where you will have the resources to cast a new VM, but the RAM will be on one node, the CPU on another and the ephemeral disk may be on a third. In this situation you will not be able to launch new VMs and more importantly you will have spare / unusable resources across many nodes that cannot be utilized. As you scale out this infrastructure, unused resource will grow very quickly. The diagram below illustrates this unusable space on a single compute node.
To avoid this issue, the most efficient way to look at flavors is to think about them as fractions of the physical hardware node. When you combine that with the idea that flavors that are multiples of each other you can really maximize your space utilization on a compute node. Together this group of flavors is called a class. Your environment can also be built to support multiple classes each of which will be mapped to a distinct set of compute hardware in your environment.
The smallest VM in your class is referred to as the base class, in the example below this is std.vm.1. The other VMs in that class will be multiples of the base class. This configuration is built in to OpenStack, but OpenStack does provide the tools needed to manage a class of VMs that utilize this configuration.
If we take the following hardware and couple that with the flavor specification below you can easily calculate how many VMs you can run on a single piece of hardware.
2 x 8 Core-16 Thread
128 GB (6 GB Reserved for Hypervisor)
12 x 1TB Drives, RAID 6. Boot Volume not included
CPU Oversubscription set at 2
VMs Per Node
To ensure that these VMs are launched on the correct hardware, you can utilize the host aggregate facility in OpenStack Nova. The host aggregate properties allow you (amongst other things) to map meta data to a group of physical compute nodes. A host can be mapped to more than one aggregate. In the example above the four VM types can be mapped to a single compute node or an array of nodes depending on the scale of your deployment. Additionally this mapping is not exposed to cloud tenants and is only exposed to cloud administrators. This prevents the cloud tenants from needing to understand where to launch VMs and only need to know what flavor they require. Host aggregates can be used in any number of ways help manage your compute resources.
You can see by the illustration below that if you were to cast VMs at single nodes they would map neatly regardless of what the combination of flavors in that class is used. By utilizing nova host aggregates, a simple mapping of compute node to flavors can be maintained. This will ensure that each flavor or flavor class lands on the correct compute hardware. This in turn ensures the most favorable hardware resource utilization. In addition to the hardware efficiencies, there are operational efficiency gains that can be achieved when hardware has been standardised as well, but that is a topic for yet another blog posting.
There are many ways to combine hardware and flavor types to meet the needs of your specific environment and workloads. Many public cloud providers use this system today, as you can see from their standard offerings. Keeping these concepts in mind while you are planning your hardware order will be key to successful and maintainable OpenStack deployment.
Determining workloads is a critical step in any cloud deployment, based on the workloads that will run you will be able to define the virtual machine flavors. These requirements, along with the desired performance characteristics, are critical to determining the hardware that you will need to run your cloud. The reality is that in most environments, a range of flavors will be needed to support the workloads that will be deployed. By using flavors that are matched to hardware sizes, as the example above shows, you will prevent the creation of unusable compute resources with in your cloud. This will in turn drive your utilization higher, which is key to a successful deployment.
Author: Seth Fox