Running Linux as a guest of z/VM presents a different set of performance problems from running Linux on a discrete server. One such problem area is in how Linux wants to manage its memory in a guest virtual machine environment. In such environments, multiple guest operating systems are hosted on top of a host operating system or hypervisor—in this case, z/VM’s Control Program (CP). The problem of overcommitting physical memory is solved either by dynamically adjusting the memory sizes of the guests or through transparent host paging. Both approaches can introduce significant overhead in heavily overcommitted memory scenarios due to frequent resize requests or high paging activity. This article discusses the design and implementation of a novel approach to this problem called Cooperative Memory Management (CMM) on Linux for System z and the z/VM hypervisor.
The problem of memory pressure, or the lack of free, allocatable memory when it’s needed, stems from the fact that guest operating systems such as Linux use all available memory given to them, usually using any “extra” virtual memory for file cache. As a result, static “partitioning” of the system would be significantly limited by real system memory. Static memory partitioning is also contrary to the nature of many systems today, which often show bursts of high utilization. z/VM virtualization technologies can effectively exploit this variability. Memory overcommitment is an attribute of the application mix that runs on a system and can’t be eliminated. Memory overcommitment occurs when a process is started and it allocates more memory than it really needs at start-up. This allows the process to begin actual work sooner, but it also causes available memory to run out sooner. The memory pressure resulting from memory overcommitment must be dealt with by either pushing it back into the guest operating system or resolving it in the hypervisor. So there’s potential for high paging rates in the hypervisor, guest, or both.
High paging rates have non-linear impact on application and system response times and limit the number of guests you can effectively deploy. This non-linear performance impact makes dealing with memory overcommitment unique compared to overcommitting other resources. Nevertheless, through proper global memory management, you can significantly reduce the symptoms experienced due to memory overcommitment.
How Does It Work?
The two main approaches to real memory management among multiple virtual guests running on a hypervisor are:
1. Dynamic partitioning, in which individual guests are forced to dynamically change their memory size to accommodate a global memory strategy
2. Memory virtualization, in which the hypervisor pages guest memory in a way similar to how any virtual memory operating system overcommits real memory to applications.
Both approaches have strengths and weaknesses; both can support overcommitment of available real memory.
The z/VM hypervisor takes the second approach, mapping guest virtual memory into the real memory storage of the System z machine. If there aren’t enough real memory frames to contain all the active guests’ virtual memory pages, some pages are moved to expanded storage (XSTOR). Once expanded storage becomes full, the guests’ pages are moved from expanded storage to DASD paging space.
Figure 1 provides a simplified view of the z/VM memory management mechanism, showing some inactive virtual storage pages in each Linux guest. These inactive virtual memory pages must be recovered for use by other guests, whether Linux-based or not.