IT Management

Feb 12 ’09

Here’s a definition of virtualization from Wikipedia:

In computing, virtualization is a broad term that refers to the abstraction of computer resources. Abstraction is the process or result of generalization by reducing the information content of a concept or an observable phenomenon, typically in order to retain only information which is relevant for a particular purpose.

There you have it. Although this definition is a bit abstract, let’s discuss one of the hottest topics in computing.

I was first exposed to computer virtualization when I joined IBM after graduating from college in 1973. An instructor told me to “just think of virtualization as being able to carry 10 pounds of sand in a five-pound bag.”

From that point, I went on to learn in great detail how a hardware Dynamic Address Translation (DAT) box enabled the processor to operate as if it had more main memory than was physically attached. Relating to the bag of sand analogy, the value proposition was that it was easier for programmers to create 10 pounds of sand (code) when the physical capacity of the computer was limited to five pounds of sand. All that was needed was an auxiliary sand bag and a DAT box to switch grains of sand between bags in a way that was transparent to the executing program.

OK, just three more definitions to lay a proper foundation: Real means you can see it and it’s really there. Transparent means it’s there but you can’t see it. Virtual means you can see it, but it’s not really there.

Prior to memory virtualization, if a program exceeded real memory capacity, a subset of the code was loaded into executable memory (main memory) as a starting point and the remainder was stored on auxiliary storage (disk or tape) and subsequently loaded as needed. This meant the application programmer had the additional burden of creating and loading the program’s segments. When IBM virtualized the mainframe’s operating system 40 years ago, application programmers were liberated from dealing with program segmentation and overlays because the operating system would create the illusion of having more memory capacity than actually existed. During program load, just prior to execution, the operating system divides the program into small pieces and moves chunks of the program (called pages) between main memory and disk during execution. As you can imagine, there’s a performance cost for this illusion. The good news is that for the majority of programs, 80 percent of the activity occurs within 20 percent of the code, which keeps paging overhead more tolerable.

From these humble beginnings, resource virtualization has since become popular to varying degrees on platforms of all sizes. Virtualization techniques are used to manage operating systems, storage devices, and networks. The payoff comes through consolidation and reducing overall complexity in ways that decrease hardware costs, improve reliability, are eco-friendly, and translate into various operational efficiencies.

The terms logical and virtual are virtually synonymous. For example, on the mainframe a Logical Partition (LPAR) is the virtualization of physical processor and memory resources that ultimately allows multiple operating systems to run on the same machine. LPARs were essentially derived by taking a subset of the Virtual Machine (VM) operating system, which provides software partitioning of resources, and puts this functionality in hardware microcode for better performance.

The predecessor to z/OS is MVS, which stands for Multiple Virtual Storage. MVS excelled at running mixed workloads by allowing each application or subsystem to have its own virtual address space. This remains one of the pillars of the mainframe’s renowned bulletproof security design. It also introduced the notion of horizontal scalability, since the range of a system’s potential addressing capacity (range of memory addresses) is assigned for each address space. By the way, each address space shares the operating system, which resides in common virtual memory.

The mainframe hardware has been expanded over the decades. IBM has effectively put virtualization on steroids, adding features for high-speed, inter-address space communication and allowing data-only virtual spaces and hyperspaces.

When comparing platforms, all virtualization hasn’t been created equally, and an underpowered environment will get bogged down. The mainframe has added extensive hardware assist during its 40 years of refinement to turbo-charge hypervisors, which in turn facilitate “virtual guests” in ways that are well-beyond the industry’s “next best.” And because IBM owns the entire stack (hardware and software), it will be hard for any other provider to come up with a tighter knit technology.