Mar 1 ’03

A VM Renaissance: VM and Linux

by Philip H. Smith III in z/Journal

Anyone who has followed the Virtual Machine (VM) Operating System (OS) over the last 20 years knows that times have often been tough for VM and its users. The good news is that VM is undergoing a renaissance. The term “renaissance” is particularly apt in this case. Derived from words meaning “new birth,” renaissance is defined as, 1) a revival of intellectual or artistic achievement and vigor: the Celtic Renaissance, or 2) the period of such a revival. VM has indeed been reborn in a new role.

History

IBM’s VM OS has never followed a predictable path. Created to allow testing multiple copies of OS/360 on a single piece of expensive hardware, VM’s flexibility quickly made it the original personal computing environment. Each user was able to customize his or her virtual machine and run any programs available, without requiring data center assistance or authorization. VM spread to IBM installations across the business spectrum, and was also popular with universities because of its flexibility and efficiency

In the early 1980s, IBM announced an Object Code Only (OCO) policy for VM. Source code would not be available for future releases. Since VM installations traditionally made extensive use of the source for debugging and local system enhancements, this was disappointing. Customers lobbied IBM and slowed OCO progression, but were unsuccessful until other factors caused VM to drop “below the radar” for IBM executives. After that, source was gradually restored for many modules. Later, IBM announced that future modules would have source provided unless there was a specific asset-protection (trade secret) issue involved.

Even without the OCO battle, parts of IBM seemed to hate its own product at times during the 1980s and 1990s. Customers reported IBM representatives telling them “VM is going away,” and “VM will be unsupported soon.” VM’s support of new hardware often ran months or years behind MVS, which helped bolster its image as a dying product. This occurred even though IBM itself depended heavily on VM, using it for corporate e-mail and many critical systems, including:

However, fear, uncertainty, and doubt (FUD) led many committed VM installations to forge plans to migrate off of VM. Fueling the FUD were the messages received by customer management, combined with the client/server movement’s ascent — and, later, distributed systems. Some sites have succeeded, although many companies are now six to eight years into two-year migration plans, with no end in sight!

Meanwhile, IBM’s downsizing hit the VM team hard. From a peak of more than 1,000 programmers, planners, and support staff, the VM organization has shrunk to fewer than 150. However, as one might hope, the current crew is topnotch, and has achieved amazing results in supporting existing code and adding new function and hardware support.

Another factor in VM’s decline during the 1990s was the shift that occurred as e-mail moved from the mainframe (PROFS/OfficeVision) to the LAN (Microsoft Exchange and Lotus Notes). End users no longer wanted VM accounts; they wanted PCs on their desktop. Attempts to add a VM Graphical User Interface (GUI) were less than successful, due both to performance issues and difficulty mapping 3270 displays to modern GUIs.

During these “decades of doubt,” the VM faithful persevered. Customers at local and national user groups, such as SHARE, continued to find innovative ways to use VM. In some cases, upper management “forgot” that they used VM, and were occasionally surprised to find that critical systems were humming quietly along on this “dead” OS.

The IBM VM team churned out new releases, often annually or even more frequently. As systems grew, and the builtin limit of 16 megabytes (MB) became an issue, the VM/SP High-Performance Option (VM/SP HPO) emerged. HPO was VM/SP with improved scheduling and the ability to use up to 64MB of real memory, although the storage above the 16MB line was used only for high-performance page space.

From the mid- to late-1980s, VM/SP was the choice of small shops, and VM/SP HPO was used on large systems. The 16MB line quickly became a serious limitation for all mainframe OSes and IBM responded with the 3090 — the first machines featuring 31-bit address support for up to 2 gigabytes (GB) of real storage.

With the advent of 31-bit hardware, VM’s origins as a test platform for MVS came to the fore again, with releases of:

These systems provided only rudimentary versions of the VM end-user environment, lacking support for much beyond system maintenance and operation, and were intended only to support MVS guests.

Finally, 1988 brought a “real” 31-bit VM: VM/XA System Product (VM/XA SP). VM/XA SP’s early days were rocky. Considerable new code had been written, and despite the usual extended testing inside IBM, some significant problems existed. Worse, some problems were with system maintenance tools themselves, which had also been improved to support the newer, more complex operating environment. Many VM system programmers from that era remember with horror having to repeatedly rebuild their entire VM maintenance system after attempts to apply service left it in an unusable state. With much hard work by IBM, SHARE, and IBM customer councils, the VM maintenance philosophy and toolset matured and stabilized.

But even VM/XA SP was a stopgap. Some Conversational Monitor System (CMS) and Control Program (CP) facilities present in VM/SP were still not present (such as IUCV), and so some shops were forced to run VM/SP HPO awhile longer. Meanwhile, IBM was hard at work on the next hardware generation, Enterprise Systems Architecture (ESA). After three releases of VM/XA SP (1.0, 2.0, and 2.1), IBM released VM/ESA 1.0. This new version converged functionality of VM/SP and VM/XA, providing a 31-bit CMS with all of VM/SP’s features and facilities.

VM/ESA quickly became the VM. Of course, a few installations couldn’t upgrade due to obsolete hardware, or chose not to for budgetary reasons, but uptake was quick.

In 1996, VM/ESA received a version change, with the release of VM/ESA 2.1. Through four releases, VM/ESA V2 achieved remarkable stability. VM installations typically measured up-time in months, with most outages caused by hardware changes, not software problems. VM systems, like their MVS brethren, had achieved Reliability, Availability, and Serviceability (RAS) goals any CIO could be proud of. Contributing factors to this included:

Many VM users felt it was too late. Intel-based machines had taken the desktop from CMS, and Intel- and Unix-based servers were making great inroads in the back-office. The Year/2000 (Y2K) problem put a brief spotlight on mainframes again, although often not a flattering one. The reality was as mainframe folks expected — Most of the work required minor and easily identified code changes and Y2K was largely a non-event.

The Turnaround

Throughout 1999, a new phenomenon was quietly awakening the VM community. A loose coalition of VM and Linux users decided that the traditional strengths of the mainframe, coupled with Linux’s flexibility and power, would be a marriage made in heaven. After discussion on the VM/ESA mail- ing list, they started working and, after several months, had a bootable kernel called “Bigfoot.”

Some of those involved with the project had an ulterior motive. They knew that IBM’s Boëblingen, Germany team had started work on Linux for System/390, but internal politics blocked them from releasing it. Some Bigfoot team members wanted to force IBM to release their port, thus making it official. In January 2000, IBM released Linux for System/390. Although it was somewhat less than a commercial distribution — users had to download it from a site at Marist College — nevertheless it legitimized the concept and the Bigfoot project was quickly abandoned.

The revolution was quiet. No frontpage stories in Computerworld resulted (though some trade press did carry the story). Microsoft stock didn’t plunge. But within weeks, the System/390 port was downloaded hundreds of times and a vibrant Linux/390 community was growing.

Then, in February 2000, LinuxPlanet’s Scott Courtney wrote an article about an experiment in running multiple copies of Linux on one VM system. A consultant with access to a spare system spent a weekend building 41,000-odd Linux copies, each running an Apache Web server — and it worked! Admittedly, it was a stunt. The pages served were static, and the load was artificially generated and relatively minimal. But it proved the point. A single VM system could successfully replace multiple servers, using multiple copies of Linux.

Throughout 2000, IBM’s commitment to Linux grew. With the introduction of zSeries hardware in the fall of 2000, “Linux for System/390” became “Linux for System/390 and zSeries” — demonstrating that IBM had serious plans for Linux on the mainframe. In December, then-CEO Lou Gerstner announced that the company would spend $1 billion on Linux in 2001, and more in subsequent years.

Challenges and Opportunities

If the mainframe’s only payoff for Linux was yet another hardware platform, there would be little point in having done the port. Mainframe MIPS are far more expensive than Intel megahertz (MHz), so Linux on zSeries would just be an expensive solution to a problem nobody had posed.

Linux on VM’s real value lies in server consolidation — the ability to migrate dozens or hundreds of servers from distributed hardware to virtual machines running on a single z/VM system. A typical distributed environment includes spare machines for fail-over due to hardware or software problems, and “best practices” recommendations require planning for peak loads. The result is often tens or hundreds of machines running at less than 10 percent capacity, with attendant power, floorspace, cooling, networking, and administration requirements. With a single VM system, “spare” machines (extra Linux guests) require minimal additional resources. By sharing the system with multiple applications, all unlikely to peak simultaneously, you can safely over-commit hardware.

The combination of reduced Total Cost of Ownership (TCO) — attributable to savings on floorspace, cooling, networking, and administration — makes a compelling case for Linux on VM, even without considering Linux’s low acquisition costs. With the large number of tools and applications available for Linux, Linux under VM can replicate almost any distributed systems infrastructure.

The ability to easily clone an existing server for testing is also attractive. With deployment times for a new machine measured in seconds rather than days, developers and support staff can be much more responsive to problems. For example, if a Web-based application running on Apache on Linux has a problem, creating a “sandbox” for testing a fix is a matter of copying the existing server’s disks to a new virtual machine. If the underlying data is read-only, the process is even simpler. With no copying required, you can create a new virtual machine with links to the data. For companies with widely varied applications, this capability alone can justify exploring Linux on zSeries.

IBM quickly realized that, for Linux on zSeries to become the choice of CIOs, several hurdles needed to be overcome, including:

The $1 billion Linux commitment let IBM respond to these in a fashion atypical of the Big Blue of old.

Evolution Follows Revolution

While Linux on VM offers excellent value now, there’s room for improvement, and both VM and Linux are evolving.

Networking is an area of relative complexity. Linux guests typically use TCP/IP, and each is its own IP host. On VM, the entire system is traditionally a single host, with individual virtual machines providing services on specific ports, managed through VM’s TCP/IP stack. If a guest wishes to provide its own TCP/IP stack, it either owns a real Open Systems Adaptor (OSA) port, or connects to VM’s stack via virtual channel- to-channel adaptor.

For dozens or hundreds of Linux guests, this is unwieldy at best. The VM TCP/IP service isn’t designed for dynamic reconfiguration, so adding and deleting guests on-the-fly is difficult. Managing a separate channel-to-channel adaptor for each Linux guest is tedious.

z/VM 4.2 introduced Guest LANS and virtual HiperSockets. Like other real hardware virtualized by VM, these are constructs, built and managed by VM’s CP component, which emulate real LANs and real HiperSockets. Linux images running on VM are unaware whether the communications devices they’re using are real or virtual. One clue, should a guest care, is throughput. Since Guest LANs run at memory speed, they offer throughput of several gigabits per second!

Using virtual HiperSockets, tens, hundreds, or thousands of Linux guests can interoperate on a single z/VM system simply by plugging into the Guest LAN. Connectivity to the outside can be achieved through a real OSA or other device attached to a TCP/IP stack on VM or to a Linux guest, acting as a router. To configure Guest LANs, you can use the CP command, the SYSTEM CONFIG file, and individual users’ virtual machine entries in the VM user directory.

Guest LANs quickly proved a perfect fit for Linux on VM. Guest LANs make it trivial to connect Linux guests to internal and external TCP/IP networks. Their throughput means network latency between guests is almost zero, and collisions and dropped packets are nonexistent. When the other end of a connection is an MVS (OS/390, z/OS) system running on the same physical system, the high speed and reliability of the connection are even more attractive.

z/VM 4.3 improved Guest LAN support to enable services such as Dynamic Host Configuration Protocol (DHCP) and Samba, which require IP broadcast support.

An interesting cultural issue has been that IBM considers HiperSockets and QDIO technology proprietary, so the Linux drivers for these devices are OCO. This doesn’t sit well with the Linux faithful, but IBM has responded quickly to any problems with the drivers and issues have been minimal. Since all other mainframespecific changes have been submitted to the Linux kernel maintainers per normal practice, it’s clear that IBM is truly joining the Linux community.

z/VM 4.3 introduced the CP SIGNAL command, which lets a privileged user remotely tell Linux guests to shut down. Linux kernel patches exist to respond appropriately to this signal. Since Linux, like other Unix variants, uses a default file system that must be partly rebuilt after an ungraceful shutdown, this eases managing multiple Linux guests on a VM system.

Linux is learning how to share nicely with others. Since Linux never had to deal with a shared environment, it was designed to use all available real hardware resources. When running on VM, that real hardware is actually virtual, shared with other guests. It makes sense for Linux to use all Random Access Memory (RAM) for file cache when running on an Intel system. However, it’s counterproductive for it to use virtual memory for file cache. Both VM and its underlying DASD subsystem are better positioned to make intelligent decisions about what data should be cached; they also do it more efficiently. Making Linux guests smaller often helps performance by reducing overall system paging load.

Linux was designed to wake up every 10 milliseconds (or every millisecond, on later kernels) to look for work. Under VM, this means that idle Linux guests are never truly idle. They wake up frequently and thus are kept in queue and have large working sets (their in-use storage cannot be paged out). A patch exists to suppress this periodic wakeup and efforts are under way to make the Linux scheduler more intelligent about shared environments and to resolve other issues such as use of standard label tapes.

Culture Clash

A cultural issue far larger than OCO drivers is the gulf between the groups who must be involved in any Linux project — mainframe, distributed systems, and networking. For decades, these groups have been somewhat at odds, as each has felt the others periodically encroaching on their territory. Sites considering Linux on the mainframe must recognize this and often require management mandate to force cooperation.

Age and terminology can be stumbling blocks. When the grizzled VM system programmer asks the Linux kid, “How much storage do you need for this guest?” and the answer is, “Oh, six gig oughta do it,” they’re speaking different languages. One means memory; the other means disk space. Even there, the terms are different: DASD vs. disk, cylinders vs. gigabytes. This terminology confusion isn’t impossible to overcome, but can engender frustration, particularly during early stages when there’s no common language available.

The groups’ values also often differ. The mainframe community values process, structure, control, and reliability. They’re used to folks having sharply defined areas of responsibility, and believe in exercising careful change control and compatibility testing, with a focus on the strategic applications whose operation cannot be disrupted. They understand that hardware is expensive, and must be used efficiently and completely, and that users and applications must be “good citizens,” sharing the system with others.

The Linux people value entrepreneurial spirit and the flexibility it affords, leading to more experimentation. Since Linux is so young and evolving so fast, Linux folks expect to know a little bit about many things (e.g., networking), and assume that programs will usually be free to interoperate and have source code.

Typical complaints from Linux advocates include:

It’s easy to see how these complaints arise. Neither group is right or wrong and both can learn from each other. It’s important for them to figure out why they’re saying what they’re saying.

For mainframe folks, this means realizing that this is, indeed, a brave new world and that there isn’t always a wellknown set of time-tested procedures for a given situation. The Linux group must recognize that more than 30 years of mainframe operations have resulted in best practices worth examining. Systems programming isn’t a matter of doing things the way you like to do them, or the way that’s most convenient. It’s doing things right, where “right” is defined by those who have come before, and have scars from past mistakes to prove it.

Another area of conflict lies in documentation. IBM has, for decades, produced the best documentation in the computing world. IBM documentation is generally so complete and coherent that third-party books on IBM topics are relatively rare. Linux folks, on the other hand, are used to having three levels of documentation:

So, when Linux users are first exposed to the mainframe, their first reaction often is, “Where’s the documentation?” They find few how-to materials. Then they visit Barnes & Noble and don’t find any books. If they even have access to VM source, they find that it’s written in Assembler, with which they’re unfamiliar! No wonder they’re frustrated. Having paper manuals handy and files in Adobe Acrobat PDF is essential for helping folks learn to navigate the IBM documentation.

The mainframe team is often surprised at the poor quality of Linux documentation. Books accompanying commercial Linux distributions are often wildly incomplete and poorly indexed.

Internet sites such as SlashDot, LinuxVM, LinuxDoc, and IBM’s VM and Linux on zSeries Resources pages can help answer questions and resolve specific problems. The Linux-390 mailing list, hosted by Marist College, is also helpful. IBM and other vendors offer classes on VM and Linux on zSeries. A more economical choice is attending SHARE, the national user group for IBM users. SHARE meets twice annually, each meeting offering a full week of sessions on various topics. Refer to the resources section for more details.

Installations with separate networking groups and mainframe groups often encounter problems integrating the Linux guests with the rest of the network. These are not technical issues, and can be overcome with effort on both sides. Networking folks are used to the mainframe being a single host. With Linux on the mainframe, there are suddenly whole networks “hiding” inside the zSeries black box. Since the networking folks cannot see or manipulate the cables, routers, etc., they’re often uncomfortable with this situation.

This can lead to difficulties setting up guests. External routers must be reconfigured to add new routes; traffic over the OSA or other real hardware connecting the mainframe to the LAN may increase dramatically, etc. As with any networking project, it helps to draw a diagram. This lets the network group see that there isn’t anything dramatically new. The new LANs are inside the box and may come and go without external intervention. It’s helpful to show the network team the output of VM NETSTAT and CP QUERY LAN commands, so they can see that it’s similar to what they’re used to.

Mainframe Linu x in Action

So how many organizations have worked through the cultural and technical issues, and migrated important applications to Linux on zSeries? Numbers are difficult to pinpoint. Some companies are reluctant to share information because they feel they’ve gained competitive advantage and don’t want competitors to follow them.

Linux on zSeries success stories include:

Universities, a traditional bastion of both VM and Linux, are also heavily using Linux on zSeries. Marist College uses it for classes. Each student can be given a virtual machine, and can build and modify the Linux environment without affecting others.

Several financial organizations on Wall Street are known to be running quiet but extensive pilot programs, as are some large insurance companies. A big surprise is that the expected early uses of mainframe Linux were for infrastructure applications: Web, file, and print serving, e-mail, etc. While these applications are frequently seen, usage has quickly expanded to include business applications such as WebLogic, WebSphere, and DB2 Connect. This suggests customers are realizing the value of Linux on zSeries even faster than Linux fans hoped!

Conclusion

With Linux for zSeries and System/ 390, VM was indeed reborn, although its new role — hosting multiple copies of another OS — is strikingly similar to that for which it was first created. No longer is VM the “poor cousin” to its MVS-derived brethren, with hardware support lagging by months or years. In fact, VM has supported the latest IBM hardware advances ahead of z/OS!

Linux on the mainframe forces mainframe, distributed systems, and networking teams to converge. As the groups learn from each other, companies will realize value far beyond that which Linux itself brings.

Linux is young and dynamic. Employees are finding something new to learn and enhancing their value while adding to their companies’ bottom lines. Linux is a part of every IT installation’s future. With z/VM’s added value, Linux on zSeries is likely to be a part of the most successful companies’ futures! Z