Sep 1 ’03

Linux Under z/VM: CHALLENGES & OPPORTUNITIES

by Philip H. Smith III in z/Journal

As anyone who has read a trade magazine in the last year knows, Linux is making inroads in the enterprise on multiple platforms, especially on IBM zSeries hardware. This article explores assorted positives and perils of Linux on zSeries, from both a business and technical perspective.

WHY LINUX ON zSERIES?

The combination of horizontal and vertical scalability, hardware reliability, and I/O bandwidth offered by the mainframe improves Linux’s already significant appeal, particularly for installations with zSeries systems installed or that have used mainframes in the past.

However, unlike an Intelbased install, which entails taking an unused machine, downloading a free copy of Linux, and installing it, mainframe Linux typically requires more planning and discussion. This is not due to weakness or deficiency in mainframe Linux; rather, it reflects the critical nature and rigid change-control policies typically in force in mainframe installations. These policies comprise a set of checks and balances that operate similarly in concept to those in most systems of government: While they make some things more difficult, through the same mechanisms, they ensure greater reliability and integrity of the process.

It is also true that there are relatively few public success examples of Linux on zSeries. IBM literature talks about Winnebago, Boscov’s Department Stores, Korean Air, and a handful of others. But unless you happen to build recreational vehicles, run a department store chain, or offer commercial air travel, it is easy to dismiss these as not being relevant to your business. Other success stories exist, but are “cloaked” under nondisclosure: These businesses recognize that their exploitation of Linux constitutes a competitive advantage at this stage of Linux market penetration, which they want to preserve.

SCALABILITY

While the proven hardware reliability of IBM mainframe hardware and I/O bandwidth (z990 machines offer up to 512 channels, each with up to 256 devices) add value, the ability to scale both horizontally and vertically is the biggest benefit Linux on zSeries offers over Linux on other platforms.

The modern data center has evolved from the centralized “glass house” with a small number of mainframes to racks of so-called “small” or “distributed” servers, often numbering in the hundreds. This offers numerous advantages and disadvantages (see Figure 1).

Modern, well-managed data centers have noticed that these “inexpensive” machines are not that inexpensive after all. But the past is gone: Other factors, particularly growth of connectivity enabling inexpensive, reliable data sharing across long distances, mean that the traditional centralized, tightly controlled, isolated data center is usually not a realistic alternative.

BEST PRACTICES: LINUX WITH DATA CENTER RELIABILITY, AVAILABILITY, AND SERVICEABILITY

The 40-plus years of industry experience in the mainframe world has produced a set of data center “best practices,” which are processes and procedures that are often not well-understood in the distributed world. Issues such as religiously- followed backup procedures, disaster recovery testing, complete audit trails, thorough testing, change management, and robust security policies have proven their value repeatedly in the mainframe environment, and translate well to Linux. While perhaps it’s difficult to directly quantify their value, these practices have proven themselves, and applying them to Linux systems only further improves Linux’s value.

WHERE THE MONEY GOES

Savvy CIOs who have analyzed their IT budgets know that cost allocations have shifted over the years. Typically, costs are divided into four areas: hardware, software, facilities, and staff.

Twenty-five years ago, when a single machine of any power cost millions of dollars, required water-cooling, and could run (at most) a handful of simultaneous applications, hardware and facilities were a significant portion of annual data center costs. Software was largely developed in-house or even provided free by IBM, and changed relatively slowly. Staff costs were fairly predictable and static: Upgrading to a bigger machine usually did not entail adding staff, unless dramatic application changes were also undertaken.

Today, the equation has changed. In any data center, staff costs are by far the biggest piece of the pie. Hardware has become much cheaper, and the data center can get by with less elaborate building facilities. However, software costs have increased dramatically: Not only are products from Independent Software Vendors (ISVs) increasingly expensive, but software’s fundamental pricing model has evolved to include  fees based on machine size and number of “seats” (users).

These changes dramatically affect total-cost-of-ownership equations. Not only are costs in different areas, but they are in different categories: Hardware and software purchases are capital costs, whereas facilities (once built) and staff are operational costs. Operational costs are often easier to bear, since they come in small “bites,” unlike multi-million-dollar hardware or software purchases (although at $7,000 per square foot, Tokyo data center space drives up operational costs pretty quickly!).

Distributed computing has also led to growth of rogue or departmental servers: Real or perceived data center unresponsiveness leads a group needing a particular application to just buy a machine and get it running — never mind that doing so often involves significant amounts of someone’s time to install and maintain the machines. Unofficial support time being unallocated and untracked doesn’t make it free, and it usually comes at the expense of a real job not being done.

Machine-size pricing and applications available for multiple systems also contribute to this problem: When a group requesting an application on the mainframe hears that the software alone will cost $100,000 because the software is priced according to the size of the entire mainframe, they often switch to a “cheaper” version on a distributed server.

LINUX TO THE RESCUE

Linux can help with several of these issues, and Linux on zSeries can help with them all.

HARDWARE

Linux, still a relatively small system, runs effectively on low-end hardware. Many installations have earned their Linux stripes by installing a downloaded copy on an unused 486 or low-end Pentium that was gathering dust, proving viability of the intended use, and then moving it to larger platforms (or not!) as the need arose.

While zSeries hardware is hardly “low-end,” its superior resource controls mean that virtualizing Linux under z/VM on zSeries often dramatically reduces raw MIPS required. (Linux can, of course, run in an LPAR, but the fixed hardware allocation and restrictions on the number of images usually make this unpalatable for other than short-term use.) Some analysts estimate that the average distributed server runs at less than 10 percent utilization, with the rest of the processing power reserved for growth or peak usage. Anyone who has stressed even a high-end Intel machine knows that when the machine is busy performing two or three tasks, there’s little point in trying to get it to do anything else — you just have to wait.

On the other hand, well-run mainframes often run at 100 percent utilization while providing excellent response to thousands of applications and interactive users. This means that server consolidation using Linux under z/VM can allocate resources using that 10 percent factor, leaving a global pool of extra resource for peak loads, and can measure and tune appropriately to provide required service levels.

SOFTWARE

Linux, of course, is free to acquire, as are many Linux applications. Even vendor applications for Linux typically have cheaper pricing models than traditional ISVs, based partly on the fact that the market won’t (yet?) support the older schemes, and partly on the vast numbers of free, open source applications. While many installations prefer vendor-supported products for reasons of support and service, the existence of alternatives means that, for once, installations often really can abandon a vendor perceived as rapacious. And, since “Linux is Linux,” almost any application that runs on an Intel machine will run on zSeries.

BUILDING FACILITIES

Smaller, cheaper hardware typically requires less in the way of facilities, although when Linux is being installed on Intel machines to replace Windows applications, the common distributed “one application per machine” approach works against this principle. On the other hand, when hundreds of Intel servers are consolidated onto Linux virtual machines on zSeries, facilities savings can be significant for the following items:

STAFF

The experienced Windows administrator who installs a single Linux machine is often bewildered by promises of administrative savings. Like any system, a single Linux machine requires administration, and while plenty of Linux administrative tools exist, they are not necessarily easier to use than Windows equivalents, particularly given the inevitable learning curve.

However, longer-term savings inherent in Linux’s proven security compared to Windows mean that Linux machines typically require less “care and feeding” than an equivalent number of Windows machines. When Linux machines are virtualized under z/VM on zSeries, the ability to perform maintenance to multiple machines simultaneously also dramatically reduces effort.

ACQUISITION

So, is Linux free or not? Linux for zSeries distributions are available for download, but do not include support. SuSE and Red Hat offer distributions with priced support; IBM offers support, but is not a Linux distributor. If an installation has experience with Linux on other platforms, a preference may exist based on the tools available with a given distribution.

If no Linux experience exists, the best advice is to visit LinuxVM at www.linuxvm. org. This site includes links to various distributions (and other resources), and should help you make an informed decision.

Whether to pay for support is a complex question, and depends largely on the skill level of your installation staff. Perversely, support is often most needed during the pilot phase, when management is least willing to pay for it. This suggests that, if possible, acquiring a distribution with a paid but time-limited support license may help ensure success.

In any case, the CIO needs to know, “How much will Linux save us?” As always, the answer is, “It depends.” In some cases, it may be possible to justify a Linux project purely based on savings on capital expenses, facilities, and software licenses. Others may require estimates of staff time saved applying service individually to dozens or hundreds of machines, or the value of improved reliability, availability, and serviceability, all of which can be difficult to quantify. No matter what, beware of generating numbers that seem “too good to be true” (even if often accurate)!

VENDOR SUPPORT

Any company has vendor applications that are critical to its operations. Unfortunately, not all of these vendors offer Linux versions, and some offer Linux versions, but not for Linux on zSeries.

Ask about vendor plans for Linux support and for Linux on zSeries support. Since “Linux is Linux,” the hardest part of porting most applications to zSeries is access to hardware; current mainframe installations may be able to work with their vendors to provide testing access in exchange for license concessions, further improving the value proposition of a migration. In more than one instance, merely asking about a zSeries version of an application has prompted a vendor to create one in a very short time.

It’s also important to analyze whether the product or the function is the critical item. Often alternatives exist. You need to take into consideration whether the cost of migration — the disruption, training, client reconfiguration, etc. — is worth more or less than the return. The term “best-of-breed” has been overused in the computing industry; “good enough” is often sufficient. Therefore, even if an alternative product lacks features or functions, it may be sufficient to justify a migration. Be aware, however, of “hidden uses,” where groups or departments use products in unanticipated but important ways. These can completely derail a migration if not ferreted out and understood in advance.

APPLICABILITY

When asked, “Why aren’t we using Linux?” the usual response is, of course, the question, “What should we use it for?” The good and bad news is that there are many answers to this; the good news is that there’s a fair chance that several choices make sense for you. The bad news is that you have to winnow those choices!

Organizations use Linux on zSeries in varied ways, and usage is rapidly evolving. Eighteen months ago, common uses were “infrastructure” applications such as Domain Name Server (DNS), file and printer sharing, e-mail, etc. While these are still excellent choices, static (HTML) and now dynamic Web serving (IBM WebSphere, BEA WebLogic, et al.) quickly entered the picture, and now applications of all types are available.

Vendor products such as IBM DB2 Connect, Oracle, and mail solutions from Bynari and Steltor (now Oracle Collaboration Suite) have made strong inroads. Common open source applications include the Apache Web server with its plethora of add-ons, and Samba for file and printer sharing.

Poor choices for mainframe Linux are CPU-intensive applications. Intel MIPS are far cheaper than zSeries MIPS, and are better choices for such uses.

IBM maintains a list of vendor products available for Linux on zSeries (see the sidebar for more details).

LINUX AND z/VM: A SOLID MARRIAGE

As discussed earlier, virtualizing Linux machines under z/VM on zSeries offers great promise for cost savings. Like any such marriage, however, it introduces issues that must be worked out for a harmonious relationship.

VM network configuration is always a consideration, not because it is deficient or particularly difficult, but because it’s unfamiliar. The process differs from what Linux folks know. VM staff have traditionally not rearranged their networks all that often and are unfamiliar with the process. In addition, once VM itself is set up, external routing must be configured.

With z/VM 4.3, an IFCONFIG command similar to what Linux staff are accustomed to was introduced, which greatly eases network configuration. Other z/VM TCP/IP enhancements further help dynamically configure the network. This is particularly important given the increasing rarity of non- TCP/IP-attached terminals: If a z/VM TCP/IP configuration error occurs, it is often necessary to gain access to the system console to restore a previous, working configuration.

In most installations, the networking group considers the mainframe an “endpoint” on the network, with a single IP address. When a block of addresses is requested for Linux guests to be created, the result is often an extended discussion, with iterations required before routing is correctly set up. It is often best to gather all affected staff, draw the proposed network on the board as if Linux guests were separate machines and, once it’s agreed on, only then draw a box around the outside, showing that the guests are all inside the mainframe. This usually allows everyone to understand the proposed layout. The same approach is effective when a Storage Area Network (SAN) will be involved with a mainframe Linux project.

Once configuration and routing issues are understood, however, networking with z/VM is fast and easy. With guest LANs — virtualized network connections within the physical machine, created and operated by the Control Program (CP) — there are no network cards to install, no wires to plug in. Throughput is at memory speeds — about 2.4GB per second on older machines, and even faster on z990s. And, with CP managing the connectivity, there are no collisions or dropped packets!

CULTURE WARS

The biggest challenge for any Linux on zSeries project is the inevitable “culture clash” between mainframe and other IT staff. For more than 20 years, these two groups have, in most shops, eyed each other warily across a gulf. Mainframers have snorted derisively at the “toy machines,” while the distributed staff laughed about gray-haired dinosaurs with their line-mode interfaces. Existing Unix and Windows personnel have often waged war in an effort to protect their turf, and the network group is often also separate with its own prejudices.

Suddenly, a zSeries Linux project requires these groups to work together for a common goal — one that provides a learning experience for all of them. Even something as seemingly simple as terminology becomes a significant issue; some terms are unfamiliar, and some even have different meanings (see Figure 2)!

Recognize that Linux on z/VM implementation requires collaboration and a sense of ownership across multiple teams. Migrating applications from Windows, for example, often involves all of the teams (see Figure 3).  

PAIN FOR LINUX PEOPLE . . .

Linux system administrators are familiar with poorly (or un-) documented code; however, they are used to having code as a reference. In this regard, z/VM itself, for which most source code is provided, is familiar (although few Linux users know S/390 Assembler). What Linux users are not used to is the quantity and quality of IBM documentation. They are used to acquiring documentation as a series of how-to documents from the Web, supplemented by reference books from various publishers.

When confronted with an IBM bookshelf of z/VM manuals alone, the typical reaction is one of mild to moderate shock. “How are we supposed to find anything?!” they cry. Of course, there is a good reason why IBM refers to its technical writing staff as “Information Developers”: IBM has been creating computer documentation longer than anyone, and produces the most usable manuals in the world. Once the organization of IBM documentation is understood, this hurdle is usually overcome.

Still, it is true that gaining VM expertise is not as easy as it once was. VMers in many installations have retired or moved on, and IBM’s VM curriculum has been pared down drastically over the years, although IBM and others are once again introducing new courses. IBM used to publish a “VM Primer” manual. Used copies are still valuable for beginning VMers, and are often available from used-book Websites. Prentice-Hall recently published Linux on the Mainframe (ISBN 01-3101415-3) by John Eilert, Maria Eisenhaendler, Ingolf Salm, and Dorothea Matthaeus, which may be helpful.

SHARE is also an invaluable resource. A week at SHARE offers a full VM curriculum, as well as the opportunity to talk to dozens of long-time VMers and Linux on zSeries experts — all for far less than most courses. In addition, local user groups, both VM- and Linux-focused, often offer useful sessions or at least contact with peers facing the same issues.

Other mainframe aspects provide more shock for distributed folks: 3270 terminals, 3480 tape cartridges, record formats, serial numbers ... not to mention EBCDIC instead of ASCII! (Being a true Linux, Linux on zSeries does, however, use ASCII.)

For Linux users, some utilities are available on z/VM using the OpenExtensions Shell and Utilities, but most familiar tools such as vi, emacs, diff, and grep are not otherwise usable. Equivalents exist, but still require a learning curve. The z/VM function used by most will likely be the system editor, XEDIT. While very different from traditional Linux editors, XEDIT is very powerful, and some long-time Linux users have even found that they prefer it!

. . . A ND PAIN FOR MAINFRAME PEOPLE

One of the biggest issues for mainframe programmers is the fact that the Linux command language is case-sensitive. In addition, people create files and commands in mixed case! Thus, “logout” is not the same as “Logout” or “LOGOUT,” although most would agree that creating commands with similar names is asking for trouble. This is surprisingly hard to learn: Programmers with years of experience find themselves entering a command that they are certain is correct, only to have it fail because of a case issue.

ASCII is as much an obstacle for mainframe folks as EBCDIC is for Linux folks. Many mainframe veterans have the entire character set memorized in hexadecimal — in EBCDIC. This is somewhat less than useful in an ASCII environment, alas.

Free and commercial tools are available to ease the transition to Linux for mainframe users. This includes REXX for Linux (Regina REXX, uni-REXX, and S/REXX), and there are XEDIT and ISPF implementations (THE [The Hessling Editor], S/EDIT, and uni-XEDIT/uni-SPF). Note that THE and Regina REXX are free. For more information, see the sidebar.

On an ongoing basis, Internet mailing lists VMESA-L and LINUX-390 are invaluable for both Linux and mainframe staff. Hosted at Marist College and the University of Arkansas, they offer friendly, helpful peer assistance with z/VM and Linux problems.

LONG-TERM HEADACHES

All of the aforementioned issues, however, are short-term pain, eventually cured through learning. More significant are long-term issues such as DASD, system, and user management.

Each Linux guest requires a new Linux install — typically about 2GB of data. This takes a while to load, and seems wasteful of both DASD (which is cheap these days, but hardly free) and time. There are ways to share significant amounts of Linux data read/only, but they are non-trivial without vendor products or significant Linux experience.

Linux performance management is still somewhat a black art, particularly on zSeries. Besides the fact that the few Linux tuning APIs are poorly documented, there is a larger issue: Linux was built for single-user machines, so it assumes it owns the entire physical system. This can make it a greedy guest.

For example, it uses all available memory to cache file buffers, which is probably not particularly wise since z/VM caches data in memory and so do most hardware controllers. Particularly when read/only file sharing is in effect, this means that a given byte of read/only data could exist in multiple Linux caches as well as z/VM minidisk cache and hardware cache!

The solution is to keep virtual storage sizes for Linux guests as small as possible.  This usually means reducing storage until Linux starts swapping, and then adding a small amount more (or letting Linux swap a bit). Reducing storage this way prevents Linux from using storage for file buffers, which usually isn’t appropriate on z/VM. Linux guest throughput has been drastically improved by reducing virtual storage size — in one case, from 2GB to 64MB!

When idle, Linux is not particularly frugal with CPU. After all, it’s idle, and “unused cycles are wasted cycles.” Of course, on a shared system, this is not something well-behaved guests practice. By default, Linux wakes up every few milliseconds and looks for pending work. The result on z/VM is that an unfettered Linux guest never becomes idle in z/VM terms. Since Linux guests tend to have fairly large working sets (the number of pages of virtual memory actually in use) — typically 100 percent of their virtual storage size — this severely limits the number of Linux guests that can run simultaneously, particularly if virtual storage sizes are not carefully tuned.

The result is common and predictable: Things run fine until a magic number of guests is reached, and then they just seem to stop. At that point, the z/VM Control Program (CP) notices that it is going to overcommit real storage, so it declines to run some of the guests, instead putting them on the “Eligible list.” This is correct behavior: The alternative, with well-behaved guests, is to spend most of the system paging in each guest in turn, running it briefly, then paging it back out so the next guest can run. However, Eligible list processing is based on the assumption that guests will voluntarily become idle periodically, or will at least perform I/O or some other operation that will allow other guests to run. Since an idle Linux guest never becomes idle and does no I/O, the result is that it is the idle guests that wind up using all the CPU!

There is also a “patch” for the Linux kernel known as the “notimer patch.” This changes the Linux kernel behavior to avoid the wakeup loop when the guest is idle, and increases the number of viable Linux guests on a system by several orders of magnitude.

If Linux must swap, it should use VDISK (virtual disk-in-storage): pseudodisk space implemented via z/VM real memory and paging space. This provides the fastest performance while not allocating fixed amounts of real DASD.

In any case, carefully watch paging: With several large Linux guests, insufficient paging space can result in z/VM

ABENDs fairly easily. Long-term solutions to these issues are in development, but in the meantime, Linux is still perfectly usable, provided these problems are understood and compensated for.

DON’T BE SWAYED BY FEAR, UNCERTAINTY, AND DOUBT!

With a weak economy, a recent war, and headlines about layoffs, many employees are nervous. When they hear someone discuss “server consolidation,” what they hear is “more layoffs.” Other issues, such as the recent SCO lawsuit against Linux and IBM for alleged misuse of Unix source code, as well as Microsoft and Sun publicity about why they believe Windows or Solaris are better, also can discourage consideration of Linux.

Linux on zSeries, however, is an opportunity for companies to save money and provide better service with the same staff (or to grow headcount more slowly), and for staff to learn new, marketable skills. While Linux may not be the right choice for every installation, it’s at least worth evaluating for most enterprises! Z