May 1 ’03

Linux on zSeries: Leading the Way in Virtual Machine Technology

by Editor in z/Journal

Linux on the zSeries is leading the revival of virtual machine technology, originally developed by IBM in the 1960s, and is supporting breakthrough research in Grid computing for the scientific community.

This article explores how researchers at the University of Florida, Gainesville, and Northwestern University, Chicago, are employing Linux, IBM’s zSeries, and virtual machine technology to create powerful Grid computing systems. At the core of this Grid computing system at the University of Florida is the IBM z800. In addition, both universities are using Intel Pentium-based PCs.

Professors José Fortes and Renato Figueiredo of the University of Florida, and Professor Peter Dinda of Northwestern University, are leading Grid computing research at the Advanced Computing and Information Systems (ACIS) Laboratory (www.acis.ufl.edu/) at the University of Florida. Fortés, who is the university’s BellSouth Eminent Scholar, a chaired position endowed by the telecommunications company, and Figueiredo have been doing groundbreaking work on Grid computing for the past seven years, beginning with a project at Purdue University in 1995.

Fortes explained that their original work began when users who were doing research in computer device simulation requested that they be able to test from remote locations.

“We ended up with the ability to share computer resources among many different users who are distributed geographically with resources being in different labs at different locations,” Fortes said.

As Figueiredo explained, “It is research that started in 1995 when Dr. Fortes and I were at Purdue University. At Purdue, we developed this system called PUNCH, which stands for Purdue University Network Computing Hubs. It’s basically a network computing system that lets people run applications through a conventional Web browser interface. It was a portal to our grid resources for running applications.”

Having moved to Florida, where they were offered an opportunity to oversee Grid computing at the ACIS Lab, Fortes and Figueiredo are working to build on their pioneering research at Purdue with a new emphasis on virtual machine technology.

Fortes explained that although the research at Purdue and now at Florida primarily serves the needs of scientific research, Grid computing also plays a role in business applications. For example, a financial institution based in New York might choose to set up an Internet-based grid so that their branch offices in Europe could use the company’s computer resources in New York. This would save the company the expense of purchasing and maintaining hardware and software on both continents.

According to IBM documentation on the z800’s inclusion in the project, the University of Florida’s approach to Grid computing is unique in that it relies extensively on the use of virtualization technology at the machine, network, data, and application levels to dynamically create virtual information grids per user and/or per application. The intended users of this Grid computing approach include worldwide communities of scientists and engineers in nanotechnology and computer science. The university describes the middleware developed by the ACIS Lab as “In-VIGO,” meaning that it enables scientific simulations and design to take place In Virtual Information Grid Organizations.

Employing the IBM eServer z800 running z/VM, IBM’s latest virtualization software, the University of Florida project will use the IBM Enterprise StorageServer to house the terabyte-level data files involved in cutting-edge scientific research.

Fortes explained that IBM’s decades of research in virtualization are “built-in” to the z800.

“We wanted to be able to have access to a very efficient virtualization technology that clearly demonstrated that what we wanted to do could be done with good performance and functionality,” he said.

IBM’s commitment to the Linux operating system, as implemented on the z800, was critical to his project because applications for most scientific research are based in Unix and many have now been ported to Linux.

“The superior virtualization capabilities available on the IBM eServer z800, which allow the mainframe to be shared by multiple researchers, each with separate and distinct applications on a single piece of hardware, make it uniquely qualified for research in the Grid computing arena,” said Erich Clementi, general manager, IBM eServer zSeries, in announcing IBM’s involvement in the project. “The combination of IBM’s powerful storage and server technologies will play a key role in the university’s In-VIGO project.”

Grid Computing Meets Budgetary Constraints

In the development of Grid computing, necessity has been the mother of invention as the computational needs for scientific research have outstripped the computing resources available at individual universities and labs.

In tight budgetary times, most universities cannot afford powerful super computers needed for cutting-edge research in areas such as nuclear physics and the human genome. Grid computing links computers — mostly relatively inexpensive Intel-based PCs — over the Internet so that universities can share their resources. This reduces the costs associated with purchasing and maintaining a single, massive computer system.

“In a lot of cases, the scientific community has reached the point where massively parallel systems or systems that can scale to where they can deliver really immense amounts of computing cycles have become too expensive to manage,” said David Boyes, president and CTO of Sine Nomine Associates, Ashburn, VA.

Boyes, who is a leading independent researcher in Grid computing, explained: “If you look at what’s being done in computational physics or quantum hydro dynamics or any other area where the science community would go out and buy a super computer, a state university or a small research lab can’t afford that kind of iron.”

As the computational demands of scientific research grew beyond the capacity of a single university’s computing resources, the scientific community, backed by funding from the National Science Foundation (NSF), began looking at ways to link available computing power from a number of sources.

“So, the researchers within the Center for Research on Parallel Computation (CRPC) tried to develop a method for taking available hardware and gluing it (individual computers) together in a more uniform programming model,” Boyes explained.

In most cases, that meant harnessing the collective computing power of relatively inexpensive but plentiful Intel machines, he explained. Even old 386 and 486 machines could be used to add capacity to the grid.

The philosophy of Grid computing is dedicated to creating a system that was open source on a common platform. In this model, universities and laboratories combine their computing resources to provide the high-volume cycles needed for cutting-edge research. Scientists and engineers can access the grid through any standard Web browser.

The University of Florida project provides a Web portal, Figueiredo explained. “It’s responsible for allocating resources and doing management of load and data and interacting with the user, and providing the user with an interface to the application. It’s an interactive interface.”

According to Figueiredo, the core of the grid for the research project at Florida includes Pentium 3 Intel machines, plus a cluster of Pentium 4 machines housed at Northwestern.

“The Pentium clusters and the IBM z800 make up the core of our grid,” he explained. “But there are other machines at other sites that will be linked logically to our grid. There are resources at Purdue that we plan to integrate. We have on-going relationships with other universities.”

The Florida project will provide grid resources for existing applications, including solid-state device modeling, nanotechnology, Computer-Aided Design (CAD), the design of very large-scale integrated logic, and bio-medical engineering and imaging, according to Figueiredo.

“We’re also beginning a collaboration with the Coastal Engineering Department here in Florida to look at simulation of ocean and coastal dynamics,” Figueiredo said.

The Role of Virtual Machine Technology

IBM’s original 1960s research and development of VM is now a key technology in Florida’s cutting-edge implementation of Grid computing. This technology provides a level of security that was missing in earlier Grid projects, according to Figueiredo.

“We are building on this idea that basically is not new, this idea of virtualization, and I think the technology of virtual machines has been the subject of a renaissance with the zSeries and Linux and the virtualization of Intel machines,” he said. “Part of the reason for the renaissance of virtualization is the idea of trying to consolidate resources and use them as efficiently as possible and in the process making sure that access is secure. That’s a big functionality difference between a virtual machine model and a multi-user operating system.”

The virtual machine model will provide secure access to applications that were not available in earlier Grid experiments.

“One of the things we learned in the process of doing this at Purdue was that in many cases since we were trying to run applications that were not modified, we had to worry about the issues of trust and security,” Figueiredo said. “We had to make sure that if we enabled the user to run an application on the system we trusted the user, so that we had reasonable confidence they would not compromise the physical resources. In practice, we could not trust the user of the application. Therefore, in PUNCH we faced a situation where we could not publish an application because it wasn’t secure, even though we would have liked to make that application available.”

Virtual machine technology provides security on two levels, he explained. Because the user is not interacting with the underlying hardware, it is not vulnerable, even if there are security holes in the operating system running his application. It also protects other users accessing the same physical resources because they are on completely decoupled virtual machines.

The Intel machines at the University of Florida are running on a virtual machine platform provided by VMware, Palo Alto, CA. It is a “cousin” of the VM technology running on the IBM z800, according to Ed Bugnion, chief architect and co-founder of VMware, which is an IBM business partner.

“Our products are about running virtual machines on Intel servers,” Bugnion explained. “By virtual machine we mean in the traditional mainframe sense. Each virtual machine is independent of the other virtual machine. Furthermore, each virtual machine is completely isolated from the other virtual machine so you can consolidate a large number of servers on the same server.”

“In the case of the research at the University of Florida, you can also connect them and use these virtual machines as building blocks for a grid experiment,” Bugnion said.

The IBM zSeries : The Heart of the Grid Project

IBM’s pioneering development of virtual machine technology was one key to selecting the z800 to be the heart of the Grid project at Florida.

“The z800 makes use of highly efficient z/VM virtualization capabilities and supports Linux-based environments and applications — this played a determining role in our decision to deploy IBM technology and seek NSF funding to acquire the machine,” Fortes said in announcing the z800 selection by the university. “We believe that Grid resources of the future will be able to provide virtualization capabilities similar to those already available when using the z800.”

The Grid researchers acknowledge that they are building on work from the early days of computer science.

“IBM has the original virtual machine technology dating back to the System 360 in the late ’60s and early ’70s that is still the most well-established virtual machine technology available,” Figueiredo said. “That brought us to consider the zSeries.”

The university plans to integrate the z800 running z/VM, IBM’s advanced virtualization software, a 3.36TB IBM Enterprise Storage Server (code-named Shark) and a 32-node IBM eServer xSeries cluster running VMware and Linux, according to IBM. Scientific research involving terabytes of data will rely on the IBM Enterprise Storage Server.

The Intel machines in the grid can handle the number-crunching involved in experiments, but one area where the IBM z800 shines is in working with huge data files.

“The way we see the zSeries machine being used is for the applications that are more data-intensive, I/O-intensive, rather than computing-intensive,” Figueiredo said. “That’s the major differentiator of the zSeries from traditional Unix boxes as we see it, the high available bandwidth and memory bandwidth for applications that are data-intensive.”

Boyes, who did pioneering research on employing existing IBM technology in Grid systems, explained that scientists today are often working with very large data files.

“A typical nuclear physics experiment generates 150 to 200GB every few seconds,” he said. “If you need to handle a 14TB file, then that’s something z/OS does in its sleep. IBM spent the past 15 years developing management of information on that scale to the point where it is pretty much a science. A remote user can take advantage of that to store enormous files or just manipulate data on a large scale. That’s something the z800 is very, very good at. This is why it’s interesting to keep mainframes in the loop. There are things they are good at.”

Linux and the Grid

IBM’s commitment to Linux, including the implementation on the zSeries, is important to Grid research, which has been based almost entirely on Unix. Linux is seen as having a number of advantages in the scientific community, including the fact that it is open source and it is freely available. Figueiredo pointed out that scientific applications developed over the past 30 years in Unix are already making the transition to Linux.

“Many applications in the Unix environment already have been ported to Linux, including the middleware needed to run many of the applications,” Figueiredo said.

For this very practical reason, Grid computing and the scientific applications that run on it are likely to remain Unix/Linux-based. Grid researchers do not see other operating systems, specifically Microsoft Windows, gaining much traction in the scientific community, at least, at present.

“Doing research in this area and publishing applications that people in the scientific community specifically use, it would not be an easy task for our users to interface with operating systems other than Unix,” Figueiredo explained. “People develop applications in C, C++, Java, or FORTRAN and they expect compilers of the type that you find in a regular Unix box and if you leave that environment it’s difficult to port applications.”

Reliability and the ease with which a Linux system can be managed are additional pluses for using the operating system in the Grid environment. In fact, Figueiredo explained that it would be hard to do the Florida project without Linux.

“From our perspective, without Linux it would be difficult to use existing software,” he said. “The components that are available for doing Grid computing have been developed using Open Grid Services Architecture (OGSA). Having Linux enables us to support this architecture.”

At the bottom line, Linux offers one very important plus for the researchers seeking to get the most out of every dime in their usually tight budgets — Linux is freely available. In addition, as Figueiredo pointed out, the Linux-based system is easy for the university and scientific community to manage with limited professional IT resources and a support staff that is often made up primarily of students and graduate students. Z