The onsite unveiling of the IBM z10 Enterprise Class (EC) processor was a highlight of the February 2008 SHARE user event in Orlando. The z9 on the SHARE Technology Expo floor on Monday, Feb. 25, was replaced overnight with a shiny new z10. Numerous SHARE sessions examined the features of the new machine.
The announcement wasn’t a surprise to most. Rumors of the new machines had been widely disseminated and a detailed presentation by Charles Webb, IBM Fellow and hardware architect, was posted on the Web, with subsequent “leaks” of the URL to public lists (www2. hursley.ibm.com/decimal/IBM-z6- mainframe-microprocessor-Webb.pdf).
During the Technology Expo’s Monday session, speculation was that the z9 on display was actually a z10 with a z9 door panel. When the z10 was unveiled, however, it was clear this wasn’t the case: The stripe on the machine is now green instead of blue, and the chassis is taller and deeper than the z9. The increased clearance requirements may cause hassles for installations already tight on data center space, though the new system is still much smaller than mainframes of old.
The z10 continues the IBM tradition of more, better, balanced function, as well as faster CPUs. Its core processor chips are closely related to those in IBM’s current midrange machines, System p and System i (recently merged and renamed simply “Power”). In those systems, the chips are known as Power 6 or p6; during development, the z10 chip was known as z6. Common components were engineered by a single team, while the zand p-specific areas were developed in parallel.
IBM says large portions of some significant components use identical silicon between the p6 and z10. This shared DNA speeds development of aspects such as memory controllers, floating-point processors, and I/O bus controllers. The z10 is the first visible result of a long-rumored IBM project called eCLipz, which IBM has never fully explained, except to admit that the “e” is for eServer and the “ipz” is for the three hardware platforms, System i, System p, and System z. IBM has spent the last 10 years working to converge underlying hardware platforms as much as possible to reduce development and manufacturing costs. eCLipz has been claimed to be many things, most often “consolidating all IBM non-x86 server platforms onto a single, common architecture” or “System z running on Power via emulation.” These haven’t yet proved to be practical, but starting with the z900/z800 and p590/p595, IBM has increasingly shared parts between z and p. The z10 represents the first significant shared circuitry on an IBM processor chip.
Even without the common lineage with p6, the 6 in z6 could have been considered appropriate for z/Architecture machines starting with the first z900s; it represents the sixth significant revision of the venerable mainframe architecture (see Figure 1). (This timeline conveniently ignores some minor innovations such as 26-bit addressing on the 3033 and 3081 in 1981, as well as non-mainstream systems such as the 9370, P/370 and P/390, and Multiprise machines.) The z10 represents a major step forward for System z. Previous z9 systems were powerful—about 580 MIPS per CPU. However, the new machines increase that by more than 50 percent, to about 920 MIPS, and IBM says the net increase is about 62 percent. With up to 64 processors, a full z10 chassis provides a staggering amount of processing power in a single box.
With a clock speed of 4.4 GHz, one might have expected a greater speed increase to at least double that of the previous machines (whose clocks ran at only 1.7 GHz). However, things aren’t that simple. As even Intel has admitted in recent years, realistic speed estimates depend on many factors, including overall architecture, instruction set complexity, and workload type. This also is why IBM has traditionally been reluctant to quote MIPS ratings: A machine that’s twice as fast for one workload may not be so for another. MIPS ratings at least give a general metric for comparing machines (a 920 MIPS box is clearly faster than a 580 MIPS box). IBM does say that for CPU-intensive workloads, the new processor offers about twice the speed of the previous generation.
The z10 includes the first major redesign of IBM mainframe instruction processing since the 9672 G4 machines in 1997. The impetus for this rework lies in the faster clock speed. Chip designers use a metric called FO4 (Fanout of 4) as a clock-independent metric to describe how aggressive the instruction pipelining design is in a chip’s core. Lower FO4 means more aggressive pipelining; the previous System z machines had a 28 FO4 cycle, whereas z10s use a 15 FO4 cycle. As clock speed increases, machines become high-frequency and different considerations apply in pipeline design.
Charles Webb explains that this rework breaks instruction processing into smaller chunks, meaning the machine simultaneously processes more instructions. Instead of six 28 FO4 steps, the z10 has 13 FO15 steps. So more instructions are simultaneously in flight and each step takes less time, reducing latency even though overall pipeline length has increased.