Two common questions are asked when a vendor introduces a new generation of processors: How big is it and how much does it cost?
The second question is fairly straightforward. Short of currency conversion across countries, the price is the price. However, the first question is much more problematic. The size, or capacity, of a processor is open to various forms of interpretation. Over time, these metrics have been overstated, understated, and confused to the point where consultants have been enlisted to make sense of them. This article attempts to shed some light on this subject and dispel some commonly held myths regarding mainframe sizing.
LSPR IS BORN
Approximately 30 years ago, IBM decided to formalize its internal benchmarking of new mainframe processors and make the results publicly available. The goal was to help customers understand the relative capacity of new models. However, equally important, IBM was trying to reduce the high demand for data center test time. Back in the early days of mainframe computing, many companies spent large quantities of time and money developing their own benchmarks and requested access to IBM data centers to measure the newest models. IT shops relied on these benchmarks for developing their own tables of relative capacity. These results were used by capacity planners and financial analysts charged with evaluating the cost of different proposals. By standardizing a set of benchmarks and making these results public, IBM hoped to reduce the confusion over relative capacity and the growing demand for test time.
The result was the IBM Large Systems Performance Reference (LSPR) benchmarks and the tables of relative capacity these benchmarks produced. I will not discuss the LSPR benchmark methodology here; however, there are several sources of information on how these benchmarks are conducted. The IBM LSPR results are publicly available at www.ibm.com/servers/eserver/zseries/lspr/.
A MIPS IS NOT A MIPS
Despite the wide use of the term, no one measures MIPS today. Back in the early days of computing, benchmarks were constructed that actually did measure MIPS. These included kernels, or small routines, that ran over and over. The total number of instructions executed, divided by elapsed time, resulted in millions of instructions per second (MIPS). MIPS was the key metric for determining the capacity of a processor. However, MIPS can be a misleading indicator of capacity, as it measures only the rate of executing instructions, and only those instructions in the kernel. MIPS do not measure the ability to process real work. As computers evolved and became more sophisticated, MIPS-based benchmarks became less able to predict the true capacity of these processors. Over time, benchmarks also evolved, and MIPS were replaced with other measurement techniques. But the name lives on. Customers and consultants (and even IBM, on occasion) still quote MIPS as measures of capacity, but that is not what IBM measures today when they run capacity benchmarks. LSPR benchmarks measure the internal throughput rate (ITR) of processors. This is a measure of the amount of real work that completes in a period of time, normalized to the processor running 100 percent busy.
Since there is a separate LSPR benchmark for each major subsystem (CICS/DB2, IMS, TSO, and multiple batch workloads), the measure of real work is different for each benchmark. For example, for CICS/DB2, the measure of work is the number of CICS transactions completed. For the batch benchmarks it is the number of job steps completed. When the number of transactions (or job steps) completed is divided by the total elapsed time (wall clock time), the result is the external throughput rate (ETR). When the ETR is divided by the average processor utilization, the ETR is converted to ITR. So, now you may be wondering why IBM converts ETR to ITR. The reason is that IBM can never guarantee that each benchmark will run at exactly the same utilization. Therefore, comparing the ETR on one run to the ETR on another run wouldn’t make sense, especially if the two processors ran at different utilization levels. Dividing by the utilization, the throughput rate is normalized to what would happen (theoretically) if all the processors ran at 100 percent busy. This levels the playing field and allows us to compare an apple to an apple.
The ITR is a measure of the maximum possible throughput rate we could achieve with the processor at 100 percent busy. Although this is a theoretical metric, it eliminates the impact of I/O and memory limitations. That leaves us with a measure of the maximum amount of work the processor can deliver, assuming no limitations other than processor speed. The numbers supporting LSPR tables are the ITR results for each benchmark. When we calculate the ratio of the ITR of one model to another, we have the internal throughput rate ratio (ITRR) between these models. That metric is the accepted measure of the relative capacity between these machines.
So, how do we get from ITRs to MIPS? Well, history has a lot of inertia. Some customers and consultants over the age of 50 still believe a certain processor from 20 years ago was truly capable of executing X amount of MIPS. When IBM publishes new LSPR tables, these people update their MIPS charts by extrapolating the relative capacity shown in the LSPR results to MIPS. For example, several years ago, a common standard used by many consultants (and IBM) was to rate the 9672-R15 at 63 MIPS. When the LSPR OS/390 V2 R4 results were published for the G5 (fifth generation of CMOS processors) models, the ITRR comparing the R66 to the R15 was 10.136. This ratio is based on a “Default Mix” of several LSPR workloads (more on this later). Using this ratio, a MIPS chart was created using the R15 as the base machine and showed the R66 as 638.6 MIPS or 63 x 10.136. If we were to use a different base machine (other than the R15), or assign a different MIPS rating (other than 63 MIPS) to the base machine, then all the other MIPS ratings would also change. The most important message from this discussion is that IBM and the LSPR benchmarks do not measure MIPS; they measure relative capacity using the ITR metric. Keep this in mind when I discuss how LSPR results are used and misused.