Blog

Mar 26 ’13

A company running an obsolete z890 two-way machine with what amounted to 0.88 processors (332 MIPS) planned a migration to a distributed system consisted of 36 distributed UNIX servers. The production workload consisted of applications, database, testing, development, security, and more. Five years later, the company was running the same in the 36-server multi-core (41x more cores than the z890) distributed environment only its four-year TCO went from $4.9 million to $17.9 million based on an IBM Eagle study. The lesson, the Eagle team notes: Cores drive platform costs in distributed systems.

Then there is the case of a 3500 MIPS shop which budgeted $10 million for a one-year migration to a distributed environment. Eighteen months into the project, already six months late, the company had spent $25 million and only managed to offload 350 MIPS. In addition it had to; increase staff to cover the over-run, implement steps to replace mainframe automation, acquire additional distributed capacity over the initial prediction (to support only 10% of total MIPS offloaded), and extend the dual-running period (at even more cost due to the schedule overrun). Not surprisingly, the executive sponsor is no longer there.

If the goal of a migration to a distributed environment is cost savings, the IBM Eagle team has concluded after three years of doing such analyses that most migrations are failures. Read the Eagle FAQ here.

The Eagle TCO team was formed in 2007 and since they report that they completed more than 300 user studies. Often its studies are used to determine the best platform among IBM’s various choices for a given set of workloads, usually as part of a Fit for Purpose study. In other cases, the Eagle analysis is aimed at enabling a System z shop to avoid a migration to a distributed platform. It also could be used to secure a new opportunity for the System z. Since 2007, the team reports that its TCO studies secured wins amounting to over $1.6 billion in revenue.

Along the way, the Eagle team has learned a few lessons. For example: Re-hosting projects tend to be larger than anticipated. The typical one-year projection will likely turn into a two- or three-year project.

The Eagle team also offers the following tips, which can help existing System z shops that aren’t necessarily looking to migrate but just want to minimize costs:

  • Update hardware and software; for example one bank upgraded from z/OS 1.6 to 1.8 and reduced each LPAR’s MIPS by 5 percent (monthly software cost savings paid for the upgrade almost immediately)
  • Take advantage of sub-capacity, which may produce free workloads
  • Consolidate System z Linux, which invariably saves money: Many IT people don’t realize how many Linux virtual servers can run on a System z core. (A debate raging on LinkedIn focused on how many virtual instances can run on an IFL with quite a few suggesting a max of 20. The official IBM figure: Consolidate up to 60 distributed cores or more on a single System z core, thousands on a single footprint; a single System z core = an IFL.)
  • Changing the database can impact capacity requirements and therefore costs
  • Workloads amenable to specialty processors, like the IFL, zIIP, and zAAP, reduce mainframe costs through lower cost/MIPS and fewer general processor cycles
  • Consider the System z Solution Edition (I have long viewed the Solution Edition program as the best System z deal going, although you absolutely must be able to operate within the strict usage constraints the deal imposes.)

The Eagle team also suggests other things to consider, especially when the initial cost of a distributed platform looks attractive to management. To begin the System z responds flexibly to unforeseen business events; a distributed system may have to be augmented or the deployment re-architected, both of which drive up cost and slow responsiveness. Also, the cost of adding incremental workloads to System z is less than linear. Similarly, the cost of administrative labor is lower on System z, and the System z cost per unit of work is much lower than with distributed systems.

I am generally skeptical of TCO analyses from vendors. To be useful the analysis needs context, technical details (components, release levels, and prices), and specific verifiable quantitative results.  In addition, there are soft costs that must be considered. In the end, the lowest acquisition cost or even the lowest TCO isn’t necessarily the best platform choice for a given situation or workload. Determining the right platform requires both quantifiable analysis and judgment.