It’s important to pay attention to your storage configuration. Careful planning in evenly distributing data and workloads will yield better response times, more resilient operations, and higher throughput from your storage hardware. This article reveals the hidden influence of balance, which resources it impacts, and what balancing techniques you can use to improve performance and throughput without upgrading your hardware. We’ll discuss how storage tuning is uniquely different from processor tuning, and we’ll show that significant throughput and response time improvements may be possible with just a few well-chosen optimizations.
Having an unbalanced storage system can affect performance and cost. When hardware resources aren’t evenly loaded during peak periods, delays will occur even though the resources are more than sufficient to handle the workloads. The consequence could be that hardware is being replaced or upgraded unnecessarily, which is obviously a tremendous waste of financial and other resources. Unfortunately, this often happens because of the low visibility of the most important metrics for the internal storage system components. If you only look at the z/OS side of I/O, these imbalances can be hard to find, resolve, and prevent.
The mainframe performance perspective has always been that Workload Manager (WLM) optimizes the throughput in the z/OS environment by prioritizing work and assigning resources. This load balancing works well for identical processors in a complex. However, for storage, it’s a different story. The kind of optimization WLM performs simply isn’t possible for I/O since the location of the data is fixed. WLM can only manage the components that are shared, such as the channels and Parallel Access Volume (PAV) aliases. The internal disk storage system resources are mostly out of WLM’s control, and utilization levels of the internal components of the storage system hardware are unknown to z/OS and WLM, so work can’t be directed to optimize balance.
Let’s review how the level of balance on the major internal components of a disk storage controller influences the performance and throughput and how to create the necessary visibility to detect imbalances.
In a z/OS environment, front-end balance relates to the FICON channels and adapter cards. Most installations maintain a good balance between the FICON channels. z/OS will nicely balance the load between the channels in one path group and, with multiple path groups, most installations have ways to ensure each path group does about the same amount of work.
The less visible components here are the host adapter boards. Multiple FICON ports are attached to one host adapter board, and the host adapter boards share logic, processor, and bandwidth resources between ports. So, it’s important to carefully design the layout of the port-to-host adapter board configuration. Link statistics provide a good way to track imbalance. The load on each of the FICON channels is the same, but the links aren’t evenly distributed over the host adapter cards. The resulting differences in load on the host adapter cards negatively influence the response times for the links on the busiest cards (see Figure 1).
RAID Parity Groups
Redundant Array of Inexpensive Disks (RAID) parity groups contain the actual data the applications want to access. The throughput of a storage system largely depends on the throughput of the RAID parity groups. A common misconception is that a disk storage system with a large amount of cache hardly uses its disks because it does most of its I/O operations from cache or to cache. Although it’s true that under normal circumstances virtually all operations occur via cache, many of those operations do cause disk activity in the background. The only operations that don’t cause a disk access are the random read hits; all others do access the disks at some point. For instance, sequential reads, even though they’re mostly hits, must always be read from disk. As for writes, all writes are done to cache, but they need to be written to disk sooner or later, too. Moreover, for many of the current RAID schemes, a single write on the front-end causes more than one disk I/O on the back-end. For RAID 1 or RAID 10, a write takes two disk operations since all data is mirrored. For RAID 5, a random write takes four operations; for RAID 6, it even takes six operations because of the more complicated way parity updates work for these RAID schemes. Sequential writes are much more efficient on RAID 5 and RAID 6 than random writes, but they will still generate more than one back-end I/O per front-end I/O.