Storage

Latency is a very important topic in the world of storage and storage area networking (SAN). The increasing rate of adoption of solid state drives (SSDs) by System z installations makes the subject of FICON SAN latency and its minimization paramount.

My colleague, David Lytle, wrote an outstanding article titled “Building I/O Fabric Super Highways in the Data Center” in the June/July 2014 issue of Enterprise Tech Journal. In this column, I will take the baton from David, and go into the subject of latency in more detail. I will discuss latency/delay and how it ties into a discussion on response time. I will then mention some items that contribute to delay and latency in a FICON SAN. I will close with three SAN architectural technologies that are ideal to implement in your efforts to minimize latency.

First, let’s define latency and response time. I like the formal definitions used by Huseyin Simitci in his textbook Storage Network Performance Analysis.

• Latency: The amount of time between the initiation of an action and the actual start of the action. This typically is an end-to-end definition.
• Response Time: The time it takes to finish a given operation. Operations include reads, writes, searches or any mix of all of the above. The response time is measured from the initiation of the storage I/O operation (request) to the completion of the operation. In other words, response time is an end-to-end measurement that includes wait times (delays and latencies) and service times (actual work times by storage devices).

Latency is related to transfer rates (bandwidth) as well as IOPS. However, latency really is not concerned with how much data is being moved or how many requests for data are being made. Latency is concerned with how quickly the I/O operation is being handled. Latency also can be measured either one-way or round-trip delay time (commonly used with remote data replication).

Contributing factors to SAN latency can be in the network physical layer, as well as delay in the SAN switching devices themselves. Keeping in mind that a SAN is built on fiber optic technology, we need to remember that in any discussion of SAN latency, the latency is largely a function of the speed of light. According to Wikipedia, the speed of light in a vacuum is 299,792,458 meters per second (equating to a latency of 3.33 microseconds for every kilometer of path length). Fiber optic cable technology is not an ideal vacuum and we must account for the index of refraction, which for most fiber optic cables today is 1.5. Sparing you the physics details, this means that light travels 1.5 times faster in a vacuum than it does through your FICON SAN. So, our latency actually works out to be 4.9 microseconds per kilometer (typically rounded to 5).

The SAN switching devices themselves can also have internal delays, which contribute to overall latency. There also could be issues with buffer credit configurations on interswitch links (ISLs) that can contribute to latency. In general, any architectural features with a SAN switching device, which minimize the potential for head of line blocking, and/or reduce actual time present in the switching device will minimize latency in a SAN. Storage access by applications is generally more latency sensitive than typical network access and latency. Storage subsystems today (such as SSDs) are typically measured with latency of a few milliseconds, while typical network applications typically measure latency in dozens of milliseconds.

If I were an end user, and I was implementing SSD storage technology in my mainframe environment, I would be looking to take advantage of three key SAN switching device architectural features for minimizing latency: local switching, cut through routing and virtual channel technology. Space constraints for a column preclude a detailed discussion of these features, so I will cover them in a more detailed future full-length article.