IT Management

Suppose we asked, “Will the real mainframe virtual tape functionality please stand up?” Four candidates—four unique, yet complementary, solutions—would stand up. All four could legitimately claim to perform a tape virtualization function in a z/OS environment. Although a number of companies have products in the virtual tape space, IBM will serve as an example here, as it covers all four solutions. I recently took an analyst briefing from IBM on the new ProtecTIER solution (another “face” to tape virtualization), so I thought it was a perfect time to examine all the faces of tape virtualization.

The original use of virtual tape was (and is) to more fully utilize the capacity of physical tape cartridges. Historically, the process of writing data sets directly to tape cartridges has been very inefficient (say one application), which resulted in much of the capacity of a directly written tape cartridge being unused. With the average capacity of tape cartridges rising rapidly and no end of this trend in sight, writing directly to tape continues to be a worse idea each year.

The first use of virtual tape—to collect and concatenate data on disk to form large, virtual volumes for better space utilization of tape cartridges (i.e., tape stacking)—has proved to be an acceptable, well-regarded alternative. The result is that a physical tape library and the number of tape cartridges can be sized as close to optimum efficiency as possible.

However, the use of disk cache as a front-end to tape led to a second tape virtualization solution. Using intelligent management software on a server (what IBM called a Virtual Tape Server [VTS]), this second virtualization solution broke the connection between logical and physical tape drives, allowing storage of “virtual tape volumes” either on disk or tape. Physical tape drives are expensive compared to virtual tape drives. So users of the second solution can now “manage to need” (dynamically determine how many “tape drives,” physical or virtual, are needed at any one time to keep all job streams working most efficiently) rather than “manage to cost” (determine how to use the capacity of all available physical tape drives most efficiently to postpone the purchase of a new drive for as long as possible).

The IBM Virtualization Engine TS7700 family is a follow-up to VTS. Whereas IBM’s VTS products are peer-to-peer, the new family uses a grid architecture to considerably scale both performance and capacity.

A third use of virtual tape emerged as a z/OS software implementation useful for certain niche applications. IBM Virtual Tape Facility (VTF) for Mainframe (VTFM) uses mainframe-attached 3390 disk for virtual tape storage, thus providing time-sensitive tape applications with consistent high performance and virtual tape replication completely consistent with all replicated disk data, including catalogs, control files, and indexes.  

IBM’s VTFM feature set, scalability, and multiple implementation options make it a very useful, general purpose virtual tape solution for clients with smaller tape libraries or for clients with unique tape processing needs. One such feature, unique to VTFM, is Parallel Access Tape (PAT), which allows multiple applications concurrent read access to data on its virtual tapes.

The fourth use of virtual tape is the data deduplication capability that’s derived from open systems “virtual tape library” deduplication. Data deduplication takes advantage of the fact that much of the data written to tape is for backup data protection and that most of the data written over time is the same, and eliminates this duplicate data.

Compared to the second and third approaches, data deduplication saves additional disk space. It also means a shop can keep older data on disk as well as current data, offering the ability to partially or totally eliminate the need to go to physical tape in order to find the correct data to restore. This fourth use of virtual tape is exemplified in the new IBM System Storage TS7680 ProtecTIER Deduplication Gateway for System z.

Users shouldn’t think of these four solutions as “either/or” choices. Rather, they form a spectrum from physically all tape to some disk or lots of disk; the use of physical tape technology is mandatory for the first solution, but not the other three, and all allow disk caching. Thus, users can employ each solution for a different segment, or tier, of their tape library, depending on such factors as performance, duration of data retention, number of copies maintained, access patterns, cost, manageability, and local and remote recovery/restart needs. All four solutions could stand up as tape virtualization solutions.