Mar 18 ’10

Tape: A Collapsing Star

by Randy Chalfant 

When a star is born, the mass of the star determines how long it will live, ranging from millions to trillions of years.

Magnetic tape for data storage began life in 1951 and was used as a primary recording media by some until the ’70s. The continuous use of tape caused a buildup of market share over other tenable, long-term storage alternatives. But as the storage market universe continued to expand, other storage stars were created. The star power of tape is now fading; in an astronomical sense, it has exhausted its supply of market fuel, expanded its range as a red giant, and is now rapidly collapsing as a white dwarf to someday become a black hole in our memory.

For today, though, even a white dwarf has value. With minor exceptions, the value of tape shines only for the large enterprise. For the non-enterprise-class market, distance and portability are the only star qualities tape has left.

The demise of tape won’t be a big bang that results in a black hole through some instantaneous event. Instead, tape will die because it can’t follow or match current market conditions and requirements. Problems that limit tape’s popularity include mislabeling, labels that fall off, media failures, transport failures, library failures, media server failures, slow performance, risk of loss or theft, and the high cost of replacing, adding, or re-mastering tape for use by new generations. For all those reasons, users will simply stop writing data to tape; they’ll opt for disk instead. Every stronghold of value that tape once held has been supplanted by disk. Protection, performance, reliability, energy conservation, management, and cost all favor disk.

When Tape Made Sense

Cave art remains a hieroglyphic-written history that’s survived some 40,000 years—surpassing anything we know about data retention even today. You simply can’t beat data survivability when it’s etched in stone.

According to the University of California-Berkley, from 40,000 years ago until the year 2000, all recorded data represented a total of 2.7 exabytes. Tape could function well in that era. However, IDC now estimates that roughly 487 exabytes were added to the digital universe in 2008. IDC further estimates that five times that number, or 2.5 zetabytes, will be captured by 2012. Because not all data needs to be recorded, the total amount of disk storage capacity shipped is estimated to reach 110 exabytes by 2012. The sheer size exceeds tape’s ability to reasonably, reliably, and economically store, manage, and retrieve data.

Gartner estimates that 15 percent of all backups fail. Additionally, 10 to 50 percent of all subsequent restores from tape fail, depending on the elapsed time since the backup occurred. Restoring data from tape older than five years fails 40 to 50 percent of the time. Much of this goes unnoticed. Both Gartner and Storage Magazine report that some 34 percent of companies never test a restore from tape. Of those that do test, 77 percent experienced failures in their tape backups.

The patterns are clear. Data is growing at rates hard to conceive. Administration of the infrastructure to keep up with data growth is challenging. The management of disk to house all the data isn’t trivial, but managing disk is easier than managing tape.

Restoring from tape is unreliable in the best of conditions. When a restore from tape does work, the performance of the restore is poor. A failed restore or slow performance of a restore can spell disaster if the failure is part of a major business application’s data loss. Boston Computing Network found that seven out of 10 small firms that experience a major data loss are out of business in a year—and that assumes they’ve completed a backup and have the option of a restore. Ironically, backups to tape frequently aren’t completed in the course of a defined backup window. Without a timely backup, a restore is of little or no value.

From the early ’50s until the late ’90s, the volume of data made sense for tape technology. However, for the volume occurring now and what’s anticipated, tape can’t reasonably sustain the value it once held.

Hollerith or IBM card punch readers disappeared from the data center in the ’80s as data volumes grew faster and larger than what that device could reasonably sustain. Similarly, tape is near the end of its useful life as active media. It may still exist to be tucked away inside a mountain, or for data portability, but even those uses will diminish. Environmental issues associated with tape are problematic and communications bandwidth growth will someday soon remove the need to truck a tape to a remote location.

Protection

From a business perspective, backups are important, but restores are everything. From an operational perspective, backup is easy and recovery is hard. However, those that deal with the rigors of day-to-day backup issues may feel otherwise. Tape and, in particular, backups have always been an administrative challenge. Tape was positioned for years as the least expensive alternative in an expanding universe of data, but price per byte stored isn’t the only important factor.

Unless you have the latest, high-end tape transport and library, reliability issues are expected and management is complex when using tape. You need many transports and libraries to stand a chance of getting nightly backups done. And then there’s the high cost of the media. Every couple of years, as new transports are announced, you discover the old media will no longer work, and all of it has to be replaced. That’s expensive and disruptive.

Unless you can afford a large library capable of storing the incremental amount of data required for ongoing protection, your tape media is at risk from environmental issues and mishandling when tapes are ejected from a library. Cartridge reliability is an issue, and so is the potential for lost or damaged media.

When you consider these issues, the price per byte stored on tape becomes far more expensive than the base measurement. Moreover, issues associated with the operations of tape are numerous; they include reliability, performance, network connectivity, resource conflicts, scheduling, media management, and more. Using disk as a library alternative offers a flexible, reliable, high-performance, efficient solution.

Simplify and Save

Features from backup software vendors have made backup to disk a logical choice for simple and flexible backup and recovery. For applications such as VMware, Microsoft Exchange and SharePoint, protection recoverability and performance are key. While tape is still used, it’s rare to see it used exclusively today. The benefits of Disk-to-Disk (D2D) are great, which is why at least 70 percent of all backups are first written to disk.

Speed Matters 

To improve backup performance to tape, backup software will gather data from multiple job streams, typically 15 or more, and then interleave the data into a super block, which is then sequentially written to tape.

To recover a single application, you must read all the data from the tape, stripping away 14 out of 15 records to get the one you need; this is multiplied by the need to read through numerous super blocks associated with sequential files that often are spread across 30 or more tapes. This has obvious performance issues, and is prone to failure from media and transport reliability issues, which have no level of redundancy whatsoever. If you have an uncorrectable read error, you can lose the entire backup.

Protection objectives are measured as Recovery Point Objectives (the amount of data at risk) and Recovery Time Objectives (the amount of downtime you can tolerate); they’re a concern with tape. Tape can easily take nearly four days to recover 10 TBs for a Linear Tape-Open 3 (LTO-3) as compared to 2.5 hours for a disk used as a protection library. What’s your cost of downtime?

Disk Performance

When using disk as a protection library, the problem is solved. Backup software can index all the data as it stores it directly to disk. Whether you have to recover a single sub-object to VMware, an email message to Exchange, or a SharePoint Document, you can individually recall them. Because it’s a random access recovery, it’s fast. The Network Data Management Protocol (NDMP) lets you write directly to disk for easy configuration and management of network-based backups. With NDMP, network congestion is minimized because the data path and control path are separated.

With disk used as a protection library, backup can occur locally—from file servers direct to disk, while management can occur from a central location. Because it’s indexed by the backup application directly to disk, it’s simple. Because of the decreased infrastructure complexity, it’s easy and more efficient.

Client encryption can protect the data all the way through the network into the backup server and onto disk. It’s a faster restore that’s easier to manage and occurs at or below the cost of tape.

Consider Complexity

There are many compelling reasons to use disk over tape as a protection library—chief among them the ease of managing and using disk.

Managing tape cartridges is complex. Many users complain they can’t recycle backup tapes fast enough—forcing them to constantly buy more media. Backup typically uses a Grandfather-Father-Son (GFS)-managed retention plan. The backup schedule generally will include daily, incremental backups and weekly, full backups. If you look at what happens over the course of a year, every TB of primary disk causes 25 TBs to be written to tape to protect it. The cost to implement, maintain, and manage this is extreme. If you were backing up 42 TBs of disk, over the course of a year, you’d need 6,300 LTO-2 tapes. This assumes an 80 percent efficiency usage for each cartridge. At $26 per cartridge, the cost is $163,000, and you have 25 copies of the data to manage and maintain.

A restore of a single user or application can easily require loading and reading 10 to 30 cartridges or more. Finding the right cartridges and having each one of them work without failure is a concern. The amount of people required to manage a tape library typically exceeds the number necessary to manage disk as a protection library.

A tape library is typically a serialized resource. Backup jobs are scheduled by priority; resources are switched and allocated to a job. When that job completes, resources are switched again and the process continues; one backup job serialized behind another—all requiring vigilant monitoring and administration.

Disk Workflow Management Simplicity

Disk used for a protection library lets you simultaneously share resources among multiple servers—whether it’s on a Storage Area Network (SAN) or through the network by way of Internet Small Computer System Interface (iSCSI). There’s no monitoring, switching, or hassles. Backup jobs run simultaneously, avoiding the requirement of tape to wait to start a backup job until after the previous one is completed and resources are switched. With disk, multiple streams can run simultaneously. Using iSCSI-enabled disk, you also can easily collect or move data offsite on a WAN for geographically protected data.

Using disk as a protection library, backups are routed through a centralized backup infrastructure; you can even use de-duplication to greatly reduce the total amount of storage required. Overall, you can expect up to a 20 times savings in stored data with significant improvements in backup and restore performance. Using a post-processing approach, there’s no need to continually add servers to keep up with de-duplication load for backups.

Using disk as a Virtual Tape Library (VTL) provides all the advantages of disk as a protection library while letting you write to tape on the back-end if you need data portability, the most useful function of tape for the small to medium enterprise.

Tape Availability

Known chemical processes degrade magnetic tape. The binder systems used in today;’s tape are based on polyester polyurethanes. These polymers degrade in a process known as hydrolysis, where the polyester linkage breaks in reaction with water. One of the byproducts of this degradation is organic acid; organic acids accelerate the rate of hydrolytic decomposition by attacking and degrading magnetic particles. A degraded magnetic particle means you can’t read data, which is reported as a permanent read error.

The lifetime of a tape is defined as the length of time a tape can be archived until it will fail to perform and can’t be read, at which point you can expect significant data loss. The degree of hydrolysis of a tape binder system is a critical property that will determine the life of a magnetic tape.

Temperature and humidity dramatically affect shelf life. A 10-degree temperature change can reduce the life of a tape by 10 or more years. If an administrator loads a cart of tapes and takes them to a non-raised-floor room, temperature and humidity changes will accelerate the effects of thermal decay, which will destroy data in as little as five years. Five years is a long time, but no longer than the length of time you normally keep your disk before you refresh it. So, why risk not getting your data back when you know you can with disk?

The Library of Congress and the National Media Lab recommend:

Widely fluctuating temperature or RH severely shortens the life span of all tape. This is one of the main reasons tape is viable only for the large enterprise that can afford a library large enough to maintain tape on a raised floor with handling by a robot.

There are many other considerations. The design of the cartridge and the transport are critical to tape reliability. Only recently have tape transports become reasonably reliable. The enterprise class transports today are in the 400,000-hour range, with a well-managed cartridge (meaning temperature and humidity controlled environment) having a stagnant cartridge (meaning it’s not being used—which would otherwise shorten its life) with a shelf life of 15 to 30 years.

When data is stored on a cartridge, it must be in a temperature and humidity controlled environment. The cartridge also shouldn’t be handled if you want to maintain the integrity of the data. As the cartridge is sitting in a slot, after 10 years, three generations of transports have been introduced into the market. Considering the whole shelf life of about 20 years, at least six generations of change would have evolved in transports. Unless you kept the transport you wrote the cartridge with, system software, operating system, computer hardware, operations manuals, and ample spare parts, along with the recorded media, you can’t get your data. Even with all those things and in perfect environmental conditions, your chances of getting data back are about 27 percent. Does that realistically protect your business and mitigate legal risk?

By the way, if anything at all went wrong with that tape, or the other 30 cartridges that were used for a backup, there’s no redundancy and you can’t get your data. IT organizations deal with this by re-mastering data onto new transports and new media with every generation they change. Changing out media and re-mastering is expensive.

The mechanism for reading and writing tape is much more complicated than disk, where you have a nice flat stable surface that spins without flexing in a hermetically sealed and contaminate-free enclosure. By contrast, take a spool of paper out in the wind and unwind it in the breeze. The problem with a tape transport is trying to keep that surface flat and tracking to be able to read anything that was written. It’s difficult, and many factors can make it all stop working. Again, disk is simple by comparison, which is why the reliability numbers of disk are in the 1.2 million hour range vs. less than half that for the best tape transport. If you’re using Digital Linear Tape (DLT), the life is more like 250,000 hours for the transport, and the shelf life of the media is more like 10 years in perfect conditions.

Disk Availability

Disk, unlike tape, has a multitude of reliability and protection elements, such as Redundant Array of Inexpensive Disk (RAID), that are built in and commonly used. Because tape lacks RAID-type capabilities, when one tape out of a backup job group fails, the integrity of the whole restore collapses when multi-threading is used.

Disk has long been trusted as highly available. Disk used as a protection library is no exception. Whether you need a remote office, small office, entry-level system, or enterprise class with petabytes of capacity, disk products are redundant, protected and highly available, serving the need to recover and meet regulatory requirements.

RAID 6 is ultra reliable, protecting you from double drive failures, and providing an extra level of protection for your recovery data, a wise choice when using disk as a protection library.

Conclusion 

With more than 70 percent of users using disk backup today, disk is a wise choice. It’s more affordable than tape when you consider labor and downtime costs. The ability to rapidly back up and recover is paramount for business continuity planning. Backing up to disk is fast; recovery is faster. The use of a Massive Array of Idle Disk (MAID) approach also can save a great deal of energy. You can reduce your comparable energy costs for power and cooling by as much as 60 percent. Finally, you can reduce the total amount of space required to maintain your backup archive when you use de-duplication.

Using disk as a protection library will make the people you have more efficient, enabling them to do more, while you pay less. Using disk as a protection library will help you get your business back up fast.