Ensuring Recoverability in Mainframe Environments

4 Pages

Explosive information growth has many corporations struggling to protect their information assets. Data loss, never an attractive proposition, has become more risky than ever, especially due to new corporate and regulatory mandates for information retention. The threat of terrorism, regional blackouts, and natural disasters has many enterprises reviewing and retooling their existing plans and procedures to ensure their data is adequately protected and recoverable.

There are other disasters to be concerned with besides major disasters that might result in a need to recover data and applications. The most common are failures resulting from programmatic or human error, which often result in data corruption or loss.

Protecting information assets isn’t simple. It requires resources and vigilance to ensure data is always recoverable, as well as in a consistent, coherent state. The challenge is how to balance recoverability against non-stop computing needs where data is constantly being generated and modified.

Recoverability vs. Continuous Availability

Organizations implement disaster recovery plans to help them strike this balance. Such plans take into account how critical each application is to business operations, and define recovery time and recovery point objectives for them.

The Recovery Time Objective (RTO) is the maximum allowable time an application may be offline. The Recovery Point Objective (RPO) is a measure of how much data can be lost. For example, if the last backup is 12 hours before the failure, the organization could potentially lose 12 hours of data. If the organization can’t afford to lose that much data, a shorter recovery point objective is required. Once these objectives are defined, the organization applies various data protection and recovery strategies to meet them.

Tape-based backup methods form the foundation of most data protection strategies. However, tape is slow relative to other technologies and recovery speed is becoming a critical element in many organizations’ recovery plans. So tape is now often the “last line of defense” in data protection rather than the primary strategy. In addition, tape has become the medium of choice for long-term data retention and archiving.

Disk-based backup and Point-in-Time (PIT) copies or snapshots provide faster recovery than standard tape backup. PIT copies provide the added advantage of improving the recovery point—meaning the risk of data loss is reduced as the amount of data that can be lost is limited to the interval between PIT copies. Asynchronous replication further reduces the risk of data loss and RPO to only a few transactions or data write.

For the best in terms of instant physical data recovery and the lowest RPO, data center managers turn to synchronous disk mirroring—particularly for local protection and recovery, as application performance can be adversely affected if the secondary copy is located at a distance from the primary. So synchronous mirroring and asynchronous replication are often used in combination for the most critical applications—mirroring for local protection and replication for distance protection. For example, an organization might implement synchronous mirroring within a data center to protect against local failures. To protect against a sitewide failure—such as a power failure or natural disaster—the data would be replicated to a second data center or Disaster Recovery (DR) site. Further protection could be provided with PIT copies and/or tape backups of either the local or remote copy of data. Clearly, such a multi-tiered, multi-site mirrored environment would represent a significant investment in terms of hardware, software, and management.

Benefits of Regular Testing

4 Pages