All these data protection strategies— from tape backup to synchronous mirroring—are physical recovery methods. However, it’s vital to understand that logical data recovery is different, yet equally important. Logical data recovery focuses on recovering the business processes and business applications (i.e., recovering the data to a consistent, known state to properly restart applications).
Generally, the faster the recovery speed and more granular the recovery point, the higher the cost of implementation and ongoing management, and the greater the complexity of the storage and network environment. This added cost and complexity increases the importance of ensuring DR plans are regularly tested to ensure data is recoverable. You don’t want to make a substantial investment in hardware and software only to find your data isn’t recoverable when you need it.
Another reason to regularly test DR plans and procedures is that even the highest level of synchronous mirroring protection may not provide protection against data corruption or loss resulting from logical failures or physical events. For example, if an application failure causes data corruption, then the corrupt data will be copied to the mirror until the failure is discovered. Further, if mirrored data has been deleted, it would be deleted on both mirrors. To recover, locating a copy of data taken before the failure or deletion is required.
Another problem supporting regular testing of DR plans could be a failure of the link between the mirrors. Since the application needs to get write verification from both the primary and secondary copies before proceeding, a link failure could result in a hung application. If a link failure occurs in asynchronous replication, the secondary copy might lag significantly behind the primary, as the application need not wait for write verification from the secondary before proceeding. Swiftly diagnosing and fixing link failures can help prevent application slowdowns and minimize the risk of data loss.
A further consideration in DR planning and testing is to ensure personnel at remote sites have the information and tools they need to rapidly restore operations if the primary site fails—especially in the event of a natural disaster or power failure that might disrupt communications or prevent personnel from traveling to the remote site.
All these factors taken together demonstrate the need to take a proactive stance toward recovery planning and to implement appropriate policies and procedures—supported by technology—to ensure recovery.
Software Technologies for Mainframe Recovery
The complexity of the modern data center environment is evident in:
- Multiple applications with varying recovery objectives, sometimes sharing critical or non-critical data sets
- Tiered storage and recovery architectures, including backup, replication and mirroring