Ensuring Recoverability in Mainframe Environments

4 Pages

-        A wide variety of storage devices from several different vendors.

Such complexity is the primary reason to deploy software technology to provide recovery assurance. What should an organization look for in recovery assurance software technology that helps ensure recoverability? Generally, managing and assuring recovery requires:

-        System recoverability: No data center is recoverable unless the operating system is intact. The recovery assurance software you choose should track and ensure all system-related data sets are available at the remote site, whether they’re available on backup tape, or on disk as backup files, PIT copies or as a result of replication or mirroring.

-        Application recoverability: Once the system is available, the business-critical applications are the next priority for recovery. Even in a mirrored environment, a missing file can cause a delay in recovering and restarting these applications, so the technology you select should eliminate these delays by ensuring all critical data is available and intact. In addition, the solution you choose should be able to find critical data regardless of where it’s physically located.

-        Prioritization of application recovery: In a major outage or disaster, not all applications need to be restored simultaneously. Look for a solution that enables a customized, phased restore process and prevents the accidental overlay of restored data sets, especially if you have multiple applications with varying recovery objectives.

-        Automation of critical data set identification: Identifying critical, non-critical and allocate-only application data sets ensures your data is always recoverable. Automating this identification process, regardless of the media it’s on, streamlines recovery management. Ongoing monitoring of application changes is necessary to ensure critical data isn’t inadvertently left out of the recovery scenario. You should also look for a solution with reporting capabilities to present the analysis in a historical or daily mode and illustrate application interdependencies.

Even minor physical or logical failures may result in suspended or halted batch processing. So organizations that leverage batch processing should incorporate swift resumption of batch operations into their recovery strategy. A key element in restoring or rerunning a batch cycle is using an appropriate synchronization point. To simplify restoration of batch operations, select recovery software that helps you identify the appropriate synch point and restart or rerun the batch as appropriate. Ideally, you should be able to isolate specific applications so more critical batches can run first.

It can be difficult to determine the resources necessary to recover in a given scenario, and it may not be practical to apply the same level of resource investment to each scenario. To help you size various recovery scenarios and reserve only the resources you need, simulation or modeling is helpful. Some recovery assurance solutions include the ability to model recovery scenarios tailored to the number of applications or jobs to be recovered; this streamlines planning, testing and resource allocation efforts, saving time and money. This type of modeling can also help identify unnecessary or redundant backups, enabling further cost savings.

To further simplify management, look for a solution that provides a central console from which to view the status of applications, backups, replicas, and mirrors. This capability, combined with reports, can substantially improve the ability to demonstrate recoverability to non-IT personnel—such as business management, auditors, and other interested parties.


4 Pages