Operating Systems

Figure 3 shows a real-world example of the user catalog configuration on one system at a major z/OS installation. There are more than 1.3 million data sets on the system, and nominally 25 user catalogs, yet a full 79 percent of the data sets—almost 1.1 million—are cataloged in just five catalogs. The largest catalog has 377,441 data sets cataloged in it, representing 27 percent of the data sets on this system. If any of these five catalogs suffers an outage of any kind, a large number of data sets immediately become unavailable. Consider the 241 aliases on these five catalogs, representing the applications that will be affected by a catalog outage. 

Figure 4 illustrates the catalog environment on one system at a major banking facility where 34 user catalogs contain entries for 2.7 million data sets. Here again, a huge percentage (74 percent) of these data sets are cataloged in just five catalogs, and a whopping 39 percent (more than 1 million) are cataloged in a single catalog! Lose any of these catalogs for a few hours and you have a major business disruption.   

Similar situations emerge at virtually every OS/390 or z/OS data center, large or small. The worst data centers have a single catalog where every single data set in the installation is cataloged in the same catalog. That’s a disaster waiting to happen!  


Obstacles to Overcome

There are reasons for this unfortunate situation:    

- Many z/OS systems programmers cling to a decades-old belief that ICF catalogs don’t break. This is rubbish, but opinions die hard. The result is a lack of attention to this area.  

- Taking any action to improve the catalog environment typically requires one or more catalogs to be taken out of service while corrective action occurs. Many systems programmers are afraid to touch catalogs for fear they’ll cause more problems than they fix. Actually, many catalogs are already broken in one way or another and either ignorance or band-aid procedures are in effect to sidestep the problem areas.

- Non-stop processing is also a barrier. Online systems such as CICS or DB2 run for weeks at a time. When the online systems aren’t accessing the data, batch jobs are. With so many aliases (applications) using any given catalog, it’s difficult to schedule downtime on a catalog.   

4 Pages