IT Management

Keeping DB2 objects well-maintained is imperative, but reorganizations can be costly and time-consuming. You can reorganize objects on a schedule or when thresholds are exceeded. For some types of workloads, physically disorganized objects don’t degrade application performance, and reorganizing those objects is a waste of resources.

The main reason to consider reorganizing a DB2 index or table space is increasingly poorly performing SQL. Look at your indexes first. Healthy indexes deliver direct cost savings to an application. But also keep an eye on table spaces. Reorganizing a table space with its associated indexes restores the clustering order of table data, enabling good sequential I/O. Before reorganizing table spaces because of bad cluster ratio, ask if the clustering index is really used by application SQL. Reducing or eliminating secondary extents is a valid reason to reorganize indexes and table spaces. Even though modern disk systems can quickly retrieve data from disk, secondary extents cause an increase in CPU path length for each time the objects that reside on a secondary extent need to be retrieved. Over time this can add up to millions of SQL statements.   Regardless of the overhead secondary extents cause, they often are a sign that the initial data set size wasn’t correctly defined. And here’s yet another reason to run a reorganization: changing data set sizes.

To Reorg or Not to Reorg, That’s the Question

Let’s look at a general step-by-step approach to automate DB2 table space and index reorganizations.

Step 1: Collect statistics. Collect physical statistical data on DB2 objects and filter out those objects that are physically disorganized. If you use IBM RUNSTATS, consider using the HISTORY option to collect statistics over time. Vendors provide their own object statistics database. Canned reports help you understand trends and do capacity planning.

Collecting DB2 catalog statistics at the wrong time might cause performance problems because access paths could change at later binds. Rather than using RUNSTATS for reorg automation, use Real-Time Stats (RTS). RTS tables contain “since” statistical values, which represent a certain number of activities on a particular table space or index since the last REORG/COPY/RUNSTATS.

Collect accounting statistics of SQL transactions over time and correlate those with the physical object statistics to further reduce the number of reorg candidates.

Step 2: Object selection/exclusion. An important part of the automation process is the concept of object selection and exclusion. The simplest way to select objects for reorganization is to use the IBM LISTDEF utility command. Consider excluding objects that require special, non-automated treatment. Another way to group for reorganizations is to select all objects related to a particular application.

Step 3: Threshold analysis. The main part of the automation process is evaluating each object against a threshold. Certain situations require a combination of conditions with AND/OR, and if you have collected statistical data over time, you can compare current data with older values to qualify. Consider the level of performance degradation for the object since the last REORG in the threshold analysis. 

Step 4: JCL generation. Once you have a list of eligible objects, it’s time to generate some JCL. Don’t reorganize indexes if you also reorganize the owning table space. If you need to resize the object, adjust the physical allocation of data sets through an ALTER before the reorganization. If you’re reorganizing many objects and you must finish those reorganizations by a certain time, prioritize the jobs. You could generate and submit the largest and worst reorganized object first or balance objects over multiple jobs (put a few large objects into one job and many small objects into another).

Step 5: Execution. It’s time to talk to the job scheduling person. Provide information for a maintenance window on how often, on which day of the week, at which time, and for how long reorganizations should run. You want to have reorganizations done in general, and the more you know about how and when your applications use DB2 data, the less intrusive your reorganizations will be.

Conclusion

DB2 reorganization isn’t an art. If you automate it once, you can get it off your desk and spend your time on the complex tasks. And don’t forget: The best reorganization is the one that solves a performance problem, and the fastest reorganization is no reorganization.