Jul 21 ’09
Database Reorganization Strategies for DB2 z/OS
When a DB2 database is first created and loaded, it’s in fine shape; data is in sequence and free space elements exist. However, over time, the objects can become disorganized, especially the indexes. There are two main reasons to reorganize a DB2 database: I/O performance and space reclamation/consolidation.
In the first case, performance is critical and is closely monitored in production DB2 applications. A long-running transaction may abort or a user may cancel it. Many DB2 applications are Web-enabled. A user doesn’t know or care what the cause of the long-running transaction is and may simply abort the transaction and go to another Website. Once this business is lost, it’s lost forever.
In the second case, effective space management can lead to lower operating costs and higher availability. Even in newer releases of DB2, you can eventually run out of available space for inserting or loading data, resulting in a failed transaction or an outage.
DB2 objects are designed to support both online transactions and query reporting. Online transactions are generally record-level events (although set-level updates are possible). Queries generally sweep many records into a result set. The query records may be joined from several tables, resulting in significant work to derive the result set.
Inserts, updates, and deletes affect data location and space usage. A new row inserted into a table can result in hundreds of I/O events due to processing on the table and all its indexes. Indexes want to maintain data sequence, so an insert may cause data movement and page splits in the index. These index leaf page splits can become quite expensive, so it’s important to maintain enough free space in the index pages to prevent splits. Reorganizing the index sorts all the index entries and reclaims wasted space, making subsequent processing more efficient.
Query processing presents a different challenge. Queries generally pull data from several tables into a result set. The join processing can become complicated and it isn’t unusual for a query to process millions of rows of data to derive a relatively small result set. A disorganized table space or index can result in processing many more pages than would be required if the objects were properly organized. DB2 generates an access path during the bind process. The access path is affected by the catalog statistics on the objects, so a disorganized object can result in an ineffective access path and wasted resources.
Free Space Reclamation/Consolidation
When you define indexes and table spaces, you allocate free space with CREATE or ALTER parameters. As application data is modified, some data movement occurs and free space can be consumed. When no space is available for an insert, DB2 must do some work to make room for the new data. For index processing, this can result in moving hundreds of pages. It’s important to monitor the appropriate statistics and trigger reorganization when the objects reach a critical free space shortage.
DB2 data is stored in VSAM data sets. It’s possible to run out of space on these types of data sets, so monitoring space at the data set level is critical. Failure to monitor space usage can result in a significant application outage.
When to Reorganize
DB2 reorganizations are expensive; they consume processing cycles, disk or tape, operator time, and so on. They also result in an outage (albeit brief due to newer online reorganization capabilities). Most DBAs don’t reorganize until they have to. The decision to reorganize can be driven by two high-level events— statistics exceeding thresholds and application performance degradation.
DB2 keeps two kinds of statistics on its objects:
• Those captured during execution of a RUNSTATS utility or an Independent Software Vendor (ISV) equivalent. RUNSTATs statistics can modify the access path DB2 chooses, so some users are reluctant to update them once access paths are set as preferred. Other users update these statistics often so they can be assured they have the most recent data on their objects.
• Real Time Statistics (RTS). RTS are dynamically captured and don’t modify access paths. Instead, they’re good indicators of change in the object since that last utility event (such as reorganization).
Some of these statistics are good indicators that the object is disorganized. In particular, pay attention to index statistics that indicate leaf page disorganization (LEAFDIST, LEAFFAR, and CLUSTERRATIO). When these values show disorganization, it may be time to reorganize the index. Many users periodically monitor these statistics and trigger a reorganization of an object when a predefined threshold is reached.
DB2 users depend on performance monitoring tools to manage application throughput. You can easily identify an application workload that’s doing excessive I/O processing because of disorganization. Workloads can be tracked and compared over time to identify degradation in performance. Performance degradation is an indicator that reorganization may be required to restore peak performance.
DB2 IFCID 199, which captures metrics with respect to I/O at the object space level, is the performance record of interest for deciding when to reorganize. These metrics can be captured and stored so performance can be measured over time.
When Not to Reorganize
When statistics show some disorganization, is it time to automatically trigger reorganization? What if the application processing is actually maintaining peak performance? If space allocation is adequate, reorganization of a well-performing object is a waste of time and resources.
Asynchronous I/O vs. Synchronous I/O
DB2 prefetches pages into the buffer pool. As soon as DB2 recognizes the least bit of sequential access, it begins to fetch pages from disk so they’ll be in buffer pools before your application needs them—to dramatically reduce I/O wait times and improve application performance. This is known as asynchronous I/O because it happens in parallel with (indeed, ahead of) application processing and is especially effective for workloads that scan lots of data. Asynchronous I/O is sensitive to data being organized and kept in cluster sequence. Maintaining current statistics and triggering reorganization based on threshold exceptions supports this scan-oriented workload well.
When an application does a more discrete or random access, it will probably be serviced by synchronous I/O, which occurs when an application demands a specific row. Synchronous I/O is more tolerant of data disorganization— it will probe an index, obtain the information to directly access the data, then retrieve it (assuming the data isn’t already in the buffer pool).
DB2 can use PREFETCH differently, depending on the situation:
• With viable indexes, the DB2 Optimizer will try to use PREFETCH to asynchronously read in the data pages.
• If it can determine that sequential processing is reasonable when the access path is determined, DB2 will call for SEQUENTIAL PREFETCH.
• If it identifies the need for many specific records that aren’t sequentially located, DB2 may call for Row ID (RID)-based LIST PREFETCH.
• Even if the access path calls for random access (which involves synchronous I/O), run-time monitoring may invoke DYNAMIC PREFETCH if the pages requested appear to be even loosely sequential.
Effective Reorganization Strategy
The most effective reorganization strategy combines analysis of statistics with analysis of workload performance over time. Look at RTS and IFCID 199 information together to make informed decisions regarding reorganization scheduling. Reorganize only those objects that would benefit from the process.
If statistics indicate that space soon may be exhausted, either enlarge the object or reorganize it to reclaim any free space. DB2 9 makes it almost impossible to run out of space—a universal table space can be up to 128TB! Universal table spaces provide for up to 4,096 partitions and they no longer require a partitioning range key. Partitions can be allocated as a specified size; when a partition gets full, another partition is automatically created and data begins to be inserted into it. Performance needs will drive most future reorganization decisions.
Workload analysis should focus on identifying the type of workload causing I/O. If the workload is largely a query scan-related workload (asynchronous I/O), then the object will be sensitive to cluster sequence and should be reorganized when performance degrades. If the workload is largely random (synchronous I/O), then the object will be somewhat impervious to the benefits of reorganization.
The aforementioned analysis assumes data is available to analyze. Periodically capturing IFCID 199 and RTS information and storing it for future analysis will take some setup and processing. To create a performance database to be used for workload comparison, you could use the table definition in Figure 1. This data can be periodically captured (as often as daily) and stored in the table. The data in the performance table along with data in the DB2 catalog can then be analyzed to spot degradation trends and disorganization.
The following indicators could be triggers for reorganization:
• Table space has more than 10 percent rows relocated FAR from original (as indicated in the TABLE_FARINDREF column in the performance table).
• CLUSTERRATIO less than 90 percent
• Index with more than 10 percent LEAF pages FAR from original (as indicated in the INDEX_ REORGLEAFFAR column in the performance table)
• More than designed number of rows have been inserted or deleted.
You can identify performance degradation by these factors:
• Baseline PREFETCH pages are more than two times synchronous I/O.
• AND synchronous I/Os have increased 20 percent since baseline.
• AND GETPAGES per synchronous I/O have fallen 20 percent.
The combination of statistics and workload analysis should yield a set of reorganization candidates—those objects that are both disorganized and experiencing degraded performance over time. Reorganizing these objects will result in more efficient DB2 processing, reducing both I/O and CPU required to derive the result set for scan-oriented queries.
Reorganize only when you need to. Combine the analysis of object statistics with the performance metric comparison for the workload. Schedule reorganization for the object (table space and/or index) when the workload performance degrades and the statistics indicate disorganization. This will ensure the time, resource, and outage needed for reorganization is worth the effort.
• Activate RTS (part of DB2 9 but available for V8 with setup)
• Activate DB2 statistics Class 8 and begin recording IFCID 199 data
• Create a performance table to store data
• Set up a periodic job to summarize 199 data, collect RTS data, and populate your performance table
• Use performance and RTS data to adjust your reorganization strategy.
Running needless reorganization is a waste of time and resources and reduces DB2 application availability. Identifying those objects with workload that would benefit from reorganization can reduce the total cost of ownership for DB2 and yield more efficient application processing.