Some care must be taken since the load utility must know the physical partition number. Since after partition rotation the physical and logical partitions are no longer synchronized, the DBA must implement some process to generate the load statement with the correct partition number. This can be done with programming or by using a language such as REXX.
Note that these schemes also require coordination with other DBA processes such as reorg and image copy since the physical partitions requiring backup (and potential recovery) change over time. Again, you can use a programmatic process to generate correct copy and reorg statements or Real-Time Statistics (RTS) to automate these jobs.
For schemes where the purge criteria can’t be conveniently matched with the partitioning scheme, purging must occur using an application. Here, to avoid resource constraints, you must design the purge application to peacefully coexist with other concurrent table activity. For volatile tables that may be accessed at any time of day the purge application must be coded as restartable, since it may encounter a deadlock or timeout. In addition, you should be able to tune the commit frequency easily so the DBA can adjust it to minimize any timeouts and deadlocks other applications experience. This is commonly done by having commit frequency information placed in a common area (such as a DB2 table) that can be modified when needed, avoiding program logic changes.
What happens if the purge process executes more slowly than rows are added to the table? The usual answer in this case is to partition the table based on other criteria and then execute multiple, simultaneous instances of the purge process. To avoid deadlocks, each instance is designed to purge from a different partition. Again, if a partition rotation scheme is in place, the partition choice must be programmed in some fashion.
Partitioning schemes that favor quick purge (via emptying partitions with old data) tend to lessen the I/O load by limiting transaction activity to partitions with current data. However, by concentrating data access to a small portion of the table you must beware of causing availability constraints since row access activity is now confined to a relatively small part of the table. Application throughput is excellent for inserting applications while queries may possibly deadlock or timeout due to page contention. This can be alleviated in several ways, each with advantages and disadvantages. Options include querying with an isolation level of uncommitted read, implementing row-level locking, or shortening transaction length via more frequent commits.
There are three common reasons for table clustering:
- To support high-performance, multi-row access (such as I/O parallelism)
- To support distribution of volatile pages to avoid hot spots
- To allow some joins to work faster.
In each case, rows with “nearby” keys are expected to be accessed in the same transaction or in the same SQL statement. Increasing their physical proximity tends to decrease table getpages and avoid random table I/Os. However, this may not always be the case.
Another possible benefit of clustering is when many (hundreds or more) rows with nearby keys are accessed. Clustering may favor either sequential prefetch or dynamic prefetch of pages, reducing I/O wait times and improving application throughput.
Some partitioning schemes will clash with clustering needs. For example, partitioning schemes that distribute row inserts to avoid hot spots may prevent clustering in the preferred manner.