Jun 1 ’11

z/Data Perspectives: Partitioning Pros and Cons

by Craig S. Mullins in z/Journal

Before deciding whether or not to use a DB2 partitioned table space, it’s wise to weigh the pros and cons. The following applies to partitioning in general and all types of partitioning, including index-controlled, table-controlled, and universal partition-by-range.

Advantages of a Partitioned Table Space

Each partition can be placed on a different disk volume to increase access efficiency. Of course, this is more difficult, if not impossible, to accomplish with RAID storage devices, so this doesn’t generally apply to shops using modern disk storage arrays.

Partitioned table spaces can be used to store large amounts of data. The maximum size of segmented table spaces is 64GB.

START and STOP commands can be issued at the partition level. By stopping only specific partitions, the remaining partitions are available to be accessed, thereby promoting higher availability.

Free space (PCTFREE and FREEPAGE) can be specified at the partition level, enabling the DBA to isolate data “hot spots” to a specific partition and tune accordingly.

Query I/O, CPU, and Sysplex parallelism enable multiple engines to access different partitions in parallel, usually resulting in reduced elapsed time. DB2 can access non-partitioned table spaces in parallel, too, but partitioning can optimize parallelism by removing disk contention.

Table space scans on partitioned table spaces can skip partitions that are excluded based on the query predicates. Skipping entire partitions can improve overall query performance for table space scans.

By mixing clustering and partitioning, you can design to decrease data contention. For example, if the table space will be partitioned by DEPTNO, each department (or range of compatible departments) could be placed in a separate partition. Each range of departments is in a discrete physical data set, thereby reducing inter-departmental contention due to multiple departments co-existing on the same data page.

You can further reduce contention by creating Data Partitioned Secondary Indexes (DPSIs). Prior to V8, some contention remained for data in non-partitioned indexes. Defining a Non-Partitioned Secondary Index (NPSI) on a table in a partitioned table space causes you to lose some of the benefits of partition-level independence for utility operations because access to an NPSI isn’t broken apart by the partitioning scheme.

DB2 creates a separate compression dictionary for each table space partition. Multiple dictionaries tend to cause better overall compression ratios. In addition, it’s more likely the partition-level compression dictionaries can be rebuilt more frequently than non-partitioned dictionaries. Frequent rebuilding of the compression dictionary can lead to a better overall compression ratio.

The REORG, COPY, and RECOVER utilities can execute on table spaces at the partition level. If these utilities are set to execute on partitions instead of on the entire table space, valuable time can be saved by processing only the partitions that need to be reorganized, copied, or recovered. Partition independence and resource serialization further increase the availability of partitions during utility processing.

Modifying partitioning details, changing partition key ranges, and rotating partitions is more flexible than ever with online schema change support.

Disadvantages of Partitioning

Only one table can be defined in a partitioned table space. This isn’t really a disadvantage, merely a limitation.

Prior to DB2 V8, updating the partitioning key was problematic. Before the limit key values could be updated, the PARTKEYU DSNZPARM parameter had to be set to YES. And if updates were allowed, it was likely they ran slowly. If PARTKEYU was set to NO, you had to DELETE the row and then re-INSERT it to change a value in a column of a partitioning key. But things have changed. As of V8, a partitioning index is no longer required. Furthermore, any partitioning key can be updated without worrying about the PARTKEYU parameter, which is no longer even supported.

The range of key values for which data will be inserted into the table should be known and stable before you create and define the limit keys for the partitioning index. These ranges should distribute the data throughout the partitions according to the access needs of the applications using the data. If you provide a stop-gap partition to catch all the values lower (or higher) than the defined range, monitor that partition to ensure it doesn’t grow dramatically or cause performance problems if it’s smaller or larger than most other partitions.

Summary

More and more of your DB2 table spaces will be partitioned because of the advent of universal table spaces and the deprecation of support for simple table spaces. Learn the benefits of partitioning and use them to your advantage.