IT Management

Horizontal partitioning is the separation of table rows into two or more physical data sets. It may be accompanied by a similar physical partitioning of some indexes. DB2 partitioned tables let the designer choose a row storage strategy based on the partitioning. Horizontal partitioning will affect the database backup and recovery methods and elapsed times. Figure 2 lists some strategies and their usefulness.

Vertical partitioning involves splitting a table vertically—designating some columns to remain in the table while others are moved to another table or tables. In this way, a single table is denormalized into two or more tables, each containing a subset of columns from the original. This is usually accompanied by replicating the primary key of the original table across all the resulting tables. To re-create an original row, the tables must be joined on their primary key columns. The DBA usually chooses vertical partitioning for performance reasons; Figure 3 lists additional decision-making factors.

Key clustering is a technique the DBA uses to ensure inserted rows don’t create hot spots in the data. One common way this happens is when new records are assigned a key based on a sequentially ascending value (such as an identity column or a sequence) and the table is clustered on this key. The result is that newly inserted rows are added at the physical end of the table.

Assuming the DBA implements page-level locking, a row insert requires an exclusive lock on the page where the row is placed, and the lock is kept until the transaction issues a commit. Until the commit, the page is locked, preventing SQL access to rows on that page.

If transaction volume is low enough, or commits happen immediately after inserts are complete, then this may not be a problem. However, for a high-volume, critical business application, this may not be possible. Instead, the DBA has several options for spreading inserts across the pageset to minimize hot spots.

One possibility is to cluster the table based on a column other than the key. Indeed, this is typical for many applications since clustering is best used in instances where SQL statements access many rows, allowing DB2 to use prefetch to increase performance.

Another possibility is to cluster by a surrogate key chosen semi-randomly. Some choices include:

  • Using the microseconds portion of the CURRENT TIMESTAMP special register
  • Using the RAND function
  • Creating a trigger that populates the value of the key upon row insert based on a user-defined function.

You can also use these methods of key value creation to assign rows to partitions, assuming table horizontal partitioning is accomplished using the key column.

4 Pages