Index compression is available only after you’ve upgraded your DB2 9 subsystem to NFM.
APPEND clause: A data warehouse can have a tremendous amount of SQL INSERT and online LOAD utility processing. Normally, DB2 makes every attempt to place a row in the correct place in a table. To accomplish this, it must perform a search. The more it has to search, the more expensive the INSERT or LOAD process becomes. Some warehouses may refresh one or more tables using INSERT or LOAD on a nightly basis—all expensive, time-consuming operations.
DB2 9 provides a solution. DB2 has added a new APPEND clause to the CREATE and ALTER TABLE SQL statements. If APPEND is set to YES, DB2 places a new row at the end of the table, ignoring any forms of clustering defined to that table. The end of a table can be defined several ways, depending on the type of object being accessed. For example:
• For a segmented table space, it’s placed at the end of the table.
• For partitioned table spaces and partition by range universal table spaces, the row is placed at the end of the appropriate partition.
• For partition by range universal table spaces, the row is placed in any partition with space at the end.
The APPEND clause can’t be used with tables created in LOB, XML, and workfile table spaces; it also is ignored by the REORG utility so the REORG can be used to maintain cluster order if that order is necessary.
The APPEND clause is available only after you’ve upgraded your DB2 9 subsystem to NFM.
Data-partitioned secondary index: A data warehouse usually means large amounts of data. That translates to partitioned table spaces, and partitioned table spaces often call for Data- Partitioned Secondary Indexes (DPSIs). DB2 9 has made a few improvements to DPSIs that should make them more usable and popular. For example, a DPSI can now get index-only access and a DPSI index can be defined as unique when columns in the DPSI key are a superset of partitioning columns. A few more subtle DPSI improvements are enhanced page range screening to help avoid hitting every partition, more parallelism, and more index look-aside.
Remember the lessons learned in DB2 Version 8 if you use DPSIs. Always include a local predicate on the leading column(s) of the DPSI index to avoid scanning the entire index structure. Also, remember that when using a DPSI, you will see an increase in the number of VSAM data sets that DB2 will need to use. Make sure you appropriately adjust your DSMAX value.
All these DPSI index improvements are available in Conversion Mode (CM).
Prefetch improvements: In a data warehousing environment, some queries will use prefetch; it’s just normal. As mentioned earlier, there are situations where a stage 2 predicate is your only option; stage 2 means prefetch. The good news for all of those stage 2 warehouse queries is that prefetch has improved in DB2 9. Prefetch quantity has increased. When a pool size becomes larger than 40,000 pages, a larger prefetch quantity is now used. If the buffer pool is greater than 40,000 and less than or equal to 80,000 pages, the prefetch quantity is 64 pages for SQL processing and 64 pages for utilities. The utility page quantity jumps to 128 pages when the VPSIZE * VPSEQT is greater than 80,000 pages. Of course, this description of prefetch quantities is based on 4K page size. These numbers must be adjusted for other page sizes.
When DB2 chooses to use dynamic prefetch is also a significant change made in DB2 9. Dynamic prefetch is used for index scans and table access via table scans. When dynamic prefetch is picked at bind time, it can switch between sequential prefetch and index access based on the data pages accessed using something called sequential detection. In addition, dynamic prefetch doesn’t use a triggering page and can scan forward or backward, something that is handy for a backward index scan. Finally, dynamic prefetch engines also can run in parallel. A few utilities available via APAR PK44026 also are recent converts to dynamic prefetch. For example, dynamic prefetch is enabled for:
• UNLOAD phase of REORG INDEX
• The UNLOAD and BUILD phases of REORG TABLESPACE PART
• The BUILD phase of LOAD REPLACE PART
• The RUNSTATS phase of RUNSTATS INDEX.
The changes to dynamic prefetch in DB2 9 apply to single table access, multitable join, outer join, subquery, and union. Sequential prefetch, as of DB2 9, is used only for table space scans.
Dynamic prefetch could have all kinds of performance advantages over sequential prefetch because it isn’t dependent on the optimizer making the correct decision the first time and allows the access path to be changed on the fly as data patterns change.
The prefetch changes are available in DB2 9 CM.
Clone tables: Clone table support in DB2 9 could be an excellent way to minimize an outage caused by performing a total refresh on a data warehouse table. Let’s say warehouse table X is completely replaced every evening for use the following day. Let’s also assume that access to other tables in the warehouse depends on being able to get to the data in table X. In the time it takes each evening to replace (refresh) table X, none of the tables dependent on it are available. Clone table support could be the solution.
With clone table support, a matching table, or a clone, is generated. The clone table structure is identical in every way to the original base table; it’s even created in the same table space. This table clone can be populated with new data while the original base table is still being accessed. Both SQL INSERTs and the LOAD utility can be used to get data into the table clone. Once the clone has been populated with the appropriate data, the SQL EXCHANGE statement can be executed to switch which table DB2 accesses. Prior to issuing the EXCHANGE, DB2 uses the original base table. After issuing the EXCHANGE, DB2 is referencing what was the clone table. The entire switch is extremely fast and completely transparent to applications and users. DB2 tracks which is the clone and which is the base table by manipulating the special naming convention in the underlying VSAM data sets.
As DB2 is using what was the clone as the new base table, what used to be the base table before the EXCHANGE is now the clone and is eligible to be loaded with data for the next table space flip. This is similar to the concept of an online LOAD REPLACE.
Although clone tables aren’t really a performance option, they can improve your availability. You can’t work on the performance of anything if you can’t get to the object.
Clone tables are available only after you’ve upgraded your DB2 9 subsystem to NFM. In addition, you can clone only base tables created in universal table spaces.
Reordered Row Format (RRF) is one of those significant changes that most wouldn’t even know about if not for a couple of user group presentations. This change affects how a table’s variable length columns are ordered when written to disk. A data warehouse, by its nature, can have lots of variable length columns.
A challenge that has existed for DB2 professionals since DB2’s beginnings is placement of the fixed length (CHAR) columns vs. the placement of variable length (VARCHAR) columns in a newly created table definition. There have been conflicting views on what’s correct, efficient, and the best coding practice. Most have agreed that the VARCHAR columns should go at the end of the row or at least infrequently updated variable length columns should go last. However, no matter what order you use to define a table’s columns, once DB2 hits a VARCHAR column when retrieving a row, DB2 must calculate the starting position of subsequent columns in that row. Until DB2 reads the length of the VARCHAR column (the two-byte prefix on every VARCAHR column), it has no idea where to find the beginning of the next column that needs to be retrieved.
DB2 9 provides a potential solution to this quandary. Once you get to DB2 9 for z/OS NFM, all VARCHAR columns on newly created table spaces are placed at the end of the row. DB2 9 can make this type of decision because the row format in DB2 9 has changed. DB2 9 introduces Reordered Row Format (RRF), a straightforward concept in which a DB2 9 NFM row will have all the fixed length columns first, followed by pointers to the beginning of each VARCHAR row. The pointers are followed by the actual variable length data. Rather than scan what could be some lengthy columns just to find the beginning of the column you’re looking for, DB2 only needs to scan the list of pointers to find the location for the beginning of the column you want.
You’ll get RRF for any table space created in DB2 9 NFM; additionally, any table spaces, or table space partitions, will be converted to RRF when a REORG or LOAD REPLACE is run against that table space or table space partition. Be cautious when working with partitioned table spaces. If you only REORG selected partitions of a table space, then you’ll end up with some partitions, the ones that have been reorganized, in the new RRF. The remaining partitions that haven’t yet been reorganized will stay in basic row format. Basic row format is the phrase used to describes the row format prior to DB2 9 for z/OS NFM.
RRF is available only after you’ve upgraded your DB2 9 subsystem to NFM.
Optimization Service Center (OSC) is a great free tool available for download from DB2’s Website. Straightforward to use and easy to learn, it helps programmers, DBAs, and systems types determine what your SQL is really doing—providing a wealth of information about SQL behavior with minimal effort on your part. Should you come across an issue you believe IBM needs to resolve, it has a phenomenal way of getting all the information to DB2 Level 2 support with the click of a button. OSC can be used in any mode (CM or NFM) of DB2 for z/OS Version 8 or DB2 9 for z/OS.
DB2 for z/OS Version 8 was almost as robust as DB2 9 in terms of the enhancements a BI application or data warehouse can use. Here’s a list of a few of the more significant V8 enhancements that will augment your data warehouse experience:
• Materialized query tables
• Automatic space management
• Backward index scan
• Sparse index for star join
• SQL statements up to 2MB
• Up to 4,096 partitions
• Statements IDs of cached statements as input to EXPLAIN
• Control Interval size larger than 4KB
• More tables in join
• Common table expressions
• Recursive SQL
• Index-only access for VARCHAR columns
• Longer table and column names
• Sparse index for star join.
This article doesn’t even come close to discussing all the performance enhancements in DB2 9, not to mention all the general enhancements to DB2. Your data warehouse or BI application could benefit from some of the new SQL being delivered by DB2 9, such as TRUNCATE, MERGE, INTERSECT, EXCEPT, RANK, and ROW_NUMBER statements and functions. SQL improvements that should be of interest are INSTEAD OF triggers, cultural sort, caseless comparisons, and FETCH FIRST in fullselect. You also will get histograms, more parallelism, more System z Integrated Information Processor (zIIP) eligible workload, cross-query block optimization, generalized sparse index, in-memory data cache method, pureXML, and much more. There are only new features and enhancements that could benefit a data warehouse using DB2 for z/OS.
As you gain all these useful features that can directly affect the performance of your data warehouse and BI applications, you also may be able to reduce the amount of CPU DB2 9 uses. In testing performed by IBM’s Silicon Valley Lab, some CPU savings over DB2 Version 8 NFM were obtained by moving to DB2 9 CM, with additional savings realized when moving to DB2 9 NFM. However, the most significant CPU reductions were realized when the move to DB2 9 NFM was completed and when the new features and functions of DB2 9 NFM were finally used.
DB2 9 is an extremely robust version that everyone should be planning to upgrade to regardless of their warehousing situation. If you’re running a data warehouse that uses DB2 for z/OS, then there are several new features and improved functions that should interest you. If you’re considering implementing a new warehouse or moving an existing warehouse to DB2, then DB2 9 for z/OS is definitely the release you should be planning for your install if you want the absolute most from your relational database management system.