Sep 1 ’03
DB2 YDB Server for z/OS Version 8
This and the previous article (z/Journal, June/July 2003) highlight the many different areas where DB2 is being enhanced to leverage the new z/OS capabilities and provide new features that take databases to the next level. This article examines the Very Large Database (VLDB) enhancements along with migration considerations for getting to DB2 UDB Server for z/OS Version 8. We also examine other enhancements, such as the new index type, Data Partition Secondary Indexes (DPSIs), statistic improvements, and new DSNZPARMS that provide numerous dynamic adjustment capabilities.
Most of the new features in DB2 Version 8 can be somewhat related to the performance, availability, and scalability considerations for VLDBs. All of the availability considerations of column redefinition and addition are beginning to be addressed through online schema evolution. Modifying a database table or index no longer requires the object to be offline. Now changes can be incorporated as needed without interrupting processing. Other availability enhancements include the ability to define or change a table’s clustering and also have it separated from the partitioning schemes. These definition and design flexibilities provide many alternative design opportunities for huge application performance improvements without the availability concerns of unloading and reloading VLDB data. Schema evolution is a huge benefit for everyone involved with the management and maintenance of any size database, but especially for VLDBs.
Additionally, the expansion to 4,096 partitions, along with the ability to add a partition at the end of a partition and rotate partitions from the beginning to the end of the partition, will help VLDBs tremendously. This flexibility, along with the new DPSI index structures, offers users the ability to run applications or maintenance against multiple partitions in parallel, which will dramatically reduce overall elapsed time.
The expansion of the memory on the z/OS platform offers a unique opportunity for VLDBs. DB2 Version 8 expansion of all its memory-related components such as buffer pools, local agent for parse trees, the RID Pool, the Sort Pool, along with better handling for compression dictionaries and the EDM Pool, allow more application processing. Expanding the memory capabilities of these components provides more flexibility to focus memory resources on VLDB issues such as buffering more data, and handling more concurrent users and more applications. As CPUs become faster and disk space becomes less expensive, efficient memory usage is the best way to address the ever-increasing I/O requirements of VLDB workloads.
Version 8 also addresses the processing of VLDB workloads through the expansion of the active and archive logs available to a system. Many VLDB enterprises were processing so many transactions that logging and retaining the archive data was a major recovery consideration. DB2 Version 8 addresses these issues by expanding the number and size of the logs available to an active DB2 system. In Version 8, the active log can now hold up to 372GB of active data and the archive logs can hold up to 40TB of data. These enlarged capabilities greatly improve the recovery prospects for high-transaction VLDB systems. These log expansions, along with the integration and improvements of z/OS 1.5 DFSMS, provide even better handling for VLDB system checkpoints, backups, and recoveries through FlashCopy facilities. Through DFSMS FlashCopy facilities, DB2 Version 8 backups and recoveries are easier to set up and execute, while the new utility options of Backup System and Restore System speak for themselves.
DPSIs AND INDEXING IMPROVEMENTS
DPSIs are another major enhancement in Version 8 that every DB2 installation will benefit from. The DPSI design improvements were developed for Version 8 in response to the many availability and performance considerations of Non-Partitioning Indexes (NPIs). Since the NPI structure references all the table entries across all partitions, it became a single point of recovery, performance, and locking consideration. Since partitioned tables are commonly some of the largest tables defined for an application, defining many NPIs on a large table has become a major liability for maintenance, concurrency, and availability issues. Also, as data growth continues, the NPIs sometimes contain the most database row entries of any single database object in an entire shop. As data growth approaches the petabyte range, DB2 continues its leadership in technology and performance as it addresses this NPI design issue with the new DPSIs.
DPSIs are extremely important because of their relationship with several of the other new features in Version 8 associated with online schema evolution. The new DPSI implementation, tablespace data clustering options, expanded number of partitions available, and the partitioning rotation options provide many flexible combinations.
Practically speaking, DPSIs are vital because they eliminate the large data considerations from recovery, load, and concurrency issues for DBAs and applications. By splitting the index entries into smaller partition parts, DPSIs provide the ability to divide and conquer the large data issues. For example, previous BUILD2 phases required by loads, reorganizations, or recoveries of NPIs are no longer needed, as DPSIs prevent a huge maintenance time outage. Also, since DPSIs are partitioned along the same scheme as the data, parallelism for all maintenance processes and application SQL is now possible, saving significant amounts of time for all activities. Concurrency is also improved because the locks are no longer associated with a single NPI data set. In a data sharing environment, DPSIs provide better granularity for locks because they spread the locks across numerous Partitioned Data Sets (PDSes).
As with any enhancement, DPSIs bring a few new caveats for the application and database designer. DPSIs split up the data along the partitioning scheme, which means there is a larger number of DB2 data sets to manage, backup, and recover. This is especially vital to remember, since DB2 Version 8 offers up to 4,096 partitions, and having just three DPSI indexes could alone result in 12,288 index data sets. Reviewing the DSMAX settings and the associated DBD size within your EDM Pool definition will be important, as the number of DPSIs and their associated data sets redefine your old NPI indexes. Also, since DPSIs allow parallelism to be leveraged across the maintenance utilities, the utilization of Sorting, and the possible use of the new SORTDEVT and SORTNUM, parameters may need to be reviewed.
Probably the most important caveat about DPSIs is that they can’t be defined as unique. Because of this, DPSIs may only be useful if the queries include a partitioning key and there aren’t many possible partitions to search. For applications to leverage the performance benefits of DPSIs, they will need to have good statistics because of their structural difference from old NPIs and their potential for parallelism. It is vital that RUNSTATS capture good DPSI column distribution statistics because of the difference between clustering of the data and how it relates to the various partitions of the DPSI structure. As enterprises evaluate the new partitioning, clustering, and DPSI design alternatives, sites should validate and retest their index designs. DPSIs offer so much flexibility that you may be able to eliminate extra processing I/Os and maintenance by combining indexes previously needed by applications.
DPSIs combined with the other index improvements offer better performance support options. Version 8 improves indexes in many new ways, including providing better support for variable length (VARCHAR) column usage. Previously, the VARCHAR column was padded to its largest possible length, potentially making the index extremely large compared to the actual data. This was a problem because the column length limit on indexes was 255 bytes, but now has been expanded in Version 8 to 2,000 bytes. Also, depending on the padding decision, the VARCHAR index information may now be available to processes with “index-only access,” providing a performance improvement for some applications.
In addition, Version 8 index and optimizer improvements provide better handling for applications. First, indexes can now support a backward scan, which may help eliminate indexes that were previously necessary for performance. Also, indexes can now be utilized when the host variable and the target column have different definitions. Decimal and Float variable types are now comparable, which is important for languages that do not have a decimal data type. This feature is very valuable, as DB2 z/OS becomes the ultimate server for any computer language such as Java, C++, C#, and any code base such as EBCDIC, ASCII, or Unicode. With these new features and the ability to handle SQL predicates up to 32KB, the DB2 z/OS implementation further extends its list of DB2 Family-compatible features.
COLLECTION OF NON-INDEXED COLUMN STATISTICS
Along with the index improvements are improvements to the RUNSTATS utility to collect statistics on non-indexed columns. These RUNSTATS statistics will help the DB2 optimizer choose better access paths and especially improve Join opportunities for all queries using non-indexed columns, which are common in data warehousing, data mining, and ad hoc queries. DB2 Version 5 previously supported gathering non-indexed columns statistics into the SYSCOLDIST DB2 Catalog table through the DSTATS tool. This tool used SQL to gather the statistics and provided limited flexibility. Version 8 includes the functionality in RUNSTATS and adds numerous parameters, allowing more efficient statistics gathering and sort handling.
For this new functionality, RUNSTATS has added the following new parameters to control the collection of these non-indexed column statistics: COLGROUP, FREQVAL, LEAST, MOST, BOTH, SORTDEVT, and SORTNUM. These parameters provide flexibility to control the grouping of columns, the number of values captured, their highest and/or lowest cardinality values, and provide the ability to customize RUNSTATS SORT parameters.
For example, capturing these statistics in the SYSCOLDISTSTATS and SYSCOLDIST DB2 Catalog tables can help the DB2 optimizer understand that there is only a limited number of values for a particular column in a table. This knowledge helps the optimizer revise its default assumption of uniform column data values and recalculate the filter factors and possibly change the access path, thereby improving query performance.
Another VLDB and availability enhancement in Version 8 is better granularity and handling of I/O type errors that require DB2 to take data set pages offline and add them to the Logical Page List (LPL) for potential recovery. These rare LPL error situations typically are encountered in extraordinary situations, such as DASD I/O failures, Group Buffer Pool failures, or Log Apply failures. DB2 in Version 7 restricted access by adding the entire partition or page set to the LPL list. Taking these complete page sets offline could result in a major outage caused by only one or two LPL problem pages.
In DB2 Version 8, recovery from LPL situations has been further automated, as DB2 first attempts to heal itself by automatically trying to recover from the situation. Additional messages and error codes are also being developed to further detail the cause of the LPL errors. The most important improvement to Version 8 LPL handling is that DB2 now only puts the individual pages in LPL offline restricted status when it encounters an error. This allows other database activity to continue while keeping the integrity of the problem pages guarded for recovery. By restricting only the directly LPL affected DB2 pages, potential outages and recoveries are minimized.
The DB2 catalog will now be stored in Unicode tables. Before Version 8, there was a restriction within DB2 on joining tables with different Coded Character Set Identifiers (CCSIDs), so that Unicode, EBCDIC, and ASCII tables cannot be accessed in the same SQL statement. Since the catalog will now be in Unicode, but the majority of user data will still be in EBCDIC encoding, this restriction would cause extreme difficulty for many vendors and users who have written applications combining catalog information and user data. Thankfully, in Version 8, these restrictions have been addressed. Tables with different CCSIDs can now be included in one SQL statement. The CCSID of the results will be determined by the default CCSID for the DB2 subsystem.
Using Unicode solves the problem of handling characters uniformly worldwide. It also helps address the lack of compatibility between the mainframe EBCDIC CCSIDs and other Unix and Web workloads. As different EBCDIC or ASCII code pages are inconsistent about the code points for some characters (such as “$” and “|”), the move to Unicode parsing solves this issue. However, one restriction to the use of Unicode deals with authorization IDs. Authorization IDs sent to a DB2 Version 8 server must conform to the security package guidelines (i.e., RACF). Distributed requesters and servers now exchange information about the various CCSIDs they understand and will use Unicode where possible.
DB2 Version 7 included the basic infrastructure for supporting Unicode, including storage and conversion of Unicode-encoded data and coordination with OS/390 Unicode Systems Services. DB2 Version 8 builds on this support by adding several new functionalities, with probably the most important one being Unicode parsing. Unicode parsing of SQL statements affects many portions of DB2:
- Any SQL statement sent for parsing is converted to Unicode UTF-8 before parsing
- The Precompiler writes SQL statements into DBRMs in Unicode
- There are new Unicode hexadecimal string constants
- Long identifiers have been extended from 18 bytes to a new maximum length of 128 bytes. Note that some of the long identifiers will be lengthened to the new 128-byte maximum, but not all names will grow. Also note that the column name maximum is still 30 bytes, as in the entire DB2 family products.
DB2 uses two forms of Unicode called UTF-8 and UTF-16. Each form specifies one set of code points that handles many character sets. UTF-8 uses 1 byte (8 bits) for common characters (code points 0 through 127) that are compatible with ASCII. Code points 128 and beyond each occupy 2, 3, or 4 bytes. DB2 uses UTF-8 for stored data in the Catalog and for SQL statements that are to be parsed. UTF-16 uses 2 bytes (16 bits) for most characters.
Since the Precompiler executes outside of DB2, there is a new Precompiler option that tells it whether or not to allow Version 8 new syntax; i.e., whether SQL statements are to be interpreted as being in EBCDIC or Unicode, whether they must be converted to Unicode, and whether or not the DBRM must be marked as Version 8-dependent. Bind and Rebind are also enhanced to take EBCDIC statements and convert them to Unicode. The additional CPU cost required in doing Unicode conversion is incurred in Bind (both static and dynamic), Precompilation, Execute Immediate, and other facilities. This cost is not expected to be significant.
In addition to parsing SQL statements in Version 8, Unicode parsing of Utility control statements is now entirely in Unicode. Output to the SYSPRINT data set and the MVS console will continue to be in EBCDIC with conversion taking place as required. There is a new utility stored procedure DSNUTILU. It is identical to DSNUTILS except that the inputs are in Unicode and dynamic allocation of data sets is removed.
The DB2 UDB for z/OS Version 8 Open DataBase Connectivity (ODBC) driver has been enhanced to include implicit data conversion of Unicode data bound to non-character or non-graphic columns. It will also now support execution of SQL statements encoded in Unicode. A new ODBC INI keyword, CURRENTAPPENSCH, will allow users to indicate which encoding scheme (Unicode, EBCDIC, or ASCII) the ODBC driver will assume for input/output host variable data, SQL statements, and all character string arguments of the ODBC APIs that are passed by the application. The new keyword will belong to the COMMON stanza. If this keyword is present in the INI file, the ODBC driver will set the Current Application Encoding Scheme special register on behalf of the application to the value specified by flowing the appropriate SET statement to DB2 upon a successful connect. If this keyword is not present, then the driver will assume EBCDIC as the default application encoding scheme. In order to avoid inconsistent data states inside the ODBC driver, users will not be allowed to issue Set Current Application Encoding Scheme as an SQL statement against the ODBC driver. Also, Unicode encoding is not supported for the actual INI file itself. Both keywords and their values must still be specified in EBCDIC.
V8 MIGRATION CONSIDERATIONS
A migration to DB2 V8 is only supported from DB2 V7 and is done in two phases: New Release migration and New Function migration.
The New Release migration is similar to migrations done in the past (say from DB2 V6 to DB2 V7). It invokes CATMAINT to migrate the catalog and directory to the Version 8 level, and includes:
- Adding entries for new catalog objects (such as SYSIBM.SYSEBCDC)
- Adding new columns to existing catalog tables
- Changing data types of certain columns
- Adding or changing catalog referential constraints.
Note that the column changes in this phase of migration are those involving FOR BIT DATA and no column lengths are being changed during the Compatibility Mode migration. Prior to running this migration, you must ensure that your DB2 V7 subsystem has installed the appropriate fallback SPE and DB2 has been started at least once.
In a data-sharing environment, all subsystems must have the fallback SPE applied before a migration is attempted.
After migration to New Function mode, there will be some incompatibilities, which are being extensively documented. See the appropriate manuals for further details. The important incompatibilities are:
- Unicode Parser – Based on a new ZPARM setting, all SQL statements may be written into IFCID records in Unicode UTF-8 (not EBCDIC).
- Customers will no longer be able to define or run LANGUAGE COMPJAVA stored procedures after migration. They must be converted to LANGUAGE JAVA.
- New Function migration massively re-tailors the catalog, including converting it to Unicode. In particular:
- Many catalog columns (especially those defining long names) are greatly expanded
- Most of the catalog and directory tablespaces are converted to Unicode
- Some tablespaces are reassigned to 8KB and 16KB virtual pools
- The SYSLINKS and SYSPROCEDURES catalog tables are dropped •SYSDUMMY1 is moved to a new tablespace
- Many indexes are changed to NOT PADDED.
This migration is done one catalog tablespace at a time using the Online Reorg utility. New Function migration is only supported from Compatibility Mode. All active DB2 subsystems must be at the proper catalog level. Once a system has been migrated to New Function mode in Version 8, there is no fallback to Version 8 Compatibility Mode or to DB2 Version 7.
Here are some additional considerations:
- Plans and packages dependent on some of these catalog objects may be invalidated
- You should ensure that Image Copies of all catalog and directory tablespaces exist or the New Function Mode conversion process cannot occur
- Ensure that you have defined and sized the appropriate 8KB and 16KB virtual pools
- Some catalog indexes may be placed in Advisory Reorg Pending state (AREO*) during the process and will be rebuilt as the process completes.
Once you have migrated to New Function mode, several incompatibilities may exist. Here are the most important ones:
- CREATE INDEX now defaults to the ZPARM padding setting chosen. NOT PADDED is the default ZPARM chosen during installs, but defaults to PADDED for subsystems that are migrated.
- Tablespace partitioning can now be table-based instead of index-based; hence, the syntax for CREATE TABLESPACE has changed.
- Queries that accessed the catalog prior to its conversion to Unicode may have to be rewritten due to the Unicode values or collating sequence.
DB2 supports the coexistence of only two releases at a time. While DB2 V7 and DB2 V8 subsystems can coexist, this is only possible if the V8 subsystem(s) are in Compatibility Mode. Once any DB2 subsystem is in New Function Mode, all of the subsystem members will be in New Function Mode.
THE INSTALLATION/MIGRATION PROCESS
During install/migrate, the precompiler default NEWFUN parameter is set based on the value of Install Type on panel DSNTIPA1 as follows:
- INSTALL: NEWFUN=YES
- MIGRATE: NEWFUN=NO
- ENFM: NEWFUN=YES
- UPDATE: NEWFUN setting taken from previous CLIST run.
Run the CLIST in MIGRATE mode. The CLIST customizes the usual set of jobs for migration from DB2 V7 to V8 Compatibility Mode. Note that the DB2 V8 Installation Verification Program (IVP) jobs are not customized at this point. Execute the jobs to migrate to DB2 V8 Compatibility Mode. To verify migration, run the IVP jobs from Version 7. When ready, run the CLIST and specify Enable New Function Mode (ENFM) for Install Type. The CLIST will then:
- Calculate space required for the VSAM shadow data sets required to support Enable New Function Mode
- Display a new Install panel listing the 18 DB2 catalog and directory tablespaces to be transformed in the next step. This list includes updateable fields for device type, volume specification, and space allocation for each shadow data set.
- Display a summary of required space allocation; changes are allowed
- Generate a new install jobs and customize the 8 DB2 V8 IVP jobs.
At this point, you can run the new install jobs to convert the DB2 catalog and directory to Enable New Function Mode. Note that there are several additional tasks that you must perform prior to enabling Version 8.
The minimum size of BP0 must be increased to 2,000 pages, and at least one 8KB and one 16KB bufferpool must be available to accommodate the longer row sizes of the catalog and directory tables.
The new generated install jobs perform the following functions:
- Invoke CATENFM to signal the beginning of processing
- Allocate the new shadow data sets
- Convert the catalog and directory tablespaces from EBCDIC to Unicode via Reorg
- Verify that all tablespaces have been converted
- Convert to New Function Mode
- Regenerate DSNHDECP to change the value of the new Precompiler NEWFUN parameter to YES.
Without any doubt, the migration process is much more involved because of the new functionality. With all this added functionality, Unicode catalog, and larger catalog table rows, estimates indicate that 10 percent larger buffer pools may be required.
ADDITIONAL ONLINE ZPARMS
Beginning in DB2 V7, DBAs and systems programmers were able to change subsystem parameters (ZPARMs) dynamically by assembling a new subsystem parameter module and then using the SET SYSPARM command to load these new parameters into storage.
DB2 V8 extends the list of ZPARMs that may be changed this way. Basically, those parameters required for serviceability or that exist in DSNHDECP are not changeable. Here are a few of the changeable parameters:
- SYSADM1, SYSADM2, SYSOPR1, SYSOPR2 (Install SYSADM and Install SYSOPR)
- CACHEDYN, MAXKEEPD (Caching of dynamic SQL)
- SRTPOOL (Sort Pool size)
- POOLINAC, IDTHTOIN, MAXTYPE1 (inactive DDF thread controls)
- PARTKEYU (ability to update the value of a column that participates in a partitioning key).
Changes to Install SYSADM authority through the SET SYSPARM command are restricted to IDs already designated as Install SYSADM. Similarly, changes to the Install SYSOPR authority are restricted to IDs already designated as Install SYSOPR. See the DB2 Administration Guide for considerations regarding resetting these authorities, including possible revoke cascade situations. Also, when IDs are cached, the old authorization ID remains in effect after online changes.
Changes to CACHEDYN from YES to NO will not release space in the EDM Pool reserved for dynamic statements, and changes to the value of CACHEDYN will not affect the previous settings of currently running statements.
DPSIs and improved index support continue to address database challenges for VLDB applications for DB2 UDB Server for z/OS Version 8. These new features enable application and database designers to address business requirements with flexible solutions. The migration process helps everyone avoid issues and protect the databases while moving to the new 64-bit z/OS environment. The many enhancements found in DB2 UDB Server for z/OS Version 8 continue to raise the bar by leveraging the latest hardware and maximizing the performance and technology capabilities. DB2 UDB Server for z/OS Version 8 is state-of-the-art database technology. Z