Applications & Databases

Apr 1 ’06

When you first start V8, you’re running in CM and the DB2 catalog is still in EBCDIC and you’re prevented from exploiting new function. However, all SQL parsing in V8 occurs in Unicode, (that is, UTF-8), even in compatibility mode. SQL statements that are input in EBCDIC, ASCII or UTF-16 must be converted to UTF-8. In compatibility mode, all the DB2 object names the SQL statements reference must again be converted to EBCDIC to compare them to DB2 catalog data. In addition, all utility control statements must be converted to Unicode before they’re parsed. When you migrate the catalog to NFM, the DB2 catalog is converted to Unicode, and fallback to V7 is no longer possible. New V8 subsystems always run in NFM. SQL statement conversion is unaffected by the switch to NFM and all the DB2 object names the SQL statements reference no longer need to be converted since the DB2 object names are already in the CCSID of the DB2 catalog.

Unicode Conversion Improvements

The improvements in Unicode conversion come from three sources. Each of these sources affects different facets of conversion performance, such as the type of characters and the length of the character strings. The three sources are:

  • V8 major and minor conversion
  • Improvements in the z/OS conversion services
  • Improvements in the zSeries processor hardware.

V8 major and minor conversion improves the performance of all types of conversions, but the improvements are most dramatic with English alphanumeric characters because DB2 doesn’t need to invoke the Unicode conversion service. Conversion of SQL statements and DB2 object names generally involves English alphanumeric characters. For many years, the DB2 Catalog has been EBCIDC. To improve the performance of DB2, V8 developed a faster way to convert such characters without calling the z/OS conversion services. This faster way is called minor conversion. Therefore, the term major conversion refers to those conversions performed by the z/OS conversion services.

The zSeries instruction set has always included a TR instruction that translates one byte for one byte, based on a 256-byte translation table. A TRT instruction has also existed that could be used to test a string for certain onebyte characters. As long as a singlebyte English alphanumeric character can be represented in UTF-8 by a single byte, a TR instruction can be used to convert the byte to UTF-8. V8 has built-in translate tables for most common single-byte CCSIDs. If the TRT instruction is “successful,” DB2 can translate the string using a TR instruction. The presence of shift-in and shift-out characters in a mixed string always causes the TRT instruction to fail. If the TRT fails, then DB2 must invoke the z/OS conversion services. The performance of minor conversion is much better than the performance of major conversion because minor conversion avoids calling the z/OS conversion services. So minor conversion is one of the significant advantages of V8 compared to V7.

The z/OS conversion services contain both a 31-bit and a 64-bit service. V7 uses the 31-bit service, while V8 usually uses the 64-bit service. A new enhancement was shipped to z/OS in PTF UA05789, known as HC3, that included significant performance enhancements, especially for conversions between UTF-8 and EBCDIC or ASCII. The UTF-8 enhancement applies to both the 31-bit service and the 64-bit service. To take advantage of this performance enhancement, you must rebuild your conversion tables after this PTF has been applied.

With this enhancement, conversion services under z/OS 1.4 can be as much as 3.75 times faster than under z/OS 1.3. In addition to the UTF-8 enhancement, the 64-bit service implemented some improved module linkage that gives V8 an advantage over V7. Studies have shown that V8 improves the performance of major conversion over V7 and that this improvement is most evident with small strings. With a 10-byte string, V8 is 1.5 times faster. With a 200-byte string, V8 is only about 15 percent faster.

In terms of zSeries processor hardware, the z900 model introduced some new Unicode hardware instructions that are simulated by the z/OS conversion services on older processors. In addition, the speed of a z990 engine is approximately 50 percent faster than the z900 Turbo, but when it comes to the Unicode conversion of large strings, the z990 hardware is twice as fast. The z/OS conversion services take advantage of several new hardware instructions that are faster with the z990 and z890 processors. These instructions are:

  • CUTFU: Convert from UTF-8 to UTF-16
  • CUUTF: Convert from UTF-16 to UTF-8
  • TROO: Translate from one byte to another one byte (used to convert single- byte codes)
  • TRTO: Translate from two bytes to one byte (used to convert from UTF- 16 to EBCDIC)
  • TROT: Translate from one byte to two bytes (used to convert from EBCDIC to UTF-16)
  • TRTT: Translate two bytes to two bytes (used to convert between UTF- 16 and double-byte EBCDIC).

Like many other instructions, the CPU time of these instructions increases in proportion to the string length. In a workload involving Unicode conversion, the CPU time effects of these instructions can be high, especially with large strings. The z990 and z890 processors have provided significant enhancements to the TROO, TROT, TRTO and TRTT instructions, making them as much as five times faster than the z900 and z800 processors. The larger the string, the larger the performance benefit of the z990 and z890 processors. Studies have shown that the z990 reduces the conversion CPU time using V7 for 10-byte strings by 1.5 times. This is a typical z990 improvement that we have measured for most applications. This increases to a 200 percent improvement for 200-byte strings, and that’s even better. Performance of other string lengths may be extrapolated from 10-byte and 200-byte strings. We see similar improvements in V8.

Conclusion

The enhancements we’ve seen in V8, z/OS conversion services, and the z990 hardware dramatically improve Unicode conversion performance. Prior to the z/OS conversion services improvement (HC3), Unicode conversion to EBCDIC was always slower than ASCII conversion to EBCDIC. In V7, Unicode conversion is about 20 percent faster than ASCII conversion. In addition, V8 major Unicode conversion is about 12 percent faster than ASCII conversion, but minor conversion is still much faster. The cost of conversion can affect the total CPU time of your application. V8’s increased exploitation of Unicode dictates that DB2 will perform far more conversions to and from Unicode than in the past and the recent enhancements that have been made inside DB2 and external to DB2 will improve the performance of these conversions.

2 Pages