When IBM released CICS TS 2.2 in December 2002, which introduced Task-Related User Exits (TRUE) in the Open Transaction Environment (OTE) architecture, a primary selling point was potentially significant CPU savings for CICS/DB2 applications defined as threadsafe. To be threadsafe, a program must be Language Environment- (LE-) conforming and knowledgeable CICS programmers must ensure the application logic adheres to threadsafe coding standards. (For more information, see “DB2 and CICS Are Moving On: Avoiding Potholes on the Yellow Brick Road to an LE Migration,” z/Journal, April/May 2007.) This may require knowledge of Assembler code to follow the many tentacles of application logic that need to verify the application and its related programs are threadsafe. If you define a program to be threadsafe, but the application logic isn’t threadsafe, then unpredictable results could occur that could compromise your data integrity. This article provides some background on what threadsafe means at the program level, how to identify and correct non-threadsafe coding, and how to ensure your programs are maximizing their potential CPU savings.
CICS was initially designed to process using a single Task Control Block (TCB). Once the CICS dispatcher had given control to a user program, that program had complete control of the entire region until it requested a CICS service. If the program issued a command that included an operating system wait, the entire region would wait with it. As a result, CICS programming guides included a list of operating system and COBOL commands that CICS programs couldn’t use. The flipside of these limitations was the advantage that CICS programs didn’t have to be re-entrant between CICS commands.
As all activity in the CICS region was single-threaded, it was also restricted to the capacity of one CPU. The introduction of multi-processor mainframes raised new issues for the CICS systems staff, when the purchase of a faster (and more expensive) mainframe would slow down CICS if the individual processors on the new machine were slower than the single processor it replaced. IBM responded by attempting to offload some of the CICS workload to additional CICS-controlled MVS TCBs that could run concurrently on a multi-processing machine. For convenience, IBM labeled the main CICS TCB as the Quasi-Reentrant, or QR TCB.
The most significant implementation of this type of offloading came with the introduction of the DB2 Database Management System (DBMS). Rather than establishing one TCB for all DB2 activity, CICS would create a separate TCB for each concurrent DB2 request and switch the task to that TCB while DB2 system code ran. While all of the application programs for each task in the region still ran single-threaded, each task’s DB2 workload could run simultaneously—limited only by the total capacity of a multi-processor. On a practical level, the DB2 workload seldom approached the CICS workload, meaning CICS users were still constrained by the processing speed of a single processor. Also, while the overhead of an individual TCB swap (roughly 2,000 instructions) is slight, these two TCB swaps for each DB2 request can account for as much as 30 percent of total application CPU.
Open Transaction Environment
In a classic “ah ha!” moment, someone at IBM realized this TCB swapping overhead could be eliminated by simply not swapping the transaction back from the DB2 TCB and allowing application code to run there. To provide support for running CICS application code outside of the QR TCB, the concept of the OTE was developed. Put simply, OTE allows an individual CICS transaction to run under its own MVS TCB instead of sharing the QR TCB. Many transactions, each under their own TCB, can run simultaneously in the same CICS region. If a transaction running in the OTE issues an operating system wait, none of the other transactions in the CICS region are affected.
The drawback of OTE is that more than one occurrence of the same program can run simultaneously, requiring CICS programs to be re-entrant between CICS calls. A simple example of the type of problem created is the common practice of maintaining a record counter in the Common Work Area (CWA) that’s used to create a unique key. Under “classic” CICS, as long as the record counter was updated before the next CICS command was issued, the integrity of the counter was assured. With OTE, it’s possible for two or more transactions to use the counter simultaneously, resulting in duplicate keys.
Fully re-entrant programs—that don’t assume access to data in shared storage areas will automatically be serialized—are defined as “threadsafe.” It’s crucial to remember that threadsafe isn’t a determination CICS makes, but a promise the programmer makes. By marking a program as threadsafe, the programmer is stating that the program won’t cause any damage if it’s allowed to run in the OTE.
Preparing CICS Regions for Threadsafe Activity