Dec 4 ’09
Threadsafety and the CICS Open Transaction Environment: Background, Hints and Tips
IBM’s CICS transaction processing software has been enhanced in recent releases to extend its support for the Open Transaction Environment (OTE). This article describes the background and history of OTE, and offers some examples and guidance on several problems that may be encountered when exploiting OTE. This article also describes how to prepare your CICS systems and applications to efficiently exploit OTE.
Evolution of OTE
Support for OTE was first introduced in CICS Transaction Server 1.3. Initially, this support was limited to the Java programming environment within CICS, which also was introduced in this release. OTE provides a means to execute different programming environments such as a Java Virtual Machine (JVM) under a separate z/OS Task Control Block (TCB). Conversely, traditional CICS workloads execute in a single-threaded manner under the CICS Quasi-Reentrant (QR) TCB. This is the workhorse TCB that has handled all the CICS application workload in previous releases.
The CICS Dispatcher domain employs rapid multi-tasking, giving each task running under the QR TCB the chance to execute until it cooperatively relinquishes control once more (for example, as the result of a suspend or wait). This cooperative processing model is efficient at rapidly dispatching work in CICS. However, exploiting a single TCB to dispatch its application work means CICS is singlethreaded and can’t exploit multiple Central Processors (CPs) in a truly concurrent manner.
OTE-managed TCBs are called Open TCBs in CICS. They’re truly independent of the QR TCB and are dispatched and executed separately. In a multiengine hardware environment, with multiple CPs available for parallel execution of work, Open TCBs can execute code at the same time the QR TCB runs other work in the same CICS region.
Provision of separate TCBs to execute work in parallel in CICS helps address any throughput constraints that might be seen when all work is otherwise executed under the one (QR) TCB. In addition, the use of z/OS services that can suspend a TCB and which would block the QR TCB for a period of time can now be considered in CICS. Previously, these services were documented as being restricted for CICS application programming use. With OTE, and in particular the latest CICS Transaction Server V3 enhancements for OPENAPI programming, such calls may now be considered for use in CICS. This is feasible, since any effect of blocking a particular application executing code under its own Open TCB won’t affect other applications also running under their own TCBs in the same CICS system.
The evolving history of OTE in CICS can be summarized as follows:
• CICS Transaction Server 1.3 introduced OTE in CICS. Support was limited to Java applications. JVMs were executed under J8 Open TCBs. In addition, H8 Open TCBs were provided to execute an optimized Java environment. This was known as Hotpooling, and provided a compiled Java run-time environment for CICS applications. It was introduced as a temporary measure; the JVM runtime environment was acknowledged as the strategic platform for Java applications in CICS.
• CICS Transaction Server 2.2 extended OTE to also support Task Related User Exits (TRUEs) defined as OPENAPI. The DB2 TRUE was changed to exploit this. Such TRUEs were invoked under L8 Open TCBs.
• CICS Transaction Server 2.3 provided various enhancements to OTE functionality, such as user-key (J9) Open TCBs for JVMs.
• CICS Transaction Server 3.1 extended OTE to provide L8 and L9 Open TCBs for CICS-key and user-key application programs defined with OPENAPI on their program definitions. In addition, X8 and X9 TCBs were introduced for support of XPLINK in C and C++ CICS-key and user-key applications. Internally, CICS Transaction Server 3.1 also exploits S8 Open TCBs for CICS Sockets domain work, and uses L8 TCBs for CICS programs defined as OPENAPI and which must perform Web Services activity that references data on the Hierarchical File System (HFS). Such calls would block the QR TCB if executed under that. They’re a good example of CICS using its own OTE functionality to exploit multiple TCBs for its own purposes. CICS Transaction Server 3.1 also removed support for H8 TCBs.
• CICS Transaction Server 3.2 has further extended support for OTE by enabling the CICS to WebSphere MQ adaptor as an OPENAPI-enabled TRUE, as CICS Transaction Server 2.2 did for the DB2 TRUE. It also enabled parts of the EXEC CICS File Control component to exploit OTE, too. These enhancements will help CICS execute parallel workloads on multiple TCBs.
There’s a performance cost in switching between TCBs. The more often CICS has to issue a TCB switch to move control from the QR TCB to an L8 TCB, for example, and back again, the more CPU usage is required.
So it’s best to reduce TCB switching as much as possible. This means programs need to be correctly defined to CICS. For example, Global User Exits (GLUEs) executed in the path of threadsafe application logic should be made threadsafe, if possible, to avoid the need for CICS to have to switch to the QR TCB when driving them. GLUEs driven in threadsafe commands may issue EXEC CICS commands that are themselves non-threadsafe, and so require switching back and forth from the QR TCB to process them. The same applies to threadsafe application programs that issue non-threadsafe EXEC CICS commands.
It’s important to correctly define programs, especially exit programs, to avoid unnecessary TCB switching. It’s also important to understand the sequence of EXEC CICS commands issued from within the applications and to recognize those commands that require TCB switching.
CICS Application Threadsafety CICS application program definitions or program autoinstalls let the programmer define whether an application is threadsafe. A definition of CONCURRENCY(THREADSAFE) means a program was written to threadsafe standards. If it accesses any shared resources such as the CICS Work Area (CWA), it takes into account the possibility that other programs may be executing concurrently and attempting to modify the same resources.
A program that isn’t threadsafe is defined as CONCURRENCY (QUASIRENT). Such a program is quasi-reentrant only, and relies on CICS to provide serialization when it executes. CICS dispatches such programs under the QR TCB, so this provides such serialization by dispatching only a single instance of the programs at any time.
The threadsafety of a CICS application applies to its own logic, not to the particular EXEC CICS commands the program may issue. Some EXEC CICS commands are threadsafe, and don’t have an affinity to any particular TCB type. Other EXEC CICS commands aren’t threadsafe, and CICS will automatically switch to the QR TCB to process them. Defining an application as threadsafe means the business logic of the program is itself able to execute under an Open TCB.
JVM programs must execute under their Open TCB in CICS. While the thread of execution is in JVM, the code is executing under a J8 or J9 TCB. If the Java application invokes a JCICS method call, then CICS will invoke the appropriate EXEC CICS command mapped to that method. If this is a threadsafe EXEC CICS command, it will execute under the Open TCB. If it’s not threadsafe, CICS will switch control to the QR TCB for the duration of the request, and then switch back to the JVM’s Open TCB when control returns from CICS.
CICS OPENAPI TRUEs, such as those for DB2 and WebSphere MQ, must execute under an Open TCB. Upon return to the application after the call to the TRUE, CICS will leave control on the Open TCB if the application is defined as threadsafe, or switch back to the QR TCB if it’s defined as quasirent. If the program is threadsafe, control will remain on the Open TCB until a non-threadsafe EXEC CICS command is issued. CICS will then switch to the QR TCB to process the command.
In CICS Transaction Server 3.1, programs defined with API(CICSAPI) are invoked under their Open TCB. Like JVMs, their internal logic must run under an Open TCB. If they invoke threadsafe EXEC CICS commands, these are processed under the Open TCB. If they invoke non-threadsafe EXEC CICS commands, CICS will switch control to the QR TCB for the duration of the command, and then switch back to the Open TCB when the command completes.
CALLed Programs and Non-Serialized Resources
Switching an application program definition to make it threadsafe doesn’t guarantee that the program itself is written to threadsafe standards. All it means is that CICS can execute the program under an Open TCB if it deems it appropriate to do so. Whether the program can actually run under such a TCB is another matter. If it wasn’t written to threadsafe standards, various unpredictable results may occur. The easiest to identify is a simple abend. Such programs may not always result in abends; they may leave the data associated with the programs in an invalid state. This result is hard to detect and even harder to resolve.
One example where problems can occur is when a program that’s indeed threadsafe is redefined as such to CICS. However, the program makes use of calls to a subprogram. If this was, for example, a subprogram invoked from a static COBOL CALL, this would be link-edited to the main program. Natural language calls such as this aren’t part of the EXEC CICS Application Program Interface (API); they don’t cause CICS to pass through its program domain logic. At the program level, CICS is unaware that the main program has passed control from itself to somewhere else. The program domain still believes the main program is in control.
If the subprogram wasn’t written to threadsafe standards, it wouldn’t be able to safely execute in a multi-threaded TCB environment. For example, it may have a hard-coded Register Save Area (RSA) within itself. While this is a safe technique to use if CICS guaranteed to execute only the subprogram in a serialized, uninterrupted manner under the QR TCB, it’s not viable if multiple TCBs can simultaneously enter the subprogram. The RSA contents won’t be guaranteed to be preserved across each call to the program, and an abend or program check is the likely result.
This type of problem could be detected if the load module containing the programs had been link-edited with the RENT attribute since a self-modifying subprogram such as this would have abended trying to update its read-only storage. This could be avoided if a CICS API command such as an EXEC CICS LINK was used to pass control to the subprogram. CICS would then have a program definition for the subprogram, and this could have been defined as quasirent and ensured that the LINKed to program was executed under the serialized QR TCB environment. EXEC CICS LINK commands may not always be appropriate, however.
A natural language call has a shorter path length than a LINK; the trade-off is that it doesn’t provide the rich options available from an EXEC CICS LINK command such as workload balancing, trace diagnostics, Execution Diagnostic Facility (EDF) hooks, etc. If a natural language call was still required, another alternative would be to rewrite the subprogram to threadsafe standards, such as giving it its own separate working storage area (DFHEISTG for assembler programs, for example). Each transaction would then get its own copy of user storage mapped to DFHEISTG, and the RSA contents would be separated between transactions and not potentially trampled on.
Non-serialized programs can lead to problems such as this. They may be made threadsafe by various techniques. If they require serialization around specific operations within them, they can employ techniques such as the use of EXEC CICS ENQUEUE and DEQUEUE commands, or instructions such as Compare and Swap (CS), Compare and Double Swap (CDS), Test and Set (TS), or even the Perform Locked Operation (PLO) instruction.
Shared Storage Use
The EXEC CICS ADDRESS CWA command allows an application to obtain the address of the CWA, which is a piece of storage CICS provides for shared use by any applications. Typically, this is used to act as a “commarea” between different applications. While this sort of shared storage is safe when accessed in a serialized manner, its contents can’t be guaranteed to be consistent when used in a concurrent processing environment.
CICS applications that execute under Open TCBs can potentially be simultaneously referencing (or worse, updating) fields in the CWA, with other applications running under the QR TCB, too. This can lead to data integrity problems in the CWA and unpredictable program results. DFHEISUP can be used to detect the presence of EXEC CICS ADDRESS CWA commands in target application programs being scanned for commands that access shared storage in CICS. These are the EXEC CICS ADDRESS CWA, EXTRACT EXIT, and GETMAIN SHARED commands. The DFHEIDTH filter table is provided for use by
DFHEISUP to detect these. Once detected, you can make the application(s) threadsafe with suitable changes to their logic. For example, you could use ENQ and DEQ operations around the logic that references/updates CWA fields. In cases where this isn’t appropriate, say perhaps due to no longer having access to the original source of the program(s), the CICS RDO PROGRAM definitions may need to be changed to redefine them as quasi-reentrant once more.
A harder problem to identify is when applications acquire their own storage by means other than the EXEC CICS GETMAIN SHARED API. An example could occur if they issue MVS GETMAIN calls to acquire OSCORE storage, rather than EXEC CICS GETMAIN storage from the CICS DSAs. If this storage is shared between applications, there’s the potential for concurrency-related problems in the same manner as those previously described when using the CWA. Since MVS GETMAIN calls are made to acquire such storage, the DFHEIDTH filter table and DFHEISUP can’t detect them. Such sharing of non-CICS managed storage must be identified by a careful analysis of the programs before redefining them as being threadsafe. Again, suitable serialization techniques would be required to ensure that any such shared storage was accessed in a well-defined manner when used in a concurrent Open TCB environment.
TCA and TCB Considerations
Addressing a task’s Task Control Area (TCA) control block used to be possible via the CSACDTA field in the CICS Common System Area (CSA). However, use of this field is reliable only when executing under the QR TCB. With the introduction of OTE, it’s not safe to assume the TCA address held in CSACDTA is the TCA of the task that’s inquiring upon the address. CSACDTA contains the address of the task currently dispatched under the QR TCB. The task that’s looking at this field may be executing under an Open TCB (possibly an L8). The wrong TCA address would be picked up by the program, leading to problems when referencing it.
Use the CICS System Programming Interface (SPI) for programs that wish to access task state information; it provides this information in a threadsafe manner and avoids the problem inherent in using CSACDTA. The CSACDTA field was renamed to CSAQRTCA in CICS Transaction Server 3.1 to discourage any remaining use of this field to access a TCA address.
An application defined as API(OPENAPI) can potentially use non- CICS APIs. Use of such APIs generally requires that the key of the TCB running the program matches the execution key as held in the Program Status Word (PSW). For this reason, CICS provides L9 TCBs for those user key programs defined with API(OPENAPI), and L8 TCBs for those CICS-key programs. This is different from a program defined as API(CICSAPI) since this uses only CICS APIs and applications can successfully run in either CICS-key or user-key, regardless of the key associated with the TCB they’re running under. Such programs can execute under the QR TCB, an L8, or an L9 TCB.
OPENAPI programs are given control under an Open TCB. The program logic must be executed under that TCB. This means there’s potential for additional TCB switching to occur in some circumstances. For example, if a user-key OPENAPI program is given control, it will execute under an L9 TCB. If it invokes a non-threadsafe EXEC CICS command, this will require an automatic switch to the QR TCB for the duration of the command, followed by a switch back to the L9 TCB upon completion of the CICS API call. Similarly, if such a program issues a call to an OPENAPI TRUE such as DB2, this must be executed under an L8 TCB and so automatic switches to and from the L8 TCB must occur while DB2 is being called.
For these reasons, be careful when deciding to define threadsafe user-key applications as API(OPENAPI). Candidates for this would be those programs that issue few, if any, non-threadsafe EXEC CICS commands. Also, you might consider programs that don’t tend to invoke OPENAPI TRUEs. Beyond that, good candidates would be programs that need to perform CPU-intensive workloads or wish to invoke non-CICS APIs.
The provision of API(OPENAPI) on program definitions lets you move application workloads off the QR TCB and on to multiple Open TCBs. Be aware, however, that IBM has documented a warning regarding the use of OPENAPI. Use of other (non-CICS) APIs in CICS is entirely at the user’s risk and discretion. IBM has not performed testing of other (non-CICS) APIs and doesn’t provide service support for their use.
There are several ways you can determine an application’s threadsafety and this is important when deciding whether or not it’s appropriate to redefine a program from being quasirent to threadsafe.
You can use the DFHEISUP utility to detect the signatures of probable commands that indicate non-threadsafe activity in a program such as the ADDRESS CWA or GETMAIN SHARED examples previously discussed.
Relinking a program as RENT, and specifying RENTPGM=PROTECT in the SIT, will result in the program being loaded into read-only DSA storage (i.e., the RDSA or ERDSA). Any attempt by the program to modify its own contents will result in message DFHSR0622 and a protection exception/ abend 0C4.
The DFH0STAT report can list characteristics of exit programs executed by the program(s) of interest. This is valuable in understanding the performance impact of redefining the program(s) as threadsafe. Quasirent exit programs will incur TCB switching back to the QR TCB as they’re executed.
Likewise, CICS supplied transactions such as CEDA and CEMT can be used to analyze program attributes.
There’s no one solution to guaranteeing that a program is written to threadsafe standards and it isn’t possible to fully automate the decision as to whether or not to redefine the program. Also, as previously discussed, problems from such an incorrect decision may not immediately manifest themselves, so any user testing of a redefined program isn’t guaranteed to reveal problems due to non-serialized activity. The only way to ensure that is to perform a thorough code inspection of the program logic. The other techniques described can be helpful in such a decision, but should be used as part of an overall person-led analysis of the program(s).
One visual way of confirming the TCB usage from a threadsafe or OPENAPI program is to use CICS trace and record the events associated with a task running the program. Figure 1 shows an edited example of such a trace. An application is running as task number 19283 in CICS. When it makes a call to an OPENAPI TRUE such as DB2, CICS switches control from the QR TCB to an Open TCB—in this case, L8 TCB number L8337.
On return from DB2, control stays on the Open TCB since the program is defined as threadsafe. It issues an EXEC CICS WRITEQ command to temporary storage, which is itself a threadsafe command, so this is processed under the Open TCB, too. It then issues an EXEC CICS SEND command, which isn’t threadsafe, so CICS must switch back to the QR TCB to process the request. Control will then stay on the QR TCB until another OPENAPI TRUE call is made, or CICS switches TCBs for another reason such as an EXEC CICS LINK to an OPENAPI program.
This example shows the DSAT CHANGE_MODE trace entries. By default, these aren’t seen with standard DS level 1 tracing active. To see them, DS level 2 (or ALL) needs to be set under CETR. You can still recognize TCB switches without these CHANGE_ MODE calls, since the TCB identifier in the second column will change.
This article has helped explain the background to CICS threadsafety issues with OTE and provided an overview of its evolution since CICS Transaction Server 1.3. The article also has described considerations to keep in mind when implementing Open TCB programming in CICS.