Mar 1 ’05

DFHTRAP: Assisting the CICS Systems Programmer

by Editor in z/Journal

IBM CICS has provided transaction processing services for businesses for more than 35 years. During that time, a team of dedicated IBM service employees has supported CICS and IBM’s program product family. IBM has continually enhanced CICS while adding support for many new features such as Web access, SOAP support, Java applications, etc.

IBM staff members dedicated to CICS are based in the local geographies for each country, and in larger support centers such as the Level 2 organization in Raleigh, NC, and at IBM’s Hursley Park Laboratories in the U.K. The CICS Change Team in the U.K. works with CICS Level 2 and the product developers to provide diagnosis, problem determination, and defect fixes, also known as Program Temporary Fixes (PTFs).

The CICS trace facility provides useful information for CICS problem determination; it provides thousands of unique trace points, together with state data and variable information. As CICS executes, the trace records a detailed chronological history of the events occurring within CICS. You can use trace to reconstruct the series of events that led up to a particular failure, or to examine the state of different CICS control blocks and state data as processing continues.

One key aspect of CICS trace support that’s less familiar to some users is the CICS global trap/trace exit program DFHTRAP. You should use DFHTRAP only with guidance from IBM’s service personnel. There are good reasons for this; the restrictions on what may and may not be done within DFHTRAP are discussed here. A default version of this module is supplied with CICS. If the need arises, IBM service personnel may provide a specific replacement for DFHTRAP and offer guidance on how to use it to gather more data and help resolve a particular problem. This article defines and details DFHTRAP in the CICS MVS environment and explains how it can be used and what it provides.

The CICS Trace Domain and DFHTRAP

CICS is divided into several discrete component areas known as domains. These domains are responsible for managing the function and data associated with different parts of CICS processing. Examples of CICS domains include the:

- Dispatcher Domain, which handles the dispatching of different tasks within CICS

- Log Manager Domain, which manages the writing and reading of CICS and user log data

- Trace Domain, which encapsulates the logic and control blocks necessary to support tracing to the various trace destinations CICS supports, including the internal trace table (a wraparound area of memory above the 16MB line), auxiliary trace (two BSAM-managed data sets used to hold larger volumes of trace data if required) and to the Generalized Trace Facility (GTF) (which can be useful when merging CICS trace information with that from other products).

Each time CICS has to record a trace entry, it will invoke Trace Domain services to write the data to its destination.

CICS provides the CETR transaction, which lets the user dynamically control many aspects of CICS trace activity. This includes switching on and off CICS internal and auxiliary tracing, setting the levels of component tracing for the various functional areas in CICS, and using selective tracing for specific transactions and terminals. In addition to the use of CETR, the CICS system initialization parameters, STNTR and SPCTR, can be used to control the initial trace settings. STNTRxx and SPCTRxx can be used to control individual component tracing at CICS start-up.

Selecting higher levels of the standard component tracing is a useful diagnostic tool for determining a problem in a particular CICS domain or component. Likewise, using CICS selective tracing can be helpful when debugging CICS application problems; it provides detailed trace information for a specific program environment.

If DFHTRAP is active, the Trace Domain will also pass control to it as part of the trace operation. This occurs after the trace entry is written (say to internal trace). The Trace Domain provides the exit with a parameter list of information. The supplied DSECT DFHTRADS maps this parameter list.

DFHTRAP is a CICS-supplied assembler program. CICS supplies a sample version containing a basic skeleton of some limited functionality in both the load-module form and also as a source file (in the SDFHMAC library).

It is worth noting that DFHTRAP is not a SLIP trap, nor a trap the operating system or related subsystems provide. It’s shipped with CICS, and (unless activated) is never executed as part of normal CICS function. If the trap hasn’t been activated, the CICS Trace Domain will return control to its caller—having written the trace data without invoking DFHTRAP. This is expected behavior.

You should use the global trap/trace exit only with the guidance of IBM’s service personnel. There are certain restrictions on what operations DFHTRAP may perform. Using DFHTRAP may result in an adverse performance impact on a CICS system if the trap has to execute many instructions to perform its analysis. Also, DFHTRAP can generate a variety of side effects such as requesting CICS system dumps or requesting a system termination. Clearly, it’s a powerful tool and its use should be limited to specific requirements. Typically, these are to detect the occurrence of an error that can’t be diagnosed by another means of problem determination. Some forms of CICS storage overlays, timing windows leading to data corruption, or randomly occurring events leading to specific types of failures, may all require use of DFHTRAP to further IBM analysis.

DFHTRAP is intended to provide the ability to execute some specific code purely to help with the understanding and diagnosis of a particular CICS problem or situation. DFHTRAP is designed so a detailed piece of online diagnosis may be performed as part of normal CICS operations. Having such a diagnostic exit within the Trace Domain is ideal because it lets the trap perform its analysis during the flow of execution from all parts of the CICS system. DFHTRAP is intended to be used in this way without the need to have to stop and then restart the CICS system under investigation, thus making it as transparent to use as possible.

Receiving Updates to DFHTRAP

If IBM requests you run a version of DFHTRAP to generate additional diagnostic information, the CICS Change Team or Level 2 service personnel will provide you with an updated version of the code containing the specific instructions necessary to analyze the particular areas of interest and carry out the required diagnostics. This may be in the form of a usermod containing a ++MOD for DFHTRAP’s object module; alternatively, the changes may be provided as a source delta in ++MACUPD format. A source delta can be applied to the source version of DFHTRAP as originally supplied in SDFHMAC and reassembled ready for use in CICS.

Only one DFHTRAP may exist in a CICS system at any time. In the unlikely event that more than one problem needs to be trapped, a composite DFHTRAP would be required to investigate the problems.

Managing DFHTRAP

To use DFHTRAP services, there’s a CICS-supplied field engineering transaction called CSFE. You can use it to activate DFHTRAP in the CICS system or to deactivate it. The format of the command to activate DFHTRAP is CSFE DEBUG,TRAP=ON. The corresponding command to deactivate it is CSFE DEBUG,TRAP=OFF. If IBM has instructed you to use DFHTRAP during CICS initialization, code TRAP=ON in the SIT, or as a CICS start-up override in the CICS JCL.

It may be that a version of DFHTRAP needs to be replaced while CICS is active. If so, you can use CSFE to turn the trap off, then use CEMT to SET PROGRAM(DFHTRAP) NEWCOPY and pick up the updated version from a target library. Then, you can reissue CSFE DEBUG, TRAP=ON to begin invoking the new version of DFHTRAP.

DFHTRAP Input and Output Data

The DFHTRADS DSECT contains a series of addresses the CICS Trace Domain passes to DFHTRAP. The most important address fields are the Common System Area (CSA), Task Control Area (TCA), an RSA for DFHTRAP to use, and the most recently written CICS trace entry, which is provided for the trap to analyze. If DFHTRAP is invoked early during CICS start-up, there may be no CSA within the system, meaning its address passed to DFHTRAP will be zeroes. Likewise, not all tasks within CICS have a TCA environment, so there may not be an address to pass to DFHTRAP either.

The parameter list also addresses an 80-byte work area for use by DFHTRAP. An MVS GETMAIN issued by the Trace Domain acquires this; its contents are initialized to binary zeroes when the trap is activated. The work area and its contents then persist until the trap is deactivated. This work area is formatted in both a CICS transaction dump and CICS system dump for offline analysis.

CICS will invoke DFHTRAP for trace entries issued under its various Task Control Blocks (TCBs), but the Trace Domain serializes the trace entries themselves (and use of DFHTRAP).

When DFHTRAP executes, it may or may not need to notify CICS to perform certain actions as a result of its online analysis. The global trap/trace exit parameter list also provides the address of a return-action flag byte. DFHTRAP can set this to indicate what actions should be performed when control returns from it to CICS. The possible actions are:

- The trap may elect for CICS to do nothing. This is the normal result of using a trap. Nearly every invocation doesn’t result in a problem being detected, so the trap has no work to do in gathering diagnostic information.

- The trap may tell CICS to issue a further trace entry, passing data items of interest to be traced by this new trace call (the trace point ID is TR 0103). Note that if a TR 0103 trace entry is made, it will follow the trace entry that caused the trap to run and generate it.

- The trap can instruct CICS to take a system dump of the CICS region if a problem is discovered. The system dump code is TR0103.

Typically, discovery of a problem and the request for a system dump is coupled with a trap also instructing CICS to disable the global trap/trace exit so DFHTRAP is no longer invoked by CICS on subsequent trace entries. This occurs because, if the problem persists, then any subsequent invocations of DFHTRAP would also detect the problem and request further system dumps be taken, which would be unnecessary and have a detrimental effect on the running CICS system.

Some problems are so serious that a DFHTRAP may elect to tell CICS to terminate itself when they’re detected. This action is accompanied by message DFHTR1000. It’s rare that such action is necessary, but the functionality is provided in case it’s ever required.

You can select these return actions in any combination; a given DFHTRAP can elect to set all, some, or no actions when it returns control to CICS.

Often, a DFHTRAP will issue a WTO when it has detected the corruption or problem it was searching for. This is a useful means of identifying the problem occurrence to the CICS systems programmer and ties in with the dump and trace diagnostics that may also be generated by CICS upon return from DFHTRAP.

The CICS system dump formatter VERBX for the TR (Trace) Domain formats information relating to the DFHTRAP being used by CICS and the trap environment. For example, the 80-byte work area contents are formatted as part of the VERBX data.

If using DFHTRAP results in a system dump when a problem is detected, it’s sufficient to just have CICS internal trace active (with a suitably large internal trace table size) because the dump contains the preceding trace entries of interest. Conversely, if DFHTRAP is used to generate additional trace entries, CICS tracing to auxiliary or GTF destinations is more appropriate.

DFHTRAP Usage Models

The DFHTRAP logic can examine the most recently issued trace entry and see whether it’s of interest for problem determination. Some supplied DFHTRAPs only choose to perform further analysis if they’re being driven for particular trace entries. This may be because it’s only relevant to examine certain parts of CICS if certain events are occurring. Another reason may be to limit the overhead of executing a particular version of DFHTRAP to certain trace entries. It might be that a particular problem investigation requires DFHTRAP to run down a great many chained control blocks. This sort of operation might be too expensive in terms of CPU and elapsed time, were the function to be performed on every trace entry. For that reason, DFHTRAPs are often provided with logic to perform their analysis only when invoked for specific trace entries or, for example, when a task switch occurs. In this way, the overhead of using DFHTRAP can be minimized. This approach has to be tempered by appreciating the fact that if the window between successive DFHTRAP invocations is too large, there’s a corresponding increase in the time during which the problem being investigated may occur. The shorter the period between DFHTRAP analysis points, the easier it is to isolate the cause of a problem. This is because there’s less activity within the CICS system to have to consider between the points in time when DFHTRAP last performed its analysis and this invocation.

A problem investigation may require DFHTRAP to be used over several iterations with the trap logic being refined each time to be more selective as to when the trap executes as the failing environment is better identified. This can reduce the performance impact to CICS when running the trap.

The sample version of DFHTRAP provided in the SDFHMAC library shows an example of how to request a further trace entry from CICS when control returns from it to CICS. This TR 0103 trace data is written to the currently active trace destinations in the same way as other trace points that are written on the CICS system.

It may be that the analysis DFHTRAP performs results in a program check. With corrupted pointers in CICS, care must be taken that the trap logic doesn’t rely upon them. Typically, program checks will be the result of the corruption that the trap is attempting to catch. Likely program check types are S0C1 (operation exception) and S0C4 (protection exception). You may also see S0C7 program checks (data exceptions) if packed decimal data is involved in the area of the problem.

If a program check in DFHTRAP occurs, CICS will mark the trap as unusable to prevent its execution as part of future Trace Domain invocations. Again, this prevents subsequent invocations of the trap from program checking for the same reason. CICS will also perform First Failure Data Capture (FFDC) by issuing a diagnostic message, DFHTR0001, and taking a CICS system dump (dump code TR1001). This will contain the failing DFHTRAP environment, the registers, and the Program Status Word (PSW) at the time of the program check. The documentation should be supplied to IBM in the normal way for further investigation.

One interesting aspect of the use of a supplied version of DFHTRAP is when it’s written to detect a particular example of data corruption or storage overlay. If the problem is consistent and the area of memory that’s being affected is consistently changed from one value to another, then it may be possible for the DFHTRAP to modify the storage contents back to their original value after it detects the corruption. By gathering the necessary FFDC information at the time of corruption, and using the opportunity to address and re-modify the affected piece of storage, a DFHTRAP can perform repair work in addition to generating the information necessary to determine the cause of the corruption. This online self-correction is particularly useful in cases where the corruption could otherwise lead to serious system stability or data integrity concerns. While it must be used with caution, it’s another important example of the type of work DFHTRAP may be called upon to perform when necessary.

DFHTRAP Restrictions

Again, DFHTRAP should be used only under the guidance of IBM’s service personnel. Consider the issue of restricted functions that must not be performed with the global trap/trace exit. For example, the code added to DFHTRAP must not invoke any CICS services, either directly or indirectly. This would lead to recursion within the CICS Trace Domain environment and unpredictable results. Also, the trap logic must not cause the currently dispatched task to lose control. In other words, CICS Dispatcher services must not be invoked and result in a switch of the task currently dispatched in CICS. In addition, the status of the CICS system must not be changed by the use of DFHTRAP. Invoking the trap should be a transparent operation as far as CICS is concerned. Part of this transparency requires that DFHTRAP saves and restores the general purpose register values around its invocation, so the Trace Domain doesn’t pass control back from DFHTRAP with different register values.

DFHTRAP has a requirement to run with AMODE(31) and RMODE(ANY) attributes. Apart from the need to access CICS control blocks and state data above the 16MB line, there’s also the requirement that DFHTRAP always returns control to the CICS Trace Domain in AMODE(31). In addition, DFHTRAP executes in Storage Key 8 and base space environments with respect to storage protection and transaction isolation.

With the introduction of the Open Transaction Environment (OTE), CICS now provides the support for parallel dispatching of different tasks under their own TCBs. CICS now supports OTE-managed “Open” TCBs for use by environments such as Java Virtual Machines (JVMs) and OpenAPI TRUEs running within CICS. These TCBs may be dispatched and execute their own tasks within CICS simultaneously; each TCB may be executing concurrently under its own Central Processor (CP).

Parallel dispatching within CICS does impose a limitation upon some of the operations DFHTRAP may perform. If a trap needs to analyze state data that may be modified by another task running simultaneously on another TCB, then it’s important that the trap recognize this sort of event may occur. It may be necessary for IBM to write the supplied trap to prevent concurrency situations from affecting DFHTRAP to ensure serialization of certain items of state data while the trap is analyzing them. This is non-trivial and underscores the importance of using DFHTRAP only with guidance from IBM’s service personnel.

DFHXCTRA

DFHXCTRA is a user-replaceable program for use in much the same way as DFHTRAP, but for the External CICS Interface (EXCI) environment. It’s invoked whenever an EXCI trace entry is made. Again, the source is provided in the SDFHMAC library. It has a similar role to DFHTRAP, but is provided to perform analysis within the EXCI address space rather than CICS.

Summary

This article has explained the CICS diagnostic trap exit, DFHTRAP. We’ve looked at its typical uses and benefits for performing complex online problem determination with guidance from IBM. For more information, see the CICS bibliography at www.ibm.com/software/htp/cics/library.