Nov 18 ’14
Opening the CICS Diagnostic Toolbox
CICS has been continually developed and enhanced during its lifetime, and its problem determination data has also been extended and improved during that time. Facilities such as z/OS system dumps, internal and external tracing, exception trace data and built-in diagnostic trap mechanisms, together with a rich set of first failure data capture messages, journaling and SMF data, are all available to assist in problem determination, both at an application and at a system level.
This article, the first of two, focuses on the message and dump problem determination facilities provided with CICS, and gives details on how they may be used and what they can offer in assisting with system problem analysis.
If you encounter a problem with a CICS system, an initial step in identifying the cause is to look at the messages that have been issued.
CICS messages are prefixed with DFH (for CICS), EYU (for CICSPlex SM) or AXM (for the authorized cross-memory server environment). The DFH messages conform to one of two standardized formats: DFHnnnn or DFHccnnnn. DFH, the IBM identifier for CICS modules, is followed by either a fourdigit message number, or two-letter CICS domain or component identifier and four-digit message number. These message identifiers are unique, making it easy to find the description of each message in the documentation that accompanies CICS.
If you have access to a running CICS system, you can use the CMAC transaction to display the description of a CICS message. The description typically includes an explanation of the events leading up to the production of the message, the action that has been or will be taken by CICS and action that you can take. In addition, all messages are described in the CICS Messages and Codes manuals, and the IBM Knowledge Center (which is the recommended reference point for CICS documentation).
CICS messages may be suffixed by action or severity codes. Action codes immediately follow the message number (for example, DFHDB8208D) and provide guidance of the type of action needed. The action codes used are:
• I – Where no action is required
• E – For eventual action, which is required but does not have to be taken immediately
• A – For immediate action, such as mounting a tape
• D – For immediate decision, such as a reply to a request.
Severity codes indicate whether the associated message is reporting an error, and if so, how serious it is. DFHST0210 I is an example of a message identifier, followed by a severity code. The following severity codes are used:
• I – For information only. No action is required.
• W – For an alert. Something may have gone wrong; for example, a program loop, but CICS processing continues.
• E – For an error. Action is required before CICS processing can continue.
• S – For a severe error. CICS processing is suspended until action is taken.
CICS messages are sent to one or more destinations:
• The system console
• A terminal
• A log
• A transient data (TD) queue, such as CSMT for terminal errors and abend messages.
In addition, CICS provides the XMEOUT Global User Exit point to allow system programming interaction with the message being issued.
With each new release of CICS, new messages are introduced and existing messages are updated to reflect new functionality. Of note are the DFHAP1900 system programming interface (SPI) audit messages introduced in CICS 5.1. The SPI commands can change resource definitions dynamically, which if configured incorrectly can lead to failures. They should also be audited for reference purposes. As of CICS 5.1, a DFHAP1900 message is written to the CADS TD queue when a SET, PERFORM, ENABLE, DISABLE or RESYNC command is issued from a CICS non-system task, allowing an auditor or system administrator to monitor such commands. Note, there are some commands, such as SET TERMINAL and PERFORM SHUTDOWN, which are not audited. These exceptions are documented in the IBM Knowledge Center.
CICS transaction and system dumps provide a snapshot of storage areas within CICS at the moment they were taken. System dumps contain components of the z/OS address space, while transaction dumps are limited to transaction-related storage areas. Such detailed information is very useful in problem determination, and can be used together with trace, logs and statistics that provide information about the running of a CICS system over a period of time. As system dumps contain more information than transaction dumps, they are typically more useful and preferred by IBM support, should they be required to investigate a problem.
A dump is taken either by request of a user or in the event of a transaction or CICS abend. To request a transaction dump, issue EXEC CICS DUMP TRANSACTION from a program. Transaction dumps are written to either of the dump data sets DFHDMPA or DFHDMPB. You can determine the status of these dump data sets by entering CEMT INQUIRE DUMPDS.
In some cases you may not get a transaction dump when an abend occurs. Usually this is because either dumping is suppressed or an error occurred that prevented the dump being taken; for example, because no transaction dump data sets were available. To ensure that a dump is taken whenever possible, start your CICS region with the DUMP=YES SIT parameter and specify DUMP=YES in your transaction definitions.
Alternatively, or as a more complete debugging aid, you can request a system dump to be taken in the event of a transaction abend. You can achieve this by entering CEMT SET TRDUMPCODE(xxxx) SYSDUMP MAX(1) ADD, where xxxx is the transaction abend code. Specify a MAX value greater than the current value, which can be seen in the CUR field when entering CEMT INQUIRE TRDUMPCODE(xxxx). The same method can also be used to request a system dump is taken in the event of a CICS message being issued. You can achieve this by entering CEMT SET SYDUMPCODE(xxxxxx) SYSDUMP MAX(1) ADD, where xxxxxx is the CICS message identifier without the DFH-prefix. You can then re-create the problem that coincides with the specified transaction abend or CICS message, or wait for the problem to recur.
Conversely, it is possible to suppress dumps; for example, in cases where the solution to a problem is not yet ready to be applied, and in the meantime you do not wish to be impacted by dumps being written. You can suppress dumps for certain abend codes and messages using the CEMT SET commands described previously, but rather than specifying SYSDUMP and a MAX value, instead specify the NOSYSDUMP (or NOTRANDUMP) option.
You can also request a system dump from outside a CICS region by using the z/OS DUMP command. This command is useful should a CICS region become unresponsive. You can also use the z/OS DUMP command to issue dump requests to multiple z/OS address spaces, including multiple CICS regions, simultaneously. For further information, see the z/OS MVS System Commands manual.
Similar to using CEMT SET TRDUMPCODE or SYDUMPCODE commands to request that CICS writes a dump when a specified event occurs, you can request that z/OS take a SLIP (Serviceability Level Indication Processing) dump when a particular event occurs. The SLIP command is also described in the z/OS MVS System Commands manual.
It is also possible for a z/OS standalone dump to be captured if a serious problem has occurred and data from multiple address spaces is required.
Once you have a dump, it can be formatted using the Interactive Problem Control System (IPCS). After providing IPCS with the data set name, various commands can be issued. By entering the Status System (ST SYS) command for example, you can determine the origin of the dump. If the dump was requested by CICS, then the “Program Requesting Dump” value will be called DFHKETCB.
To format a transaction dump, you can use the CICS dump utility program, DFHDUxxx, where xxx is the release number of CICS used to create the dump (specify 660 for CICS 4.1, 670 for CICS 4.2, 680 for CICS 5.1 and 690 for CICS 5.2). Similarly, the CICS dump utility DFHPDxxx (where xxx is the release number of CICS used to create the dump) can be used to format system dumps. From the IPCS command entry page, enter VERBX DFHPDxxx followed by the domain(s) and level(s) you wish to format. Typically, level 1 formats a short summary of information relating to that CICS domain or component, with higher levels providing more extensive information. You can find a summary of the system dump formatting keywords and levels in the IBM Knowledge Center.
If the production of a system dump was preceded by one or more CICS error messages, then the two-letter domain or component identifiers within the message identifier may indicate an appropriate component to format when you start your investigation. Three CICS domains commonly formatted to investigate problems are the CICS Kernel, the Storage Manager domain and the Trace domain. In the next article on problem determination facilities, we will describe how you can alter the size of the internal trace table to capture sufficient data for problem diagnosis.
IBM Fault Analyzer for z/OS can also be used to diagnose system and application failures in CICS. Fault Analyzer can be invoked at the time of an abend via the CICS global user exits XPCABND and XDUREQ, or the Language Environment termination exit CEECXTAN. Alternatively Fault Analyzer can be run against previously captured dumps.
We hope this article has helped explain how messages and dumps can be used for problem determination. In our subsequent article, we will delve further into the problem determination facilities and discuss tracing, traps, CICS Performance Analyzer and shared data structure diagnostics.