Modern, highly scaled CICS applications can make use of resources across many individual CICS regions running on multiple Logical Partitions (LPARs) even in different time zones. Real-time monitoring of such infrastructures has become suboptimal, given the way they’re now used. Service outages can have a devastating impact on business and must be minimized.
CICS systems programmers and other support personnel at large CICS sites become concerned when asked to investigate operational issues if they’re only given the name of the CICS application that needs attention. The problem is this doesn’t pinpoint the infrastructure involved.
Consider, for example, an application initiated via one of several instances of CICS Transaction Gateway (CTG) executing on several different z/OS LPARs. The CICS transactions involved could be started in any one of many CICS regions where the application code executes and is sent to CICS over Logical Unit 6.2 (LU6.2) connections. These could call on programs or resources in other specific and connected CICS regions or from one of a group of CICS regions. The resources used could include DB2 data, in which case the DB2 connection and suitably configured DB2 entries and DB2 transaction resources may be needed in those regions. The application programs may access VSAM files, either locally in the same CICS region or remotely in a File Owning Region (FOR); perhaps special CICS Temporary Storage Queues (TSQs) are used. Maybe when the application has a problem, it habitually abends with a certain known transaction abend code; if the application uses the services of WebSphere MQ, then the MQConnection needs to be active in the CICS region or regions. Figure 1 shows the basic CICS infrastructure that might be involved in such an application.
It’s also often true that an accurate, timely diagram or document describing the CICS resources an application uses isn’t readily available when it’s most needed. Even if this information is available, the traditional CICS monitors can only show a small subset of the CICS infrastructure to the support analyst at one time.
Considerable time can elapse while support personnel try to understand just what should be inspected; this is frustrating for all involved. Service-impacting incidents aren’t resolved as quickly as they could be and this has a negative effect on the overall service provided.
Occasionally, there will be something obviously amiss in the CICS regions involved and the traditional CICS monitors, in situ, may provide the insight needed to begin the investigation and recovery. Just as often, the traditional CICS monitors offer no immediate clues. Even with a starting point, it can take time to understand the flow and use of CICS resources involved in the problem application.
Where problems regularly occur, the CICS support areas may start to keep some informal notes about what to look for in relation to a given application name. However, these notes must be referenced quickly and by the whole team, not just the individual who made them. It’s vital that the CICS support area, the application support team, and anyone else who is expected to support CICS-based business applications, know exactly what to look at when calls come in about a problem with an application. That knowledge often resides in informally held knowledge bases that individuals use.
What’s needed is a complete guide to the resources a given application uses and the ability to observe individual components using a CICS monitor solution. Ideally, an automated means of presenting this information should be available—one that can show the status of all aspects of the infrastructure in one easy-to-use view.
Tools available to the support staff should collect and display application-specific monitoring data for immediate use where it’s needed. The traditional monitors in use at most sites are only able to associate a user-chosen or application name with transaction response time type data. This is inadequate, but may at least provide a crude starting point, though typically the systems programmer needs to log on to the specific monitor that will provide this information and visibility may only be provided on one specific CICS region even though the resources may be spread across multiple CICS regions.
In summary, the operational processes for addressing CICS-related application issues are suboptimal at many sites. In the heat of an incident, support personnel don’t know and can’t visualize the entire CICS footprint of the named application reported to be having problems. Time is wasted and service-impacting incidents are prolonged.