Operating Systems

Purge Measurements

When all SYSOUT has been printed, routed or deleted, a type 26 record is written. The IEFUJP exit is invoked, and the IEFU83/84/85 exit is called before the record is written. We haven’t seen much use of the IEFUJP exit, so if you have a good reason for one, please let us know. The JES2 JOBCLASS parameter can indicate that IEFUJP is not to be called for certain job classes, all started tasks and/or all TSO users.

Type 26 is a fantastic record. If it contained total CPU time and account code information, it would be almost perfect! The primary information in the type 26 is significant timestamps in the life of the job: reader start and stop, converter start and stop, execution start and stop and output start and stop. Most service level management information can be obtained in this record alone. Additionally, for systems with NJE, all system IDs are provided here. You can tell that a job was read in on one system, converted on a second system, executed on a third system and printed on a fourth. When this information is used with the timestamps, you can also see the impact on multiple systems by time of day, job class and user.

Another nice feature of the type 26 records is the inclusion of total SYSOUT lines written to spool, along with estimated and actual SYSOUT byte counts. You can use this to let programmers know how close they are to their limits (before abending). The JES2 JOBCLASS parameter can indicate that type 26 records are not to be created for certain job classes, all started tasks and/or all TSO users.

Here’s the bad news about the type 26 record: it doesn’t get written until every SYSOUT has been printed or purged. This might take many days if the output is held. If you want to combine all of the type 30 records with the type 26 record for a single job, you might need to hold the type 30 records for a week or more. The amount of time you allow held SYSOUT to remain on spool determines how soon the type 26 record is written after the job terminates.

TSO Command Recording

The intention of the type 32 record is to provide counts and, as an option, resource information on TSO commands. Even though it doesn’t provide response time, the capability to analyze TSO usage by command is very useful. (TSO response time is best collected in the RMF type 72 record, which is by service class period.)

A type 32 record is written at TSO logoff or interval end containing information about each command. What isn’t useful, however, is that called programs all fall under the “CALL” command and CLISTs fall under the “EXEC” command so that you can’t tell information about specific programs or CLISTS. Another negative when using this record is that aliases aren’t combined (that is, “SE” and “SEND” are reported separately). If you want specific information on CLISTs, you could write a command processor that invokes the CLIST and track activity on the command processor.

The type 32 record can be a useful tool for special studies to help you determine which TSO commands should be placed in PLPA and which should be relegated to the LINKLIB libraries. If DETAIL is specified in SMFPRMxx, an additional 40 bytes of data per command is added to the record to provide TCB, SRB milliseconds, TGETs, TPUTs, transactions, EXCP count and device connect time. This is most often used by the performance analysts (turn it on for only one week and study the results).

Interval Recording

Interval recording for types 30 and 32 records allows you to specify that records are to be written at specific intervals, such as every 30 minutes. A type 30.2 is written during every interval but the last, where a 30.3 is written. These records contain the same information as the step termination, type 30.4, but only contain totals for the interval. The primary purpose of interval recording is to provide statistics in case the system crashes during a step. For a batch job, this may not be too important, but for TSO it’s critical since a TSO step is the entire logon session. If you do not use interval recording for TSO and you have an “unscheduled IPL” in the middle of the day, you will have lost all SMF measurements for logged on TSO users during the period of time up to the crash. If you do charge back to TSO users, you will have lost revenue.

(Note to capacity planners—you’ll still get your performance group service units from RMF measurements, but no individual totals by TSO user or job.) In a chargeback environment or one where capacity planners are using SMF data, this could amount to a loss of a day’s worth of information or billing data. Interval recording is also used where you want to collect information by job by time of day. For example, my CICS address space job totals may show that I had 10M EXCPs to device 6174, but I may need to know whether those were spread throughout the day or during a single period of activity. Interval recording provides that information.

You can specify interval recording by subsystem (batch, TSO, STC and ASCH) in SMFPRMxx in SYS1.PARMLIB. Different interval times can be specified for each subsystem. An unfortunate side issue of intervals is the complexity of processing them. In general, they need to be sorted prior to processing. For example, we may only want to use interval records if we lost the step termination records. Since the 30.4 follows the 30.2 and 30.3 records, we can’t process them until we’ve passed them. And we can’t simply turn off step termination (30.4) since if a job executes in less than an interval of time, only a 30.4 will be created and no interval records are written. All in all, it’s a messy job to process them! Several installations have written exits to bypass interval records for everything but important workloads (onlines, production batch and TSO). Where do you do that? In the IEFU83/IEFU84/IEFU85 exits, of course!

Type 30, subtype 6 interval records are written for system address spaces (such as MASTER, TCPIP, XCFAS, TRACE, ZFS) if interval recording is specified for the STC subsystem. This is the only way to get information about these system address spaces. These interval records differ from the subtypes 2 and 3 because they contain cumulative information, not simply totals for the interval. If you do not collect and report on this data, you could underestimate your capacity load by up to 20 percent.

Conclusion

We hope this description of the life of an address space has helped you understand when SMF types 30, 6, 26 and 32 records are written, and when user exits are called. Two of the exits in Table 1, IEFU29 and IEFU29L, will be discussed in a later article when we discuss dumping SMF records to an external file.

5 Pages