Oct 7 ’14
SMF: An Important Component of z/OS
z/OS provides more measurement data than any other operating system, which is one of the great strengths of this platform over others. These measurements allow you to tune the system, debug problems, chargeback for resources, provide management reports to show resource usage and help a capacity planner adequately forecast the future needs. The majority of this wonderful information is written to a data repository by the System Management Facilities (SMF) component. This component is used as a common receptacle of data from other components of z/OS, such as CICS and DB2.
This article provides an introduction to this most important component. A follow-up article will describe the life of an address space, how exits can be used, when type 30 records are created and how they can be used. Additional articles will include more detail on SMF, its parameters, recommendations on SMF file creation and usage, and details about the most important record types.
SMF is a base component of z/OS and has been available for more than 40 years. It provides a set of macros applications can use to pass records to SMF for recording to a data repository. The beginning of all records must conform to a standard format, and each record is identified by a one-character record type (hex value ‘00’ to ‘FF’) between 0 and 255. IBM has exclusive use of record types 0 through 127, and subsystems and applications may use 128 to 255. Applications aren’t forced to use SMF for recording and some don’t (e.g., IMS relies on its logs instead of SMF). Several of the record types also have subtypes, such as the type 30 record, which has subtypes 1 through 6 to identify unique types of records.
The basics and use of SMF are documented in the z/OS System Management Facilities (SMF) manual (see the “Resources” section at the end of this article).
The major elements of SMF are:
- Macros. Applications use several macros to pass information to SMF and to interrogate parameters. The most common of these are the SMFWTM and SMFEWTM macros, which are used to pass records to SMF. These are documented in the SMF manual.
- Parameters. Parameters are used to control how SMF is run and which record types are recorded. The parameters reside in member SMFPRMxx of the system parmlib. They’re documented in the z/OS MVS Initialization & Tuning Reference, although some additional material is also found in the SMF manual. A future article will provide more detail about each SMFPRMxx parameter and give our recommendations.
- Exits. User-written exits are available to interrogate, change or delete records as they’re passed to SMF. These are preferably defined in the PROGxx parmlib member and are documented in the SMF manual. These will be covered in our next article.
- MVS Commands. The default is for SMF to automatically be started following an IPL. If you specify NOACTIVE (meaning SMF shouldn’t be automatically started), you can subsequently start it using the SET SMF command. Additionally, the SETSMF and SET SMF commands can be used to dynamically modify the SMF parameters. The DISPLAY SMF command can display the parameters and output data sets.
Uses of SMF Data
These disciplines make heavy use of SMF data:
- Tuning devices, jobs, network, data sets and workload manager (WLM)
- Managing and reporting resources such as CPU, external storage, memory and connections
- Configuration analysis
- Management reporting
- Problem identification
- Hardware and workload analysis
- Accounting and chargeback to internal or external customers (such as outsourcers)
- Performance and resource management, including:
- Security and auditing issues
- Capacity planning, including the collection of resource data for planning purposes
- Data center reporting
- IBM software license charges, including sub-capacity and usage charges.
Major Record Types
Everyone should be familiar with the major SMF records, but if you’re new to SMF, here’s a shortcut:
- Type 70-79 records are created by the IBM Resource Measurement Facility (RMF) or the BMC Software CMF Monitor and are used to provide performance statistics for z/OS. The type 70 contains CPU and LPAR usage, the type 74 provides (among other things) DASD activity and response times, and the type 72 provides resource usage and response time by service class periods. These three record types and the type 30 records are the most frequently used records in most data centers.
- Type 30 records are created by MVS and are written at key points during the processing of any batch job, TSO user or started task. Our next article will provide more detail about each subtype:
- Subtype 1 is written at the beginning of a job.
- Subtype 2 is written at the end of all but the last interval (if the INTERVAL keyword of the SMFPRMxx parmlib member is turned on).
- Subtype 3 is written at the end of the last interval of a step (if the INTERVAL keyword of the SMFPRMxx parmlib member is turned on).
- Subtype 4 is written at the end of a step.
- Subtype 5 is written at the end of a job.
- Subtype 6 is written at the end of an interval for a system task.
- Type 14-15 records are written when non-VSAM data sets are closed and contain information about the usage of each data set. You can do a lot of performance tuning by analyzing buffer sizes and data set activity with these records.
- Type 42 records are written by the DFSMS component for SMS and non-SMS-managed data sets and controllers.
- Type 60-69 records contain information about VSAM data sets and catalogs, and can be used to tune VSAM files.
- Type 100-102 records are written by DB2 and contain accounting, trace data and performance data.
- Type 110-111 records are written by CICS and provide accounting and performance data.
- Type 113 records are written by Hardware Instrumentation Services (HIS) and provide invaluable hardware data that includes information on the type of workloads you’re running.
- Type 115-116 records are written by WebSphere MQ and provide statistics and accounting data.
- Type 120 records are written by the WebSphere Application Server (WAS) to provide performance data.
If these are on the tip of your tongue, you will be considered an expert!
The SMF address space is started at IPL and runs as a SYSTEM task. Figure 1 shows the flow of SMF data through the system:
1. Applications use macros to pass records to the SMF writer (i.e., SMF address space).
2. The SMF writer uses parameters from the SMFPRMxx member(s) in SYS1.PARMLIB to determine which records to keep, which to discard and which exits to call. It then stores the records in a buffer.
3. Records are written from the buffer to either VSAM data sets formatted especially for SMF or to one or more SMF Logger logstreams (but not both).
4. The installation runs a program to extract the records from the data sets or logstream to create sequential files. The IBM-supplied program is IFASMFDP (for VSAM data sets) or IFASMFDL (for logstream data).
5. The SMF sequential files are used as input to a variety of programs and are often stored on databases. MXG from Merrill Consultants, CA-MICS, ITRM from SAS and the IBM Tivoli Decision Support (TDS) System are the most common databases.
SMF VSAM Data Sets
VSAM data sets are the traditional and most common location for SMF records when they’re written by the SMF address space to a device. These data sets must be pre-formatted, and there are multiple data sets so that if one is being dumped, SMF may switch to another data set for recording the records. When an SMF VSAM data set fills, SMF calls the IEFU29 exit and switches to another formatted VSAM data set. The exit usually issues a message to the operator and submits a job (most commonly IFASMFDP) to dump the records to a sequential data set and clear and format the dumped VSAM data set.
There are several problems with SMF VSAM data sets:
- Data sets can fill up and not be cleared for some reason. If this occurs and the last data set fills up, data can be lost. If an installation is using SMF for chargeback, that could represent lost revenue.
- The data sets can be easily overlaid and destroyed, thus possibly losing revenue.
- Large volumes of SMF records can overflow the buffers and/or the data sets, again losing data.
- Because of the high volume of certain records, and the ability to overflow the buffers, some installations have turned off usable records simply to reduce the volume.
- A runaway or looping application can produce hundreds of thousands of records in a short amount of time, resulting in buffer overflows even in a robust configuration.
- Because SMF uses old VSAM macros, the SMF VSAM data sets don’t support modern enhancements such as compression or striping and the tuning options are very limited.
To resolve some of the problems just defined, IBM provided an option in z/OS 1.9 to write SMF records to a logstream instead of a VSAM data set. This is now the recommended method of recording SMF data and resolves many of the previous problems.
With SMF Logger, SMF records are written to a logstream that resides in a Coupling Facility structure or on DASD (a DASDONLY logstream). CF logstreams allow multiple LPARs in a sysplex to record to the logstream, so you could have a single, sysplexwide repository of SMF records, for example. Also, to provide better scalability and more flexibility, you can have multiple SMF logstreams; for example, one for RMF, another for DB2, another for security records and so on. This allows a higher write rate and lets you reduce the post-processing time (multiple offload jobs at the same time).
The use of SMF logstreams can avoid these problems because:
- Buffers are in 2GB dataspaces (as many as needed), so that buffers are less likely to overflow.
- Each logstream is managed by its own task, so that the write rate is increased.
- Logstreams are offloaded to DASD data sets as needed (and they don’t need to be pre-allocated), eliminating the delays that can be associated with the process to dump the SMF VSAM data sets.
- The ability to separate record types onto different structures and logstreams can increase the write rate, as well as decrease post-processing time. This also reduces the problem of one runaway application causing other applications to lose data.
We strongly recommend that installations use the SMF Logger, even if it’s recording to a DASDONLY logstream. You will need to spend a little time on migration because some of your procedures will need to be changed, but the benefits far outweigh the effort. For more information, see the SMF Logger Redbook in the “Resources” section.
The SYS1.PARMLIB member called SMFPRMxx defines the types of SMF records that are recorded. It also provides all the options used by SMF during its processing. It’s an important Parmlib member, although it’s seldom modified once it’s created. SMFPRMxx is processed at IPL and is pointed to by the IEASYSxx parameter of “SMF=xx”, where “xx” is the suffix of SMFPRMxx. If no reference is made, SMFPRM00 is assumed.
General SMFPRMxx recommendations:
- Use this member to document and keep track of product-assigned SMF record numbers. IBM assigns record types 0 through 127, while applications may use 128 through 255. Documenting which application uses each record type is quite important, especially for non-IBM products because there’s no single manual that lists which application creates which SMF records. It isn’t uncommon to find that people have lost track of which application uses each SMF record (see Figure 2).
- SMF provides several exit points where you can look at, modify or delete SMF records during processing. You should keep track of these exit programs someplace. While some sites used the SMFPRMxx member to document the list of active exits, the location of their source, the location of the load modules and a brief description of why they’re used, most sites today use PROGxx for that documentation. You really should be using the dynamic exit facility of Parmlib member PROGxx so the exits can be changed without an IPL and documented in PROGxx instead of in SMFPRMxx.
- Code only the parameters that differ from IBM defaults. If IBM identifies that some values should be changed, they can change the defaults. If you hardcode the IBM default value, the new default won’t be picked up when IBM changes it. It’s usually to your advantage to let the new defaults apply.
- Use this member to document when you make changes to the SMFPRMxx member. It can often help resolve problems when they occur. Add a date to the change line as well! Figure 2 shows an example.
A future article will describe each parameter and our recommendations.
The primary SMF manual is the z/OS MVS System Management Facilities (SMF). This provides the record layouts, exits, macros and several recommendations on using the SMF files:
- z/OS 1.13 MVS System Management Facilities (SMF) (SA22-7630-25)
- z/OS 2.1 MVS System Management Facilities (SMF) (SA38-0667-02).
The z/OS MVS Initialization and Tuning Reference manual describes all the SMFPRMxx Parmlib parameters:
- z/OS 1.13 MVS Initialization and Tuning Reference (SA22-7592-24)
- z/OS 2.1 MVS Initialization and Tuning Reference (SA23-1380-02).
A wonderful Redbook on the SMF Logger is SMF Logstream Mode—Optimizing the New Paradigm (SG24-7919). Also, you can find a summary of SMF record types in our SMF Reference Summary at www.watsonwalker. com/references.html.