Years ago, when a million CICS transactions a day was a lot, the data went on tape because it was so “huge.” Today, 500 million DB2 transactions may be considered small and, in many installations, the jobs that post-process System Management Facility (SMF) data have become among the largest applications. With the rising cost of software to support the ever-burgeoning volume of data, it’s important to decide what to keep and process daily. The tests run and examples provided here were all conducted with MXG, but the same techniques could be applied to other software performing the same functions.
Consider the daily volume of SMF data for a relatively small shop where not all possible SMF records are written. Figure 1 shows 17GB of data with nearly 90 percent of the volume coming from DB2 and CICS. Post processing of the SMF data occurs with MXG and is broken into the following three jobs (not including some weekly jobs; there’s no monthly processing):
• The BASE PDB, excluding CICS and DB2
• The CICS/DB2 PDB
• A job that puts some data from the first two together for reporting.
Figure 2 shows the required processing time of this data in minutes. It doesn’t seem like a lot, but the DB2/CICS job is always one of the top-10 resource-consuming jobs. On a small 2098-T04, 45 minutes of CPU time is a huge chunk of capacity to have tied up for more than an hour every morning, when the system is usually busy.
It’s time to decide if all this is a necessary, useful consumption of resources. If not, how do we fix it? To start the analysis, let’s break data into three categories:
• Tactical data is needed to solve problems. It’s most likely only needed for a matter of days or weeks to resolve any outstanding issues.
• Strategic data is needed for long-range planning. It’s typically highly summarized with retention of several years, but with the rapid evolution of technology, data older than perhaps five years is only an interesting historical artifact; it’s not usually useful for future planning.
• Accounting and security data are used for chargeback and tracking security violations. There can be legal requirements for long-term detail storage. Let the auditors decide.
Clearly, some data crosses boundaries. Job-level data could easily fall into all three categories. In those cases, the category with the longest retention wins.
Figure 2 shows us that the processing of DB2 and CICS consumed 90 percent of the total CPU time as well as 90 percent of the volume of data. A series of MXG benchmark tests was run to determine the major CPU consumer: