Mar 23 ’10
Snapping the DB2 for z/OS Dynamic Statement Cache
DB2 for z/OS does a good job of caching dynamic SQL statements to avoid recompiling (aka binding) the statements on the fly. This can save significant system resources. However, we still have the problem of figuring out just what dynamic statements are coming in to DB2 and how they’re performing. DB2 for z/OS Version 8 introduced the ability to capture and report on the statements in the dynamic statement cache. Here, we’ll focus on use and management of this facility, and how to report, understand, and act on the available performance information.
Introduction to the Dynamic Statement Cache
The dynamic statement cache is an area of memory reserved inside DB2 for storing the text of dynamic SQL statements and the compiled run-time structures associated with the statement (also referred to as a prepared statement). It’s part of the set of EDM storage and is enabled by the CACHEDYN installation parameter, which by default is set to YES. The EDMSTMTC installation parameter controls the size of this cache, which can range from 5MB to 2GB. There’s a formula in the installation guide you can use to properly size the cache, but typically, a systems programmer sizes the pool based on the amount of available memory.
The primary purpose of this cache is to avoid the overhead of re-preparing a statement for every execution. The statement prepare process, where DB2 determines the access path, can be expensive, especially for complicated statements, and can result in high CPU costs. The concept is simple: If the user and statement don’t change, and no table structures or statistics change in the database, then there’s no reason to re-prepare the statement; the previous executable version can be reused.
A potential performance issue is that dynamic statements must be bound to the database during a prepare similar to the way static statements are bound. During the bind process, statements are validated, object definitions and statistics are accessed in the DB2 system catalog, the access paths are determined with the optimal path chosen, and the run-time structures are created and stored.
This process can be quite expensive, especially for more complicated statements. In some transaction environments, the prepare represents about 90 percent of the transaction CPU cost. The dynamic statement cache is used to retain the output from the bind process across executions so it can be reused, avoiding the bind overhead. If an incoming statement matches the authorization id and SQL text (byte for byte including spaces), and the object structures and catalog statistics have remained unchanged, then the run-time structures are reused and the prepare avoids the statement bind process. This can dramatically improve performance of some dynamic statements.
Leveraging the Dynamic Statement Cache
Only certain statements can take advantage of the dynamic statement cache, and these are statements normally associated with transactions—statements that execute often and repetitively. For a statement to match in the statement cache, it must match an existing statement, byte for byte, for the given authorization id. If any part of the statement doesn’t match, then the statement must go through the expensive bind process.
It’s important to code SQL statements so they have a better chance of matching in the cache. The main way is to use parameter markers for variables in the statement. If you use embedded literals, and those literal values change across statement executions, then there’ll be little or no matching. Also, you should use constants for authorization ids. A statement can only match in the cache for the same authorization id. So, you should use a three-tier architecture with connection pooling and fixed authorization ids to increase the matching in the statement cache.
Remember that the cost of matching isn’t free, but it’s not high, especially relative to the cost of the prepare/bind process. If most of the statements executed aren’t properly matching, then it’s better to turn off the statement cache. Statement cache matching overhead can represent about 10 percent of the CPU consumed per statement.
Tuning Dynamic Statements
Traditionally, application programs were written using static embedded SQL. Many modern programming languages and application development methodologies today use dynamic SQL. With dynamic SQL, it’s an ongoing challenge to predict performance and find tuning opportunities. This was always relatively easy to do with static SQL because the access path was determined during statement compile time (the bind process) and before the statement even executed. With dynamic SQL, a DBA or other tuning expert often doesn’t see the statement until it’s already executing. Unless you run specific performance traces, you usually can’t see the statements coming from a remote dynamic application.
Traditionally, tuning dynamic SQL has been difficult. You worked directly with programmers to obtain the SQL text, captured the SQL via an online monitor or trace, or captured the SQL via a local JDBC trace at the application server. The cost of dynamic SQL statements isn’t generally known. With static statements, you typically know how often the program that contains the statement is executed by looking at the number of package executions in a DB2 accounting report. From looking at the program logic, you can determine how often certain static SQL statements execute. To properly tune an SQL statement, you must know how expensive a statement is relative to what the statement does and how often it’s used. This is difficult to do with remote dynamic SQL, especially if all the SQL is coming in under the same authid, because there are no package execution counts to view.
Collecting Performance Information
DB2 for z/OS can store performance metrics related to statements stored in the dynamic statement cache. This collection of performance information isn’t automatic, but can be enabled by turning on IFCIDs 316 and 318 inside an existing trace, or by starting them in a new trace. Most shops have accounting traces, particularly classes 1,2,3,7 and 8, continuously running. You can easily attach these IFCIDs to those existing traces. There may be some overhead to collecting this additional trace information, but typically it’s not of measurable impact. The following command example starts a new monitor trace and activates the IFCIDs that gather the performance metrics for the statements in the cache:
-START TRACE(MON) IFCID(316,318)
To properly tune for optimal performance, you should start with the overall big picture of the performance of applications in a subsystem or data-sharing group. We typically use DB2 accounting reports summarized by authid. To support this, give each distinct software application or program suite a separate authorization id for their production SQL. By collecting this information identified and separated by application, you can tell which applications use the most system resources, and how many SQL statements are issued. So, you know who uses the most CPU, and which applications’ SQL statements are most efficient or inefficient. Determining which application may have inefficient SQL can be as easy as dividing the total CPU consumed by the number of statements executed.
Starting with V8, DB2 has expanded the functionality of the EXPLAIN facility. A key feature of this expansion is the ability to explain the access paths of the statements in the dynamic statement cache and expose the performance metrics stored in the dynamic statement cache. A new statement, EXPLAIN STMTCACHE, allows access to the performance information in the dynamic statement cache and also allows statements in the cache to be explained. Unfortunately, by default, you can explain statements only in the cache that matches your own authid. People with SYSADM authority can get the performance information for all the statements in the cache.
The format of the statement is quite simple. The following statement will collect all the performance information for the statements that executed under your authorization id, or for an entire subsystem if you’re SYSADM, and populates the DSN_STATEMENT_CACHE_TABLE under your schema:
EXPLAIN STMTCACHE ALL;
The next statement will perform an explain of the access path for the statement in the cache with a statement id of 11087, and populates the normal set of explain tables under the schema of the person who executes it:
EXPLAIN STMTCACHE STMTID 11087;
The DSN_STMTCACHE_TABLE is similar to the other explain tables, and must be created under your own schema. The output from the EXPLAIN STMTCACHE ALL statement is placed in this table. You can then run queries against the table to examine the performance metrics captured for the dynamic statements.
Figure 1 describes just some of the columns in the statement cache table. As you can see, there’s a significant amount of performance information gathered. Most important are such things as the number of executions, accumulated elapsed time, and accumulated CPU time. These numbers continue to be accumulated in the cache until the statement is flushed from the cache, and can be exposed every time the statement cache is explained so you can derive delta values.
Let’s take a look at a large-scale customer (online merchandise retailer) that had an established base of legacy CICS/COBOL programs, but was moving large volumes of applications into a new enterprise architecture. This new architecture involved object-based design with modern programming languages and remote application servers. What resulted was a large volume of remote dynamic SQL that no one could track and tune. Each application was assigned its own authid, so DB2 accounting information could be summarized at this level and reported by application. This enabled us to first address the application that used the most CPU.
The customer wanted a reduction in CPU consumption as their peak season approached without becoming CPU-constrained. They also didn’t want to upgrade processors. Moreover, there can be hundreds of thousands of daily product sales, so their order processing applications can be quite active. All their modern enterprise applications, including the order processing, are written in Java to run under the control of WebSphere application servers.
We collected DB2 accounting data (classes 1,2,3) from the System Management Facility (SMF) records, ran accounting reports using a reporting software package, then loaded important fields from the reports into a DB2 table. Queries were used to organize the data by daily CPU consumption. We tracked the applications that used the most CPU over several months. We examined those applications first. Figure 2 shows that the application with the DMDMGMTE authid is a daily large CPU consumer.
Once we knew which application we wanted to study, we had the DBA create a statement cache table and issue an EXPLAIN STMTCACHE ALL statement. That same DBA granted us SELECT authority to the table. Our goal was to query the table for the authid of the leading CPU consumer (DMDMGMTE) and sort the result by CPU descending. That way, we saw that the overall most-expensive statement for this application happened to be in the statement cache at the time the EXPLAIN statement ran. Figure 3 is one example of a common query we executed against the statement cache table. We filter by the authid and sort by CPU descending. Since the elapsed time and CPU time columns have a floating point data type, we convert to a decimal data type to make the output easier to read. We also apply a built-in function to the statement text so very long statements don’t cause trouble with our query tool’s limitations.
Figure 4 contains the output from the SQL statement in Figure 3. Our application that uses the most CPU has one statement that’s responsible for a large percentage of the total CPU the application consumes. Additional investigation showed the elapsed time for this query was about the same as the CPU time, meaning the query ran quickly and, without this information, the poor performance may have gone unnoticed.
Now that we’ve discovered a single statement using a significant amount of CPU, we can isolate that statement, EXPLAIN it, and see if it can be tuned, cached, or avoided! The following statement can be used to run a normal explain against that high CPU-consuming statement so we can expose the access path:
EXPLAIN STMTCACHE STMTID 1083216;
In this case, the statement was easily tuned, and significant CPU savings realized.
You’ll want to maintain the amount of data in your statement cache table as it can quickly get quite large, depending on how many dynamic statements are cached. You’ll need to plan to delete or archive the information because a large statement cache table can impact the performance of the EXPLAIN STMTCACHE ALL statement execution. Also, the statement text column (STMT_TEXT) is a Binary Large Object (BLOB), which could make it time-consuming or impossible to execute certain functions. Consider writing the statement cache table to a flat file, then performing extensive searches of that flat file to find interesting cached statements.
While we found the relationship between the authid and an application critical to determining who owned the poor performing statements, we had trouble relating the text of a statement to the method in the application where the statement was embedded. If a comment is added to the statement text, then that comment could be the perfect tool for identifying the piece of code that contains the statement. The following example shows how simple this is to do, and that embedded comment will show up in the statement text column of the statement cache table:
SELECT LASTNAME –GetEmpLastName method
WHERE EMPNO = ?
Getting the performance metrics out of the dynamic statement cache can be extremely valuable for performance tuning your dynamic remote applications. Remember to make this work part of a comprehensive performance reporting and tuning effort. Also, ensure you report the impact of tuning so management understands and supports the work you do!