z/Data Perspectives: Dynamic SQL vs. Static SQL

Many DB2 professionals aren’t comfortable determining when to choose dynamic SQL instead of static. Here we’ll examine criteria that can be used to help make that decision.

Performance sensitivity: For frequent, high-performance SQL statements, weigh the overhead required for the PREPARE against the potential for better access paths when choosing between dynamic and static. When performance is crucial, favor static SQL. Using dynamic SQL incurs a higher initial cost per SQL statement due to the PREPARE. Once prepared, the difference in execution time for dynamic vs. static diminishes. If multiple users are running the same dynamic application with the same statements, and dynamic statement caching is implemented, only the first application to issue the statement realizes the cost of preparing.

In some cases, the access paths chosen by the optimizer for dynamic SQL can be better, but authorization checking is costly and prepare statement lookup can often mean up to 30 percent more CPU time for dynamic SQL the first time the statement is executed.

Language: For COBOL, it still makes sense to favor static over dynamic as a default. Static is generally easier to code, monitor, and tune. This doesn’t mean dynamic SQL has no place within a COBOL shop, just that static is the standard and you should deviate only when absolutely necessary. For Java, C and Web developers, dynamic SQL is easier to use because of the Integrated Development Environments (IDEs) and Command Line Interfaces (CLIs) that simplify coding. Of course, the same performance issues that impact COBOL programs exist for Java, too.

Data uniformity: When data is non-uniformly distributed (e.g., beer drinkers skew male), it makes sense to favor dynamic SQL over static. Dynamic SQL can result in more efficient access paths than static for non-uniform distributions. This occurs because DB2 can potentially derive different access paths for different host variable values. Columns with a limited number of values can result in significantly different access paths for the same SQL statement when the DB2 optimizer has access to the values.

Data can also be correlated, which is a similar, yet separate, concept. For example, CITY, STATE, and ZIP_CODE data will be correlated. The ZIP_CODE determines the CITY and STATE. There will be many more occurrences of HOUSTON, TX, than there would be of HOUSTON, CA, (if, indeed, that’s even valid data).

Range predicates: These refer to greater-than (>), less-than (<), BETWEEN, and LIKE. The more frequently you need to use range predicates the more you should consider dynamic SQL. The optimizer can take advantage of distribution statistics and histogram statistics with dynamic SQL to formulate better access paths because the actual range will be known.

Repetitious execution: If the statement runs once a day, for example, it shouldn’t really matter whether dynamic or static SQL is used. Because the statement doesn’t run frequently, the impact of dynamic preparation is likely moot. As the frequency of execution increases, you should favor static SQL or dynamic SQL with KEEPDYNAMIC YES. The cost of the PREPARE becomes a smaller percentage of the overall run-time of the statement the more frequently it runs (if the cached prepare is reused).

Nature of the query: If you need all or part of the SQL statement to be generated during application execution, favor dynamic over static SQL. A common scenario involves a transaction where the user is presented with a screen and can specify various criteria for querying. The SQL statement is based on what the user enters and there can be many variations. Coding such an application with static SQL is difficult because of the numerous variations that need to be supported. But it’s a snap with dynamic SQL because the SQL can be built on the fly based on what’s entered.

Run-time environment: Favor dynamic SQL when you need to build an application where the database objects may not exist at pre-compile time. Binding a static SQL program that attempts to access a non-existent table won’t succeed unless you also specify VALIDATE(RUN); but then every time you run the program, DB2 must validate that the required database objects exist, thereby degrading performance.

RUNSTATS frequency: Consider dynamic SQL when your application needs to access data that changes frequently and dramatically. Dynamic SQL increases the probability of using the most optimal access path, based on current statistics. With static SQL, you would need to use REOPT or make sure you REBIND the program as frequently as you run RUNSTATS. Even then, if the data is skewed, you may not be getting the best access path for every SQL statement in your program.

Of course, the issues discussed here are conditional and you will need to apply knowledge of your business, application, and specific performance requirements to make informed decisions on dynamic vs. static SQL. But you can use these scenarios to drive your decision-making process.