Figure 2 shows a typical generic search where a user sees a screen with multiple search criteria they can enter. Often, this criterion is in the form of upper and lower values for a range search.

If there was an index on the three columns referenced in the WHERE clause of this query, then in the case of range predicates, DB2 can only match on the first indexed column. At best, it can do index screening on the other columns. The lower number of matching columns can result in more of the index being accessed, especially for large tables with many rows of data.

Often, the domain of the columns being searched is finite and possibly quite small. This creates the opportunity to use code tables that contain the entire domain of search values and then perform the more expensive range searches on these small tables. This accommodates an index match on more columns of the target table. This technique can be effective and efficient, especially for large tables and several range predicates.

The first query in Figure 3 leverages a non-correlated subquery and some code tables to generate a set of values to power the search. The code tables are accessed first in this query and then three-column matching index access is attained for the table being searched. So, we apply the range predicates to small code tables rather than the big table. The less efficient index (or table) access is against a small table instead of a big table. There’s a Cartesian join between the code tables to provide all value combinations to the outer portion of the query.

In the second query in Figure 3, the same query range predicates are applied to our code tables in separate, non-correlated subqueries. This avoids the potential for a large work file due to the join in the previous example, but introduces an additional query block. The access path again has three-column matching index access for our search in the outer portion of the statement, provided DB2 hasn’t rewritten the query.

In the third query in Figure 3, a common table expression is used with a Cartesian join using our range predicates to build a set of values for the search. The table produced is then joined to the table being searched. This can avoid the large work file creation and a sort of the search values. Again, the code tables are accessed first in the access path and we get three-column matching index access.


Early Out

Multi-table generic searches can be expensive because the optimizer has to pick only one path it thinks is the best access path. So, if the input is variable and optional, then the query may perform poorly if the values entered for predicates aren’t against the first table accessed in the join sequence. Coding generic searches is easy and they provide flexible search screens. However, the performance of these queries is generic at best. Figure 4 shows a generic search on an employee number, department name, or project number.

There are many solutions to the generic search problem. Some include building the query on the fly and providing dynamic literal values to the input. Such a solution could take advantage of DB2 distribution statistics. DB2 run-time optimization is also a possibility and will give DB2 the opportunity to choose the access path at run-time, but the incremental bind during execution can add more CPU time than it saves.

5 Pages