Customers in all industries often experience challenges while searching for their customers’ names in a database. Sometimes, there may be a misspelling in the name. Sometimes, the customer may not remember what name they’re registered under, and sometimes, we simply need to search a range of names. When dealing with large amounts of data, SQL accuracy and search execution performance can be challenging. However, certain DB2 techniques can help improve performance for name searches of various sorts. This article discusses a few interesting scenarios, including one involving name searching against DB2 data using a legacy key, interesting data variations, and some more generic name search examples and tips.
Generic vs. Quick Hit Techniques
Often, when providing search support, developers take a one-stop shopping approach to search screens and subsequent search queries. Managers and users ask for the ability to search on a variety of fields and any combination of them. Developers look at the myriad of combinations and quiver at the prospect of the complex code to support these searches. The result typically is a big, fat, generic search query that spans many lines of code and has no efficient access path. Sometimes, these queries even contain switches that will activate or inactivate compound predicates, such as “AND :HV1 = 1.” This type of switching doesn’t belong in SQL statements and instead should be supported by if-then-else logic in the application program with branching to multiple queries.
With this in mind, consider your search screens with relevance to the data, the business needs, and common practices of users and customers. What are the most common search field combinations? Are there any search fields you can force to be required? At a minimum, there should be fields available that can be used to provide the best search performance. These should serve as “quick hit” fields and be the basis for your primary query. Yes, that means you’re going to code multiple search queries. This is usually necessary if you want to achieve good performance, especially when searching large databases.
Assuming customers are calling in (or logging on) without knowing their customer number, Figure 1 presents a generic search query that supports a variety of optional search columns. Each compound predicate, separated by an “OR,” supports a different combination of search criteria. In this example, it sup-ports a search on name, address and partial name, or phone number and partial name. This is a powerful search, but may not deliver optimal performance. At best, the optimizer may be able to get multi-index access with carefully designed indexes. When fast response and minimal CPU consumption are important, this may be the easy programming choice, but not the best performance choice.
An easy, economical solution is to build two queries with a little extra application code to support them. While more understanding of the data and common search input is required, the return in response time and CPU savings is worth the effort. For example, if most customers who call in can provide their last name and phone number, then the query in Figure 2 could be executed before the generic query in Figure 1. Coordinated training of customer service representatives to ask for these common fields could help achieve extremely fast search times.
The Need for Boolean Term Predicates
Generic searches yield generic performance. Often, application developers are coding predicates in queries that support scrolling through a search result set as well as a direct hit search result. Again, multiple cursors are better from a performance perspective than a do-all, multi-purpose cursor.
When it comes to scrolling cursors, it’s important to use Boolean term predicates in your queries. A Boolean term predicate is a simple or compound predicate that, when evaluated to false, makes the evaluation of the entire row false. Figures 3 and 4 show queries that contain Boolean term and non-Boolean term predicates. If you look at Figure 3 and ask yourself if any one of the simple predicates alone can eliminate a row, the answer is no. So, there are no Boolean term predicates in that query. In Figure 4, the simple predicate “AND EMPNO >= ?” can alone eliminate rows from the answer set. That predicate is a Boolean term.