Oct 21 ’08
The Magic of DB2’s Prefetch
Sequential prefetch has been a part of DB2 since just about its beginning, going back to Version 1.2. When you think about what it can do for an SQL statement, it’s hard to imagine getting along without it. A few releases after V1.2, we gained a variation on sequential prefetch called list prefetch. Eventually, something called dynamic prefetch was added. Prefetch, or some form of it, always seems to be changing. If it isn’t increasing the number of prefetch engines used by sequential, list and dynamic prefetch (currently set to 600), it’s removing the VSAM Control Interval (CI) restriction from list prefetch. Regardless, prefetch can often be useful; it can improve your performance because it lets CPU and I/O processing overlap. This article discusses the latest enhancements to prefetch and the different types of prefetch function.
Usually, DB2 can process sequentially accessed pages in the buffer pool faster than the I/O subsystem can satisfy DB2’s request for another page. When DB2 is aware that this sequential access could happen, it can choose sequential prefetch as the access path. DB2’s sequential prefetch is an asynchronous process that reads some number of pages into the buffer pools before DB2 actually needs to use them. The number of pages read is usually based on the buffer pool size.
Each prefetch read occurs when a “trigger page” is reached. The trigger page also is determined by the number of pages read in each prefetch operation. The idea is to stage the pages that DB2 (actually DB2’s buffer manager component) may possibly need in the pool before DB2 actually requires them. Remember that DB2 thinks it’s sequentially reading these pages. This readahead processing should reduce the chances of DB2 having to wait for a page I/O to complete before satisfying a getpage request.
Although the number of pages in a prefetch varies based on the buffer pool size, for this example let’s assume we have an adequate buffer pool size defined to support a 32-page prefetch. When prefetch is first requested, DB2 will immediately perform two prefetch operations, each reading 32 pages into the buffer pool. This will give DB2 a total of 64 pages to work with. When the first trigger page is reached, the next 32 pages are read.
This technique of reading a full prefetch ahead is sometimes referred to as double buffering. For now, sticking with our 32-page example, let’s assume DB2 starts its first prefetch operation at page 1. The first prefetch will read pages 1 through 32, and the second prefetch, starting immediately after the first begins, will read in pages 33 through 64. When both prefetches complete, DB2 should have 64 pages in the buffer pool waiting on getpage requests.
When the trigger page (in this case, page 32) is reached, DB2 initiates another prefetch for pages 65 to 96. From this point forward, any subsequent prefetch is triggered on the next 32-page boundary (in this example, pages 64, 96, 128, 160, etc.). Again, this assumes DB2 is performing a 32-page prefetch. As DB2 is reading in the next 32 pages, DB2 is processing getpage requests against the 32 pages DB2 brought into the pool on the previous prefetch. Although our example always refers to 32 pages being read in, it doesn’t always mean that DB2 will actually read in 32 pages. If DB2 attempts to read in a page via prefetch that’s already in the buffer pool, DB2 won’t try to reread the page (see Figure 1).
What exactly is meant by the term “adequate buffers” that was used earlier? The prefetch quantity is determined by your buffer pool size (VPSIZE) and table space (or index space) page size. If you remember when prefetch was new, we made a huge deal over buffer pool sizes because the buffer pool size directly affected the number of pages prefetched. You needed a pool size greater than or equal to 1,000 pages to get prefetch to fetch 32 pages (the maximum at that time).
This long-standing value has finally changed. In DB2 9, all we previously knew is still true for buffer pools less than 40,000 pages. However, when your pool sizes exceed 40,000 pages, you’ll now get larger prefetch quantities. If your buffer pool is greater than 40,000 and less than or equal to 80,000 pages, you can get a sequential prefetch quantity of 64 pages and 64 pages for utilities. The utility page quantity jumps to 128 pages when you get to 80,000 pages less than or equal to VPSIZE * VPSEQT (although the sequential prefetch value remains at 64).
Dynamic prefetch will always have a maximum of 32 for all buffer sizes greater than 1,000, regardless of how large your buffer might be. This short discussion of prefetch quantities is based on a 4k page size. The prefetch quantity needs to be adjusted for other page sizes. Chapter 39 of the DB2 9 Performance Monitoring and Tuning Guide (SC18-9851) contains a chart that covers the prefetch quantities for all page sizes.
Tip: If you decide it’s time to start using DB2’s larger page sizes, make sure the DSNZPARM keyword DSVCI on macro DSN6SYSP, “VARY DS CONTROL INTERVAL” on panel DSNTIP7, is set to YES, the DB2 default. Any table spaces created after this is turned on will have a VSAM control interval equal to the page size.
Two more buffer pool keywords that have an effect on prefetch are VPSEQT and VPPSEQT.
VPSEQT controls the percentage of the buffer pool (VPSIZE) that can be used for sequential pages, the sequential steal threshold for that pool. Set VPSEQT to zero and you completely disable prefetch for that buffer pool. If the pool is primarily used for highly sequential data (lots of prefetch), then make this value large, going as high as 99 percent. If the pool is going to be shared by both random and sequential pages, choose a VPSEQT value low enough to allow for adequate random I/O to be successful.
VPPSEQT is the sequential threshold for parallel sequential activity and this value is a percentage of the VPSEQT value. Setting VPPSEQT to zero turns off parallel processing for that buffer pool. Because VPPSEQT is a percentage of VPSEQT, setting VPSEQT to zero has the same effect as setting VPPSEQT to zero, regardless of the percentage VPPSEQT is set. Both of these values are modified using the ALTER BUFFERPOOL command.
The PREFETCH column in the plan table is set to “S” when sequential prefetch is picked. As of DB2 9, ACCESSTYPE should be equal to “R.” Be careful to not assume that the “R” in ACCESSTYPE is a bad thing. Although an index access often is an excellent access path, there are those occasions when not using an index could be the better way to get to your data. Always do the extra analysis to determine if it’s actually the better access path.
Clustering is another aspect of DB2’s data that’s important to prefetch processing. If the data is highly clustered, you can take advantage of prefetch and reduce I/O processing. This is an example of why your design is so critical to your application’s potential performance. Picking the wrong key can defeat the advantages of using prefetch.
With sequential prefetch being such a cool feature of DB2 for so many years, you may be surprised to learn the occasions when DB2 will choose sequential prefetch as the access path by bind are about to significantly decrease. That’s because of dynamic prefetch, which is playing a far more prominent part as of DB2 9. Although dynamic prefetch has been around a long time, it really hasn’t been front-page news until recently, when it’s been showing up as the access path of choice more often than ever.
Sequential prefetch works well, but has its shortcomings. Because it’s picked at bind time, once sequential prefetch is chosen, you’re stuck with it. This is important because it’s possible for the decision to use sequential prefetch to be based on inaccurate estimations. Next, sequential prefetch requires a triggering page, which is a multiple of the prefetch quantity (our example used 32 pages), and if it’s missed, you’re stuck reading one page at a time until the next triggering page is reached. Finally, sequential prefetch works only when fetching forward; it isn’t used for a backward scan, an important fact now that DB2 has backward index access.
Dynamic prefetch has none of those shortcomings. When dynamic prefetch is picked at bind time, it can switch between sequential prefetch and index access based on the data pages accessed using sequential detection. Dynamic prefetch doesn’t rely on a triggering page, so there’s no chance of “missing it.” Dynamic prefetch also can scan both forward and backward, which is handy. Remember, full backward index scan is available as of DB2 V8. Finally, the dynamic prefetch engines can run in parallel. Parallel processing is almost always a good thing if you’re trying to reduce elapsed time.
As of DB2 9, dynamic prefetch is used for index scans and table access via table scans. A few of the utilities via APAR PK44026 also are recent converts to dynamic prefetch. Dynamic prefetch is enabled for the UNLOAD phase of REORG INDEX, the UNLOAD and BUILD phases of REORG TABLESPACE PART, the BUILD phase of LOAD REPLACE PART, and the RUNSTATS phase of RUNSTATS INDEX.
In addition, the changes to dynamic prefetch in DB2 9 apply to single table access, multi-table join, outer join, subquery, and union. All this is available in DB2 9 Conversion Mode (CM), formerly compatibility mode in V8. DB2’s traditional sequential prefetch, as of DB2 9, will only be used for table space scans.
The part that hasn’t changed is determining when you’re using dynamic prefetch. The plan table column PREFETCH will still contain a “D” when bind decides to use dynamic prefetch. However, you may have to check IFCID 0003 and IFCID 0006 to find out what actually happened if DB2 made the switch to dynamic prefetch at run-time. In this case, dynamic prefetch wouldn’t be reflected in the plan table.
Dynamic prefetch could have all kinds of performance advantages over sequential prefetch because it isn’t dependent on the optimizer making the correct decision the first time and allows the access path to be changed on the fly as data patterns change. It’s interesting that DB2 just doesn’t make dynamic prefetch its first choice for everything and then just do whatever is right at run-time. However, dynamic prefetch, using sequential detection, does incur a CPU overhead from monitoring page access patterns. For this reason, it’s not turned on in a table scan in which all consecutive pages are being read anyway. In the test performed for the DB2 9 Performance Redbook, there was up to a 50 percent improvement when performing an index scan using dynamic prefetch.
So what’s sequential detection? Although almost always referred to as dynamic prefetch today, the term dynamic prefetch was often used interchangeably with sequential detection when this function was first introduced in DB2 V2.3. Today, sequential detection is almost exclusively used to describe the mechanism that trips dynamic prefetch. Dynamic prefetch can “dynamically” switch from an index access to sequential prefetch using the sequential detection algorithm to determine if pages are being sequentially accessed. Regardless of what you call it, it’s a nice feature that lets the optimizer make an access path decision with a fallback plan changing how data pages are retrieved while an application runs.
Sequential detection can work on all page sizes (4K, 8K, 16K, and 32K) and can be used on leaf pages or data pages. Sequential detection also can be used to trigger dynamic prefetch for a backward index scan. If used to trip dynamic prefetch at run-time, there’s no way of detecting the access change in the plan table; it always tells you what was known at bind time.
Chapter 39 of the DB2 9 Performance Monitoring and Tuning Guide (SC18- 9851) contains a detailed explanation of how sequential detection works. This is my interpretation of that discussion.
Sequential detection is a two-step process. DB2 first needs to determine if the most recently accessed page is page sequential. If so, it then determines if the data is being sequentially accessed. Both calculations are performed on a moving target. When determining if a page is page sequential, DB2 checks to see if the page is p/2 (where p is the prefetch quantity) pages from the current page. If it is, it then checks to see if five or more of the last eight pages were sequentially accessed. If they were, it switches to sequential prefetch. The pages continuously checked move forward and are always checked. If it’s detected that the pages are no longer page sequential, then DB2 reverts to the original access path. DB2 will flip-flop between index and sequential access while continually checking to see if access is page sequential.
To determine if sequential detection was used, check the accounting IFCID 0003 or the performance IFCID 0006 records.
Although there seems to be less emphasis today on specify RELEASE (DEALLOCATE) at bind time, specifying RELEASE (DEALLAOCATE) does ensure that the sequential detection counters aren’t reset at each commit point. In a batch job with numerous commits, using RELEASE (COMMIT) could impact the batch application’s performance by negating the benefits of sequential detection and reducing the opportunities to take advantage of dynamic prefetch.
List prefetch probably won’t make it too high on your list of favorite access paths if you happen to be trying to obtain sub-second response time for an Online Transaction Processing (OLTP) transaction. List prefetch needs to do a little bit of work before your application gets to see any results. It can really shine in the right situation, such as when your data is clustered in one sequence, but you need to use an index whose key sequence goes completely against your clustering order, or if you just have a low cluster ratio (something less than 80 percent). Data scan via index with an index-screening predicate is another example, even with 100 percent cluster ratio. The 80 percent cluster ratio threshold for list prefetch was eliminated in DB2 9. With list prefetch, used only with an index access, you gather up all of the qualifying Record Identifiers (RIDs) that satisfy the predicates from the index access, sort the RIDs into page order, and then use sequential prefetch (or skip sequential prefetch) to read the noncontiguous but now ordered pages. Because of the faster speeds of today’s disk, and features such as Parallel Access Volume (PAV), the old limit of 180 CIs per I/O for list prefetch was removed in V7 and no limit currently exists.
List prefetch is a mixed bag. On one hand, DB2 can perform a little extra index screening at the RID level, which is a good thing. However, because you’re sorting the RID in page sequences, you lose any use of ordering you might have gained from the index access and will still have to perform a sort.
List prefetch isn’t just used for an unclustered index access. It’s also used:
• To improve restart and recovery times by the process Fast Log Apply (FLA)
• To access the inner table of a hybrid join
• For multiple index joins by getting the qualifying RIDs from each index involved, taking the union or intersection to provide a final RID list, and using that RID list via list prefetch to retrieve all the qualifying (pages).
List prefetch can even be used for reading Large Objects (LOBs) and by an incremental image copy.
Although a good way to efficiently process an unordered group of pages, list prefetch does come with rules. At static bind, if it’s determined that the list of RIDs could use 50 percent or more of the RID pool, list prefetch won’t be chosen. The RID pool is used for processing RIDs in DB2. It’s defined at installation and can be modified by changing your DSNZPARM member. Correctly sizing the RID pool will minimize the chances of RID pool failures. There are suggestions and formulas for determining the RID pool size in the DB2 Installation Guide. Setting the RID pool size to zero will disable RID processing and therefore disable list prefetch.
One additional bind time constraint on list prefetch is that it won’t be chosen by the optimizer if the table has been created with, or altered to use, the VOLATILE keyword.
At execution time, DB2 can abandon list prefetch if it accesses more than 25 percent of the rows of a single table. This last threshold, if reached, can really crush any chances of optimal performance. After already spending time processing data using list prefetch as the access method, when DB2 stops using list prefetch at run-time, DB2 has to fall back to sequential prefetch to access the data. DB2 also has a low-end check it performs for list prefetch. If DB2 is using an intersection of multiple RID lists and detects less than 32 RIDs, it stops the intersection and just uses the RIDs it has.
If you want to find out what list prefetch is actually doing, you can track IFCID 125, available with a class 8 performance trace.
If you want to try to influence the optimizer to not pick list prefetch, you can specify OPTIMIZE FOR n ROWS on the SQL SELECT statement. The OPTIMIZE FOR n ROWS clause will reduce the likelihood of list prefetch being picked.
There are reasons you may not wish to use list prefetch. One of them is the chance of increasing elapsed time by increasing the amount of processing that needs to be performed before a row is returned to the application. Another would be no index only access can be used in combination with list prefetch.
To determine if the optimizer would use list prefetch, you can check the PREFETCH column in the plan table for an “L.”
When used at the right time, prefetch can be a real asset to your application, helping it achieve its best performance. You must understand when it’s a benefit and when another access path may be a better choice. When adequate statistics are available, depend on the optimizer making the right choice. If you’d like to learn more about the basics of prefetch and how it all started, look for the article, “Managing IBM Database 2 Buffers to Maximize Performance,” which is available for download in Adobe Acrobat PDF format from IBM’s Website.