Apr 27 ’10
Choices for Caching Data in the Enterprise
As we continue to expand and combine our data and applications into enterprisewide applications and layer them in an architecture that adds abstraction and expands usage, we find our database access activity greatly increases. Service layers are being built on top of application interfaces and then connected with various other applications across the enterprise. Designs point programmers at enterprise objects or services as the source of data, rather than data servers and databases. This allows for fast deployment of new software applications and services, and provides great flexibility in the enterprise to adapt to new requirements. It can also result in greatly increased data server access, application server activity, and network traffic. One way to improve the performance of critical applications with this type of architecture is to use data caching.
Data caching is a generic term used to describe any of several techniques for placing critical data, or data commonly accessed, in memory and also possibly closer to the user. There are a few choices for caching; some requiring programming or administration efforts and some requiring no manual involvement at all.
This article will offer some choices for caching, with special focus on the DB2 for z/OS buffers.
The Need for Caching
With large-scale, high-volume applications, our primary concerns are always elapsed time per transaction and CPU consumption on the central server. With modern enterprise implementations taking over as the development standard, the layers of abstraction hide the data access layer, and developers often have no clue about the data access actually happening on the data server. Also, generic development typically results in a proliferation of data access calls.
With modern enterprise design, we find more data server calls than in legacy architectures. For example, one recent high-performance application we worked on executed about three SQL statements per transaction. This was a highly tuned application designed for a specific purpose. In contrast, another high-performance implementation we worked on, which employed a Service-Oriented Architecture (SOA) to implement an enterprise data service layer, executed about 1,000 SQL statements per transaction. This frequency of access is a primary concern for modern implementations. Caching can be extremely helpful in these situations.
Caching can help alleviate several potential performance bottlenecks:
- CPU constraints on the data server
- Network traffic between the user and application server or application server and data server
- DASD access latency.
Figure 1 indicates potential performance bottlenecks in enterprise application architecture. The caching itself can be implemented in several different ways listed here in debatable order of least to most effective:
- DASD caching
- Database caching (in our example DB2 for z/OS)
- Application-level caching
- Client-level caching
- Cache data servers.
Figure 2 highlights our choices for data caching. We’ll examine each type of caching with emphasis on the DB2 buffer pools.
Modern DASD subsystems have the potential for huge front-end controller cache. We’ve seen sites use this to the max with great results. We’ve seen average synchronous I/O response rates at about 2.7ms, and in some situations, as low as 1ms. Many believe that should be the standard for high-volume databases. At many sites, however, the response range averages between 4ms and 7ms, with some peak response rates of 12ms to 14ms. These could be situations where there’s not enough DASD cache, or perhaps an operating system bottleneck such as high I/O queue times.
Modern, high-performance DASD subsystems don’t necessarily have faster disk drives than traditional disk drives. We get the high DASD response rates with a combination of distributing the data across multiple devices in a Redundant Array of Inexpensive Disk (RAID) configuration, and through large stores of cache at the DASD subsystem level. In situations where your applications are driving high volumes of random I/O, DASD cache may be a good idea.
While some cache at the DASD subsystem level can be extremely effective in improving I/O response rates, it’s probably one of the least overall beneficial caching techniques. You should have a lot of DASD cache, but caching at the DASD level doesn’t eliminate the I/O at the operating system level while all the other caching techniques highlighted in this article do. Also, if you’re going to use a lot of data server cache (i.e., DB2 buffers), the DASD cache and data server buffers often can be redundant.
Caching in DB2
When it comes to DB2 performance, using the available cache can prove to be quite important. DB2 buffer pools are virtual storage that temporarily store pages of table spaces or indexes. When a program accesses a row of a table, DB2 places the page containing that row in a buffer. When a program changes a row of a table, DB2 must (eventually) write the data in the buffer back to disk, normally either at a DB2 system checkpoint or a write threshold.
The way DB2 buffer pools work is fairly simple by design, but tuning these operations can greatly impact application performance. The data manager issues GETPAGE requests to the buffer manager, which hopefully can satisfy the request from the buffer pool instead of having to retrieve the page from disk. We often trade CPU for I/O to efficiently manage our buffer pools. Buffer pools are maintained by subsystem, but individual buffer pool design and use should be by object granularity and, in some cases, also by application.
The virtual buffer pools contain three types of pages: Available pages are pages on an available queue and are available for stealing. The in-use pages are pages currently in use by a process and unavailable for stealing. Updated pages are pages that aren’t in use, not available for stealing, and are considered to be dirty pages waiting to be externalized.
You can have up to 80 virtual buffer pools, a limit that allows for up to 50 4K-page buffer pools (BP0–BP49), up to 10 32K-page buffer pools (BP32K–BP32K9), up to 10 8K-page buffer pools (BP8K0-BP8K9), and up to 10 16K-page buffer pools (BP16K0–16K9). The physical memory available on your system limits the size of the buffer pools, with a maximum size for all buffer pools of 1TB. It doesn’t take additional resources or CPU to search a large pool vs. a small pool. DB2 can manage large buffer pools well. However, if you exceed the system’s available memory, the system will start swapping pages from physical memory to disk, a consequence that can severely impact performance. Also, large buffer pools don’t necessarily mean you’ll have better performance.
Best Utilization of DB2’s Buffer Pools
To best use the buffer pools in DB2, we must understand how the applications use the data because this will influence how we set up the buffer pools. DB2 processes pages used in the buffer pools in two categories: random (pages read one at a time) and sequential (pages read via prefetch). These pages are queued separately: in a random Least Recently Used (LRU) queue or a Sequential Least Recently Used (SLRU) queue, where prefetch steals only from this queue. These queues (set with the VPSEQT buffer pool parameter) can be adjusted with the ALTER command. If you have heavy sequential processes using a lot of sequential processing and sequential prefetch, you’d want to be sure the buffer pools have a large sequential area. Other processes may be more random and not benefit from large sequential queues (i.e., online random index access). If DB2 doesn’t have pages available on the sequential queue, sequential prefetch can be turned off.
You also can use First-In, First-Out (FIFO) processing instead of the default LRU. With this method, the oldest pages are moved out regardless. This alternative decreases the cost of doing a GETPAGE operation and reduces internal latch contention for high concurrency. You would use this option only where little or no I/O occurs and the table space or index is resident in the buffer pool (i.e., highly used code tables and indexes that are memory resident). By pinning highly used objects into memory in their own buffer pool, you can then set it to be 100 percent random (VPSEQT=0) so no sequential queue is managed, and then turn on FIFO queuing. This can save on CPU and GETPAGE cost for these heavily used, memory-resident objects.
Your use of separate buffer pools should be based on their usage by the applications (such as buffer pools for objects that are randomly accessed vs. those that are sequentially accessed). Each pool will have its own unique settings, and the type of processing may even differ between the batch cycle and the online day. Figure 3 provides a detailed example of buffer pool object breakouts. These are generic breakouts just for this example. Actual definitions would be much finer-tuned and less generic.
Sizing DB2 Buffer Pools
Will providing a large cache always provide better performance? Not necessarily. While DB2 can manage a large cache well, it may not necessarily be beneficial due to the nature of the data access. The biggest abuse we see today with large buffer pools is when they’re used for objects that are accessed in a 100 percent random fashion with no reference of data. In this case, a large area of memory won’t help. This is often the result of a poor database design (i.e., use of random keys).
The VPSIZE parameter determines buffer pool sizes. This parameter sets the number of pages to be used for the virtual pool. To size buffer pools correctly, it’s helpful to know the residency rate of the pages for the object(s) in the buffer pool. This can be a difficult process and it requires understanding the applications and their use of the data.
Application Server-Level Caching
Modern servers can be configured with a considerable amount of memory, and this memory can be leveraged for a considerable performance boost. This potential performance gain is due to the simple concept that the least expensive call to the database is the call never made. The most expensive thing you can do in an application is call a database. With DB2 for z/OS, any call to the database is a cross-memory call across address spaces. That’s expensive and especially so if the data the application is looking for is commonly accessed data that rarely changes. This includes data stored in code tables or commonly accessed reference tables.
Imagine a large, national retailer that manages their inventory in 100 warehouses spread out across the country. Their inventory management application must deal with moving thousands of products throughout the network of warehouses, and the product and distribution requirements include critical warehouse information that’s stored in a set of database tables. This processing results in these tables being read hundreds of millions of times daily. However, the tables containing this warehouse information rarely change. If all this data was cached at the application server, then hundreds of millions of data server calls per day could be eliminated. Imagine the network and data server savings!
There are many application-level caching techniques. For example, in a CICS environment, you could use data tables; in a COBOL batch program, an array; or in a remote Java application, a global cached object. We’ve seen lots of algorithms for hashing key information into objects for quick online retrieval, with the cache broken at certain intervals so it could be refreshed from the database. This allows for a balance between the currency of the data and usefulness of the cache.
Overall, one of your best options for caching is at the application server level.
There’s some potential for caching at the client level, especially if your applications reside in a two-tier environment. This mainly involves caching things such as code values in drop-down boxes for selection on a screen. With most modern user interfaces being Web-based, thin clients, this technique is becoming less useful. However, it still can be useful for simple code values. From a security perspective, application server-level caching is a better balance between protecting the data and performance.
Cache Data Servers
There are several cache data server products on the market today. These are database products that keep 100 percent of the data in memory. The cache data server is typically placed closer to the application than the central database and is intended to speed access to common database objects or entire databases. So they’re most effective in a multi-tier environment and co-located with the application server component. They typically can be seamlessly incorporated into the data access component of an application, and so, unlike any homegrown caching method, there’s no extra programming required. The cache data servers are typically kept in sync with the main databases via replication. So, a replication product either comes bundled with the cache database or must be purchased.
Replication can be one directional or bi-directional, and as with any replication effort there can be issues to deal with such as update latency and collisions. The advantage to the cache data server would be a simplified implementation and automation.
Caching can be an extremely effective way to improve the performance of your enterprise applications, but it doesn’t replace good application and system performance tuning. The concepts of enterprise database architecture and a data services layer have merit, and when performance is critical exceptions should be made and specialized code put in place. In this way, most of your development follows the standard for data access. The applications that require that extra something would get some customization, especially for the high-volume stuff. This approach can work well. You can improve performance of an application by tweaking the process that impacts the application. You can get a performance boost without sacrificing methodology.
While local caching can help offload database server cycles to less expensive platforms, it won’t reduce overall CPU consumption. CPU utilization generally increases with an enterprise data services architecture. Tuning is the best solution and it involves some customization, at least for the high-volume stuff.