Conditional Job Control Language (JCL) in z/VSE was a major step forward when it was introduced in the ’90s. In the intervening years, z/VSE systems programmers have developed intelligent, complex job streams. The design of conditional JCL was well done and highly functional with few restrictions of any consequence. There have been few requests for enhancements, indicating the initial designs were well-executed. …
Solid State Drives (SSDs) based on flash memory are the premier, enterprise-class storage devices of the near future. Chances are you will be implementing SSDs in the next few years, even if the technology seems expensive today. This article explains the technology behind SSDs and provides practical implementation strategies, considering both the performance improvements SSDs provide and the reduction in spinning hard disk activity that results from moving the most active data sets to solid state. We’ll examine the pros and cons of migrating individual data sets, logical volumes, or complete storage groups and show the reductions in component count possible with SSDs based on actual z/OS workload data. …
Mainframe Executive recently spoke with Reed Mullen, consolidation and virtualization initiative leader for IBM System z. Mullen joined IBM in 1981 as a junior programmer in the VM/SP Development organization located in Endicott, NY. Over the years, he has held management, product planning, and strategy development positions in the IBM mainframe organization. In 1999, he played a key role in developing the Linux on S/390 business plan that leverages the mainframe’s ability to host large-scale virtual server environments. Since then, he has applied his expertise in the areas of strategy, business development, and technical marketing support to help grow the Linux on System z business. …
Smart CIOs recognize that it’s necessary to evaluate how well the metrics they use continue to work. Many CIOs use a set of basic metrics for IT performance, such as response time, percent availability, and ROI. These sometimes serve as a “dashboard” for IT management. …
This article examines the evolution of threadsafe execution in CICS from its inception to today, and why you should exploit threadsafe in as many of your applications as possible. Other topics discussed include: …
This article takes a detailed look at DB2 buffer pool management, an important performance consideration. The goal is to leverage virtual storage so you can keep application data closer at hand and eliminate physical I/Os—still the biggest component of DB2 application response time. The benefit is an improvement in overall service levels, with a reduction in CPU cycles and elapsed time to the user.
With DB2 V8, buffer pool virtual storage constraints disappeared as buffer pools were moved above the bar, so you’re no longer concerned with virtual storage constraints. The primary concern is DB2 object placement in the buffer pools to ensure optimum performance based on your business priorities. This includes determining how many pools you’ll need, what size they’ll be, and which objects go in which pools. Of course, you need to ensure that buffer pool configurations don’t cause an increase in paging and negatively affect overall system performance.
DB2 objects (table spaces and index spaces) are physically stored as pages in VSAM data sets. The buffer manager services DB2 application data requests with a GETPAGE operation. The GETPAGE requests a specific DB2 page. The DB2 buffer pool is searched, and if the page isn’t found, a synchronous I/O operation occurs to retrieve it. Pages accessed this way are known as random pages.
DB2 pages retrieved using an asynchronous I/O operation called a prefetch are known as sequential pages. Prefetch operations retrieve multiple data pages, anticipating an application need for the additional pages. Types of prefetch include sequential, dynamic, and list prefetch. The DB2 9 optimizer favors dynamic prefetch for all sequential access except for table space scans where sequential prefetch is still used.
The retrieved page is placed in the buffer pool associated with the DB2 object being accessed where it will remain until the buffer is required for another DB2 process. While still resident in the pool, the page may be required by the same or another process, resulting in a buffer pool hit and avoiding an additional I/O to read the page again.
After a page is updated by a DB2 process, it’s asynchronously (normally) written back to the underlying VSAM data set. Writes are triggered by DB2 checkpoint processing or when certain buffer pool thresholds occur during normal processing. (Deferred write thresholds will be discussed later.) DB2 optimizes the write process to minimize I/O processing.
This is a simplified view of what’s happening. For more detail on DB2 internals, refer to IBM publications and other content on the Web. Chapter 6 of the Performance and Tuning Guide for DB2 Version 9.1 for z/OS (SC18-9851-06) provides additional details on this topic. An IBM Redpaper (REDP-4604-00) provides similar detailed content on buffer pool management. A general search on “DB2 buffer pool tuning” on the Web will return many items of interest on this topic.
Buffer Pool Details
Because DB2 buffer pools are allocated in virtual storage in the DBM1 address space, they’re also called virtual buffer pools. DB2 virtual buffer pools are defined during DB2 installation (DSNTIP2). Initial specifications are stored in the Bootstrap Data Set (BSDS). Buffer pools defined with a buffer pool size of zero aren’t allocated.
Once the buffer pools are defined, the only way to modify their size is with the DB2 ALTER BUFFERPOOL command. Updates aren’t possible offline. Any changes are persistent. Pools initially defined with VPSIZE (0) can be altered to a positive value and then allocated in the DBM1 address space. You can effectively delete an existing buffer pool by altering the existing size to zero, although any attempts to access a pageset defined to the buffer pool will result in an unavailable buffer pool resource. The alter won’t occur if any object using the pool is open.
DB2 on z/OS allows for 80 buffer pools, including:
* BP0 – BP49: 50 4K pools
* BP8K0 – BP8K9: 10 8K pools (introduced in DB2 6.1)
* BP16K0 – BP16K9: 10 16K pools (introduced in DB2 6.1)
* BP32K – BP32K9: 10 32K pools.
The following virtual buffer pool attributes define the size and other characteristics of the buffer pool:
Virtual Buffer Pool Size (VPSIZE) defines the size of the buffer pool in pages.
Page Stealing Algorithm (PGSTEAL) indicates the page stealing algorithm DB2 will use for stealing pages from this pool. LRU (Least Recently Used) is the default option and is most often the right choice. You might use the First In, First Out (FIFO) PGSTEAL option for buffer pools where no I/O occurs after required table space or index pages are initially loaded into the pool. These pages will remain resident. Since page stealing won’t occur, the higher CPU cost of LRU isn’t necessary.
Page Fix Attribute (PGFIX) indicates if pages are fixed in real storage when initially used. Possible values are YES and NO. Use page fixed buffer pools only if there’s sufficient real storage available to support the fixed pools without causing paging. Page fixing is beneficial when there’s significant unavoidable I/O. Pages must be fixed to do I/O, so having a fixed pool avoids the need to repeatedly fix and then un-fix each page. This can save as much as 10 percent of CPU time. Use page fixing in high update situations where significant amounts of write I/O are occurring.
Automatic Buffer Pool Adjustment (AUTOSIZE) specifies whether this buffer pool uses a DB2 9 feature that allows Workload Manager (WLM) services to automatically change the size of the pool based on I/O delays for random GETPAGES. YES and NO are the available options. With AUTOSIZE (YES), DB2 registers the buffer pool to the WLM. DB2 provides information to the WLM about I/O delays and also passes information about the buffer pool size and hit ratio for random reads. If the WLM determines the buffer pool size is a problem, DB2 is instructed to increase the size of the buffer pool in 25 percent increments. The change to the virtual pool size is persistent. WLM will not reduce the size of the pool if the increase doesn’t improve performance.
WLM can also determine that the size of the pool should be reduced. A buffer pool could be reduced in size, for example, if the WLM observes that available real storage is being severely over-committed. This happens in 25 percent decrements.
This overview of the AUTOSIZE option comes from various sources but there’s little hard performance data showing the value of this option. We’ve had difficulty identifying DB2 users taking advantage of this option.
You can modify several threshold attributes using the ALTER BUFFERPOOL command. Thresholds you might consider changing include:
* Sequential Steal Threshold (VPSEQT) represents the percentage of buffer pool pages that can contain sequentially accessed pages. The default value is 80 percent. For pools with mostly random access, this value can be set to 0 (zero), preventing sequentially accessed pages from taking space in the pool. Any sequentially accessed pages are discarded immediately when they are released. Setting this value at a low number (≈10 percent) would allow some sequential pages to use the pool. If you expect the pool to be used primarily for sequential access, then VPSEQT should be very high (≈99 percent). Monitor the sequential getpage statistics for the pool to validate your assumptions about random/sequential access to the objects in the pool.
* Deferred Write Threshold (DWQT) represents the percentage of the virtual buffer pool pages that can be occupied by unavailable pages (updated or in use) before DB2 begins to schedule asynchronous writes for updated pages. This sets the value for the entire pool. The default value is 30 percent. If the threshold is reached, DB2 will schedule asynchronous write engines (up to 600) to reduce the number of unavailable pages to 10 percent below the threshold. If you were using the default threshold, DB2 would schedule write engines to reduce the current percentage of unavailable pages from 30 percent to 20 percent.
* Vertical Deferred Write Threshold (VDWQT) defines virtual buffer pool pages that can be occupied by updated pages from a single pageset before DB2 begins to schedule asynchronous writes for that pageset. This value should always be less than the DWQT. The threshold can be expressed as a percentage of the total pages in the pool or you can specify an absolute number of buffers. The default percentage is 5 percent. If you specify the absolute number of buffers, the first parameter must be 0 percent; (0,300), for example. If you specify 0 percent and don’t provide a value for the second parameter (absolute number of buffers), a deferred write of 32 pages is triggered when the number of updated buffers (dirty pages) for the data set reaches 40.
Other thresholds, which are fixed for all pools, indicate problems that have a severe impact on performance and availability. Things get progressively worse as the percentage of unavailable (in-use or updated) pages increases. These thresholds are:
* Sequential Prefetch Threshold (SPTH): When 90 percent of the available buffers in the pool are unavailable, prefetch operations are turned off for the pool.
* Data Manager Threshold (DMTH): When this threshold reaches 95 percent, GETPAGE requests and releases apply to rows instead of pages. That is, when more than one row is retrieved in a given page, more than one GETPAGE request and release are performed for that page. This event has a major CPU impact.
* Immediate Write Threshold (IWTH): This threshold (97.5 percent) causes synchronous writes of DB2 pages as they are updated by DB2 processes.
The –DISPLAY BUFFERPOOL DB2 command output provides detailed information on buffer pool characteristics as well as incremental performance metrics for each buffer pool. It provides counts on the number of times various thresholds have been reached and details about the pagesets in the pool. The output of this command is shown in Figure 1.
Configuring Your DB2 Buffer Pools
There are several considerations for buffer pool implementation. Most DB2 users have production systems in place and periodically revisit buffer pool sizing and thresholds to ensure they still meet performance goals. The IBM DB2 on z/OS Installation Guide (GC18-9846-08) provides information on initial sizing for DB2 buffer pools, and it’s often the best place to start for new applications or subsystems. Once applications are in test or production, you can collect accurate metrics that may lead you to consider changes to the buffer pool configuration.
Use the simplest approach that effectively meets performance requirements. Your major task is determining DB2 object placement in the various pools. This effort will lead to the physical characteristics of the buffer pool configuration for your DB2 subsystem.
A good starting point is to consider the various DB2 object types you deal with on a daily basis, including:
* System: DB2 catalog and directory objects (DSNDB01 and DSNDB06)
* Work: Storage used for temporary files and tables required for multiple DB2 functions, including global temporary tables, DB2 sort work files, etc. Because these pagesets tend to have high I/O volumes, consider page fixing for these pools. DB2 9 merges the WORKFILE and TEMP databases from DB2 V8 into a single WORKFILE database.
* Large objects: LOB/XML structures
* Application table space: DB2 application tables
* Application index space: DB2 index structures.
DB2 catalog and directory objects are placed in BP0, BP8K0, BP16K0 and BP32K, depending on the page size of the object. It’s best to keep these objects isolated in these buffer pools, so don’t put application objects in these pools.
Place WORKFILE and TEMP database objects in dedicated buffer pools. Typically, workfile pagesets are accessed sequentially and are short-lived objects. DB2 9 will try to make greater use of 32K workfile table spaces, so increase the size of the 32K buffer pool used for the WORKFILE database and the number of 32K page workfile data sets.
Large Objects (LOBs) shouldn’t share a buffer pool with other kinds of data. Set the DWQT to 0 for these pools. The DWQT sets the value for the pool. The VDWQT (Vertical Deferred Write Threshold) can be used to set the value at the individual pageset level. In either case, this allows updates to happen continuously (trickle writes) in the background to avoid massive writes at DB2 checkpoints.
XML table spaces use 16K buffer pools. Place these objects in dedicated pools. Use a buffer pool other than BP16K0.
DB2 application objects are a much more complex consideration, but here are some general guidelines:
* Don’t place application objects in BP0, BP8K0, BP16K0, or BP32K. These are the default buffer pools for all user data. If you omit the BUFFERPOOL clause for a table space or index space, DB2 will use the default buffer pool for the database. If you don’t specify a default buffer pool for the CREATE DATABASE, the default values set on DSNTIP1 are used. The default values on the DSNTIP1 out of the box are BP0, BP8K0, BP16K0, and BP32K. These are also the buffer pools used for DB2 system objects, so you should consider changing these values. Generally, it’s a good idea to always specify an explicit buffer pool for your objects; however, if you’re allowing implicit object creation, you can’t directly specify what pools to use. Changing the installation defaults for user data and indexes will help enforce your strategy for object buffer pool placement.
* Segregate indexes and table spaces in different buffer pools for concurrency and long residency times. Indexes are typically accessed differently from the tables they index and with greater frequency. The number of buffers required for the average working set size of an index space is usually smaller than its corresponding table space or tables, and you want to ensure buffer pages used by index pages remain resident and don’t get flushed during heavy table space scan activity. Like table spaces, separate frequently updated indexes from infrequently updated indexes to allow optimum buffer pool sizing and thresholds.
It’s important to understand DB2 application object characteristics when deciding on buffer pool placement. Grouping objects with similar characteristics into the same buffer pool can be a good approach to meeting your performance objectives. Consider these characteristics:
* Business priority: Perhaps the most important information you need is how critical the application is to your business. Consider placing critical application objects that require maximum performance into dedicated buffer pools to avoid resource contention with less-important application objects.
* Random or sequential access: Is the predominant level of GETPAGE activity random or sequential? Place objects that are accessed sequentially most often in a pool with a higher VPSEQT value; place objects with mostly random access in a pool with a lower VPSEQT value. If the accesses are evenly split between random and sequential, use the default values as a starting point.
* Activity against this object: How busy is this object? Activity is measured in GETPAGEs per second against the object. A highly busy object might cause buffer pool pages to be stolen from another object that’s not as busy but more important to the business.
* Update frequency: What’s the update rate for the object? Is this a heavily updated DB2 pageset? Table spaces and index spaces that are frequently updated should be separated from infrequently updated objects to allow optimum buffer pool sizing and thresholds.
* Object size: How big is this DB2 object in pages? If the object is quite large and the access is highly random, the chances of reusing a page that’s already in the pool will be minimal, so having a very large pool for the large object may provide no benefit to application performance.
It can be difficult to collect many of these characteristics, especially for a new application. Use an educated guess as a starting point. In test environments, use an isolated set of buffer pools as you begin testing a new application or one that’s undergoing major changes. This will give you more complete control of the testing environment where you can more easily monitor buffer pool performance.
Important Buffer Pool Performance Metrics
Many of the buffer pool characteristics previously mentioned are collected by DB2 and reported by multiple vendor products. DB2 statistics (SMF 100 records) and accounting (SMF 101) records can provide buffer pool performance information at the subsystem and thread level. DB2 can externalize multiple IFCIDs (198, 199, 201, and 202) with detailed buffer pools and object information.
For a good idea of overall buffer pool health, focus on these metrics:
* GETPAGE rate (pages per second), which is the average number of GETPAGE requests per second over time. You can instantly compare the relative amount of work supported by each buffer pool. This is a good indicator of overall workload through the buffer pool. It’s also a great metric to review at the DB2 subsystem level.
* Page arrival rate (pages read per second) is the average number of pages read into the buffer pool per second. This quickly identifies the buffer pools and the objects driving your I/O subsystem. This is also a great metric for the entire DB2 subsystem.
* Buffer pool hit ratio, which is the ratio of GETPAGEs resolved without read I/Os divided by total GETPAGEs. Read I/Os should include random reads as well as prefetch (sequential, dynamic, and list) I/Os. The buffer pool hit ratio measures overall buffer pool efficiency. Your goal is to have as high a number as possible. A lower value may indicate a need for a larger buffer pool. However, if you have large objects in the pool that are accessed randomly, then you may always have a smaller hit ratio. In this case, you might consider lowering the size of the pool.
The overall hit ratio can be negative when prefetch is performing poorly. If prefetch reads pages that are never used (or are stolen before they can be used), synchronous I/Os are the result. This can be curtailed for randomly accessed pools by setting the sequential steal thresholds very low. For sequential pools, the pool either has insufficient pages to support the combined prefetch activity, or the activity is possibly being driven by dynamic prefetch for an application that is not sufficiently sequential to benefit (and should therefore be moved to a random pool with prefetch disabled).
* Page updates per write is synchronous page updates divided by system page writes. You can determine the average number of synchronous page updates for each system page written over the length of the interval. This indicates how well the pool size and deferred write thresholds enable repeated updates to the same page to occur before the page is written.
* Pages written per write I/O is the number of pages written divided by number of write I/O operations. You can see the average number of pages written per write I/O over a selected interval with many performance management tools. Average pages per write should be as high as possible, but must be balanced against the amount of the pool tied up with updated (dirty) pages and the possibility of an undesirable spike in activity when checkpoints occur. Multiple factors can impact the number of pages per write, including system checkpoint frequency, active log size, and buffer pool size. Increasing the VDWQT may also increase the number of pages per write by increasing the number of pages in the deferred write queue. Ultimately, pages per write are based on the locality of the pages, and changes to other buffer pool characteristics may have minimal impact.
In addition to watching these metrics, it’s important to monitor adverse events occurring in the DB2 subsystem, including:
* GETPAGE failures due to no available buffers
* GETPAGE failures due to DBM1 virtual storage shortage
* Write I/Os delayed because all 300 write engines are busy
* Data manager critical threshold reached
* Prefetch disabled due to buffer shortage
* Prefetch disabled due to prefetch limit
* Parallel prefetch requests denied due to buffer shortage
* Parallel groups not started due to buffer shortage
* Parallel prefetch requests halved due to buffer shortage
* Parallel prefetch requests quartered due to buffer shortage
* Sort work files that have exceeded the sequential buffer limit
* Sort-merge passes degraded due to work file sequential limit.
Buffer Pool Performance Tuning Tools
Buffer pool performance management can be a challenge, but tools are available to help you. These tools fall into two groups: general-purpose monitors and buffer pool tools.
General-purpose DB2 performance monitors let you monitor buffer pools in real-time and may provide near- and long-term historical reporting capabilities. Monitors report DB2 accounting and statistics data about buffer pool performance. Monitors can capture many of the adverse events previously mentioned, and you can use automation capabilities to automatically respond to these problems. Detailed analysis of buffer pool configurations is typically not provided. Depending on your environment, this may be the only type of DB2 buffer pool monitoring tool you need.
For other DB2 environments, “deep dive” products provide an extensive level of monitoring and analysis. These tools collect and report on real-time performance metrics and maintain detailed historical performance data that can provide long-term analysis of buffer pool performance to help you identify and respond to negative trends. These tools also recommend buffer pool configuration changes, including number and size of pools, thresholds, and DB2 object placement.
Buffer pool performance is vital to overall DB2 performance. Even well-tuned SQL statements may perform poorly if your buffer pool configuration doesn’t support the application workloads that drive your business.
The dynamics of your production environment make buffer pool tuning an ongoing task. Because today’s optimum buffer pool configuration may not be optimum in three months, it’s important to periodically revisit this task, especially as new applications are implemented…
Here we’ll discuss how to get the network information needed to do your job well. This includes first, knowing what your networks consist of and then, how your network staff is configuring and maintaining them. You will likely find several opportunities for improvement by integrating the work of your TCP/IP, SNA, and security software staff…
According to some analysts, the quantity of file-based data produced by businesses exceeded the quantity of so-called “structured” data (industry speak for electronic information produced by databases) a year or two ago. The meaning of this change, they claim, is that files have become a bigger challenge for organizations from the standpoint of data management and storage. Another consequence: mainframes, which are optimized for transaction-oriented (aka database) data handling are increasingly less appropriate for the workloads of contemporary business. I expect this message will be increasingly creeping into the rants of the anti-mainframe crowd in the near future, so I wanted to get out in front of the issue in this column. …
Regular readers of this column know it focuses on mainframe data and database management topics. Although that’s still the case, I will be stretching the confines of that definition a bit to discuss the micro-messaging phenomenon, Twitter. If you’ve tried twittering (or tweeting, as it’s also called), you know it can be addictive, but it’s also growing in popularity as a business tool for communication, even though this might seem hard to believe when you first dive into Twitter…
In many ways, the world of IT is still immature. We respond faster to hype than the average fashion addict. We’re often driven more by emotion than facts; the bully yelling the loudest gets whatever he or she wants (new servers, new architecture, etc.). Have you been to a high school reunion lately? Have you noticed what became of the class bully vs. the quiet, smart kid with the specs in the front row? Often surprising, isn’t it? In retrospect, given the choice, who would you respect more and seek out for advice or friendship? …