Mar 5 ’14
A Distributed Terminology Refresher on DB2 DDF Subsystem Parameter Settings
Today’s Fortune 500 companies are embracing distributed techniques to access mainframe data on DB2. Although online transaction processing (OLTP) workloads still dominate DB2’s local processing, more applications, such as data warehousing analytics, run remotely as distributed tasks. The interesting part of distributed DB2 is the terminology; it has its own set. If you spend time dealing only with the local side, some of these terms may be new, maybe even confusing in some rare cases. Here’s a distributed terminology refresher using a couple of subsystem parameters that affect distributed data facility (DDF) processing: MAXDBAT and CONDBAT on the DSN6SYSP macro and CMTSTAT on the DSN6FAC macro.
The DDF allows distributed relational database architecture (DRDA)-supported applications to access DB2 for z/OS data. When a thread is created in support of this type of access, it’s referred to as a DBAT, or a database access thread.
DDF runs in its own address space (ssidDIST) that’s usually started along with the other three primary DB2 address spaces: database services (ssidDBM1), system services (ssidMSTR) and internal resource lock manager (ssidIRLM). The four address spaces combined are often referred to as a DB2 subsystem, or member, if data sharing is involved.
Note: Local access to DB2’s data is via an allied thread. Allied threads are usually associated with work originating with TSO, CICS, IMS, CAF, etc.
Distributed threads come in two flavors, active and inactive. A subsystem parameter, or what’s often referred to as DSNZPARM keyword, called CMTSTAT on the DSN6FAC macro, controls which distributed thread is in effect. CMTSTAT can also be set using the “DDF THREADS” field on the DSNTIPR installation panel. This subsystem parameter can be defined as either CMTSTAT=INACTIVE (the recommended choice and the default setting for DSNZPARM since DB2 Version 8) or CMTSTAT=ACTIVE (an option now highly frowned upon because of the amount of system resources that could be consumed). Once CMTSTAT is set, it can’t be changed without recycling the DB2 subsystem. This is not one of the DSNZPARM keywords that can be modified using the –SET SYSPARM command process.
Note: Back in DB2 V8, DB2 stopped using the terms type 1 and type 2 inactive threads. Since V8, “inactive DBAT” is used in place of a type 1 inactive thread and “inactive connection” is used rather than the term type 2 inactive thread.
With CMTSTAT=INACTIVE, DBATs are pooled and associated to connections as needed. CMTSTAT=INACTIVE is sometimes referred to as inactive thread processing or thread pooling. When CMTSTAT=INACTIVE is specified, up to a maximum of 150,000 concurrent inbound connections (a value set on CONDBAT keyword on the DSN6SYSP macro) can be defined to any one DB2 subsystem. This also allows incoming requests for DBATs that exceed the MAXDBAT maximum to be processed when a DBAT becomes available. With ACTIVE, work has to wait (queue) until a DBAT terminates. However, if CMTSTAT=ACTIVE is specified, every connection is a DBAT until it’s disconnected (its maximum is controlled with the MAXDBAT keyword) and there’s no pooling.
Note: All DBATs run as enclaves (pre-emptible service request blocks [SRBs]), making them System z Integrated Information Processor (zIIP)-eligible. The qualifying percentage of DRDA workloads eligible for zIIP processing increased to 60 percent with the introduction of APAR PM12256.
The DSNZPARM keywords MAXDBAT and CONDBAT, as previously mentioned, are used to manage two distributed thresholds in DB2. MAXDBAT defines the maximum number of concurrent remote DDF threads (DBATs) allowed in DB2’s DBM1 address space, also referred to as the database services address space or advanced database management facility (ADMF). Its default is 200 (up from the very old DB2 V7 default of 64) and its maximum possible value is 19,999 (an increase from the DB2 9 maximum of 1,999).
The value specified for MAXDBAT when combined with the value specified for CTHREAD (max users) must have a sum less than or equal to 20,000 (2,000 in DB2 9). When DB2 receives a connection request, how that request is satisfied is dependent on the setting of CMTSTAT (described previously). For CMTSTAT=INACTIVE, the request is satisfied by a DBAT from the pool. When the thread goes inactive (normally at a commit), the DBAT is returned to the pool for use by another connection. Note that setting MAXDBAT=0 is a good way to restrict distributed transactions to a particular data sharing member. Remember, each data sharing member has its own unique DSNZPARM member, or at a minimum, the ability exists to set up each member with its own unique DSNZPARM member. A unique ZPARM member isn’t a DB2 requirement. Note that none of these defaults and maximums change in DB2 11. Figure 1 provides a comparison of the min, max and defaults for the keywords discussed here and how they will change when moving between DB2 9, DB2 10 and DB2 11.
CONDBAT, on the other hand, defines the maximum number of remote connections that DDF will allow. The CONDBAT default is 10,000 (also up from the DB2 V7 default of 64) and its maximum possible value is 150,000. CONDBAT needs to be greater than or equal to MAXDBAT. If MAXDBAT is set to 0 (zero), then CONDBAT will be treated as 0 (zero). If the CONDBAT value is reached or is set to 0 (zero), connection requests are rejected. CONDBAT should be set to a value so connection requests aren’t rejected. Maintaining many connections isn’t that expensive.
Any inbound distributed access to DB2 for z/OS requires a DDF connection and a DB2 DBAT. The combination of CMTSTAT=INACTIVE, MAXDBAT > 0 and CONDBAT > MAXDBAT allow many connections to share fewer DBATs. Commit allows the DBAT to be returned to the pool. However, thread creation is an expressive process and a DBAT consumes approximately 200KB in the DBM1 address space as compared to a connection that uses only approximately 7.5KB of storage in the DDF address space. Reusing threads (DBATs) avoids the continuous creation and takedown of threads (DBATs), saving valuable DB2 CPU and storage resources and potentially improving performance.
If you’re running CMTSTAT=ACTIVE, then CONDBAT is ignored and MAXDBAT is the only value used to control both concurrent connections as well as concurrent active DBATs. In addition, regardless of CMTSTAT setting, cursors defined with WITH HOLD and packages bound with KEEPDYNAMIC (YES) won’t create an inactive connection or inactive DBAT.
Note: Both MAXDBAT and CONDBAT can be changed online using the –SET SYSPARM after an assembly and link of the subsystem parameter’s module (usually DSNZPARM). However, modifying CMTSTAT still requires the DB2 subsystem to be recycled to activate a changed setting.
Subsystem Parameter IDTHTOIN
IDTHTOIN, also on the DSN6FAC macro or the IDLE THREAD TIMEOUT field on the installation panels, controls how long an idle active thread will hang around before it’s canceled. IDTHTOIN is often misinterpreted. Occasionally, this DSNZPARM keyword will be associated with threads defined with CMTSTAT ACTIVE. This is inaccurate. All threads are active when they’re performing work whether CMTSTAT is set to ACTIVE or INACTIVE. IDTHTOIN won’t cancel an inactive thread. So now KEEPDYNAMIC YES forces a thread to remain active, and depending on the setting of IDTHTOIN, that thread may or may not be canceled.
Prior to DB2 V8, IDTHTOIN defaulted to 0 (zero). Setting IDTHTOIN to 0 (zero) tells DB2 to never cancel a thread no matter how long it hangs around idle. This setting is strongly suggested for an SAP implementation. However, in most environments, this keyword should usually be set to a value greater than 0 (zero). In DB2 V8, the default was changed to 120 seconds and remains at 120 seconds as the default in DB2 11. Not only does 120 seconds work well as the DB2 default, it’s also the lowest value that should be specified. DB2 checks to see if a thread is idle on average about every two minutes or 120 seconds. Therefore, it makes little sense to set this keyword to anything less than 120 seconds. In fact, it isn’t unusual to see IDTHTOIN set to values as high as 300 to 600 seconds, the range often suggested when setting up this keyword.
Note: IDTHTOIN can also be changed online using the –SET SYSPARM after an assembly and link of the subsystem parameter’s module.
Connection Management Improvements
Three connection management areas of concern are addressed in DB2 10 by APAR PM43293. First, DB2 needed a better way to manage and observe connection behavior over and above what’s currently provided using the DSNZPARM keywords MAXDBAT and CONDBAT. Next, the behavior of KEEPDYNAMIC (YES) must be adjusted for threads that could end up running for a longer period of time. Finally, workload manager (WLM) needed to be aware of how DBAT processing was proceeding, including additional messaging to avoid any “surprises.”
All three concerns were addressed in December 2012 by APAR PM43293, including the addition of two new DSNZPARM keywords on macro DSN6FAC: MAXCONQN and MAXCONQW. The challenge is placing controls on a connection waiting and queuing for a DBAT to address these concerns. Prior to applying the aforementioned APAR, these concerns couldn’t be addressed.
Why might it be important to have some control over queuing? It’s possible that if DBATs aren’t available to satisfy requests, DB2 could reach its CONDBAT (maximum number of remote connections) or MAXDBAT (maximum number of DBATs) limits, causing new connection requests to be rejected (00D31034). If MAXDBAT is reached before CONDBAT, the TCP/IP socket could be marked for clean up. Unfortunately, that requires a DBAT. With that, hopefully, the potential problem starts to become clearer.
The two subsystem parameters introduced at the end of 2012, MAXCONQN and MAXCONQW, were released to prevent a situation like this from occurring. The keyword MAXCONQN controls the number of connections that can be waiting on a DBAT to come available, whereas the DSNZPARM keyword MAXCONQW specifies the amount of time a connection can wait for a DBAT.
MAXCONQN and MAXCONQW can be set to ON, OFF or to a specific value; the default for both MAXCONQN and MAXCONQW is OFF, thus disabling the additional checking. When a value is used, MAXCONQN can be set to the maximum connections and MAXCONQW can be set to the maximum duration in seconds. If not specified or if set to their default, how queued connections are handled is unchanged from prior to applying APAR PM43293. These subsystem parameters only have an effect when defined to a member of a data sharing group.
When a connection is closed because one of the aforementioned subsystem parameters is exceeded, message DSNL030I is issued to the console with either reason code 00D31053 when MAXCONQN has been exceeded or 00D31054 when MAXCONQW has been exceeded. These messages are issued at no more than a five-minute interval to avoid flooding the z/OS console with an excessive number of messages.
Both MAXCONQN and MAXCONQW can be updated online using the assemble, link and –SET SYSPARM command process. The aforementioned keywords also won’t work for a DB2 subsystem that still uses the DSNZPARM setting CMTSTAT = ACTIVE.
The behavior of all the keywords discussed so far doesn’t have to be a mystery. There are invaluable details available about these keywords from a source that’s easy to access: the
–DISPLAY DDF DETAIL command (see Figure 2).
The highlighted lines in the output are for messages DSNL090I and DSNL091I. DSNL090I describes the current settings in the DSNZPARM member for CONDBAT and MAXDBAT. Message DSNL091I has the current settings for MAXCONQN, labeled MCONQN, and MAXCONQW, labeled MCONQW.
KEEPDYNAMIC DBAT Refresh
To resolve a situation that existed when a distributed client used a package bound with KEEPDYNAMIC(YES) that could potentially remain active with the connection for long periods of time, an enhancement referred to as “KEEPDYNAMIC DBAT refresh” was introduced at the end of 2009. Prior to this enhancement, KEEPDYNAMIC(YES) could allow a thread’s storage footprint to continually increase, sometimes resulting in the fragmentation of the DBM1 address space’s storage. This situation could eventually lead to application failures due to storage shortages. Not using KEEPDYNAMIC(YES) wasn’t often a viable option or a solution. The entire reason for KEEPDYNAMIC(YES) was to keep prepared statements active across commits, eliminating the need to go through the prepare process potentially multiple times, thus reducing overhead. Of course, because the statement is kept active, the thread is never returned to the pool, preventing it from being eligible for timeout.
If KEEPDYNAMIC DBAT refresh is active (defined as the only thing keeping the connection active and preventing the DBAT from being pooled), DDF can terminate a DBAT and connection if it has been used for more than one hour or has been idle for more than 20 minutes. There are a few other things that must be in effect for all this to happen. For example, CMTSTAT=INACTIVE must be in effect, client must support Sysplex Workload balancing and/or Seamless Failover plus a few other things that are detailed in APAR PK69339 along with examples of problematic situations that KEEPDYNAMIC DBAT refresh may help resolve.
Two reason codes also come with the other changes in support of KEEPDYNAMIC refresh. They have been added to message DSNL027I for a connection/thread termination when KEEPDYNAIMC refresh is enabled:
• 00D3003E: A connection/thread has been used for more than one hour.
• 00D3003F: A connection/thread has been idle for more than 20 minutes.
Connection/thread termination won’t occur if the connection/thread is actively processing a transaction or holding a resource past a commit for the two aforementioned conditions. DSNL027I also won’t be issued more than once in a five-minute internal if issued from the same client IP address.
There’s considerably more detail about these two codes in the February 2013 and later release of the DB2 10 Codes (GC19-2971), or the October 2013 or later release of the DB2 11 Codes (GC19-4053) product publications.
Member Health Reduction
Ever since z/OS 1.8, every server has had a health value associated with it that’s passed to WLM. A value of 100 (representing a percentage) represents that all things for that server are copasetic. On the low end, a value of 0 (percent) means the server is probably non-functional. Based on this health value, WLM can make a more informed, and hopefully more accurate, routing decision. APAR PM43293 adjusted DB2’s DDF health value, reporting it to WLM immediately when the number of connections exceeds a percentage of the CONDBAT threshold. The DB2 health value is reduced to 50 percent of the current health value when the number of connections is at 80 percent of CONDBAT; 25 percent of the current health value is only valid when the number of connections exceeds 90 percent of CONDBAT. Message DSNL074I is also issued to inform that 80 or 90 percent of CONDBAT has been reached. Message DSNL075I is issued when the number of connections drops below 80 or 90 percent of CONDBAT. The DB2 health value is increased to 50 or 100 percent of the original health value and reported back to WLM again.
Note: The changes to the DB2 health value and the DB2 messaging are only available in a data sharing environment.
To check on DDF’s health value, view the DSNV507I message that’s displayed as part of the
–DISPLAY THREAD(*) TYPE(SYSTEM) command and the DSNL094I message displayed as part of the output from the –DISPLAY DDF DETAIL command. A sample of the DSNL094I message is contained in the –DISPLAY output.
The best resource for additional information about the subsystem parameters discussed here is the DB2 11 Installation and Migration Guide (GC19-4056). Of course, the DB2 11 for z/OS Information Center available on the Web contains the most current DB2 product details. There are also a couple of excellent Redbooks available to satisfy your distributed curiosity. Check out DB2 9 for z/OS: Distributed Functions (SG24-6952-01) and Jim Pickel’s DB2 9 for z/OS Data Sharing: Distributed Load Balancing and Fault Tolerant Configuration (REDP-4449). Although both Redbooks are at a lower DB2 release level, they’re still excellent DB2 for z/OS distributed processing resources.