Nov 10 ’14
Demystifying Autonomic Statistics for DB2
Over the past several years, autonomic statistics has been a much-discussed feature of DB2 10 and 11. The vast majority of customers we talk with are very excited about this feature because it solves the often-asked question: Why can’t DB2 be smart enough to automatically execute RUNSTATS when required? However, after revisiting with some of these customers, we were surprised to learn that once the excitement died down, none had implemented autonomic statistics. Most of the feedback pointed to the implementation being confusing and cumbersome.
So, unless you’re using a tool, such as the IBM DB2 Automation Tool, which makes the implementation easier, you will need to learn about stored procedures, new DB2 tables, user-defined functions (UDFs) and the UNIX cron date and time format. Your interaction with the process mostly involves four tables, two UDFs, three stored procedures and DB2’s ADMT started task or your operating system scheduler.
Let’s take a look at how to implement and monitor autonomic statistics.
Stored procedures can be challenging unless you already have experience with them or program in C, Java or some other language. As part of the implementation, you will run one stored procedure via a call to add another stored procedure. Based on the results, you may run a third stored procedure. Most of the examples in the DB2 manuals are in C and Java and aren’t direct calls to the stored procedure. Unfortunately, you can’t use a batch TMP to call a stored procedure, but there are alternatives: IBM Data Studio or IBM Optim Query Tuner. With these products, you can call a stored procedure using the Data perspective followed by executing Run SQL, which invokes a SQL script. The call statement and the results will be displayed on the same screen and can be saved.
Stored procedures can also be very finicky. You must be careful when coding quotes (‘ single vs. ‘’ double-single vs. “ double). Double-single and double quotes in most instances look the same. Parts of the stored procedures will only work with one type but not the other. You may get an error message that doesn’t explain the reason for the error. However, changing a double quote to a single quote or vice versa may help resolve errors. When you see userid and password in the stored procedure syntax, you must surround with quotes a valid TSO userid and password that has enough authority.
The Install Process
The first part of the process is to determine when RUNSTATS will execute using table SYSIBM. SYSAUTOTIMEWINDOWS. If you’re installing DB2 10 or a subsequent release, DSNTIJTC creates SYSAUTOTIMEWINDOWS. If you’re migrating to DB2 10, DSNTIJEN (ENFM) creates it in step ENFM0001 – ENFM start. It’s part of table space SYSTSATW with index DSNTWX01. If DB2 is running V10 NFM or higher and SYSIBM. SYSAUTOTIMEWINDOWS is empty, DSNTRIN in job DSNTIJRT will initialize it with seven rows of data. Rows inserted can be for day of the week or month with start and end times, number of parallel RUNSTATS tasks and SSID to execute on. If the rows exist, there’s no reason to manually add the rows; you’re ready to run unless you want to modify the schedule. By default, this table already has seven rows inserted that tell the scheduler RUNSTATS can execute any day of the week (one row for each day of the week) and at any time, with a maximum of one task (don’t run in parallel). The WINDOW_ID column is 1 through 7, one per row.
The next step is adding and setting up the stored procedures. The first stored procedure, ADMIN_UTL_MONITOR, determines which objects require RUNSTATS based on out-ofdate, missing and inconsistent statistics. Alerts are written to table SYSIBM. SYSAUTOALERTS and it will schedule another stored procedure, ADMIN_UTL_ EXECUTE, to execute RUNSTATS based on the alerts. Options for ADMIN_UTL_ MONITOR include a schedule when to run, any object restrictions in terms of include or exclude, the statistics scope, if it is to run standalone, RUNSTATS sampling and thresholds. Object restrictions, for example, can exclude for monitoring any database starting with DSNDB01 (DB2 Directory) and DSNDB06 (DB2 Catalog). Another example: Only include for monitoring any database with the name TESTTHIS having table spaces starting with ‘AUTOST’. Three scopes— BASIC, PROFILE or PROFILECONSISTENCY— tell the monitor how detailed it should be when looking to resolve problems. Set stand-alone=yes if you would like to review the alerts and not execute RUNSTATS or you’re using a different scheduler outside of the ADMT started task and manually want to schedule ADMIN_ UTL_EXECUTE. The thresholds deal with RTS as well as inconsistencies, depending on the scope used. For the RTS, only SYSTABLESPACESTATS is interrogated, not SYSINDEXSPACESTATS.
Autonomic statistics first made its debut in DB2 LUW, which is the reason the UNIX cron date and time format are used for the stored procedures. Table SYSIBM. SYSAUTOTIMEWINDOWS, on the other hand, tells the scheduler when to execute RUNSTATS and uses the regular date and time format. Additional information on autonomic statistics can be found in the DB2 LUW manuals. For example, for the UNIX cron format, review DB2 Version 9.5 for Linux, UNIX and Windows, Database Administration. More than one monitor can be set. Figure 1 shows two examples: ADMIN_UTL_MONITOR_1, which has a scope of BASIC and runs daily every 30 minutes on the hour, and ADMIN_UTL_ MONITOR_2, which has the more extensive scope of PROFILE-CONSISTENCY and runs daily only at 1 a.m.
You can also schedule the monitor stored procedure to execute at different times for different objects. For example, let’s say we want all eligible objects to execute RUNSTATS 24 hours a day, seven days a week, with two exceptions: Eligible objects for database ABC should only execute from 1 a.m. to 3 a.m., and from 7 p.m. to 9 p.m. for database XYZ. One way of accomplishing this is to have three monitor stored procedures using the restrict-ts parameter: one that executes at periodic intervals for all objects, excluding databases ABC and XYZ; one that’s used for database ABC only with a specific timeframe; and one that’s used for database XYZ only with a specific timeframe.
You must use stored procedure ADMIN_ TASK_ADD to first add ADMIN_UTL_ MONITOR to execute. In this case, you’re calling one stored procedure to add another stored procedure. Let’s say, for example, you want to run ADMIN_UTL_MONITOR_2 daily at 6 a.m. instead of at 1 a.m. In this case, you need to call a different stored procedure, ADMIN_UTL_UPDATE. You will notice when looking up the syntax that unlike ADMIN_UTL_ADD, ADMIN_UTL_ UPDATE doesn’t have parameters for userid and password. The admin scheduler checks the password at ADMIN_TASK_ADD time. If it’s valid, it stores the task in the task list without the password. Internally, a different approach is used to switch to the context of the user (it runs APF-authorized and in key 0), so there’s no need to know the password of the user. Scheduled jobs shouldn’t be affected when a user changes his or her password.
You’re now running two monitor stored procedures daily: ADMIN_UTL_ MONITOR_1 runs every 30 minutes on the hour with a basic scope and ADMIN_UTL_ MONITOR_2 runs once a day with a detailed scope. Let’s say you decide you no longer need to run the more detailed monitor. In this case, you call a different stored procedure: CALL sysproc.admin_task_remove(‘ADMIN_UTL_ MONITOR_2’,0,’’). Two UDFs ADMIN_ TASK_LIST and ADMIN_TASK_STATUS are provided to view the status of the stored procedures. After the removal of ADMIN_ UTL_MONITOR_2, you will see the partial output in Figure 2.
Notice that ADMIN_UTL_MONITOR_2 isn’t listed, but ADMIN_UTL_MONITOR_1 and ADMIN_UTL_EXECUTE are. The full output will include such things as the last execution date and time. If you were to remove ADMIN_UTL_MONITOR_1 as well, you would still see a listing in both SELECTs for ADMIN_UTL_EXECUTE even though it’s no longer running. The last run date and time will show the time just prior to the removal of ADMIN_UTL_MONITOR_1 or the last time it was invoked to run based on the monitor results.
Alerts from the ADMIN_UTL_MONITOR stored procedure are written to table SYSIBM. SYSAUTOALERTS, which feeds the input for stored procedure ADMIN_UTL_EXECUTE. You don’t use ADMIN_UTL_ADD to add ADMIN_UTL_EXECUTE unless you’re running with the stand-alone=yes option, which will be discussed later. The specifics of the RUNSTATS execution output from stored procedure ADMIN_UTL_EXECUTE are written to table SYSIBM.SYSAUTORUN_HIST.
The third step (optional) is to add a stored procedure to clean up these two tables based on a date criteria. For example, Figure 3 shows you how to remove rows on the first day of the month that are more than 30 days old.
When using the ADMT started task to schedule autonomic statistics, just the fact the STC is running isn’t enough; you must also run the optional install job DSNTIJRA to provide the RACF security for pass tickets; otherwise, you will see the partial output in Figure 4.
Notice that ADMIN_TASK_LIST shows that ADMIN_UTL_MONITOR is scheduled. However, from ADMIN_TASK_STATUS, you find that pass tickets can’t be generated because DSNTIJRA wasn’t run. Another way of seeing if the monitor and execute stored procedures ran is to issue DB2 command DISPLAY PROCEDURE. Figure 5 shows an example after DSNTIJRA was run.
Another useful feature beginning with DB2 10 lets RUNSTATS use profiles that are stored in table SYSIBM.SYSTABLES_ PROFILES. RUNSTATS can use parameter SET PROFILE, USE PROFILE UPDATE PROFILE and DELETE PROFILE. Data Studio and Optim Query Tuner allow you to use the Stats Advisor, which provides more accurate statistics gathering for the optimizer to choose a better access path. The output from Stats Advisor can be a page or more based on some queries. Instead of executing RUNSTATS with all the subparameters required, you can execute it once, adding the SET PROFILE parameter, and then use parameter USE PROFILE. You can execute RUNSTATS with the PREVIEW option, whereby DB2 only prints the stats profile for each table to SYSPRINT, but normal utility execution doesn’t take place. An alternative is to execute SELECT * FROM SYSIBM. SYSTABLES_PROFILES. Autonomic statistics will use profiles if already set. If not, the process will add an entry into SYSIBM.SYSTABLES_PROFILES for an object requiring RUNSTATS. As part of this process, INDEX(ALL) will be added to the profile table even when an index doesn’t exist in case one is added in the future. The process doesn’t replace Stats Advisor, nor does it integrate with DB2 11 SYSIBM. SYSSTATFEEDBACK and DSN_STAT_ FEEDBACK. For example, you have a column on zip code where 99 percent of the rows are for 11234 and 1 percent is for 23190. Autonomic statistics doesn’t pick up the variation; you still need to run Stats Advisor or use SYSIBM.SYSSTATFEEDBACK and DSN_STAT_FEEDBACK.
Test to illustrate autonomic statistics—five table spaces with no indexes—JOHNITB4 was manually added to table SYSTABLES_ PROFILES via RUNSTATS SET PROFILE without INDEX(ALL), all others automatically added by autonomic statistics:
• JOHNITS1 inserted one row followed by regular RUNSTATS TABLE(ALL); the result: Auto RUNSTATS wasn’t executed.
• JOHNITS2 inserted many rows without any RUNSTATS; the result: Auto RUNSTATS executed.
• JOHNITS3 inserted many rows with a variation, 99 percent zip code 11234, 1 percent with 23190 followed by regular RUNSTATS TABLE(ALL); the result: Auto RUNSTATS didn’t execute. Auto stats doesn’t pick up the variation, use Stats Advisor or SYSIBM.SYSSTATFEEDBACK and DSN_STAT_FEEDBACK.
• JOHNITS4 inserted many rows with a variation, 99 percent zip code 11234, 1 percent with 23190 followed by special RUNSTATS with COLGROUP for the zip code and SET PROFILE, then double the size of the table with no RUNSTATS; the result: Auto RUNSTATS executed.
• JOHNITS5 inserted many rows followed by regular RUNSTATS TABLE(ALL), then DELETE from table (mass delete) with no RUNSTATS; the result: Auto RUNSTATS didn’t run. WARNING: RUNSTATS should have run because RTS shows mass deletes>0. See APAR PM95437. Without this fix, autonomic statistics is ignoring table spaces with an RTS datasize of zero.
Figure 6 shows partial results from SELECT *. Autonomic statistics and manual RUNSTATS can be run together, but special considerations must be made for the profile used. For more information, review the DB2 Utilities manual section titled “Combining autonomic and manual statistics maintenance.”
You can use your operating system’s scheduler instead of DB2’s built-in ADMT started task for autonomic statistics. In this case, pass the DB2 parameters to your operating system’s scheduler and schedule the ADMIN_UTL_MONITOR and the ADMIN_ UTL_EXECUTE stored procedures with the standalone parameter. Be aware that in this case, ADMIN_UTL_EXECUTE can’t be scheduled by ADMIN_UTL_MONITOR or ADMIN_UTL_EXECUTE to resolve the alerts. You must schedule ADMIN_UTL_ EXECUTE to run to resolve alerts.
You can schedule autonomic statistics in a data sharing environment. For the SYSAUTOTIMEWINDOWS table, specifying NULL for DB2_SSID would allow stored procedures ADMIN_UTL_MONITOR and ADMIN_UTL_EXECUTE to run on any member. Each stored procedure can run on different members. Keep in mind, ADMT’s TASKLIST data set and database DSNADMDB are shared by all members; therefore, all members know that one of the members ran the task. For a different take, let’s say you have a data sharing group with three members: DB2A, DB2B and DB2C. Let’s say you want ADMIN_UTL_MONITOR to run on DB2B only, but ADMIN_UTL_ EXECUTE to run on any member. For the SYSAUTOTIMEWINDOWS table, specifying DB2B for DB2_SSID would allow stored procedure ADMIN_UTL_MONITOR to run on member DB2B only, but ADMIN_UTL_ EXECUTE will run on any member. If you want both stored procedures ADMIN_UTL_ MONITOR and ADMIN_UTL_EXECUTE to run on DB2B only, invoke both with standalone= yes and use ADMIN_UTL_ADD to manually run ADMIN_UTL_EXECUTE. In this case, ADMIN_UTL_EXECUTE will require very low overhead to periodically check the window set in SYSAUTOTIMEWINDOWS.
Considerations for Autonomic Statistics
When looking at using autonomic statistics, it’s important to consider:
• On what schedule should RUNSTATS potentially execute?
• What happens if a RUNSTATS starts near the end of the time schedule and now has to run for an extensive period?
• How many RUNSTATS should run simultaneously?
• If running in a data sharing environment, on which member should RUNSTATS and the monitor execute?
• How often should the monitor stored procedure run?
• Are there objects that require special statistics?
• Which objects should be included or excluded for analysis?
• Should objects recommended for RUNSTATS execute or just be reported on?
• Which statistics scope should be run on which schedule?
• Will the ADMT scheduler be used or an operating system one?
• How often should the log file and alert history be cleaned up?
• How will the log file and alert history be reviewed for unsuccessful executions? What action should be taken?
• RUNSTATS of objects helped (or hurt) dynamic SQL, but what will you do about static and REBIND?
• Is DSNACCOX also run? If it is, how will it be used alongside autonomic statistics?
• How will I deal with auto stats running at the same time as my maintenance window for REORG then inline stats? How about scheduled RUNSTATS?
Clearly, autonomic statistics has many moving parts, but it does provide the answer to the often asked question: Why can’t DB2 be smart enough to automatically execute RUNSTATS when required? Now it can.
Although documentation is limited for setting up and using autonomic statistics, some helpful resources include:
• DB2 Installation and Migration Guide; review sections regarding the auto stats stored procedures.
• DB2 Managing Performance; see section “Automating statistics maintenance.”
• DB2 SQL Reference; see section “DB2 catalog tables.” Review SYSIBM. SYSAUTOALERTS, SYSIBM. SYSAUTORUNS_HIST and SYSIBM. SYSAUTOTIMEWINDOWS and table functions ADMIN_TASK_LIST and ADMIN_TASK_STATUS.
• DB2 Utilities manual; see section “Combining autonomic and manual statistics maintenance.”
• DB2 10 for z/OS Technical Overview Redbook; see sections on DB2-supplied stored procedures, administrative task scheduler, administration enablement, DB2 statistics routines, autonomic statistics, using RUNSTATS profiles, updating RUNSTATS profiles, deleting RUNSTATS profiles, and combining autonomic and manual statistics maintenance.
• DB2 Version 9.5 for Linux, UNIX and Windows, Database Administration; see section “Administrative Task Scheduler routines and views.”