Throughout the process, we had to remember that the compiler or the environment may have inserted “invisible” functions or hidden code into the source; this meant we had to look beyond the source code. For example, while evaluating the high Task Control Block (TCB) switching rate of a CICS task, we learned that a third-party instrumentation facility that was pulled in at execution time was the culprit causing the high rate.
We took advantage of Parallel Sysplex for performance and load balancing and to lower third-party software costs by routing workload to the machine with the required software resource. We made minor changes to our applications to accomplish this, but we achieved a significant financial return.
Additionally, some of our databases don’t participate in database federation, forcing us to conduct multi-phase commits manually. This resulted in data integrity being questioned when we had to retrace many steps to do manual fixes. Even though data integrity wasn’t compromised, we considered this a performance issue because of the loss of productivity it caused. We carefully designed a homegrown agent that would oversee the multiple phases of commit and would trigger an undo process in case of failure.
Our back-end database is DB2 for z/OS, which manages about 10TB of data. Poorly performing SQL statements are the easiest to identify and correct. DB2 is well-equipped with accounting information at the correlation-ID level, package level, or even more granular levels. Only when an SQL solution is insufficient do we resort to other solutions.
Not all performance solutions are software solutions; some are procedural. Examples include stacking up non-critical or time-insensitive updates for non-peak hours or running various audit reports together by sweeping the database just once. It may sound like a clear-cut solution, but it’s difficult to get a consensus when working within the layers of communication and time zone differences typical of a global community.
We also faced some interesting business functionality issues. We collect data from different sources (converging on one key) and distribute the same data to a different set of clients (diverging on another key). There’s an authentication process validating who has clearance to input the data and a filtering process at the distribution end. Both are resource-intensive procedures because of the granularity of authentication, but we proved we could reduce cost significantly by rewriting the algorithms. It was risky to change the decision-making modules that are the backbone of the business, but there was a greater risk in not trying. After careful evaluation and thorough testing, we rolled out the new algorithms and they were successful.
We solved some design issues with historical databases that mirrored operational database design. By changing the design, we could curtail exponential data growth. We keep dated information in our database, and old attributes become outdated when new ones are made effective. The old design involved explicitly applying a discontinue date to the old attribute; the new design assumes an implied discontinue date based on the presence of a new attribute. This yielded a more than 50 percent savings in the load operation.
- IBM’s Tivoli Decision Support for z/OS for general trend analysis
- PLAN_TABLE as a repository of explains of static SQL
- DSN Dynamic Statement Cache for dynamic SQL
- Visual Explain option of IBM’s Data Studio for graphic view of the SQL
- Compuware Corp.’s STROBE for in-depth analysis after we target a process for tuning.
We extracted accounting information from the Tivoli database using native SQL in a format that’s comparable to a DB2PM report. We used thread-level and package-level details to pinpoint likely candidates, then we used STROBE and Data Studio for in-depth analysis.