Classifying Tables

To help determine whether a table belongs in a DBMS or in memory, it’s useful to think in terms of process-related data and data that’s processed. Process-related data is information that tailors the process for a specific set of circumstances. Data to be processed consists of the primary input and output. With this in mind, we can group the entire data content used by applications into three classes: database data, reference data, and temporary data.

Database Data Class

DB2, IMS, and VSAM contain data that’s processed and represents the bulk (80 percent) of the data that applications process. This data is typically randomly processed. Examples are customers updating their own profiles online, or an entire DBMS table processed sequentially in a batch application to create monthly statements. Because data is simultaneously accessed and updated, data hardening using I/O is unavoidable and necessary. Access performance isn’t as critical as for temporary or reference data because the same data row isn’t repeatedly accessed.

Often, only a fraction of the entire DBMS table is processed in any time frame. In a z/OS environment, DBMS data is served from a single source to ensure all applications have the latest version of a row in a table. The performance issues related to record locking have little impact as this class of data would likely not be simultaneously updated from multiple locations. For example, a credit card isn’t typically simultaneously used in multiple locations. Usually, in-memory tables can’t help improve the access performance when data updates need to be retained in permanent storage.

There are circumstances where the number of read accesses far exceeds the occasional update access. The trading price of stocks is a good example. Many transactions would be asking the price of an equity, but there may be a thousand requests asking for the current price of the equity for every time there’s a price adjustment. When the read-only accesses far exceed update accesses, in-memory tables could make a substantial performance difference if used as a write-through cache. Each time a request for the equity’s price is serviced via an in-memory table, substantial performance gains are common by avoiding the trip to the DBMS, which is 30 times slower. When an update occurs, both the in-memory copy of the table’s row and the identical row in the DBMS remain in sync.

 Reference Data Class

Reference data is generally the most-accessed data by any application because it’s process-related data. All reference data, whether used to look up prices or dynamically control the flow of a program, have one property in common: Reference data is always read-only when used in transaction processing. This property allows in-memory tables to provide up to a 30-fold performance multiple over DBMS access times and CPU usage by using pure in-memory algorithms. The greatest gain in performance of an application is achievable by taking this reference data, normally kept in a DBMS, and making it available through in-memory tables.

The 80/20 rule applies for reference data. Suppose 80 percent of your database accesses are against the 20 percent of your process-related data, the reference data. By eliminating four out of five DBMS accesses in your applications, they run in 20 percent of their original time and CPU usage. This simple change in reference table access results in serious MIPS saving.

This data can change yearly, monthly, daily, or virtually never. What distinguishes this class of data is that a table is generally treated as a set of rows that belong together. Consider a table that contains prices for the month of March. This table would be useless if it was partially updated with prices from other periods. Other examples of reference tables are process parameters data, security control tables, process control data, decision tables, rules tables, codes tables, and pull-down lists.

5 Pages