Jun 5 ’13

Traveling Through Time With Temporal Tables in DB2 10 for LUW

by Phil Gunning in Enterprise Tech Journal

DB2 10 introduces temporal table support in accordance with the SQL 2011 ANSI/ISO standard and is the first commercial database to implement temporal table support as specified by the standard. Other commercial databases have their own version of temporal table implementation; this limits application transparency and portability. The three types of temporal tables are:

• System-period temporal tables
• Application-period temporal tables
• Bitemporal tables.

Here we will discuss in detail the first two and provide an overview of bitemporal table support.

Time Travel Query

IBM uses the term Time Travel Query to describe temporal table support in DB2 10. The requirement for companies to track data over time drives the need for temporal tables. For financial service firms, this could be the need to track the value of a certain portfolio for a given period of time, track the period when certain products were offered and valid, or track the value of stocks or bonds at a specific time. The need to track can be based on business requirements, internal policies, regulatory and audit requirements, Generally Accepted Accounting Principles (GAAP) or International Financial Reporting Standards (IFRS). The use of temporal tables drastically reduces the amount of application logic previously required to implement user-defined historical tables.

Temporal tables help answer questions such as:

• Does your company have full traceability of all data changes and retroactive data corrections?
• Can you delineate the price of an item during any specific current or historical period of time?
• What were the prices of stock used in a valuation or initial public offering three years ago?
• When the company merged with another company, what were the financials at the time of the merger?
• How long was Policy B offered by the company at a certain rate?
• How much was the deductible for customer X during a time span relative to a filed claim?

System-Period Temporal Tables

System-period temporal tables manage data according to system time. Each row and its corresponding history are automatically assigned a pair of system timestamps. DB2 keeps a history of rows that have been updated or deleted over time. In accordance with the SQL 2011 ANSI/ISO standard, the DB2 10 implementation lets you go “back in time” and run queries for any chosen point in the past.

Using system-period temporal tables reduces the impact on applications that only access current data since historical data is contained in the associated history table. Inserts of new rows are faster than if the history was contained in the same table. Utilities, such as LOAD and REORGANIZATION on the current table, aren’t slowed down by the presence of historical data.Moreover, you can also:

• Have different indexes on the current table vs. history table
• Recover current data and history data independently
• Have different constraints on current table vs. history table
• Use different storage options for base and history table such as compression, range partitioning, clustering and storage group and physical location on disk.

Time periods apply to system time and business time. A period indicates the starting and ending points of a time interval. DB2 uses an inclusive/exclusive approach for implementing time periods.
DB2 10’s support for system time provides for multiple versions of your data. When you define a system-period temporal table, DB2 will automatically capture changes to the table and save the old rows to a user-created history table. The history table is a separate table with the same structure as your current table. This capability alone saves developers and DBAs countless hours of application development time. In the past, developers or DBAs had to write custom application logic, stored procedures or triggers to keep track of historical data changes in separate tables. Now, DB2 10 does it for you!

Whenever a row is updated or deleted, the “before image” of the affected row is inserted into the associated history table. This is accomplished via versioning. Timestamps are automatically generated to produce an exact history of all INSERT, UPDATE and DELETE operations that occurred against the base table. Queries with temporal references to the base table are automatically and transparently run against the history table where needed. System-period temporal tables are supported in all editions of DB2 10, including Database Partitioning Feature, PureScale and High Availability Disaster Recovery (HADR). The base or history table can be range-partitioned or Multidimensional Clustering (MDC) tables.

You can hide system-period columns so they don’t appear in SELECT * queries. Base and history tables must have the same columns. When versioning is enabled, ADD column operations are propagated to the history table. You can use the new “PERIODOVERRIDE” option of the LOAD utility to load existing timestamps into period columns. You can now use the new SYSCAT.PERIODS catalog view to query the characteristics of any temporal table defined in the database.

A new column, TEMPORALTYPE, was added to the view SYSCAT.TABLES to identify temporal tables. Table types are indicated by these codes:

• A=Application-period temporal table
• B=Bitemporal table
• S=System-period temporal table
• N=Not a temporal table.

Creating a System-Period Temporal Table

You produce a system-period table by creating a regular table with system start and system end time columns and then create a history table using the same Data Definition Language (DDL) structure or the LIKE option of the CREATE TABLE statement. Figure 1 contains the DDL used to create the ITEM system-period temporal table. After creating the base table, you create the history table (see Figure 2).

You can use the ALTER TABLE statement to enable the ITEM table as a system-period temporal table with the relationship of the ITEM_HISTORY table established.

The system_start and system_end columns are the default names DB2 uses for the start and end points. However, you can name these columns any valid column name and then use them in the PERIOD SYSTEM_TIME clause to identify the table as a temporal table. These columns use a TIMESTAMP(12) data type to provide the precision necessary to prevent duplicate rows. Time periods apply to system time and business time. A period indicates the starting and ending points of a time interval.

DB2 uses the TRANS_ID column to capture the time when execution started for a transaction that impacts the row. If multiple rows are inserted or updated within a single transaction, in one or multiple tables, then the values for the TRANS_ID column are the same for all these rows and are unique from values generated for this column by any other transaction. Period columns and transaction start-ID columns can be IMPLICITLY HIDDEN from applications.

The RESTRICT ON DROP clause on the history table prevents the history table from being implicitly dropped if the system-period temporal table is dropped. For our example, this is the ITEM table, which might be similar to one a retailer uses to track the sales price of items over time. The historical data can be the basis for reports the retailer could use for planning and reviewing sales trends based on sale price or for developing financial reports based on the item’s sale price for a period. The examples given show only the year, month and day portion of the TIMESTAMP columns.

Inserts and Updates With System-Period Temporal Tables

Inserts work as before, with DB2 generating the defaults for the period start and end columns (see Figure 3). Continuing with our example in Figure 3, SKU 18934 goes on sale; the price is updated as shown in Figure 4.

With updates to system-period temporal tables, the before image of the row is copied to the history table before the update, providing a complete before and after record of data changes without a need to use applications, programs, stored procedures or triggers.

Querying System-Period Temporal Tables

When you query a system-period temporal table, the query runs against the current data (table) only by default (see Figure 5). For DB2 to automatically and transparently access the history table, the query must specify the FOR SYSTEM_TIME clause in the query.

A query with the FOR SYSTEM TIME clause might retrieve rows from the base table, history table or both. The FOR SYSTEM_TIME clause can also specify a time range using the keywords FROM and TO, or BETWEEN and AND.

System Time Special Register

Instead of specifying the FOR BUSINESS TIME clause in the query, you can set the SYSTEM TIME SPECIAL REGISTER and existing queries need not change to access the associated history table. For example, to see what the price was on 11/14/2012, you can set this new special register to the period of interest with this statement:

“SET CURRENT TEMPORAL SYSTEM_TIME ‘11/14/2012’;

After setting the special register, the query executes automatically and transparently against the history table; the result shows the price in effect for 11/14/2012. This was 194.18 per the query in example B in Figure 5.

No Write Operations in the Past

It’s important to note, however, that when the current TEMPORAL SYSTEM_TIME special register is set to a non-null value, this blocks Insert, Update and Delete operations and LOAD utilities on system-period temporal tables. Also, you can only specify the desired system time either in the query or with the special register, but not both. Doing so will result in an error being returned.

Schema Evolution

Schema changes that can’t cause loss of history propagate automatically from the base table to the history table. Changes such as adding a new column to the base table result in the column being added automatically to the history table. Increasing the size of a VARCHAR column applies to the base table and history table. However, changes such as reducing the length of a VARCHAR column or dropping a column from the base table are blocked due to potential data loss. To make those changes, you must first stop versioning.

Here’s a performance tip: Normally, a history table experiences mostly insert and read activities. As such, you can eliminate the search for free space on inserts by altering the table with the APPEND ON option. When enabled, there’s no search for free space during insert processing; this can speed up inserts into the history table.

More About History Tables

History tables can save developers and DBAs lots of time, but be aware of these rules and restrictions:

• A history table can’t be dropped explicitly. It can only be dropped implicitly when the associated system-period temporal table is dropped.
• You can’t drop a table space containing a history table that doesn’t contain its associated system-period temporal table; instead, specify both tablespaces in the DROP statement.
• History table columns can’t be added, dropped or changed explicitly.
• A history table must not be defined as a parent, child or self-referencing in a referential constraint. Referential constraints involving the history table are restricted to prevent cascaded actions to the history table.

Business Time Temporal Tables

Application-period or business time temporal tables differ from system-period temporal tables in that they’re based on your applications’ logical notion of time. Some examples of the use of application periods are:

• Variable rate mortgage products
• Insurance policies
• Credit cards with expiration dates.

Business time periods are based on application-driven changes to the time dimension of business artifacts and are based on the applications’ logical view of time. Business time, by itself, doesn’t use a history table. All data is current (current information about the past, present or future state of the business).

Creating a Table With Business Time

To create a table with business time, specify the PERIOD BUSINESS_TIME with bus_start and bus_end columns of type DATE or TIMESTAMP. To prevent overlapping periods for the same object, specify the BUSINESS_TIME WITHOUT OVERLAPS clause on the PRIMARY KEY definition. In Figure 6, for a given time, only one salary figure can be valid for an employee.
 
An application-period temporal table can be defined so that rows with the same key don’t have any overlapping periods of BUSINESS_TIME. You achieve this by adding the BUSINESS TIME WITHOUT OVERLAPS clause to a primary key, unique constraint specification, or create a unique index statement. DB2 handles enforcement.

Inserting, Updating and Deleting Data With Business Time Temporal Tables

Inserting data into an application-period temporal table is similar to inserting data into a regular table; the only special consideration is the need for the application to include the row begin and row end values that capture when the row is valid from the perspective of the business applications. This is called the BUSINESS_TIME period. To prevent business time periods from overlapping, you can create a unique constraint or index with BUSINESS_TIME_WITHOUT_OVERLAPS specified. An insert will fail if it violates the unique constraint. Figure 7 shows the issuance of a new variable rate mortgage to a customer with ACCT_ID 4389105.

Besides the regular UPDATE statement, application-period temporal tables support time range updates where the UPDATE statement includes the FOR BUSINESS PORTION OF BUSINESS_TIME clause. A row is a candidate for update if its period begin value, period end value or both fall within the range specified in the FOR PORTION OF BUSINESS_TIME clause. In our example, the Cust_Mortgage table has an index with the BUSINESS_TIME_WITHOUT_OVERLAPS clause.

Using the FOR PORTION OF BUSINESS_TIME clause avoids period overlap problems. This clause causes rows to be changed and can result in rows being inserted when the existing time period for a row that’s being updated isn’t contained fully in the range specified in the UPDATE statement. As shown in Figure 6, the update time period contains fully the period range of the existing row. As such, the Vrate column is updated to 2.86 and the bus_start and bus_end columns remain unchanged. These rules apply to updates:

• When the BUSINESS_TIME period is contained fully in the bus_start and bus_end column values, the row is updated. The bus_start and bus_end columns remain unchanged. When the bus_end value overlaps the beginning of the specified period for a qualifying row, the row is updated. In the updated row, the bus_start value is set to the begin value of the UPDATE specified period and the bus_end value is unchanged. An additional row is inserted with the original values from the row except that the bus_end value will be the starting value of the BEGIN clause.
• When the bus_start value overlaps the end of the specified period for a qualifying row, the row is updated. In the updated row, the bus_start value is unchanged; the bus_end value is set to the end value of the UPDATE specified period. An additional row is inserted with the original values from the row, except that the bus_start value will be the ending value of the UPDATE specified period.

Continuing with our example from Figure 7, a mistake was discovered with the Vrate for acct_id 4389105 for the initial 30-day mortgage period. The account holder should have received the promotional Vrate of 2.74 for the first 30 days. The account is updated to correct this error as shown in Figure 8.

Besides the regular DELETE statement, application-period temporal tables also support time range deletes where the DELETE statement includes the FOR PORTION OF BUSINES_TIME clause. A row is a candidate for deletion if its period begin column, period end column or both fall within the range specified in the FOR PORTION OF BUSINESS_TIME clause. Delete with business time is shown in Figure 9.

As shown in Figure 9, the BUSINESS_TIME period in the DELETE FROM statement covers the entire time period, and, as a result, the row is deleted. These rules apply to deletes:

• When the BUSINESS_TIME period in the DELETE FROM statement covers the entire time period for a row, the row is deleted (see Figure 9).
• When only the bus_end value falls in the specified period, the row is deleted. A new row is inserted with the original values from the deleted row except that the bus_end value is set to the beginning value of the DELETE specified period.
• When only the bus_start value falls into the specified period, the row is deleted. A new row is inserted with the original values from the deleted row, except that the bus_start value is set to the ending value of the DELETE specified period.

Use the New BUSINESS_TIME Special Register

The fact that different time periods appear in the same table can present some challenges when writing queries and trying to keep existing application code. Just like with system time, DB2 has a special register for business time. You set the special register to the business period of interest. In our example, it’s set to ‘2011-12-28’ as follows:

“SET CURRENT TEMPORAL BUSINESS_TIME ‘2011-12-28’;

Setting the CURRENT TEMPORAL BUSINESS TIME special register doesn’t affect regular tables. Only queries against temporal tables with a BUSINESS_TIME period enabled use the time set in the special register. There’s no effect on DDL statements.

Working with the CUST_MORTGAGE table from after our DELETE in Figure 8, we issue a SELECT SQL statement:

“SELECT acct_id, bus_start, bus_end
FROM Cust_Mortgage
WHERE acct_id = 4389105”;

Since only the first row in the table includes the date as specified in the special register, only the first row is returned. You can also use the FOR_BUSINESS_TIME clause in a query to specify your period of interest:

“SELECT acct_id, bus_start, bus_end
FROM Cust_Mortgage FOR BUSINESS_TIME AS OF ‘2011-11-14’
WHERE acct_id = 4389105”;

Again, continuing with our example using the CUST_MORTGAGE table, the rows that fall within the business time period of interest are returned: the row with the bus_start date of ‘2011-10-23’ and bus_end date of ‘2011-12-23.’

DB2 supports several period specification options in the FOR BUSINESS_TIME clause:

• FOR BUSINESS_TIME AS OF <value>
• FOR BUSINESS_TIME FROM <value1> TO <value2>
• FOR BUSINESS_TIME BETWEEN <value1> AND <value2>.

Our example used the AS OF option. You can also use the other options; there’s a lot of flexibility.

When the CURRENT TEMPORAL BUSINESS_TIME special register is set to a non-null value, you can use data modification statements such as INSERT, UPDATE, DELETE and MERGE against application-period temporal tables.

Bitemporal Tables

Bitemporal tables support both system time and business time in a single table. You can manage system time with full versioning of updated and deleted rows. Queries can be along one or both dimensions and you can specify the same or different values. You can combine these options to meet your objectives.

Summary

Proper use of temporal table capabilities will save hours of development time.

System-time temporal tables let you leverage versioning and the capability to separate current and historical data. In this way, developers and DBAs can take advantage of different storage and index options for the base and history tables and placement in separate buffer pools. Temporal tables can be compressed and range-partitioned for even greater flexibility and performance.

Business time temporal tables let developers manage data according to business time and business-related application logic. This gives developers more flexibility to meet users’ needs.

Bitemporal table support lets developers and DBAs combine the best capabilities of both features for maximum flexibility.

Exploiting temporal tables requires knowing how DB2 implements inclusive-exclusive periods, the rules for updates and deletes and SQL extensions and special registers.