IT Management

How It Works With DB2

The first step is identifying long-running queries and the data accessed by those queries in your data warehouse. A set of tables related to each other (either from a logical point or by referential constraints) can be referred to as a mart. A mart is in the sweet spot of the IBM Smart Analytics Optimizer if it consists of one large fact table and more smaller dimension tables, as is typically true for star or snowflake schema data models.

Once a mart is identified, a snapshot of those tables is redundantly stored and loaded in the IBM Smart Analytics Optimizer by using UNIX System Services pipes. The unload process can be controlled either by the IBM Smart Analytics Optimizer studio, a Graphical User Interface (GUI), or IBM Smart Analytics Optimizer’s administrative stored procedures.

From a logical view, there are two different types of blades inside the IBM Smart Analytics Optimizer. From a physical perspective, all blades are identical. Most available blades are referred to as worker nodes, while a smaller number are brought up as coordinator nodes. The worker nodes actually store the data while the coordinator nodes act as an interface to DB2 for z/OS. During the unload process, the data flows to the coordinator nodes and is distributed to available worker nodes.

While being loaded in the IBM Smart Analytics Optimizer, the data is compressed and distributed among available blades, allowing for massive parallel processing of those queries accessing the data. While the data is stored in main memory, it’s also stored on a separate storage unit for failover purposes. Once the offloading process completes, the mart is ready to be queried (see Figure 2).

According to an International DB2 Users Group (IDUG) study, most data warehouses are refreshed once a day or less. Regarding this information, for most OLAP applications, querying data stored on an IBM Smart Analytics Optimizer can deliver breakthroughs in terms of performance and reduce the time needed to complete today’s business processes.

Here’s the interesting question: Are there any business questions or business cases relying on SQL statements that just didn’t complete in an appropriate time and you never moved these applications to production? It might be time to reconsider these applications.

What happens if a query accesses the data that’s also available in a mart? Because of the IBM Smart Analytics Optimizer’s deep integration into DB2 for z/OS, the optimizer makes a cost-based decision if a query will be routed to the IBM Smart Analytics Optimizer for execution or whether it will be processed in the traditional way by accessing the data pages, either if they’re located in the buffer pool or must be retrieved from DASD.

Queries being routed to the IBM Smart Analytics Optimizer are expected to benefit from a significant acceleration. Results from an early beta customer experience, presented at the Information On Demand Conference 2009 in Las Vegas, have shown a query acceleration by a factor of more than 370 for a given query. But mileage will vary, depending on individual data and queries to access the data. Internally, the IBM Smart Analytics Optimizer can be looked at as kind of an MQT. But the biggest differentiator is that MQTs are designed to satisfy a limited set of queries. These queries need to match the MQT definition in regard to possible restriction, projection, grouping, and aggregation operators while the IBM Smart Analytics Optimizer matches a much larger scope of queries because they don’t assume any of these operations.

4 Pages