Jun 16 ’09

SQL-Based Middleware: Perhaps the Right “Change”

by Mike McKee in z/Journal

If you had to guess, what was the most overused word or phrase during the 2008 Presidential Election?

• Maverick

• My friends

• Joe the plumber

• Change

• Historic.

Did you select change? The word is overused in political circles and in corporate IT. How many new initiatives, methodologies, or procedures were subject to change in IT departments last year? Gartner analyst Steve Prentice states, “People resistant to change will not survive changes to the industry” (see www.cobol.com/MainMigLeg/detart.asp?cat=training&bhcp=1).

Despite that dire prediction, legacy systems remain widespread. Many of the oldest legacy databases or file systems are widely used in Fortune 1000 companies. In fact, IMS has existed for more than 40 years. Consider these facts about IMS (from IBM Systems Magazine, “Standing the Test of Time,” April 2008):

• More than 90 percent of Fortune 1000 companies use it.

• More than $3 trillion of transactions for a single IMS customer are processed daily.

• Clients that run IMS have exceeded a workload of 100 million transactions daily.

So what’s the truth regarding the need to change? Are legacy systems obsolete, or should they endure as long as they’re effective? Steve Tockey, author of Return on Software (2003), states that after a certain time, legacy systems become too costly to keep. This is primarily due to additional requirements that the legacy system can’t perform, or a lack of capacity. However, is there a type of middleware that can extend the life of legacy systems?

Some might consider using an SQL-Based Middleware (SBM) approach. SBM’s main function is providing realtime SQL data access to legacy data. This lets newer technology (Internet/ Java/Windows) applications access legacy data as if it were any other type of contemporary database. This might preclude the need to do a costly migration from IMS to Oracle. In today’s economy, SBM might make sense.

What’s SBM?

SBM is a client/server middleware software solution providing real-time access to legacy data. For example, an Internet or desktop application issues a request to the SBM. The result would be a database action, or data returned to the calling application. Characteristics of an SBM include:

• Data dictionary that stores the data layout for supported data sources.

• SQL data access to supported data sources

• Support for standards such as Open Database Connectivity (ODBC), Object Linking and Embedding Database (OLEDB), Java Database Connectivity (JDBC), SQL, and Microsoft .NET.

The architecture is straightforward. The two main pieces (client and server) exchange messages in response to an SQL request by a client application. The server issues native calls to the legacy or other data sources. If the data source is a relational database, the message sent to it will be in SQL; otherwise, it’s sent in a native format. Figure 1 shows the data flow that occurs after an SQL request.

Figures 1, 2, 3

Figures 1, 2, 3

The first step is the SQL request. Applications issue SQL requests though interfaces such as ODBC, OLEDB, JDBC, .NET and ActiveX Data Objects (ADO). Applications also might invoke wrappers such as ADO or Language Integrated Query (LINQ) that will call a particular interface.

The SQL request application could range from an Internet Web service to a custom Visual Basic program. PHP scripts, JavaBeans, SharePoint, Microsoft SQL Server SSIS, .NET applications, and application servers are a few of the potential applications that issue SQL requests through one of those interfaces. The developer must ensure SBM supports the interface. Most SBMs have at least ODBC, which permits SQL data access for the data sources the SBM supports. If the application is from a Java source, then JDBC is required. Figure 2 shows some of the more popular formats.

The SBM (client) will evaluate the SQL request. It checks for accuracy, then develops an execution plan if no errors exist. The execution plan focuses on what types of services the SBM (server) needs to perform.

A Remote Procedure Call (RPC) to the server will follow. The RPC contains the necessary arguments the server requires. The server analyzes the request and determines its type. If it’s a select query, it will issue native calls to the data source, asking for the appropriate data. If it’s a request for VSAM file data, this means opening the VSAM file (if necessary), reading the specific record(s), and returning the data to the client.

If the request is an update/delete/ insert query, the server will issue the appropriate native call to the data source and perform the modify action. After that call, it will return the status to the SBM client, which dispatches it to the requesting client. If it were an update request for an IMS segment, this would mean finding the appropriate segment and updating the data the client passed.

The internal SBM operations to achieve this simple request can be intensive. Data translation, network operations, and SQL optimization are just a few of the tasks these middleware components must deal with.

What’s a Virtual Database?

A garage is a great place to store things, but finding things can be difficult if the garage isn’t organized. A database should let you easily find information; content should be organized so users can easily access, manage, or update it.

On systems where data files (or databases) are numerous, a common data dictionary is essential. The data dictionary is a repository SBM will use that consists of:

• Physical file names

• Database names

• Tables

• Columns

• Indexes

• Other metadata.

In an SBM, the data dictionary serves as a repository for all the metadata that represents various data sources. A virtual database is another term for this repository. For the developer, they only attach to the data dictionary and aren’t concerned with the particular details of the data sources (such as how to connect).

Transforming Hierarchical Data Structures

Certain legacy data has array data located in the record definition. For instance, a COBOL File Definition (FD) representing a VSAM file might have an array of grades embedded in a student record, an IMS segment might have child segments, or an ADABAS record might have MU or PE fields (arrays of data stored within a single record). How can this type of data structure turn into relational tables?

Figure 3 displays the transformation from a COBOL FD into relational tables. The bold fields represent the primary key for those tables. An occurs clause represents an array, and triggers a normalization activity that produces a child table. In this example, a child table (MonthlyPaymentTable) was generated that contains three fields. It contains a primary key (CustomerID, MonthID), and an additional attribute (PaymentAmount). The child table (Monthly Payment) is related to its parent (Customer) by the CustomerID field. An SQL statement that joins those two tables might be something like:

select PaymentAmount, MonthID from

CustomerTable, MonthlyPaymentTable

where:

CustomerTable.

CustomerId=MonthlyPaymentTable.CustomerId

Figure 3 shows that it’s possible to represent hierarchical data in relational tables. This opens up numerous possibilities for data sources that have embedded array structures.

Importance of SQL Data Access

When the Tower of Babel builders lost their common language, they were scattered throughout the earth. Before the creation of relational databases, data access was in a Babel state. Databases and indexed files were proprietary. However, with the standardization of SQL in 1986, many relational databases recognized that supporting this standard offered distinct advantages. SQL offered easy access to data with few statements. For instance, to access an order record, the following query could be submitted:

Select order_id, customer_id, order_amt from

orders where order_id='111222'

If the goal was to create a query that joined two tables, the SQL statement could be revised to be:

Select order_id, customer_id, order_amt,

customers. customer_name from orders,

customers

where:

order_id='111222' and customers.customer_id

= orders.customer_id;

To the application developer, little must be done, other than issuing the query request. SQL is a data access language. It supports expressions, operators, and Create, Retrieve, Update, Delete (CRUD) operations. It’s a simplistic, powerful language, and is considered the de facto standard for data access.

As SQL became more accepted, the development community agreed to interfaces that simplified data access from application programs. This meant that developers could issue their SQL requests through one of those interfaces, and not have to worry about proprietary database Application Program Interfaces (APIs). A key attribute of an SBM is providing these interfaces as middleware components. Another name for these components is drivers or providers. Their main task is to receive the SQL request and control the execution logic necessary to fulfill the task. Ultimately, their request will entail making the appropriate action call on the data source.

Federated Data Access

There are times when multiple data sources are required to fulfill a business function. An example might be a VSAM record that has a lookup code that requires looking at an ADABAS record. Another request might be migrating data from one data source to another. Federated data access requires that the data dictionary be able to store more than a single data source. Several SBMs implement this feature, which lets developers use multiple data sources from a single virtual database (data dictionary).

Selecting a Provider

When evaluating an SBM solution, the vendor chosen is significant. The following are key considerations:

• Data source: Does the vendor provide support for all the needed data sources?

• Hardware/OS: Does the vendor provide a solution that runs on your hardware and operating system?

• Performance: Time the queries you intend to use and see which SBM is fastest.

• Data type support: Legacy sources tend to have many custom data types. Ensure that the SBM has all that are appropriate for your system.

• Correctness: Does the provided solution produce consistent, correct results?

• Importing legacy data: Can the SBM easily import your legacy data into its data dictionary?

• Evaluation: Does the vendor provide evaluations of their product and adequate support during the evaluation period?

• Price: A high price shouldn’t necessarily rule out a vendor; a low-cost solution might mask other hidden costs such as support.

• Security: Does the SBM offer protection for the data?

• Driver interfaces: Does the vendor supply interfaces for ODBC, JDBC, OLEDB, and .NET?

• Tools: Are there helpful tools that complement the SBM?

• Support: Are technical support and consulting continuously available?

• Vendor track record: How long has the vendor been in business?

A Google search on ODBC, JDBC, OLEDB, VSAM, Data Access, and IMS might provide several sites that link to SBM providers. Noted database expert Ken North offers a list of SBM providers at www.sqlsummit.com/ODBCVend.htm.

Conclusion

An SBM solution offers several advantages to organizations using legacy data. Real-time SQL access for Internet and desktop applications offers functionality that might postpone the need for a costly migration. Change seems to be the word for this new era, but make sure that it’s the right change.