DB2 Application With “Hooks” to Legacy
In a second customer situation, there were four large BDAM (Basic Direct Access Method) systems. The records were stored in a highly compressed proprietary format. Based on the size and complexity of each source system, and subsequent DB2 database complexities, two conversion strategies were created. The first used the efficient legacy API to extract data into a display format, and then convert that data in a single pass into a set of DB2 load files. The second employed a hand-coded set of metafiles as input to a REXX and COBOL custom process that generated conversion programs. The REXX code generation process used artifacts created during the modeling process as input, as well as the DB2 catalog and legacy copy books. This allowed the generation process to adapt to any changes. This automated process handled most of the simple conversion business rules. For more complex rules, custom COBOL libraries were created, and source code was merged with the generated code. This automated process enabled flexibility of the database design during development, and a reusable process for the most complicated files. Having the development of the conversion in-house gave the programmers intimate knowledge of the conversion programs. Once the database tables reached a point of stability, the conversion generation process was abandoned and the generated programs maintained by hand.
The databases contained sensitive, strategic data. We began the process with a two-year POC that was performed in two phases. Because the legacy data was in a proprietary compressed format, it was difficult to evaluate the data. A sample of each file was converted into a simple DB2 database that matched the legacy data format. All data was stored as character display, and an extensive series of queries were executed to analyze the condition of the data. This information helped reveal undocumented codes, invalid data, and the frequency of relationships in the data. This data evaluation phase was then followed by a POC to identify and simulate application functionality. This included process simulations using both SQL scripts and COBOL programs. Various tests were conducted to measure performance, application compatibility, and data integrity. Although much of this code was thrown away after the POC was completed, the information gathered was used to set physical database design and application API program logic. Measurements taken from SQL scripts and POC program execution were used to obtain initial CPU capacity estimates and application elapsed time estimates. All these measurements were presented to management so they were well aware of the performance impact of moving to DB2 and could plan accordingly and set realistic expectations. A list of issues was developed from the POC as input to the physical database model and application API development.
We weren’t allowed to change any application programs. A data access API already existed, and we were to re-write that API. This impacted DB2 physical database design decisions. Although it was desirable to create a normalized database design for the future, we had to accommodate a “read everything” legacy approach. Through modest de-normalization, indicator columns, and by taking the time to study the legacy API, we were able to meet the future design direction and performance objectives. We carefully studied the behavior of the legacy API. By understanding how it worked, we minimized DB2 access in some situations and dramatically improved performance (four times performance improvement). This was all verified in the POC so there were no implementation surprises.
We’ve seen a similar kind of problem in cases where IMS conversions are done one segment to one table. This creates poor-performing SQL that’s trying to pass the data back the way the application expects it, but since the data isn’t in any sort of normalized format, the SQL gets overly complicated.
During migrations where legacy applications won’t be changing, it’s sometimes beneficial to affirm the migration will occur in phases. The first phase will accommodate the past and prepare for the future. However, a later phase may involve another conversion to a more normalized design when legacy access is abandoned. Planning for this and expecting less than desirable performance initially will set reasonable expectations!
DB2 Application I/O Layer Access to Legacy
There are many conversion efforts under way to convert to DB2 from VSAM, but without any changes to the application. While it’s always preferred to not upset the user community with new application code, we also must set expectations accordingly and be prepared for less than optimal performance. Fortunately, there are options to help achieve decent performance, but we must evaluate all options to ensure we don’t compromise integrity of the new DB2 relational database.
If you’re moving from VSAM to DB2 with no application changes, you must be willing to take a performance hit. Going to DB2 provides a higher level of availability, enhanced backup and recovery, and options for built-in data integrity. However, with these advantages comes a longer code path and reduced performance. In addition, if the move is from a highly denormalized recordbased system to a normalized DB2 database, there’ll be an I/O impact, too. Spending time on a significant amount of data analysis can benefit your database design and API decisions. Otherwise, be prepared to take a performance hit.
In one client situation, a database design based on a third normal form was developed. This database was deployed based on the model and required sophisticated logic to translate from the legacy record format to the DB2 database. This impacted both the conversion process and the API. The decision was made to use an Extract, Transform, and Load (ETL) tool to create the conversion programs. Although ETL tools are extremely useful for moving transaction data to warehouses and such, they’re not always the best choice for a one-time conversion process. The complex conversion rules and B-level conversion code generated by the tool resulted in an extremely poor-performing conversion process. In addition, this complexity made the conversion code so confusing that any database changes tend to code around the conversion programs rather than change them.