XPath expressions can be used to search for values in a document. Query 3 of Figure 5 returns the names of products from the PRODUCT sample table by restricting the data to only those products that sell for less than $10. Filtering is performed by using an XMLEXISTS function to specify an XPath comparison expression to search all the XML documents in the table.             

This is important, powerful functionality, but results in a complete table space scan of the PRODUCT table. You can, however, build indexes into an XML document using XPath expressions. Figure 6 shows a CREATE INDEX DDL statement that builds an index on the price element in the XML DESCRIPTION document of the PRODUCT table; this approach can improve XML search performance.

XML to Relational

Since DB2 can generate XML from more traditional rows and columns, and work with XML natively, it only makes sense that it can also parse XML data into individual columns. So you can generate traditional relational data from XML in DB2. There are two built-in functions that enable this capability:

  • XMLCAST is useful when you must generate an individual column from an XML document.
  • XMLTABLE is useful when you need to generate multiple columns.

Figure 7 contains two examples of queries that pull elements out of the XML DESCRIPTION column of the PRODUCT sample table and return them as columns. Since these are invocations of built-in functions, they can be executed virtually anywhere you can put an expression in a statement (i.e., SELECT, INSERT, UPDATE, DELETE, etc.).


Data Validation

By default, DB2 will only validate an XML value to the extent it’s well-formed. To provide more detailed validation of documents internally in DB2, you must create an XML schema. An XML schema defines structure, content, and data types for XML documents. It consists of one or more schema documents. A schema document can define a namespace (optionally) and data types for elements and attributes. There are basic data types (i.e., integer, date, decimal, string), user-defined types, and complex element types. The schema also describes the allowed length and patterns for string values and supports type inheritance and derived data types. An XML schema document is defined in XML syntax. Detailed occurrence and value range definitions are supported in XML schemas.

XML schemas must be registered with the database before they can be used for validation. DB2-supplied stored procedures are used to register XML schemas. The XML schemas are then stored in the XML Schema Repository (XSR) and used by DB2 only. A binary representation is generated for the entire XML schema and then stored as a BLOB in the XSR. This helps with performance for validation since there’s no need to read individual schema documents each time during validation.

XML schema validation is the process of determining whether the structure, content, and data types of an XML document are valid according to an XML schema. In addition, XML schema validation strips ignorable white space from the input document. XML schema validation is done by invoking the DSN_XMLVALIDATE function from within the XMLPARSE function. DB2 10 will perform validation and XML parsing “inside the engine” using XML System Services (XMLSS).

You can validate an XML document in DB2 10 automatically by including an XML type modifier in the XML column definition in a CREATE TABLE or ALTER TABLE statement. The XML type modifier specifies one or more XML schemas with the XML data type. When a column has an XML type modifier, DB2 implicitly validates documents that are inserted into the column or updated. Or, you can do it manually by executing the DSN_XMLVALIDATE built-in function when you insert a document into an XML column or update a document in an XML column. Validation is optional when you insert data into an XML column with no XML type modifier, but is mandatory if an XML type modifier exists. The LOAD utility will also validate XML data if an XML column being loaded is associated with an XML type modifier.

5 Pages