The concept of reusable models is fairly common today. A contractor building a house can choose from thousands of premade blueprints. A dressmaker making a new outfit starts with a pattern. Software development uses patterns to promote reusable segments of code. Why then, do data modelers and database developers start from scratch when building databases and data models? After years of activity in this industry, it seems there would be standard database designs for common concepts such as customer, product, organization, address, etc.
A data model is essentially a design or blueprint of a database used to document both the structure and definition of data and information. It describes how data is represented and what it means. With today’s businesses producing more digital information than ever, the ability to organize, standardize, and understand that information is critical. Recent corporate scandals involving incorrect financial data and resulting legislation such as Sarbanes-Oxley highlight the need to have correct, traceable, and understandable information produced from an organization’s raw, distributed data. It’s easy to generate a financial statement from a database; it’s much harder to validate that the meaning of the data is correct, trace the origins of the data, and validate that calculations performed on the data are consistent.
It’s Good to Be Standard
One way to promote standards and data consistency is to use common model templates, a pre-built model that has objects and definitions for a > particular subject area or industry and can be used to jump-start a new development effort or compare and validate against an existing one. One historic problem with data models is that those responsible for building them (mainly data modelers and database developers) have lacked a standardized template that would give them a head start and prevent them from reinventing the wheel each time they need to build a new data model.
The bulk of the efforts to date to promote standard data models have centered on particular industries, especially financial services and healthcare, which have historically put a premium on their data investments. There also have been efforts to develop vendorneutral standards for particular industries, including healthcare and retail, but these large-scale efforts are often too broad or too expensive for the average developer who simply wants a standard definition for a particular entity (e.g., customer or product) or a sample model for just a portion of the business (e.g., accounts receivable). An entire industry model is overkill for many small organizations in terms of both price and scope.
Without the benefit of standard data model templates, developers are often left to “do their own thing,” which can have serious implications, particularly for smaller businesses. The time and cost expenses are obvious examples of the negative impact of this approach. If an object has been built before and validated in other companies, why spend time reinventing it? Moreover, a compelling reason for common templates is the promotion of standards within an organization or between organizations sharing data. While standards promotion sounds rather dry, its value can be seen in an example that, unfortunately, we’ve all experienced.
Let’s assume you have both a banking account and a credit card with a particular financial institution. You would think you can easily receive all your records on a single statement. But no! You receive a bank statement addressed to Jane W. Doe and a separate credit card statement addressed to Jane Doe. How can they not know you’re the same person? While there are many organizational issues that may lead to this mix-up (e.g., disjointed processes, employee error, and lack of communication), a common problem lies in the data model(s) used. Because the credit branch and banking branch each created their own “customer data entity,” there was no standard way for them to track your customer information. The banking “customer data entity” might be designed as shown in Figure 1 and the credit version might look like Figure 2.