• Sixty-six percent of breaches involved data the victim didn’t know was on the system.
• Knowing what information is present within the organization, its purpose with the business model, where it flows and where it resides, is foundational to its protection.
False Sense of Security
An often-heard response is “My organization is safe. We’ve taken all the right steps to ensure our production environment is protected!” It’s nice and comforting to know that an organization believes all sensitive data in the production data sets and database tables is properly protected. However, this assumes there’s no sensitive data in non-production data sets and no production data sets, or database tables contain different sensitive data, are categorized incorrectly and therefore aren’t properly protected.
For example, credit card numbers have been found in notes fields of a customer call log database, where they should never have been entered. Because of this, either the data sets or database tables must be cleaned up, or the access permissions must be revised to assure they fit with the newly discovered sensitivity of the data.
A common response from some sites is that they’ve outsourced their sensitive data. Outsourcing data doesn’t relieve the organization of liability and many users still have general access to the data even though it resides on the outsourcer’s equipment. The same requirements and responsibilities to protect the data still apply, and the complexity is often increased. That’s because now the outsourcer’s employees and contractors may also have access to this data as part of their routine support and maintenance activities.
Virtually all installations have sensitive data. The first step is identifying and locating that data. Once the data sets and database tables containing this sensitive information have been identified, there are several different remedies:
• If you don’t need it, get rid of it! Review your organization’s data retention policy, which is usually tied to the creation date or the date when the data was last accessed or referenced. Fortunately, z/OS maintains both the creation date and last referenced date for all data sets and an organization can make decisions based on this information. Be sure the actual process used to determine which data sets contain sensitive information doesn’t reset the last referenced dates or, alternatively, obtain and record this information before reviewing them. (If the dates are reset, you will have lost this valuable remediation information.) Then create a list of data sets that haven’t been referenced since before the data retention policy timeframe. These are the data sets that can be either deleted or encrypted and archived to tape. When deleting all these data sets, the z/OS feature, erase-on-scratch, should be used to assure there’s no residual sensitive data remaining on the storage devices that someone could accidentally access.
• Determine when data sets were last accessed. Learn which of the remaining data sets potentially containing sensitive information haven’t been accessed in the recent past—say six months or a year. Many organizations like to use 13 to 15 months to include information referenced only once per year. Then, remove all access privileges, encrypt and migrate them. These data sets shouldn’t be deleted since there’s no assurance that a user, manager or auditor, for a legitimate business purpose, won’t need to access them again. You can bet someone will eventually try to legitimately gain access to several of these data sets. By keeping them in the catalog, as would be in the case of migrated data sets, they’re still “accessible.” While this will inconvenience a few users, the data will be protected until it can be more closely evaluated and the access privileges adjusted accordingly.
• Address data sets that are in “current” use. View the data set and validate that the data set does, in fact, contain sensitive information. It’s common for any kind of automated product, or manual process for that matter, to generate “false positives.” If upon examination and research, the data set doesn’t actually contain sensitive information, it’s not a vulnerability and need not be further considered for remediation. A good process will include the ability to flag such data sets as having false positives, document what was done to ensure the false positives were false, and provide an expiration date to allow the data set to be revisited at an appropriate future date.
If a “currently used” data set contains sensitive information, the data owner must be consulted to determine if the sensitive information is really necessary based on how it’s being used. If not, it can be cleansed by deleting all occurrences of the sensitive information or masking the values so the data set becomes no longer sensitive. If the data set contains sensitive test data that needs to retain its characteristics to provide valid tests, the sensitive data can be overlaid with similar data that passes the validity checks, but doesn’t really reference the sensitive data. If the data’s sensitive meaning must remain accessible, a tough decision must be made with the data owner about how processes and application programs can be modified to protect the sensitive information to the greatest extent possible and maintain compliance with the applicable regulations and laws.
Tokens and Encryption
A common approach is to replace the sensitive data with tokens or encrypt it. Both techniques typically require modification to the processes and application programs to decrypt or look up the token value to use it.
If the processing program can’t be modified easily to encrypt/decrypt/tokenize the individual fields or even the entire record as it’s processed, a potential alternative solution is to modify the batch Job Control Language (JCL) to add a decryption step that will write the data set to a Virtual Input/Output (VIO) data set, then use that for input/update by the application program, reversing the process when access to the data set is finished.
The VIO data set is preferred over a temporary “scratch” data set because it resides in the paging storage of the system and not in an actual data set on the disk storage system, as a normal temporary data set would. This means that if there’s an abnormal termination of the program, it won’t be left around for someone else to find.
Today’s legal and regulatory environment demands that organizations identify all the locations of this sensitive data, remediate the data that can be deleted or encrypted, and determine if the access permissions for the sensitive data are appropriate.