Securing all your sensitive data is an overwhelming concept for many data center managers, security officers and privacy officers. To a great extent, they would prefer to ignore it. Decades of migrated data have been swept under the rug; this mountain of data sets represents the dirty little secret nobody wants to address. The fact is, most security staffs can’t tell you where all the sensitive data resides, which means they just can’t protect data they can’t locate and categorize.

The Legacy Data Challenge

The designers of the original IBM System/360 series hardware and software should be congratulated for their foresight. Mainframes have been around a long time, all the while maintaining upward compatibility. Unfortunately, that also means data has been kept around, too, and some of it contains sensitive information. Even the best data security teams can’t protect what they don’t know about.

An easy response is that, because of the mainframe’s isolation, only insiders even have the potential to access this data. However, a 2012 InformationWeek/Dark Reading Strategic Security Survey asked, “Which of these possible sources of breaches or espionage pose the greatest threat to your company in 2012?” Respondents could select three categories; their response was:

• 52 percent: Authorized users or employees
• 52 percent: Cybercriminals
• 44 percent: Application vulnerabilities
• 24 percent: Public interest groups/hacktivists
• 21 percent: Service providers, consultants and auditors.

Note that insiders (authorized users or employees) are perceived to be a threat equal to cybercriminals.

Internal Threats

These criminal insiders may not have started out this way, but they can become vulnerable and eventually compromised by criminals. This subversion need not even be overt. A subtle look the other way is often enough to enable a massive breach. For example, in December 2011, the Manhattan District Attorney indicted 55 individuals for their participation in an organized identity theft and financial crime ring. This cybercrime breach involved cooperation from corrupt employees at banks, a non-profit institution, a high-end car dealership and a real estate management company. The defendants acquired and sold the names, dates of birth, addresses, Social Security numbers and financial account information of unsuspecting victims. (For further details, see The New York Times report at

Various studies over the years have identified insiders as a top cause of data breaches. They have access to production data as needed to perform their usual activities and much non-production data at varying levels of access. Many savvy organizations fail to consider that insiders also have access to old development test data sets, which were necessary for development and testing, but never deleted. There’s also data accumulated from mergers and acquisitions that resides unprotected or under protected and available to many users.

The three z/OS external security managers—IBM’s RACF and CA’s ACF2 and Top Secret—excel at protecting assets, secrets and personal items that organizations are required to protect, but only if the data security staff knows to do so. Unknown sensitive data sets aren’t properly protected simply because they’re not in the production data and the people responsible don’t know what sensitive data is in them.

Historical practices of the development and Quality Assurance (QA) teams as well as normal system users have caused an extensive accumulation of data over the years. When creating new application programs or modifying existing ones, programmers commonly used copies of actual production data to test against. It was too dangerous to directly use the actual production data and difficult to generate accurate data to simulate the real thing. QA professionals have typically taken the same short cut.

Today, there are tools and automated systems for de-identification of data to ensure the development and test organizations never use actual production data containing sensitive information. But these policies are relatively new and can be easily circumvented when teams are dealing with a massive effort or are under a deadline to get an application completed and tested on time.

3 Pages