The bulk of today’s information is still kept in Virtual Storage Access Method (VSAM) files. VSAM endures because of its simplicity, maturity, and to be honest, inertia. However, even with the new technology and techniques available today, it’s important to tune a VSAM data set’s free space attributes for performance. (In this article, which focuses on Key-Sequenced Datasets [KSDS], the words data set and cluster are used interchangeably.)
Records, CI, and CA Free Space
There are several ways to tune VSAM clusters for DASD utilization and performance; the methods mostly involve Control Interval (CI) size and free space.
A CI is the VSAM unit of I/O and the structure around the logical records an application manipulates. Choosing a CI size requires weighing many factors and is generally a trade-off between saving DASD by using large blocks vs. using smaller CIs for potentially better online performance. The application record length also matters because VSAM has discrete CI sizes to pick from, which rarely accommodate the records with no space left over. A Control Area (CA) is a logical grouping of CIs, usually set at a cylinder. The programmer has no direct control over CA size. Instead, VSAM picks a CA size based on the data set’s allocation units.
In a cluster definition, the CI free space is expressed as the percentage of bytes to leave empty when a cluster initially loads. For CAs, the free space is the percentage of empty CIs left at the end of the area.
Loading a Cluster
Figure 1 shows how VSAM free space distribution occurs in a freshly loaded cluster. As VSAM loads a cluster, it lays the records end to end into an empty CI until it determines the next record will leave less than the minimum amount of free space. VSAM then proceeds to put records into the next available CI. This continues until the number of remaining free CIs reaches the CA free space limit and VSAM moves on to the next CA.
Each CI has its prescribed amount of free space specified in the cluster’s definition. For instance, specifying 10 percent free space for a 4K data CI VSAM will reserve at least 409 free bytes. At the end of each CA are several empty CIs, depending on the CA free space in the definition. For example, a data component with 90 CIs per CA and a 10 percent CA free space will have nine free CIs per CA. The actual number of free bytes in a CI is indeterminate because it depends on the size of the records going into the CI at the time. But VSAM guarantees there should never be less free space than what’s specified in the CI free space definition.
Free space applies only at data set loading time. After that, VSAM uses the reserved free space when the application adds records while maintaining the records in key sequence in a CI. If there isn’t enough free space in the CI, VSAM divides the records more or less evenly and leaves half of them in the original CI. The other half goes into one of the empty CIs left at the end of the CA. This is called a CI split. If there aren’t any free CIs, VSAM has to go through a CA split. To accomplish this, VSAM moves half of the CIs in the CA to empty space at the end of the cluster. The other half of the CIs stay in the original CA. With some clusters having upward of 90 CIs per CA, you can see where this turns into a lot of work.