IT Management

Are You Reorganizing Your VSAM Files Too Often?

4 Pages

If, however, there is insufficient free space within the CI to hold the record being inserted, VSAM’s insert processing must create more free space. This process is called a Control Interval Split (CI Split). VSAM’s insertion strategy always attempts to do the following:

  • Keep records of similar keys blocked together
  • Avoid unblocked records
  • Avoid chaining of individual records
  • Create additional free space in the (key) vicinity if it is needed.

A CI Split is pretty inexpensive, as far as computer processing goes. VSAM performs the following steps to create more free space:

  1. Writes the CI being split with a split-in-progress indicator set in the CIDF field
  2. Moves (about) half of the records to a new CI buffer in storage
  3. Writes the new CI from that buffer
  4. Removes the records that were moved from the old CI buffer in storage
  5. Updates the Sequence Set (low-level) Index record to reflect the new CI and key changes
  6. Writes the old CI (without the moved records), resetting the split-in-progress indicator.

As you can see, this process required only four I/O operations to complete — the remainder of the processing was all in-storage activity.


There is little additional processing time involved in processing data after the CI Split has been completed. Two CIs now contain the records formerly contained in one, plus the new inserted record. There will be no additional I/O activity to process the records in either of the CIs when direct processing is being done, as is typical of CICS and other online activity. Batch jobs will have to read an additional CI when processing sequentially, but that is a small expenditure, and if tuned well, will likely require little or no more elapsed and I/O time given modern, cached DASD subsystems.


As you can see, when free space needed to handle the insertion of a new record was unavailable, VSAM’s insertion strategy caused additional free space to be created right where it was needed.

It is common (but not certain) that additional records will also be inserted in the same vicinity. This clustering of insert activity arises from the file and key designs in many cases, and from the natural processing flow in applications:

  • Most inserts are at the end-of-file point as keys are created in continually ascending key sequence (using a time stamp or sequence number) — one principal insertion point.
  • Most inserts are at the end of a range of keys (i.e., new accounts are opened in a branch banking situation) — there could be multiple insertion points in this case.
  • Inserts are more scattered, but still clustered (i.e., new course information for several classes is added to a student’s record during college registration).

Split processing, then, is beneficial. Suppose we reorganize the file and restore all free space to the initial load configuration — what happens then? All the extra free space that we created through the CI Split process is removed, and future record insertions may have to re-create the free space over a period of time. Note that extending the length of a record is logically the same as adding a new record in this case.

4 Pages