The days when batch work was simple and finished early are long gone. More complex workloads, frequent changes, and a shrinking batch window due to the 24-hour business cycle of the Web make today’s challenges much greater than in the past. If nightly batch cycles aren’t completed on time—or are derailed by errors—business-critical online applications can’t be restarted, customers can’t access their accounts, merchants can’t process credit card transactions, and suppliers and partners aren’t able to access inventory systems. Some parts of the business come to a standstill. Worse, financial penalties accrue on top of the lost income. Fortunately, you can take back control by following best practices for improving workload performance.
Applying best practices effectively should begin with awareness of the problems they solve. Most IT organizations typically have a team that ensures the nightly batch work runs as efficiently as possible. They also need to correct any error conditions that emerge before batch cycle completion. This team handles such a broad variety of tasks that they often don’t have command of certain nuances such as the specifications of how applications run and what each job does.
The batch operations team compares notes with technicians who watch the system performance data and decide on small changes in buffers and Job Control Language (JCL) to improve performance. The methodology is to run a job, note results, run it again with a change applied and then assess whether results showed an improvement, degradation, or no impact. If they’re using test rather than production data, results may vary. The nightly batch cycle may use up most, if not all, of the batch window; this leaves only a thin “cushion” against failure or no cushion at all.
As batch windows shrink with greater use of Web applications, batch teams must play it safe. They hesitate to make changes to the nightly batch cycle because that cycle is so critical. Even a minor change to the JCL of a production job stream may require going before a change control board. There’s always a significant risk that the next change may take up the batch cycle cushion (no matter how thin). If that happens, batch administrators may have to explain what happened and assure it won’t recur.
Batch administrators look to continually improve the performance of their jobs, which includes ensuring efficient and productive batch cycles, but they must overcome several hurdles to success:
- Batch workloads involve multiple systems and frequent systems changes.
- Applications are continually being enhanced or modified, negating the tuning efforts previously set up via simple JCL modifications.
- Frequent mergers and acquisitions create unfamiliar batch jobs that haven’t been tuned.
- Key technical people are retiring, resulting in the loss of specialized skills and knowledge.
- Few optimization techniques can be employed using simple JCL modification.
- Simple JCL modifications made at a specific time aren’t revisited when key changes in the system profile occur.
Best Practices for Batch
To really exploit best practices, IT management must realize they can’t rely on simple JCL changes alone; they must ensure they have or acquire the requisite skills, methods, and technology needed or perhaps not being used.
IBM recommends these best practices for improving batch workload performance:
- Ensure you have a properly configured system. A holistic approach involves ensuring you have enough resources such as memory and CPU to get the jobs completed.
- Implement data-in-memory techniques. Accessing data from memory instead of DASD is faster. The team must identify which files are performing the most I/O. System Management Facility (SMF) records are invaluable in identifying these candidates. If the candidate is a non-VSAM file, then there’s an iterative process of determining which buffer values fit best. Since most high-level languages support varying buffer values, you can change the BUFNO value on your DD statement and execute the program. Remember to keep your data constant and check the results after each run. At some point, increasing the BUFNO will make little difference.
If the candidate is VSAM, then you’ll need to determine if the access pattern is sequential or random. If sequential, you can use a similar approach mentioned previously: varying BUFND instead of BUFNO. However, if the pattern is random, investigate using Local Shared Resource (LSR) buffering instead of Non-Shared Resource (NSR) buffering. The NSR methodology optimizes sequential data access by assuming the next read will involve the next record in the file. With random access, this assumption is wrong and unnecessary I/O is performed. The LSR buffering methodology assumes random access and optimizes the I/O.