The linker’s responsibility is to merge several ELF object files to create an executable or library. In the most simple case, this involves appending the several code and data sections to each other and adjusting the referenced addresses accordingly. This can occur at compile-time (static linking), or run-time (dynamic linking):
• Static linking: The object files are merged. The resulting executable or object file contains the code and data of all the input files. However, to keep the resulting binary as small as possible, the linker can skip input files that don’t contain symbols needed by one of the other files. A static library is an archive containing several object files (.o). If the static variant of a library (the .a file) is used, the resulting binary will contain all the code needed from that library and can run without the library on the target system. Although this may be an advantage in some cases, dynamic linking is usually preferred.
• Dynamic linking: To link dynamically, a dynamic shared object (shared library) is needed. These objects are built in a specific way to allow its code section to be relocated to a different position in memory without modifying it. Relocation normally involves the absolute addresses in the code section to be modified to remain correct. This is prevented by building the code using the -fPIC/-fpic option, which forces GCC to not use absolute addresses in the generated code.
During the link step, the linker searches for the symbols needed from a shared library and lists the needed libraries in the resulting binary. But the linker doesn’t actually include code from the shared library into the binary. Either the shared library is present during link time or a newer version must be available on the system where the resulting binary is executed.
Advantages of dynamic linking include:
• The resulting binary is small since it contains only the code of the non-library functions.
• The code section of a shared library can actually be shared between processes running in parallel. This is extremely important, considering standard libraries (such as the GNU C library), which almost every executable needs. Regardless of how many processes are running, the code section of this big library will occupy memory only once.
• If the binary is linked against a shared library, but the code of the library at run-time isn’t used with all the input values to the binary, the library won’t load. Only shared libraries actually needed will occupy memory.
• Bugfixes without relinking. Installing an updated or fixed version of a shared library will affect the next execution of the binary without requiring any other steps.
Compiler Options and Optimization
GCC provides many options to control its behavior. Besides several general options, there are also options specific to System z (see Figure 5). Some of these have an impact on the performance of the generated code. A summary of the performance improvements achieved between 1999 and 2007 (see Figure 6) suggests that it’s of particular value to exploit the System z-specific optimization options. All data given in Figure 6 was drawn from the latest System z model available in the particular year and was normalized. Overlapping measurements were used to scale when the measurements were taken on a new System z model. See Edelsohn et al., “Contributions to the GNU Compiler Collection” (fully cited previously) to learn how the measurements were conducted.
A detailed description of all GCC options is found in the “Using GCC” manual. This manual provides a section exclusively devoted to System z-specific options; we won’t describe these options in detail here. Also, it’s likely that additional compiler options will be provided for new System z models and improvements might be offered for existing models. Check the most recent version of the manual for details.
Two performance-relevant options specify what processor type to generate code for:
• -march=cpu-type: Exploit all instructions provided by cpu-type, including instructions that aren’t available on older models. Code compiled with this option typically won’t execute on older CPUs. See the GCC homepage for a list of supported CPU types.
• -mtune=cpu-type: Schedule instructions to best exploit the internal structure of cpu-type. This option will never introduce any incompatibility with older CPUs, but code tuned for a different CPU might run slower.
When you specify a cpu-type using the -march option, GCC’s default behavior is to perform an -mtune optimization for the same cpu type. However, it’s possible to specify different values for -mtune and -march. If in doubt, consider this strategy:
• Decide what the oldest model is that you need to support. Specify this model as an argument of -march. This will cause the compiler to fully exploit the instruction set of that model. The code is then not executable on older models but will exploit at least a subset of the instructions provided by later models.
• Decide what your most important model will be (i.e., on what model the most workload will be run, or what model your largest customers use). Specify this model as the argument of the -mtune option. This will cause the compiler to order the instructions so the pipelines of the specified model are best exploited. This order may or may not be perfect for other models, but apart from a slight performance impact, the -mtune option will never introduce any incompatibility.
Further System-z-specific code options include:
• -mzarch and -mesa: Generate code exploiting the instruction set of the ESA/390 or the z/Architecture, respectively. See the GCC homepage for default values and interaction with other options.
• -m64 and -m31: Controls whether the generated code complies with the Linux for S/390 Application Binary Interface (ABI) (-m31), or to the Linux for System z ABI (-m64). See the ABI resources previously mentioned for more details about ABIs.
Debugging and Reliability
Programs don’t always work as expected. Fortunately, the GCC-based tool chain and Linux provide a rich collection of debugging tools:
• The GNU debugger GDB (visit www.gnu.org/software/gdb/) is quite powerful.
• The Data Display Debugger (see www.gnu.org/software/ddd/) is a graphical front-end for GDB and can visualize pointer-linked data structures. Several debugging tools focus on analyzing bugs related to memory references. For example, Electric fence (http://director.fsf.org/project/ElectricFence/) is a neat tool and available on many platforms, including System z. Its capability is, however, limited to dynamically allocated memory.
A powerful feature is available for Linux running under VM. VM’s TRACE command offers a convenient and powerful way to debug the whole Linux system (see z/VM CP Command and Utility Reference, SC24-5967). An introduction to debugging is available in Chapter 22 of Linux on the Mainframe (John Eilert, Maria Eisenhaendler, Dorothea Matthaeus, Ingolf Salm, Prentice Hall, 2003, ISBN 01310141532). A detailed description of debugging under Linux is provided in the file:/usr/src/linux/Documentation/s390/Debugging390.txt.
For debugging more difficult problems, information about register usage, stack frame layout, and other conventions is found in the “ELF Application Binary Interface Supplement.”
For the case that a bug is related to the translation process, GCC and its related tool provide several options for debugging. --save-temps cause all major intermediate files to be saved instead of removing them when the translation is complete. Unfortunately, heavy compiler optimization makes it difficult for the debugger to depict the correlation between the original source code and the machine. Almost all optimization passes can be activated and deactivated separately. See the GCC manual for details.
The GCC is highly reliable. To a large degree, this reflects its development process. Before a source code change is added to the official repository, the modified compiler must be able to translate itself, which is an excellent test. In addition, the new compiler must pass the GCC regression test. This is a collection of small test cases systematically covering all language constructs; it also contains source code that suffered from past compiler bugs, and everything else that compiler developers considered worthy of inclusion in the test suite. For the languages C and C++ and related libraries, the number of test cases totals nearly 100,000; this makes GCC a compiler you can rely on.