The Linux kernel code is stable, but even the best kernel hackers are only human and make mistakes. So, while kernel crashes are rare, they can occur and are unpleasant events; all services the machine provides are interrupted and the system must be rebooted. To find the cause of such crashes, kernel dumps containing the crashed system state are often the only approach.
When a user-space crash occurs, a core dump is written containing memory and register contents at the time of the crash. Writing such core dumps is possible because the Linux kernel is still fully operational. This is clearly more difficult when the kernel itself crashes. Either the dying kernel must dump itself or some other program independent from the kernel must perform that task.
This article reviews Linux kernel dump methods, describes the current Kdump process, compares System z dump tools, and offers an introduction to Linux dump analysis tools.
The Linux Kernel Crash Dumps (LKCD) project implemented one of the first Linux kernel dump methods. However, Linus Torvalds never accepted those patches into the Linux kernel because the currently active kernel was responsible for creating the dump. This meant the code creating the dump relied on kernel infrastructure that could have been affected by the original kernel problem. For example, if the kernel crashed because of a disk driver failure, a successful LKCD dump was unlikely because that code was also needed to write the dump. LKCD is no longer active; the last LKCD kernel patch was released for Linux 2.6.10.
Diskdump and Netdump were other dump Linux mechanisms; both had problems similar to LKCD and were never accepted into the mainline kernel.
For Linux on System z, IBM developers used another approach: standalone dump tools. When a kernel crash occurs, the standalone dump tool is started and loads into the first 64KB of memory, which Linux doesn’t use. Available since 2001, this functionality writes the memory and register information to a separate DASD partition or to a channel-attached tape device. z/VM also supports VMDUMP, a hypervisor dump method.
Kdump, developed after the failure of LKCD et al., uses a completely separate kernel to write the dump. With Kdump, the first (production) kernel reserves some memory for a second (Kdump) kernel. Depending on the architecture, currently, 128MB to 256MB are reserved. The second kernel is loaded into the reserved memory region; if the first kernel crashes, kexec boots the second kernel from memory and writes the dump file. Kdump was accepted upstream for Linux 2.6.13 in 2005; Red Hat Enterprise Linux 5 (RHEL5) and SUSE Linux Enterprise Server 10 (SLES 10) were the first distributions to include it.
Kdump is supported on the i686, x86_64, ia64, and ppc64 architectures. Depending on the architecture, the first and second kernels may or may not be the same. When the second kernel gets control, it runs in the reserved memory and doesn’t alter the rest of memory. It then exports all memory from the first kernel to user space with two virtual files: /dev/oldmem and /proc/vmcore. The /proc/vmcore file is in Executable and Linkable Format (ELF) core dump format and contains memory and CPU register information. An init script (see Figure 1) tests whether /proc/vmcore exists and copies the contents into a local file system or sends it to a remote host using scp. After the dump is saved, the first kernel is started again using the reboot command.