When the kdump crash dumping mechanism is enabled, the system is booted from the context of another kernel. This second kernel reserves a small amount of memory and its only purpose is to capture the core dump image in case the system crashes.
Being able to analyze the core dump significantly helps to determine the exact cause of the system failure, and it is therefore strongly recommended to have this feature enabled. This chapter explains how to configure, test, and use the kdump service in Red Hat Enterprise Linux, and provides a brief overview of how to analyze the resulting core dump using the crash debugging utility.
Installing the kdump Service
In order use the kdump service on your system, make sure you have the kexec-tools package installed. To do so, type the following at a shell prompt as root:
# yum install kexec-tools
Configuring the kdump Service
Configuring the Memory Usage
To configure the amount of memory to be reserved for the kdump kernel, edit the /boot/grub/grub.conf file and add crashkernel=[size]M or crashkernel=auto. Note that the crashkernel=auto option only reserves the memory if the physical memory of the system is equal to or greater than:
- 2 GB on 32-bit and 64-bit x86 architectures
A sample /boot/grub/grub.conf file
# grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/sda3 # initrd /initrd #boot=/dev/sda default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title Red Hat Enterprise Linux Server (2.6.32-220.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-220.el6.x86_64 ro root=/dev/sda3 crashkernel=128M initrd /initramfs-2.6.32-220.el6.x86_64.img
Configuring the Target Type
When a kernel crash is captured, the core dump can be either stored as a file in a local file system, written directly to a device, or sent over a network using the NFS (Network File System) or SSH (Secure Shell) protocol. Only one of these options can be set at the moment, and the default option is to store the vmcore file in the /var/crash/ directory of the local file system. To change this, as root, open the /etc/kdump.conf configuration file in a text editor and edit the options as described below.
To change the local directory in which the core dump is to be saved, remove the hash sign (“#”) from the beginning of the #path /var/crash line, and replace the value with the desired directory path. Optionally, if you wish to write the file to a different partition, follow the same procedure with the #ext4 /dev/sda3 line as well, and change both the file system type and the device (a device name, a file system label, and UUID are all supported) accordingly. For example:
ext3 /dev/sda4 path /usr/local/cores
To write the dump directly to a device, remove the hash sign (“#”) from the beginning of the #raw /dev/sda5line, and replace the value with a desired device name. For example:
To store the dump to a remote machine using the NFS protocol, remove the hash sign (“#”) from the beginning of the #net my.server.com:/export/tmp line, and replace the value with a valid host name and directory path. For example:
To store the dump to a remote machine using the SSH protocol, remove the hash sign (“#”) from the beginning of the #net email@example.com line, and replace the value with a valid user name and host name. For example:
When transferring a core file to a remote target over SSH, the core file needs to be serialized for the transfer. This creates a vmcore.flat file in the /var/crash/ directory on the target system, which is unreadable by the crash utility. To convert vmcore.flat to a dump file that is readable by crash, run the following command as root on the target system:
# /usr/sbin/makedumpfile -R */tmp/vmcore-rearranged* < *vmcore.flat*
Configuring the Core Collector
To reduce the size of the vmcore dump file, kdump allows you to specify an external application (that is, a core collector) to compress the data, and optionally leave out all irrelevant information. Currently, the only fully supported core collector is makedumpfile.
To enable the core collector, as root, open the /etc/kdump.conf configuration file in a text editor, remove the hash sign (“#”) from the beginning of the #core_collector makedumpfile -c --message-level 1 -d 31 line, and edit the command-line options as described below.
To enable the dump file compression, add the -c parameter. For example:
core_collector makedumpfile -c
To remove both zero and free pages, use the following:
core_collector makedumpfile -d 17 -c
Refer to the man page for makedumpfile for a complete list of available options.
Changing the Default Action
By default, when kdump fails to create a core dump, the root file system is mounted and /sbin/init is run. To change this behavior, as root, open the /etc/kdump.conf configuration file in a text editor, remove the hash sign (“#”) from the beginning of the #default shell line, and replace the value with a desired action as described below:
|reboot||Reboot the system, losing the core in the process.|
|halt||Halt the system.|
|poweroff||Power off the system.|
|shell||Run the msh session from within the initramfs, allowing a user to record the core manually.|
Enabling the Service
To start the kdump daemon at boot time, type the following at a shell prompt as root:
# chkconfig kdump on
This will enable the service for runlevels 2, 3, 4, and 5. Similarly, typing "chkconfig kdump off" will disable it for all runlevels. To start the service in the current session, use the following command as root:
# service kdump start
Testing the Configuration
The commands below will cause the kernel to crash. Use caution when following these steps, and by no means use them on a production machine. To test the configuration, reboot the system with kdump enabled, and make sure that the service is running:
# service kdump status Kdump is operational
Then type the following commands at a shell prompt:
# echo 1 > /proc/sys/kernel/sysrq # echo c > /proc/sysrq-trigger
This will force the Linux kernel to crash, and the address-YYYY-MM-DD-HH:MM:SS/vmcore file will be copied to the location you have selected in the configuration (that is, to /var/crash/ by default).