Linux OS Service ‘kdump’

This is a script which configures kdump (kernel dump). Kdump provides a memory dump into a file named vmcore when the kernel has a critical issue. Vmcore is often required to investigate the issue. The crash dump is captured from the context of a freshly-booted kernel, not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever the system crashes. Kexec is a fast-boot mechanism which allows rebooting a new Linux kernel from the context of a running kernel without going through any firmware or warm start.

Service Control

To manage the kdump service for future shutdowns and reboots, use the chkconfig tool:

# chkconfig kdump on
# chkconfig --list kdump
kdump           0:off   1:off   2:on    3:on    4:on    5:on    6:off

To control the kdump service immediately, use the service tool:

# service kdump
Usage: /etc/init.d/kdump {start|stop|status|restart|propagate}
# /etc/init.d/kdump start
Kdump already running                                      [  OK  ]
# /etc/init.d/kdump stop
Stopping kdump:                                            [  OK  ]

Package name:

kexec-tools-[version]-[arch].rpm
kexec-tools-2.0.0-310.el6.x86_64M

Configuration files

The default /etc/kdump.conf file is given below:

$ cat /etc/kdump.conf
# Configures where to put the kdump /proc/vmcore files
#
# This file contains a series of commands to perform (in order) when a
# kernel crash has happened and the kdump kernel has been loaded.  Directives in
# this file are only applicable to the kdump initramfs, and have no effect if
# the root filesystem is mounted and the normal init scripts are processed
#
# Currently only one dump target and path may be configured at a time. If dump
# to configured dump target fails, the default action will be preformed.
# Default action may be configured with the "default" directive below.
#
# Basics commands supported are:
# path [path]   - Append path to the filesystem device which you are
#     dumping to.  Ignored for raw device dumps.
#      If unset, will default to /var/crash.
#
# core_collector 
#   - This allows you to specify the command to copy the
#     vmcore.  You could use the dump filtering program
#     makedumpfile, the default one, to retrieve your core,
#     which on some arches can drastically reduce core file
#     size. See /usr/sbin/makedumpfile --help for a list of
#     options. Note that the -i and -g options are not
#     needed here, as the initrd will automatically be
#     populated with a config file appropriate for the
#     running kernel.
#     For ssh dump, scp should be used instead of cp.
#
# raw  - Will write /proc/vmcore into raw [partition].
#
# nfs  - Will mount fs and copy /proc/vmcore to
#     [mnt]/[path]/%HOST-%DATE/, supports DNS.
#
# nfs4       - Will use NFSv4 instead of NFSv3
#
# net        - This is a deprecated option to transfer vmcore over
#     nfs.  Use "nfs" option instead.
#
# ssh  - Will copy /proc/vmcore to
#     [user@server]:[path]/%HOST-%DATE/ via SSH,
#     supports DNS. If makedumpfile is the core_collector,
#     it is piped to an "ssh" shell, otherwise use the
#     specified core_collector like scp.
#     NOTE: make sure user has necessary write
#     permissions on server
#
# net [user@server]     - This is a deprecated option to transfer vmcore over
#     ssh.  Use "ssh" option instead.
#
# [fs type] [partition] - Will mount -t [fs type] [partition] /mnt and copy
#      /proc/vmcore to /mnt/[path]/127.0.0.1-%DATE/.
#     NOTE:  can be a device node, label or uuid.
#
# disk_timeout [seconds]
#   - Number of seconds to wait for disks to appear prior
#     to continue to save dump. By default kdump waits
#     180 seconds for the disks to show up it needs. This
#     can be useful in some cases if disk never shows up
#     (Either because disk was removed or because kdump is
#     waiting on wrong disk).
#
# link_delay [seconds]
#   - Some network cards take a long time to initialize, and
#     some spanning tree enabled networks do not transmit
#     user traffic for long periods after a link state
#     changes.  This optional parameter defines a wait
#     period after a link is activated in which the
#     initramfs will wait before attempting to transmit
#     user data.
#
# kdump_post [binary | script]
#    - This directive allows you to run a specified
#      executable just after the memory dump process
#      terminates. The exit status from the dump process
#      is fed to the kdump_post executable, which can be
#      used to trigger different actions for success or
#      failure.
#
# kdump_pre [binary | script]
#   - works just like the kdump_post directive, but instead
#     of running after the dump process, runs immediately
#     before.  Exit status of this binary is interpreted
#     as follows:
#     0 - continue with dump process as usual
#     non 0 - reboot/halt the system
#
# extra_bins [binaries | shell scripts]
#    - This directive allows you to specify additional
#      binaries or shell scripts you'd like to include in
#      your kdump initrd. Generally only useful in
#      conjunction with a kdump_post binary or script that
#      relies on other binaries or scripts.
#
# extra_modules [module(s)]
#    - This directive allows you to specify extra kernel
#      modules that you want to be loaded in the kdump
#      initrd, typically used to set up access to
#      non-boot-path dump targets that might otherwise
#      not be accessible in the kdump environment. Multiple
#      modules can be listed, separated by a space, and any
#      dependent modules will automatically be included.
#      Module name should be specified without ".ko" suffix.
#
# options [module] [option list]
#   - This directive allows you to specify options to apply
#     to modules in the initramfs.  This directive overrides
#     options specified in /etc/modprobe.conf. Module name
#     should be specified without ".ko" suffix.
#
# blacklist [module]
#   - The blacklist keyword indicates that all of that
#     particular modules are to be ignored in the initramfs.
#     General terminology for blacklist has been that module
#     is present in initramfs but it is not actually loaded
#     in kernel. This directive can be specified multiple
#     times or as a space separated list. Module name should
#     be specified without ".ko" suffix.
#
# sshkey [path]
#   - Specifies the path of the ssh identity file you want
#     to use when doing ssh dump. It must be a private key,
#     the default value is /root/.ssh/kdump_id_rsa. When
#     progagating public key, the key is assumed to be
#     identity_file.pub which by default is
#     /root/.ssh/kdump_id_rsa.pub.
#
# default [reboot | halt | poweroff | shell | mount_root_run_init]
#   - Action to preform in case dumping to intended target
#     fails. If no default action is specified, "reboot"
#     is assumed default.
#
#     reboot: If the default action is reboot simply reboot
#      the system and loose the core that you are
#      trying to retrieve.
#     halt:   If the default action is halt, then simply
#      halt the system after attempting to capture
#      a vmcore, regardless of success or failure.
#     poweroff: The system will be powered down
#     shell:  If the default action is shell, then drop to
#      an hush session inside the initramfs from
#      where you can try to record the core manually.
#      Exiting this shell reboots the system.
#      mount_root_run_init: Mount root filesystem and run init. Kdump
#          initscript will try to save dump to root
#          filesystem in /var/crash dir. This will
#          likely require a lot more memory to
#          be reserved for kdump kernel.
#
# debug_mem_level [0-3]
#                       - Turns on debug/verbose output of kdump scripts
#                         regarding free/used memory at various points of
#                         execution. Higher level means more debugging output.
#                         0 - no output
#                         1 - partial /proc/meminfo
#                         2 - /proc/meminfo
#                         3 - /proc/meminfo + /proc/slabinfo
#
# force_rebuild [0 | 1]
#   - By default, kdump initrd only will be rebuilt when
#     necessary. Specify 1 here to force rebuilding kdump
#     initrd every time when kdump service starts.
#
# fence_kdump_args [arg(s)]
#   - Command line arguments for fence_kdump_send (it can contain
#   all valid arguments except hosts to send notification to).
#
# fence_kdump_nodes [node(s)]
#    - List of cluster node(s) separated by space to send fence_kdump
#    notification to (this option is mandatory to enable fence_kdump).


#raw /dev/sda5
#ext4 /dev/sda3
#ext4 LABEL=/boot
#ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937
#net my.server.com:/export/tmp
#net user@my.server.com
path /var/crash
core_collector makedumpfile -c --message-level 1 -d 31
#core_collector scp
#core_collector cp --sparse=always
#extra_bins /bin/cp
#link_delay 60
#kdump_post /var/crash/scripts/kdump-post.sh
#extra_bins /usr/bin/lftp
#disk_timeout 30
#extra_modules gfs2
#options modulename options
#default shell
#debug_mem_level 0
#force_rebuild 1
#sshkey /root/.ssh/kdump_id_rsa
#fence_kdump_args -p 7410 -f auto -c 0 -i 10
#fence_kdump_nodes node1 node2
Related Post