What is early kdump support?
In previous versions of CentOS/RHEL (5/6/7), the kdump service would start very late in the boot sequence. So early crashes information is lost during the booting. Starting CentOS/RHEL 8, a new kdump mechanism called “early kdump support” was introduced to tackle this issue. Early Kdump stores the vmlinuz and initramfs of the crash kernel inside the initramfs of the booting kernel and load them directly into the reserved memory (crashkernel) during early boot stage.
The “kexec-tools” package now has 2 extra modules to load crash kernel and initramfs as early as possible during the boot sequence to capture the kernel crash dump of the booting kernel.
/usr/lib/dracut/modules.d/99earlykdump/early-kdump.sh /usr/lib/dracut/modules.d/99earlykdump/module-setup.sh
# dracut --list-modules | grep earlykdump earlykdump
By default the early kdump support is disabled and we have to enable it manually. It also supports all the dump targets and configuration parameters supported by the earlier kdump configurations in CentOS/RHEL 5,6,7.
Configure kdump service
1. Refer to the below kdump configuration post to configure kdump and ensure that kdump service in running status.
# systemctl enable --now kdump.service
# systemctl status kdump.service ● kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since Mon 2019-08-19 23:42:11 IST; 16h ago Main PID: 1255 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 26213) Memory: 0B CGroup: /system.slice/kdump.service Aug 19 23:42:09systemd[1]: Starting Crash recovery kernel arming... Aug 19 23:42:11 kdumpctl[1255]: Kdump already running: [WARNING] Aug 19 23:42:11 systemd[1]: Started Crash recovery kernel arming.
2. List the early-dump modules are available in system
# dracut --list-modules | grep earlykdump earlykdump
3. Append the parameter rd.earlykdump to kernelopts line in /boot/grub2/grubenv file:
# cat /boot/grub2/grubenv
# GRUB Environment Block
saved_entry=4eb68bf18e86437d9c957ff4863a3288-4.18.0-80.el8.x86_64
kernelopts=root=/dev/mapper/ol-root ro crashkernel=auto resume=/dev/mapper/ol-swap rd.lvm.lv=ol/root rd.lvm.lv=ol/swap rd.earlykdump
boot_success=0
###################################################################################################
###################################################################################################
###################################################################################################
Re-create iniramfs
1. Now the next step is to re-create initramfs to add early-kdump modules:
# lsinitrd | grep -i early
# dracut -f --add earlykdump
For example:
# lsinitrd |grep -i early Arguments: -f --add 'earlykdump' earlykdump -rwxr-xr-x 1 root root 1940 Jun 17 10:29 usr/lib/dracut/hooks/cmdline/00-early-kdump.sh
2. Reboot the box to load the changes
# reboot
3. Once the server back online, check the status of early-kdump:
# journalctl -x |grep -i early-kdump Aug 20 16:08:09 [HOSTNAME] dracut-cmdline[196]: early-kdump is enabled. Aug 20 16:08:10 [HOSTNAME] dracut-cmdline[196]: kexec: loaded early-kdump kernel
Testing early-kdump
Now let’s test the early-kdump using custom systemd unit files and make the panic using SysRq crash.
1. Create a unit file name /etc/systemd/system/test_early_kdump.service.
# touch /etc/systemd/system/test_early_kdump.service
2. Provide appropriate permissions:
# chmod 664 /etc/systemd/system/test_early_kdump.service
The unit file should look as below:
# cat /etc/systemd/system/test_early_kdump.service [Unit] Description=test_early_kdump Service Before=kdump.service [Service] ExecStart=/usr/local/test_early_kdump.sh Type=simple [Install] WantedBy=default.target
3. Then create another script /usr/local/test_early_kdump.sh file to pass the sysrq crash command:
# cat /usr/local/test_early_kdump.sh #!/bin/bash /usr/bin/echo c > /proc/sysrq-trigger
4. Provide executable permission for the script:
# chmod +x /usr/local/test_early_kdump.sh
5. Reload the systemd daemon:
# systemctl daemon-reload
6. Enable this test_early_kdump service at the boot level:
# systemctl enable test_early_kdump.service
7. Reboot the system:
# reboot
8. Disable the custom unit & script files and remove after tested. Boot the system in rescue mode using ‘systemd.unit=rescue.target‘ and disable the service ‘test_early_kdump’ at boot time.
# systemctl disable test_early_kdump.service
Above command disables the custom unit file. Next time system will boot normally.
9. Remove the custom unit files and crash script file as the TEST crash is completed:
# rm /etc/systemd/system/test_edump.service rm: remove regular file '/etc/systemd/system/test_edump.service'? y
# rm /usr/local/test_early_kdump.sh
10. Check the /var/crash/ folder as per the kdump.conf (path /var/crash) mentioned for the vmcore:
# ls -l /var/crash/127.0.0.1-2019-08-20-17:09:23 total 56648 -rw-------. 1 root root 57959829 Aug 20 17:09 vmcore -rw-r--r--. 1 root root 41452 Aug 20 17:09 vmcore-dmesg.txt