This post explains how to analyse the segfault message in message file and to identify the problem in application or operating system side.
What is “segfault”?
A segmentation fault (often shortened to segfault) or access violation is a fault raised by hardware with memory protection, to notify operating system (OS) about a memory access violation. The Linux kernel will response it by performing some corrective action, generally passing the fault to the offending process by sending the process a signal like #11. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the Linux default signal handler is used. The segfault generally will cause process to be terminated, and generates a core dump with proper ulimit setup.
How to check?
1. Signify segfault
A segfault typically just signifies an error in one particular process or program. It does not signify an error of the Linux Kernel. The kernel just detects the error of the process or program and (on some architectures) prints the information to the log like below:
kernel: login[118125]: segfault at 0 ip 00007f4e4d5334a8 sp 00007fffe9177d60 error 15 in pam_unity_uac.so[7f4e4d530000+b000] kernel: crond[16398]: segfault at 14 ip 00007fd612c128f2 sp 00007fff6a689010 error 4 in pam_seos.so[7fd612baf000+f5000] kernel: crond[17719]: segfault at 14 ip 00007fd612c128f2 sp 00007fff6a689010 error 4 in pam_seos.so[7fd612baf000+f5000
2. What does mean details this message?
The RIP value is the instruction pointer register value, the RSP is the stack pointer register value. The error value is a bit mask of page fault error code bits (from arch/x86/mm/fault.c):
* bit 0 == 0: no page found 1: protection fault * bit 1 == 0: read access 1: write access * bit 2 == 0: kernel-mode access 1: user-mode access * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch
Here’s error bit definition:
enum x86_pf_error_code { PF_PROT = 1 << 0, PF_WRITE = 1 << 1, PF_USER = 1 << 2, PF_RSVD = 1 << 3, PF_INSTR = 1 << 4, };
The error code 15 is 1111 bit. Finally, we can know the meaning of 1111 as follows:
01111 ^^^^^ ||||+---> bit 0 |||+----> bit 1 ||+-----> bit 2 |+------> bit 3 +-------> bit 4
This message indicates that the application triggers protection fault because that process tried to write access to a reserved section of memory in user-mode.