Beginners Guide to Using “trap” to Catch Signals and Handle Errors in Shell Script

Shell Signal Values

A signal is a message that some abnormal event has taken place or a message requesting another process do something. The signal is sent from one process to another process. Typically, a process sends a signal to one of its own subprocesses. You can obtain more information on signals from the signal and kill man pages.

You use the kill command to send signals to processes. The root user can send any signal to any process. Other users can only send signals to processes they own. The following are some signals that can be sent from the keyboard:

  • Send Signal 2 (INT) by typing Control-C.
  • Send Signal 3 (QUIT) by typing Control-\.
  • Send Signal 23 (STOP) by typing Control-S.
  • Send Signal 24 (TSTP) by typing Control-Z.
  • Send Signal 25 (CONT) by typing Control-Q.

The INT and QUIT signals cause the currently running process associated with the device (console or window) to terminate. The TSTP signal causes the process to stop or suspend execution. The job is put into the background; the process can be resumed by putting the job into the foreground. Because most signals sent to a process executing a shell script cause the script to terminate, the next topic describes how to have a script avoid termination from specified signals.

The Korn shell has 46 defined signals. Each signal has a name and a number associated with it. Typing the ‘kill -l‘ command displays this list:

# kill -l
EXIT   HUP      INT      QUIT     ILL
TRAP   ABRT     EMT      FPE      KILL
BUS    SEGV     SYS      PIPE     ALRM
TERM   USR1     USR2     CLD      PWR
WINCH  URG      POLL     STOP     TSTP
CONT   TTIN     TTOU     VTALRM   PROF
XCPU   XFSZ     WAITING  LWP      FREEZE
THAW   CANCEL   LOST     RTMIN    RTMIN+1
RTMIN+2 RTMIN+3 RTMAX-3  RTMAX-2  RTMAX-1 
RTMAX

The numeric signal values range from 0 (EXIT) through 45 (RTMAX). You can confirm the numeric value of a signal by executing the kill command with the -l option, followed by the signal name; for example:

$ kill -l EXIT 
0
$ kill -l RTMAX 
45
$ kill -l KILL 
9
$ kill -l TSTP 
24

If you know the number and want to learn or confirm the name of the signal, execute the kill command with the -l option, followed by the signal number.

$ kill -l 0 
EXIT
$ kill -l 45 
RTMAX
$ kill -l 9 
KILL
$ kill -l 24 
TSTP

You can send a signal to processes running on the system by using the kill command:

kill -signal pid

where the signal is the signal number or the signal name and pid is the process identification number for the process being sent the signal. You can terminate a process by executing the kill command and using the -9 option. The -9 option sends the KILL signal to the process. You can use either the numeric value or the signal name as shown here.

kill -9 pid 
kill -KILL pid

When you do not specify a signal name or number, the TERM signal, Signal 15, is sent to the process.

Catching Signals With the trap Statement

You might want to write a script that you do not want the user to immediately terminate using Control-C, Control-\, or Control-Z. For example, the script might need to do some clean-up before exiting. To avoid having the script exit before the clean-up is done, you use the trap statement in your script to catch these signals.

The syntax for using the trap statement is:

trap 'action' signal [ signal2 ... signalx ]

where:

  • action is a statement or several statements separated by semicolons. If an action is not provided, then no action is performed, but the signal is still “trapped.” Enclose the action within a pair of single quotes.
  • signal is the name or number for the signal to be caught.

For example, if you want to trap the use of Control-C, use the following:

trap 'echo "Control-C not available"' INT

The statements to be executed can be more than one line as long as the closing quote is followed by the signal value or values to be trapped; for example:

trap 'echo "Control-C not available" 
echo "Core dumps not allowed" 
sleep 1 
continue' INT QUIT

The trap does not work if you attempt to trap the KILL or STOP signals. The shell does not let you catch these two signals, thereby ensuring that you can always terminate or stop a process. This means that shell scripts can still be terminated using the following command:

kill -9 script_PID
kill -KILL script_PID

Also, the execution of the shell script can be suspended using the Control-S character because both signals can never be trapped within a Korn shell script.

The following statement tells the Korn shell to restore the original actions for the signal.

trap - signal

Example of Using the trap Statement

The following trapsig.ksh example script sets up a trap statement for the signals INT (Control-C), QUIT (Control-\) and TSTP (Control-Z). The script does not illustrate any practical application; it just shows how to install signal handling for these signals. After the trap statement, a while loop is entered that prints the string rolling … and waits for user input. The while loop and the script terminate when the user types in the dough string.

The script cannot be terminated by typing Control-C, Control-\, or Control-Z. The example output shows the script being executed. The user presses the Return key after the first rolling … prompt. The user then types a d to the second rolling … prompt and an s to the third. The users types a Control-C, a Control-\, and a Control-Z to the next rolling … prompts, which causes the appropriate trap statements to execute.

Finally, the user types the string dough, and the script terminates.

$ cat trapsig.ksh 
#!/bin/ksh

# Script name: trapsig.ksh

trap 'print "Control-C cannot terminate this script."' INT 
trap 'print "Control-\ cannot terminate this script."' QUIT 
trap 'print "Control-Z cannot terminate this script."' TSTP

print "Enter any string (type 'dough' to exit)." 
while (( 1 )) 
do 
       print -n "Rolling..." 
       read string
      
       if [[ "$string" = "dough" ]]  
       then 
           break  
       fi 
done

print "Exiting normally"
$ ./trapsig.ksh 
Enter any string (type 'dough' to exit). 
Rolling... 
Rolling...d 
Rolling...s 
Rolling...^C 
Control-C cannot terminate this script. 
Rolling...^\ 
Control-\ cannot terminate this script. 
Rolling...^Z 
Control-Z cannot terminate this script. 
Rolling...dough 
Exiting normally

The following is another example of using the trap statement. This example uses two scripts, parent and child. The parent script passes the signal to the child script and the messages are from the script that is called.

$ cat parent 
# trap - SIGNAL - cancel redefinitions 
# trap "cmd" SIGNAL - "cmd" is executed, signal is inherited to child 
# trap : SIGNAL - nothing is executed, signal is inherited to child 
# trap "" SIGNAL - nothing is executed, signal is not inherited to child

#!/bin/ksh 
# 
# Script name: parent

trap "" 3 # ignore, no forward to children 
trap : 2 # ignore, forward to children

./child
$ cat child 
# 
# Script name: child

trap 'print "Control-C cannot terminate this script."' INT 
trap 'print "Control-\ cannot terminate this script."' QUIT

while true
 do
      echo "Type ^D to exit..."
      read || break 
done
$ ./parent 
Type ^D to exit... 
^\ 
Type ^D to exit... 
^C
Control-C cannot terminate this script. 
Type ^D to exit...
^\

Type ^D to exit... 
$

Catching User Errors With the trap Statement

In addition to catching signals, you can use the trap statement to take specified actions when an error occurs in the execution of a script. The syntax for this type of trap statement is:

trap 'action' ERR

The value of the $? variable indicates when the trap statement is to be executed. It holds the exit (error) status of the previously executed command or statement, so any nonzero value indicates an error. Thus the trap statement is executed whenever $? becomes nonzero. The following traperr1.ksh example script requires users to enter a negative integer number (-1) if they want to exit. If a user enters an integer that is not a -1, the script prints the square of the number. The script then requests another integer until the user quits the script by entering a -1.

The example output shows the script being executed. The user enters the letter r. This is not an integer. The user’s input is read into the variable num, which is declared to only hold integer data types. Without a trap statement the shell prints an error message and exits the script. You can avoid this problem by using a trap statement, which is shown in the next section.

$ cat traperr1.ksh 

#!/bin/ksh

# Script name: traperr1.ksh

integer num=2

while (( 1 )) 
do
        read num?"Enter any number ( -1 to exit ): "
        if (( num == -1 ))
        then
  break 
  else  
  print "Square of $num is $(( num * num )). \n"
        fi 
done

print "Exiting normally"
$ ./traperr1.ksh 
Enter any number (-1 to exit): r 
traperr1.ksh[9]: r: bad number 
Square of 2 is 4.

Enter any number (-1 to exit): -1

Example of Using the trap Statement With the ERR Signal

The following trapsig2.ksh script is the traperr1.ksh script rewritten with a trap statement, a test of the exit status, and an exec statement to redirect standard error messages to /dev/null. The redirection of standard error messages is done so that the user does not see the error messages the shell would otherwise print to the screen. You want the user to see just the message you set up with the trap statement if an error occurs.

The value of $? is saved in the status variable immediately after the read statement is executed so that you can check later whether the read statement successfully read an integer. The if statement checks to see if the user entered a -1. If so, the script breaks out of the while loop and terminates. If the user did not enter a -1 and the value of status is 0 (indicating the read statement read an integer), the square of the integer entered by the user is printed and the user is again prompted for a number.

If the user does not enter an integer, the value of $? is nonzero and it is trapped. The print statement within the trap statement is then executed. Control then returns to the while loop, prompting the user for another number. The example output shows the script being executed, with the user typing in an integer, then the letter r, another integer, and finally -1 to exit the script. The messages output by the script for the corresponding input from the user are what is expected, based on the preceding description.

$ cat trapsig2.ksh 
#!/bin/ksh

# Script name: trapsig2.ksh

integer num

exec 2> /dev/null 
trap 'print "You did not enter an integer.\n"' ERR

while (( 1 )) 
do 
       print -n "Enter any number ( -1 to exit ): " 
       read num

       status=$?

       if (( num == -1 ))  
       then 
           break
       elif (( status == 0 ))  
       then 
           print "Square of $num is $(( num * num )). \n" 
       fi 
done

print "Exiting normally"
$ ./trapsig2.ksh 
Enter any number ( -1 to exit ): 3 
Square of 3 is 9. 

Enter any number ( -1 to exit ): r 
You did not enter an integer.

Enter any number ( -1 to exit ): 8 
Square of 8 is 64. 

Enter any number ( -1 to exit ): -1 
Exiting normally

When to Declare a trap Statement

To trap a signal any time during the execution of a shell script, define a trap statement at the start of the script. Alternatively, to trap a signal only when certain command lines are to be executed, you can turn on the trap statement before the appropriate command lines and turn off the trap statement after the command lines have been executed.

If a loop is being used, a trap statement can include the continue statement to make the loop start again from its beginning. You can also trap the EXIT signal so that certain statements are executed only when the shell script is being terminated with no errors. For example, if a shell script created temporary files, you could ensure that these are removed using a trap of the EXIT signal value.

trap 'rm -f /tmp/tfile* ; exit 0' EXIT
NOTE: The preceding example includes an exit statement within the list of statements to be executed when the trap occurs. This is necessary to exit from the script.

The following example is a copy of the /etc/profile file with some added comments to explain the various trap statements.

#ident "@(#)profile 1.18 98/10/03 SMI "/* SVr4.0 1.3 */
# The profile that all logins get before using their own .profile.

trap "" 2 3    # trap INT (Control-C) and QUIT (Control-\)               
               # and give no feedback 
export LOGNAME PATH

if [ "$TERM" = "" ] 
then 
if /bin/i386 
then 
TERM=sun-color 
else 
TERM=sun 
fi 
export TERM

fi

# Login and -su shells get /etc/profile services. 
# -rsh is given its environment in its .profile.

case "$0" in 
-sh | -ksh | -jsh)

    if [ ! -f .hushlogin ] 
    then 
        /usr/sbin/quota 
            # Allow the user to break the Message-Of-The-Day only. 
            # The user does this by using Control-C (INT). 
            # Note: QUIT (Control-\) is still trapped (disabled). 

        trap "trap '' 2"  2 
        /bin/cat -s /etc/motd 
        trap "" 2    # trap Control-C (INT) and give no feedback.

        /bin/mail -E 
        case $? in 
        0) 
            echo "You have new mail." 
            ;; 
        2) 
            echo "You have mail."   
            ;; 
        esac
    fi 
esac

umask 022 
trap 2 3      # Allow the user to terminate with Control-C (INT) or            
              # Control-\(QUIT)
Related Post