Inconsistent Device Names Across Reboot Cause Mount Failure Or Incorrect Mount in Linux

The Problem

A disk partition does not mount after a system boot, whether the reboot was planned or unplanned. Before the reboot, the disk partition was mounted and working properly. The disk has mounted correctly after other system reboots, but no longer works.

This behavior can happen regardless of the type of file system on the partition and is unrelated to the file system type. Failures of disks with EXT4 or OCFS2 file systems have been reported, but can happen with any file system type.

A typical entry in the /etc/fstab file could be similar to:

/dev/sda1          /mydir    ext4   defaults  0 0
/dev/mapper/mpath2 /otherdir ocfs2  _netdev   0 0

or various combinations of these.

The Solution

Disk drives and partitions are addressed geographically in Linux, according to their bus position and order of discovery. SAN devices are especially vulnerable to changes in the discovery order because of varying reboot timing for either the SAN or the client.

Filenames on Linux, typically in the /dev/ directory, are dynamically assigned each system boot. As the kernel boots, each available device is detected and a notification is sent to the UDEV (user-space device management) subsystem. By comparing information in the kernel’s device identification to the UDEV rulesets in the /etc/udev/rules.d directory, UDEV assigns a name to the device and creates a device node such as /dev/sda or /dev/mapper/mpath1 so that applications can access the device. If the detected device is a block-structured device, often it has partitions containing file systems which may be mounted according to the specifications in the /etc/fstab file.

Although Linux makes every effort is made to keep the same device name across system reboots, changes in the external environment can affect the actual name choices. For example: the same SAN partition could be /dev/sda on one client, but /dev/sdf on another cluster node, depending on the order in which each host discovered the device, or upon which multipath link comes online first. A node commonly discovers its devices in the same order each boot, but this is not guaranteed. A method to guarantee a persistent, predictable device identification is needed.

Although Linux tries to reassign the same device name each reboot, there is no such coordination across cluster nodes. A partition that reliably appears as /dev/sda1 on one cluster node could easily, and legitimately, consistently appear as /dev/sdj on another cluster node. This can make cluster-wide system administration harder than it need be. The solution provided below is also applicable on clusters where this reboot problem does not occur.

Alternate techniques are available to have a persistent, predictable device name. They are presented below, in order of difficulty.

Mounting By Label

Many file system types support associating an arbitrary string, or label, with each file system. The EXT3 file system is an example:

# /sbin/e2label /dev/sda5
/home

It is common to see an /etc/fstab entry similar to this:

LABEL=/HOME /home auto defaults 0 0

to locate the disk partition labeled /HOME, regardless of which device upon which it appears.

OCFS2 file systems also provide a recognizable label that may be used similarly. See the OCFS2 example below for how to determine the OCFS2 label.

Mounting By UUID

Many file system types assign a Universally Unique Identifier, or UUID, to each formatted disk partition. The EXT3 and OCFS2 file systems are examples of this. The UUID is usually automatically assigned and system administrators are usually discouraged from manually changing the value.

On an EXT3 and other type file systems, use the blkid utility provided as part of the e2fsprogs RPM package. For our example, the output looks like this:

# /sbin/blkid /dev/sda5
/dev/sda5: LABEL="/home" UUID="0c960108-7649-4d8c-a28c-2f75e2f906d3" SEC_TYPE="ext2" TYPE="ext3"

Notice that the UUID is, strictly speaking, only the hexadecimal digits. The “-” characters are simply punctuation to be ignored.

On an OCFS2 file system, the UUID is always reported by the fsck.ocfs2 utility. This utility can safely be used on a mounted disk partition if the “-n” switch is used to ensure a read-only test:

# /sbin/fsck.ocfs2 -n /dev/hda1
Checking OCFS2 filesystem in /dev/hda1:
label: OCFS2
uuid: bc d0 de d0 58 ea 43 11 bd a9 e0 66 e6 cb 37 b4 
number of blocks: 209632
bytes per block: 1024
number of clusters: 52408
bytes per cluster: 4096
max slots: 4

/dev/hda1 is clean. It will be checked after 20 additional mounts.

Again, note that the UUID proper is a string of hexadecimal digits. Here they are punctuated by spaces, but the real UUID is just the digits. To get just the UUID, a short awk(1) program suffices:

# /sbin/fsck.ocfs2 -n /dev/hda1 | /bin/awk '/uuid/ { $1 = ""; gsub( / /, "" ); print }'
bcd0ded058ea4311bda9e066e6cb37b4

Now that we have the UUID, an /etc/fstab entry such as these would mount either the EXT3 or OCFS2 partitions:

UUID=bcd0ded058ea4311bda9e066e6cb37b4   /ocfs2 ocfs2 _netdev  0 2
UUID=0c96010876494d8ca28c2f75e2f906d3 /home  ext3  defaults 0 2

Since the UUID is stored in the disk partition itself, it does not matter if the actual device name changes, we have a guaranteed persistent, predictable method for accessing it.

Using UDEV Rulesets

Another method to have persistent, predictable device names is to leverage the same UDEV service that the kernel uses to assign the device names. This involves creating a UDEV matching rule that uses the device attributes to identify the device and then to create a device node for it. Often the rule just creates a symbolic link to the actual low-level device node the kernel assigns.

This is the technique used to implement the device mapper multipath devices. The kernel creates a /dev/dm-X device; then UDEV rules and the multipathing daemon create /dev/mpath/ link back to the /dev/dm-X device; then the /dev/mapper/mpathN or /dev/mapper/[uuid>]link is created.

Which type of /dev/mapper/ file name gets created is controlled by the /etc/multipath.conf file user_friendly_names setting. The default setting is:

default {
    user_friendly_names yes
}

which, oddly enough, creates the totally meaningless /dev/mapper/mpathN names. By commenting this out, the more imposing-looking /dev/mapper/ form names are used. They may be harder to type, but at least they will be portable across all cluster nodes.

Below is an example of a UDEV matching rule. The line begins with a series of predicates, or matching expressions; these are marked by the “==” operators. If all the predicates match those of the discovered device, then the actions denoted by the “=” clauses are taken.

SYSFS{vendor}=="iriver" SYSFS{model}=="T10" OWNER="user" MODE="0600" SYMLINK+="iriver%n"

This example creates the symbolic link /dev/iriver0 when the first IRiver T10 player is plugged into a USB port. The device will be owned by user with file access permissions 0600. The USB subsystem notices the device has been plugged in and notifies the kernel; attributes discovered about the device are also passed along to the kernel. This information eventually gets to the UDEV subsystem which starts reading the rule sets in /etc/udev/rules.d and matching the device attributes to the predicates for each rule. If all predicates match for a rule, any actions specified by that rule are executed.

Documentation about the UDEV system, including the syntax for writing a UDEV rule, is available on-line via the udev manual page.

The challenging part of writing a UDEV rule is to know what attributes are available so that the device can be properly identified by the rule. The udevinfo utility will display device name and attributes available to the rule. For our example, checking /var/log/messages shows the IRiver device was detected as a block device and the name /dev/sdb assigned. Now, we can obtain all the device information and attributes by looking under the /block/sdb system information:

# /usr/bin/udevinfo -q all -p /block/sdb
P: /block/sdb
N: sdb
S: disk/by-id/usb-iriver_T10
S: disk/by-path/pci-0000:00:07.2-usb-0:1:1.0-scsi-0:0:0:0
E: ID_VENDOR=iriver
E: ID_MODEL=T10
E: ID_REVISION=1.00
E: ID_SERIAL=iriver_T10
E: ID_TYPE=disk
E: ID_BUS=usb
E: ID_PATH=pci-0000:00:07.2-usb-0:1:1.0-scsi-0:0:0:0
Note that the path is relative to the /sys/ directory, not the traditional root. The complete file name would be /sys/block/sdb but udevinfo assumes the /sys part.

Choosing the minimal set of attributes can be tricky, but avoid the temptation to over-specify. Our choice here is to use the vendor name and model name, but any attribute set that identifies the object device may be used. Once the predicates and actions are chosen, put your rules in their own files in the /etc/udev/rules.d directory. Do not modify a ruleset from the distribution or your changes can be lost when the package gets updated. In our example, we used /etc/udev/rules.d/49-iriver.rules for the filename.

With your UDEV rule in place, use the udevtest utility to simulate a UDEV run and to show what would be done.

Related Post