“comm” Command in Linux with Examples

It is often useful to compare versions of text files. For system administrators and software developers, this is particularly important. A system administrator may, for example, need to compare an existing configuration file to a previous version to diagnose a system problem. Likewise, a programmer frequently needs to see what changes have been made to programs over time.

The comm utility displays a line-by-line comparison of two sorted files. The first of the three columns it displays lists the lines found only in file1, the second column lists the lines found only in file2, and the third lists the lines common to both files. The basic syntax for the “comm” command is:

# comm [options] file1 file2

Arguements

The file1 and file2 arguments are pathnames of the files that comm compares. Using a hyphen (–) in place of file1 or file2 causes comm to read standard input instead of that file.

Options

You can combine the options. With no options, comm produces three-column output.

Options Function
-1 Does not display column 1 (does not display lines found only in file1).
-2 Does not display column 2 (does not display lines found only in file2).
-3 Does not display column 3 (does not display lines found in both files).
-i Case insensitive comparison of lines.
– -check-order Check the order of the input, even if all input lines are pairable
– -nocheck-order Ignore the order of the input
– -output-delimiter=STR delimates columns with delimeter “STR”
– -help Displays a help menu
– -version Display command version information
Note: If the files have not been sorted, comm will not work properly. Lines in the second column are preceded by one TAB, and those in the third column are preceded by two TAB s. The exit status indicates whether comm completed normally (0) or abnormally (not 0).

Examples of using “comm” command in Linux

Example 1: Basic usage

Lets see a basic example of “comm” command to compare 2 sorted files. The files are as shown below:

# cat file1
aa
bb
cc
dd
# cat file2
cc
xx
yy
zz

The comm command compares files line by line and outputs any lines that are identical. For example:

# comm file1 file2
aa
bb
  cc
dd
 xx
 yy
 zz

This command output displays in three columns: column 1 shows lines only in file1 (aa, bb, dd), column2 shows every line only in file2 (xx, yy, zz), and column 3 shows every line that is the same between the two files (cc). This is a much more detailed comparison than with diff, and the output can be overwhelming when all you want is to find or check for one or two simple changes. However, it can be incredibly useful when you aren’t terribly familiar with either file and want to see how they compare.

Example 2: Suppressing the coumns

comm supports options in the form -n where n is either 1, 2, or 3. When used, these options specify which column(s) to suppress. For example, if we wanted to output only the lines shared by both files, we would suppress the output of columns 1 and 2:

# comm -12 file1 file2
cc

Similarly, you can only display the lines which are only present in file1 and file2 respectively using below commands.

# comm -23 file1 file2
aa
bb
dd
# comm -13 file1 file2
xx
yy
zz

Example 3: Sorting check on input

comm command provides 2 options to check for sorted inputs:
1. –check-order
2. –nocheck-order

The –check-order option checks that the input is correctly sorted before comparing. If the input is not sorted, you would get an error as shown below:

# comm --check-order file1 file2
aa
bb
cc
dd
 xx
comm: file 2 is not in sorted order

Whereas the –nocheck-order option allows the file comparison even if the input is not in the sorted format. For example:

# cat file1
aa
bb
cc
dd
# cat file2
xx
cc
yy
zz
# comm --nocheck-order file1 file2
aa
bb
  cc
dd
 xx
 yy
 zz

Example 4: delimited output

comm also provides an option to delemit the output using the user provided delimiter. For example, instead of the default “tab” delimitted output, we can use a delimiter such as “|” (pipe) as shown in the example below:

# comm --output-delimiter="|" file1 file2
aa
bb
||cc
dd
|xx
|yy
|zz

comm V/s diff

comm is similar to diff in that both commands compare two files. But comm can also be used like uniq; comm selects duplicate or unique lines between two sorted files, whereas uniq selects duplicate or unique lines within the same sorted file.

Related Post