TrumanWong

comm

Compare two sorted files line by line.

Summary

comm [OPTION]... FILE1 FILE2

The main purpose

  • Compare two sorted files line by line.
  • When FILE1 or FILE2 is -, read standard input.
  • When there is no option, three columns are output. The first column is the rows unique to FILE1, the second column is the rows unique to FILE2, and the third column is the rows common to FILE1 and FILE2.

Options

-1 does not output the first column.
-2 does not output the second column.
-3 does not output the third column.
--check-order Check that input lines are correctly ordered, even if they are indeed sorted.
--nocheck-order Do not check input lines for correct ordering.
--output-delimiter=STR Use STR as the delimiter between output columns instead of the default TAB.
--total Adds an additional fourth column to the output summary.
-z, --zero-terminated Set the line terminator to NUL (empty) instead of newline.
--help Display help information and exit.
--version Display version information and exit.

return value

Returning 0 indicates success, returning a non-zero value indicates failure.

example

Text aaa.txt content

[root@localhost text]# cat aaa.txt
aaa
bbb
ccc
ddd
eee
111
222

Text bbb.txt content

[root@localhost text]# cat bbb.txt
bbb
ccc
aaa
hhh
ttt
jjj

Comparing results

[root@localhost text]# comm --nocheck-order aaa.txt bbb.txt
aaa
                 bbb
                 ccc
         aaa
ddd
eee
111
222
         hhh
         ttt
         jjj

The first column of the output contains only lines that appear in aaa.txt, the second column contains lines that appear in bbb.txt, and the third column contains the same lines that appear in aaa.txt and bbb.txt. Each column is separated by a tab character (\t).

Compare sorted documents

First sort the file contents by sort:

[root@localhost ~]# sort aaa.txt > aaa1.txt
[root@localhost ~]# sort bbb.txt > bbb1.txt

Comparing results:

[root@localhost ~]# comm aaa1.txt bbb1.txt
111
222
aaa
bbb
ccc
ddd
eee
hhh
jjj
ttt

Intersection

To print the intersection of two files, you need to delete the first and second columns:

[root@localhost text]# comm aaa.txt bbb.txt -1 -2
bbb
ccc

Difference set

By deleting unnecessary columns, you can get the difference between aaa.txt and bbb.txt:

Difference set of aaa.txt

[root@localhost text]# comm aaa.txt bbb.txt -2 -3
aaa
ddd
eee
111
222

Difference set of bbb.txt

[root@localhost text]# comm aaa.txt bbb.txt -1 -3
aaa
hhh
ttt
jjj

Notice

  1. This command is a command in the GNU coreutils package. For related help information, please see man -s 1 comm, info coreutils 'comm invocation'.