Linux VPS comm Command Guide with Examples
Linux VPS comm command with its straightforward syntax, comm [options] file1 file2
, is a powerful and efficient tool for comparing and analyzing two sorted files line by line.
After buying Linux VPS learning how to use the comm command is beneficial for using quick and efficient way to compare sorted files on your Linux VPS, especially in scenarios such as handling large datasets, code comparison, analyzing lists of users, and identifying differences in configuration files or log analysis.
The Linux comm
command is valuable for a wide range of tasks in data processing and system administration. This article will make you proficient in using the Linux comm
command to efficiently manage your Linux VPS.
Prerequisites to use comm command in Linux VPS
- Linux VPS or system running Linux
- Accessing terminal
- Sorted files (in alphabetical order), because comm command works with sorted files.
- Verifying read permissions for the files (using ls -l or stat commands)
- Ensuring the file’s content is in plain text format where each line represents a comparable unit of data.
General comm command syntax in Linux VPS
As we mentioned, the comm
command in Linux VPS is used to compare two sorted files. Here is its basic syntax:
comm [options] file1 file2
By default, the comm
command outputs the contents of files line by line that are unique to each file, as well as the lines that are common to both. This is valuable for analyzing the contents of files and identifying differences and similarities between two files.
The comm
command displays the result in three columns:
- Column 1 (Left Column): Displays lines unique to first file.
- Column 2 (Middle Column): Displays lines unique to the second file.
- Column 3 (Right Column): Displays lines common to both files.
The comm Command Options in Linux VPS
Extensive options are available for the comm
command, allowing you to customize the comm
command behavior. Here are some of the options for comm
command in Linux VPS:
Options | Function |
---|---|
-1 | Suppresses the output of column 1 (lines unique to the first file). |
-2 | Hides the output of column 2 (lines unique to the second file). |
-3 | Suppresses the output of column 3 (lines common to both files). |
--check-order | Checks the state of the input files whether they are already sorted before or not. (to avoid facing errors.) |
--nocheck-order | Skips the sorting check. |
-t STR (or --output-delimiter=STR) | Customizes the default tab separator between columns based a user defined string STR. |
--total | Displays the total number of lines in each column. |
--help | Prints help information. |
--version | Displays the version information. |
The comm practical examples in Linux
Using the comm
command can be more straightforward if you learn how to use it for various tasks and purposes. This section presents some practical examples to make the comm tool easy to use for quick comparisons.
Remember that the comm
command is a standard utility available in all Linux distributions such as Ubuntu, Debian, Centos, etc.
To learn how to use the comm
command, create two sample files using the touch command, name the file (for example, file1.txt and file2.txt), and write the different data to each file.
Prints the file1.txt and file2.txt content using cat command. for example:
cat file1.txt
Output:
Black
White
Red
Yellow
Blue
cat file2.txt
Output:
Green
Cyan
Purple
white
Black
Blue
let’s get more familiar with the Linux comm
command.
Comparing two sorted files using comm
The basic usage of the comm
command in Linux involves specifying the two sorted files to compare them line by line. To print the lines that differ between file1.txt and file2.txt along with the common lines, run the following command:
comm file1.txt file2.txt
Output:
Black
Green
White
Cyan
Red
Purple
Yellow
Blue
white
As you see, this shows White, Red, and Yellow are unique to file1.text, “Green”, “Cyan”, “Purple” and “white” are unique to file2.txt, and “Black” and “Blue” are present in both.
You might wonder why White is not displayed in column 3, this is because the comm
command is case sensitive and recognizes White (which starts with uppercase) and white (which starts with lowercase) differently from each other.
Comparing two files while ignoring case sensitivity using comm
As we talked about previously, by default, the comm
command is case-sensitive. Therefore, for a case-insensitive comparison, you can use additional tools such as tr to preprocess the files.
So, convert all characters within files to lowercase letters and save the result in new files using the following commands:
tr A-Z a-z < file1.txt > Temp_1
tr A-Z a-z < file2.txt > Temp_2
Therefore, by creating new files, the original files remain unchanged.
Then, sort the converted files using sort
command and compare the sorted files using comm:
comm Temp_1 Temp_2
Output:
black
green
white
cyan
red
purple
yellow
blue
Suppressing specific columns in comm output
The -1
, -2
, and -3
options are the most used options of comm
command for controlling the output columns. Therefore you can hide the specific columns in comm
command output using the -1
, -2
, and -3
flags.
For example, run the comm
command along with -1
flag to hide column 1 (left column) when comparing file1.txt and file2.txt:
comm -1 file1.txt file2.txt
Output:
Black
Green
Cyan
Purple
Blue
white
Run the comm
command along with -2
flag to suppressing column2 (middle column) when analyzing file1.txt and file2.txt:
comm -2 file1.txt file2.txt
Output:
Black
White
Red
Yellow
Blue
To customize the output of the comm
command in Linux VPS to display only lines unique to the first file and lines unique to the second file by hiding column 3 (right column), use the -3 flag with the comm command:
comm -3 file1.txt file2.txt
Output:
Green
White
Cyan
Red
Purple
Yellow
white
You can also customize the output of the comm
command by combining -1
and -2
flags to force the comm
command to display only lines common for both files:
comm -12 file1.txt file2.txt
Output:
Black
Blue
By combining the options, you can identify the differences quickly, for example,the comm
command can print the lines in file1.txt but not file2.txt with -13
flag:
comm -13 sorted_file1.txt sorted_file2.txt
Output:
White
Red
Yellow
Comparing unsorted files using comm
The Linux VPS comm
command requires sorted files for accurate comparison and resulting valuable output. When the comm
command works with unsorted files, it prints an error message that the files are not in sorted order.
You can verify that the files are sorted using comm --check-order
:
comm --check-order unsorted_file1.txt unsorted_file2.txt
If it outputs that files are not in sorted order, you must sort the files before comparing.
Therefore, sort the files before comparing using sort
command:
sort -o file1.txt file1_sorted.txt
sort -o file2.txt file2_sorted.txt
After sorting files you can run the cat
command to verify that the file is sorted now.
This command creates temporary files, that you can remove them after comparing them using rm
command.
Alternatively, you can use the following command to sort and compare files simultaneously without creating temporary files:
comm <(sort file1.txt) <(sort file2.txt)
In this command file1.txt and file2.txt are unsorted files.
This command directly gives the sorted data to comm
for comparison without creating intermediate files.
Note: When sorting files, using sort-u removes duplicate lines within files that allow you to compare only unique lines. ( syntax of this command:sort -u file1.txt -o file1.txt
)
There is another way to compare unsorted files using comm which is using --nocheck-order
with comm
command:
comm --nocheck-order file1.txt file2.txt
The --nocheck-order
option forces the comm to display a result and hide the error message. However, you can not trust this command’s output because it can be incorrect or unusable.
Customizing column default separators in comm output
By default, columns in the comm
command’s output are separated with spaces. However, you can change the default separator using the -t STR
or --output-delimiter=STR
option:
comm -t '#' file1.txt file2.txt
Or
comm --output-delimiter=# file1.txt file2.txt
Output:
##Black
#Green
White
#Cyan
Red
#Purple
Yellow
##Blue
#white
This output indicates that file2.txt includes items with one # symbol, file1.txt includes items with no # symbol, and items with two # symbols are common for both files.
Comparing a file with standard terminal input using comm
Using the hyphen as one of the options with the comm
command in Linux VPS enables you to directly compare the desired file with the standard terminal input (data entered directly into the terminal). For example, assume we have a file named “file.txt” with the following contents:
Contents of file.txt:
Red
Blue
Black
To compare the file.txt with the terminal input run the following command:
comm file.txt -
Then enter the lines in the terminal that you want to compare with file.tx, then press Enter. For example:
comm file.txt -
White
Orange
Black
Red
White
Blue
Orange
Black
The column1 provides lines unique to file.txt, the column2 indicates lines you entered in the terminal and the column3 represents lines common for both.
Comparing directories using comm
The comm
command is designed for file comparison, not directories directly. However, you can use the comm in combination with ls to compare directories. To do this run the following command:
comm <(ls Directory1) <(ls Directory2)
Sample output:
Differences1
Differences2
Differences3
Differences4
The same file
The column 1 represents a list of filenames unique to Directory 1 and column 2 displays a list of filenames unique to Directory 2 and column 3 shows a list of filenames common for both.
Customizing the comm output to display Line Counts
Using the --total
option with the comm
command in Linux changes the default output of the comm
command when comparing two sorted files to show the line counts. to do this run the following command when comparing file1.txt and file2.txt:
comm --total file1.txt file2.txt
Output:
Black
Green
White
Cyan
Red
Purple
Yellow
Blue
white
3 4 2
Comparing files by filtering results using comm and grep
The Linux VPS comm
command can be piped with other commands for more complex processing. For example, you can filter out lines unique to file1.txt that contain a specific pattern by combining comm
and grep in Linux:
comm -23 file1.txt file2.txt | grep 'pattern'
Conclusion
The comm
command is an efficient and powerful tool for comparing sorted files. Therefore, becoming proficient in the comm
command is essential to perform administration tasks optimally.
This article has provided a comprehensive guide to use Linux VPS comm command, if you need more information about comm
command in Linux, refer to the command’s man page or leave a comment below.