MERGING AND SORTING FILES IN LINUX: EASIER THAN YOU THINK

There are several reasons to choose Linux over other operating systems such as Windows and macOS. Linux is an open-source, secure, and very lightweight operating system consuming minimal system resources. It also has huge community support and has a ton of distros (variants) to choose from. While we have already posted a bunch of articles on simple file handling methods in Linux, sending email from the terminal, and more, we are going to walk you through the simple yet efficient process of merging and sorting files in Linux.

Just like with any other operation in Linux, there are several ways you can sort and merge the files in Linux. Choosing which method to use solely depends on the user and based on what needs to be accomplished. In this article, we will show you some easy yet powerful file sorting and merging methods in Linux while pointing out the differences and importance of each method.

Cat

Cat is one of the easiest and simple commands in Linux that can combine multiple files into one. All you have to do is list all the files that you wish to merge into a single file along with the new file name you wish to create. If a file with the name of the final output already exists, then it will be overwritten by the one being created.

Here is a very simple implementation of cat command.

$ cat file1 file2 file3 file4 > Newfile

However, if you wish to append information from multiple files into an already existing file, you can use ">>" instead of ">." Below is an example

$ cat file1 file2 file3 file4 >> Newfile

The cat command can also be used in many ways. It is also one of the most flexible and simple ways of reading the content of the file. To view the content of a file called file1, simply use the below command.

$cat file1

Join

Join is another command to merge the data of multiple files. While it is as easy and simple as the cat command is, it has a catch. Unlike cat, join cannot just simple combine the data of multiple files. Instead, the command allows users to merge the content of multiple files based on a common field.

For instance, consider that two files need to be combined. One file contains names, whereas the other file contains IDs, and the join command can be used to combine both these files in a way that the names and their corresponding IDs appear in the same line. However, users need to make sure that the data in both these files have the common key field with which they will be joined.

There are several reasons to choose Linux over other operating systems such as Windows and macOS. Linux is an open-source, secure, and very lightweight operating system consuming minimal system resources. It also has huge community support and has a ton of distros (variants) to choose from. While we have already posted a bunch of articles on simple file handling methods in Linux, sending email from the terminal, and more, we are going to walk you through the simple yet efficient process of merging and sorting files in Linux.

Just like with any other operation in Linux, there are several ways you can sort and merge the files in Linux. Choosing which method to use solely depends on the user and based on what needs to be accomplished. In this article, we will show you some easy yet powerful file sorting and merging methods in Linux while pointing out the differences and importance of each method.

azure linux
Shutterstock

Cat

Cat is one of the easiest and simple commands in Linux that can combine multiple files into one. All you have to do is list all the files that you wish to merge into a single file along with the new file name you wish to create. If a file with the name of the final output already exists, then it will be overwritten by the one being created.

Here is a very simple implementation of cat command.

$ cat file1 file2 file3 file4 > Newfile

However, if you wish to append information from multiple files into an already existing file, you can use ">>" instead of ">." Below is an example


$ cat file1 file2 file3 file4 >> Newfile

The cat command can also be used in many ways. It is also one of the most flexible and simple ways of reading the content of the file. To view the content of a file called file1, simply use the below command.

$cat file1

Join

Join is another command to merge the data of multiple files. While it is as easy and simple as the cat command is, it has a catch. Unlike cat, join cannot just simple combine the data of multiple files. Instead, the command allows users to merge the content of multiple files based on a common field.

For instance, consider that two files need to be combined. One file contains names, whereas the other file contains IDs, and the join command can be used to combine both these files in a way that the names and their corresponding IDs appear in the same line. However, users need to make sure that the data in both these files have the common key field with which they will be joined.

Syntax

$join [OPTION] FILE1 FILE2

Example: Assume file1.txt contains ...


1 Aarav
2 Aashi
3 Sukesh

And, file2.txt contains ...

1 101
2 102
3 103

The command ...

$ join file1.txt file2.txt

will result in:

1 Aarav 101
2 Aashi 102
3 Sukesh 103

Note that by default, the join command takes the first column as the key to join multiple files. Also, if you wish to store the final data of the two files joined into another file, you can use this command:

$ cat file1.txt file2.txt > result.txt

Paste

The paste command is used to join multiple files horizontally by performing parallel merging. The command outputs the lines from each file specified, separated by a tab as a delimiter by default to the standard output.

Assume there is a file called numbers.txt containing numbers from 1 to 4. And there are another two files called countries.txt and capital.txt containing four countries and their corresponding capitals, respectively. The command below will join the information of these three files and will be separated by a tab space as a delimiter.

$ paste numbers.txt countries.txt capital.txt

However, you can also specify any delimiter by adding a delimiter option to the above command. For example, if we need the delimited to be "-" you can use this command:

$ paste -d “-” numbers.txt countries.txt capital.txt

Sort

The sort command in Linux, as the name suggests, is used to sort a file as well as arrange the records in a particular order. Sort can also be paired with multiple other Linux commands such as cat by simply joining the two commands using a pipe "|" symbol.

For instance, if you wish to merge multiple files, sort them alphabetically and store them in another file, you can use this command:

$ cat file1.txt file2.txt file3.txt | sort > finalfile.txt

There are several reasons to choose Linux over other operating systems such as Windows and macOS. Linux is an open-source, secure, and very lightweight operating system consuming minimal system resources. It also has huge community support and has a ton of distros (variants) to choose from. While we have already posted a bunch of articles on simple file handling methods in Linux, sending email from the terminal, and more, we are going to walk you through the simple yet efficient process of merging and sorting files in Linux.

Just like with any other operation in Linux, there are several ways you can sort and merge the files in Linux. Choosing which method to use solely depends on the user and based on what needs to be accomplished. In this article, we will show you some easy yet powerful file sorting and merging methods in Linux while pointing out the differences and importance of each method.

azure linux
Shutterstock

Cat

Cat is one of the easiest and simple commands in Linux that can combine multiple files into one. All you have to do is list all the files that you wish to merge into a single file along with the new file name you wish to create. If a file with the name of the final output already exists, then it will be overwritten by the one being created.

Here is a very simple implementation of cat command.

$ cat file1 file2 file3 file4 > Newfile

However, if you wish to append information from multiple files into an already existing file, you can use ">>" instead of ">." Below is an example


$ cat file1 file2 file3 file4 >> Newfile

The cat command can also be used in many ways. It is also one of the most flexible and simple ways of reading the content of the file. To view the content of a file called file1, simply use the below command.

$cat file1

Join

Join is another command to merge the data of multiple files. While it is as easy and simple as the cat command is, it has a catch. Unlike cat, join cannot just simple combine the data of multiple files. Instead, the command allows users to merge the content of multiple files based on a common field.

For instance, consider that two files need to be combined. One file contains names, whereas the other file contains IDs, and the join command can be used to combine both these files in a way that the names and their corresponding IDs appear in the same line. However, users need to make sure that the data in both these files have the common key field with which they will be joined.

Syntax

$join [OPTION] FILE1 FILE2

Example: Assume file1.txt contains ...


1 Aarav
2 Aashi
3 Sukesh

And, file2.txt contains ...

1 101
2 102
3 103

The command ...

$ join file1.txt file2.txt

will result in:

1 Aarav 101
2 Aashi 102
3 Sukesh 103


Note that by default, the join command takes the first column as the key to join multiple files. Also, if you wish to store the final data of the two files joined into another file, you can use this command:

$ cat file1.txt file2.txt > result.txt

Paste

The paste command is used to join multiple files horizontally by performing parallel merging. The command outputs the lines from each file specified, separated by a tab as a delimiter by default to the standard output.

Assume there is a file called numbers.txt containing numbers from 1 to 4. And there are another two files called countries.txt and capital.txt containing four countries and their corresponding capitals, respectively. The command below will join the information of these three files and will be separated by a tab space as a delimiter.

$ paste numbers.txt countries.txt capital.txt

However, you can also specify any delimiter by adding a delimiter option to the above command. For example, if we need the delimited to be "-" you can use this command:

$ paste -d “-” numbers.txt countries.txt capital.txt

There are several other options available for the paste command, and more information can be found here.

Sort

The sort command in Linux, as the name suggests, is used to sort a file as well as arrange the records in a particular order. Sort can also be paired with multiple other Linux commands such as cat by simply joining the two commands using a pipe "|" symbol.

For instance, if you wish to merge multiple files, sort them alphabetically and store them in another file, you can use this command:

$ cat file1.txt file2.txt file3.txt | sort > finalfile.txt

The above command is going to merge the files, sort the overall content, and then store it in the finalfile.txt

You can also use the sort command to simply sort a single file containing information:

$ sort file.txt

The command above does not change or modify the data in file.txt and is, therefore, just for displaying the sorted data on the console.

There are several other ways of merging and sorting files and data in the Linux operating system. What makes Linux unique is its ability to pair up multiple commands to achieve its purpose. Once users start to make themselves acquainted with these commands, it can save a lot of time and effort while performing tasks with more precision and efficiency.
Become a contributor

Spotlight

ENTAX

Welcome to ENTAX! ENTAX is a professional technology advisory firm that provides valuable support to companies looking to obtain non-dilutive government funding and tax credits for Canada's Scientific Research & Experimental Development (SR&ED) program. Every year, over 20,000 innovative companies access this SR&ED program which is estimated to be worth over $3 Billion annually!

Spotlight

ENTAX

Welcome to ENTAX! ENTAX is a professional technology advisory firm that provides valuable support to companies looking to obtain non-dilutive government funding and tax credits for Canada's Scientific Research & Experimental Development (SR&ED) program. Every year, over 20,000 innovative companies access this SR&ED program which is estimated to be worth over $3 Billion annually!

RELATED ARTICLES