97 lines
		
	
	
	
		
			3.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			97 lines
		
	
	
	
		
			3.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| 
								 | 
							
								---
							 | 
						|||
| 
								 | 
							
								tags:
							 | 
						|||
| 
								 | 
							
								  - Programming_Languages
							 | 
						|||
| 
								 | 
							
								  - shell
							 | 
						|||
| 
								 | 
							
								---
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								## Sorting strings: `sort`
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								If you have a `.txt` file containing text strings, each on a new line you can use the sort function to quickly put them in alphabetical order:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								sort file.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								Note that this will not save the sort, it only presents it as a standard output. To save the sort you need to direct the sort to a file in the standard way:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								sort file.txt > output.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								### Options
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								* `-r`
							 | 
						|||
| 
								 | 
							
								  * reverse sort
							 | 
						|||
| 
								 | 
							
								* `c`
							 | 
						|||
| 
								 | 
							
								  * check if file is already sorted. If not, it will highlight the strings which are not sorted
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								## Find and replace: `sed`
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								The `sed` programme can be used to implement find and replace procedures. In `sed`, find and replace are covered by the substitution option: `/s` :
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								sed ‘s/word/replacement word/’ file.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								This however will only change the first instance of word to be replaced, in order to apply to every instance you need to add the global option: `-g` .
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								As sed is a stream editor, any changes you make using it, will only occur within the standard output , they will not be saved to file. In order to save to file you need to specify a new file output (using `> output.txt`) in addition to the original file. This hasthe benefit of leaving the original file untouched whilst ensuring the desired outcome is stored permanently.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								Alternatively, you can use the `-i` option which will make the changes take place in the source file as well as in standard input.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								Note that this will overwrite the original version of the file and it cannot be regained. If this is an issue then it is recommended to include a backup command in the overall argument like so:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								sed -i.bak ‘s/word/replacement word/’ file.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								This will create the file `file.txt.bak` in the directory you are working within which is the original file before the replacement was carried out.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								### Remove duplicates
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								We can use the `sort -u` command can be used to remove duplicates:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								sort -u file.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								It is important to sort before attempting to remove duplicates since the `-u` flag works on the basis of the strings being adjacent.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								## Split a large file into multiple smaller files: `split`
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								Suppose you have a file containing 1000 lines. You want to break the file up into five separate files, each containing two hundred lines. You can use `split` to accomplish this, like so:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								split -l 200 big-file.txt new-files
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								`split` will categorise the resulting five files as follows:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								* new-file-aa,
							 | 
						|||
| 
								 | 
							
								* new-file-ab
							 | 
						|||
| 
								 | 
							
								* new-file-ac,
							 | 
						|||
| 
								 | 
							
								* newfile-ad,
							 | 
						|||
| 
								 | 
							
								* new-file-ae.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								If you would rather have numeric suffixes, use the option `-d` . You can also split a file by its number of bytes, using the option `-b` and specifying a constituent file size.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								## Merge multiple files into one with `cat`
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								We can use `cat` read multiple files at once and then append a redirect to save them to a file:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								cat file_a.txt file_b.txt file_c.txt > merged-file.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								## Count lines, words, etc: `wc`
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								To count words:
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								````bash
							 | 
						|||
| 
								 | 
							
								wc file.txt
							 | 
						|||
| 
								 | 
							
								````
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								When we use the command three numbers are outputted, in order: lines, words, bytes.
							 | 
						|||
| 
								 | 
							
								
							 | 
						|||
| 
								 | 
							
								You can use modifiers to get just one of the numbers: `-l`, `-w` , `-b` .
							 |