55 lines
		
	
	
	
		
			2.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			55 lines
		
	
	
	
		
			2.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
---
 | 
						|
tags:
 | 
						|
  - binary
 | 
						|
---
 | 
						|
 | 
						|
# Binary encoding
 | 
						|
 | 
						|
We know that everything going on in a computer is the manipulation of binary
 | 
						|
digits. Thus all data must ultimately reduce to binary numbers manipulated
 | 
						|
through logic circuits.
 | 
						|
 | 
						|
_Encoding_ is the process of establishing a correspondence between sets of
 | 
						|
binary numbers and sets of symbols that are representative of a given type of
 | 
						|
data. For certain fundamental data types, for example alphanumeric characters
 | 
						|
and colours, there are agreed standards of encoding such that, for example, that
 | 
						|
`111111` (binary) and `3F` (hex) always corresponds to the character `?`. The
 | 
						|
reverse is obviously _decoding_: deriving the data/ symbol from the binary
 | 
						|
format.
 | 
						|
 | 
						|
> An encoding system maps each symbol to a unique sequence of bits. A computer
 | 
						|
> then interprets that sequence of bits and displays the apppropriate symbol to
 | 
						|
> the user.
 | 
						|
 | 
						|
The length of the binary number (sometimes called the _word length_)
 | 
						|
corresponding to 8-bit, 16-bit, 32-bit etc. that is used to represent a given
 | 
						|
data set is determined by the number of variations that you require to capture
 | 
						|
the entire range of the dataset:
 | 
						|
 | 
						|
> An $n$-bit binary system can encode $2^n$ different things
 | 
						|
 | 
						|
For example, say we know that there are 18 levels to a computer game. To encode
 | 
						|
a reference for each level we would need a binary number that is capable of at
 | 
						|
least 18 total variations. In this instance a 32-bit ($2^{5}$ ) number would be
 | 
						|
best because the next smallest (16-bit) would not be sufficient. Our levels
 | 
						|
would have representations as follows:
 | 
						|
 | 
						|
```
 | 
						|
00001 (1)
 | 
						|
00010 (2)
 | 
						|
00011 (3)
 | 
						|
00100 (4)
 | 
						|
...
 | 
						|
```
 | 
						|
 | 
						|
## Related points
 | 
						|
 | 
						|
Think about when you open a file format in a text editor that cannot decode it.
 | 
						|
For example trying to open a Word document in VSCode. The mangled letters it
 | 
						|
displays is the encoded binary data. When you open the file in Word, the
 | 
						|
decoding is applied and it resembles what you would expect.
 | 
						|
 | 
						|
When we save a file, the different file extensions denote different formats and
 | 
						|
these are encoding formats. For example if you save an image file as `.png`
 | 
						|
rather than `.jpg`, you are applying a different encoding algorithm to the data
 | 
						|
that compresses the raw binary data in a different way.
 |