Last Sync: 2022-10-06 11:00:05
This commit is contained in:
parent
d0ca9a56f7
commit
5da93d5b97
2 changed files with 31 additions and 1 deletions
|
@ -9,7 +9,7 @@ tags: [binary]
|
|||
|
||||
We know that everything going on in a computer is the manipulation of binary digits. Thus all data must ultimately reduce to binary numbers controlled through logic circuits.
|
||||
|
||||
_Encoding_ is the process of establishing a correspondence between certain binary numbers and symbols. For certain essential data types, for instance alphanumeric characters and colours, there are agreed standards of encoding such that, for example, that `111111` (binary) and `3F` (hex) always corresponds to the character `?`.
|
||||
_Encoding_ is the process of establishing a correspondence between certain binary numbers and symbols. For certain essential data types, for instance alphanumeric characters and colours, there are agreed standards of encoding such that, for example, that `111111` (binary) and `3F` (hex) always corresponds to the character `?`. The reverse is obviously _decoding_: deriving the data/ symbol from the binary format.
|
||||
|
||||
The length of the binary number is determined by the number of variations that you require to capture a given dataset. For example, say we know that there are 18 levels to a computer game. To encode a reference for each level we would need a binary number that is capable of at least 18 total variations.
|
||||
|
||||
|
@ -22,3 +22,11 @@ Here, a 32-bit ($2^{5}$) number would be best because the next smallest (16-bit)
|
|||
00100 (4)
|
||||
...
|
||||
```
|
||||
|
||||
> An encoding system maps each symbol to a unique sequence of bits. A computer then interprets that sequence of bits and displays the apppropriate symbol to the user.
|
||||
|
||||
## Related points
|
||||
|
||||
Think about when you open a file format in a text editor that cannot decode it. For example trying to open a Word document in VSCode. The mangled letters it displays is the encoded binary data. When you open the file in Word, the decoding is applied and it resembles what you would expect.
|
||||
|
||||
When we save a file, the different file extensions denote different formats and these are encoding formats. For example if you save an image file as `.png` rather than `.jpg`, you are applying a different encoding algorithm to the data that compresses the raw binary data in a different way.
|
||||
|
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: Text encoding
|
||||
categories:
|
||||
- Computer Architecture
|
||||
tags: [binary, ascii]
|
||||
---
|
||||
|
||||
# Text encoding
|
||||
|
||||
Text encoding is an applied instance of [binary encoding](/Hardware/Binary/Binary_encoding.md).
|
||||
|
||||
There are around 100 characters in total required to render A-Z, a-z, 0-9 and special characters. The ASCII (American Standard Code for Information Interchange) system achieves this with 8-bit code. Thus, each character symbol corresponds to a byte. As $2^8 = 256$. This allows for a total of 256 characters (where only 7-bits are sufficient, a leading `0` is added).
|
||||
|
||||
Below are some examples of the ASCII correspondences:
|
||||
|
||||
| Binary | Hex | Character |
|
||||
| --------- | --- | --------- |
|
||||
| 00100000 | 20 | [space] |
|
||||
| 00100001 | 21 | ! |
|
||||
| 001010112 | 2B | + |
|
||||
|
||||
//TODO: Add notes on unicode and UTF-8
|
Loading…
Add table
Reference in a new issue