From 537038e32c8a4a2173d035dc7064a34429c5b4ae Mon Sep 17 00:00:00 2001 From: thomasabishop Date: Fri, 13 Oct 2023 11:26:39 +0100 Subject: [PATCH] architecture: complete updates to notes on memory --- Computer_Architecture/Memory/Memory.md | 80 ++++++++++++++++++++- Computer_Architecture/Memory/Memory_temp.md | 60 ---------------- 2 files changed, 78 insertions(+), 62 deletions(-) delete mode 100644 Computer_Architecture/Memory/Memory_temp.md diff --git a/Computer_Architecture/Memory/Memory.md b/Computer_Architecture/Memory/Memory.md index 5ce14dc..c054756 100644 --- a/Computer_Architecture/Memory/Memory.md +++ b/Computer_Architecture/Memory/Memory.md @@ -39,6 +39,50 @@ SRAM (Static Random Access Memory) is also volatile memory but its electronical SRAM uses [flip flops](/Electronics_and_Hardware/Digital_circuits/Flip_flops.md) to store the bits. It also uses multiple transistors per bit. This makes it faster than DRAM but more expensive. DRAM is at least ten times slower than SRAM. +## The role of memory in computation + +The following steps outline the way in which memory interacts with the processor during computational cycles, once the [bootstrapping](/Operating_Systems/Boot_process.md) process has completed and the OS kernel is itself loaded into memory. + +1. A file is loaded from the harddisk into memory. +2. The instruction at the first address is sent to the CPU, travelling accross the data bus part of the [system bus](/Computer_Architecture/Bus.md). +3. The CPU processes this instruction and then sends a request accross the address bus part of the system bus for the next instruction to the memory controller within the [chipset](/Computer_Architecture/Chipset_and_controllers.md). +4. The chipset finds where this instruction is stored within the [DRAM](/Computer_Architecture/Memory/Memory.md#dram) and issues a request to have it read out and send to the CPU over the data bus. + +> This is a simplified account; it is not the case that only single requests are passed back and forth. This would be inefficient and time-wasting. The kernel sends to the CPU not just the first instruction in the requested file but also a number of instructions that immediately follow it. + +![](/_img/memory-flow.svg) + +Every part of the above process - the journey accross the bus, the lookup in the controller, the operations on the DRAM, the journey back accross the bus - takes multiple CPU clock cycles. + +## CPU register and cache memory + +As partly indicated in the diagram above, the CPU has its own memory in the form of registers and cache memory. + +Registers are a form of memory that are positioned on the same chip as the CPU. They are very fast but can only store a small amount of data. They are used to store the results of calculations and the addresses of the next instructions to be processed. + +The cache is SRAM memory that is separate from the DRAM memory which comprises the main memory. It exists in order to boost perfomance when executing the read/request cycles of the steps detailed above. + +For more detail see [CPU architecture](/Computer_Architecture/CPU/CPU_architecture.md). + +The cache is SRAM memory that is separate from the DRAM memory which comprises the main memory. It exists in order to boost perfomance when executing the read/request cycles of the steps detailed above. + +There are two types of cache memory: + +- L1 cache + - Situated on the CPU chip itself +- L2 cache + - Situated outside of the CPU on its own chip + +The L1 cache is the fastest since the data has less distance to travel when moving to and from the CPU. This said, the L2 cache is still very fast when compared to the main memory, both because it is SRAM rather than DRAM and because it is closer to the processor than the main memory. + +Cache controllers use complex algorithms to determine what should go into the cache to facilitate the best performance, but generally they work on the principle that what has been previously used by the CPU will be requested again soon. If the CPU has just asked for an instruction at memory location 555 it's very likely that it will next ask for the one at 556, and after that the one at 557 and so on. The cache's controller circuits therefore go ahead and fetch these from slow DRAM to fast SRAM. + +## The memory hierarchy + +The diagram below compares the different forms of memory within a computing device in terms of speed, monetary cost and capacity: + +![](/_img/Memory-Hierarchy.jpg) + ## Memory addresses > Computers assign numeric addresses to bytes of memory and the CPU can read or write to those addresses @@ -57,6 +101,38 @@ Let's imagine we have a computer system that can address up to 64KB of memory an We therefore have 65,536 addresses and each address can store one byte. So our addresses go from 0 to 65, 535. -We now need to consider how many bits we need to represent an address on this system. +We now need to consider how many bits we need to uniquely represent an address on this system. -ChatGPT account: https://chat.openai.com/c/921e1415-4965-4fc4-af11-58db5dcef0f1 +What does this mean? Although there are approximately 64 thousand bytes of memory, to refer to each byte we can't just use 1, 2, 3... because computers use binary numbers. We need a binary number to refer to a given byte in the the 64KB of memory. The question we are asking is: how long does this binary number need to be to be able to represent each of the 64 thousand bytes? + +1 bit can represent two addresses: 0 and 1. 2 bits can represent four addresses: 00, 01, 10, 11. The formula is as follows: number of addresses = $2^n$ where $n$ is the number of bits. + +We need to reverse this formula to find out how many bits we need to represent a given number of addresses. We can do this with a [logarithm](/Mathematics/Algebra/Logarithms.md). + +We can reverse the formula as follows: number of bits = $\log_2$(number of addresses). + +In our case we have 65,536 addresses so we need $\log_2(65,536)$ bits to represent each address. This is approximately 16 bits. Thus a 16 bit memory address is needed to address 65, 546 bytes. + +Using memory addresses we end up with tables like the following: + +| Memory address | Data | +| ---------------- | ---------------- | +| 0000000000000000 | 1010101010101010 | +| 0000000000000001 | 0010001001001011 | +| 0000000000000010 | 0010001001001010 | + +This is hard to parse so we can instead use [hexadecimal numbers](/Electronics_and_Hardware/Binary/Hexadecimal_number_system.md) to represent the addresses: + +| Memory address (as hex) | Data (as binary) | +| ----------------------- | ---------------- | +| 0x0000 | 1010101010101010 | +| 0x0001 | 0010001001001011 | +| 0x0002 | 0010001001001010 | + +By itself, the the data is meaningless but we know from [binary encoding](/Electronics_and_Hardware/Binary/Binary_encoding.md) that the binary data will correspond to some meaningful data, such as a character or a colour, depending on the encoding scheme used. The above table could correspond to the characters for 'A', 'B' and 'C' in the ASCII encoding scheme: + +| Memory address (as hex) | Data (as binary) | Data (as ASCII) | +| ----------------------- | ---------------- | --------------- | +| 0x0000 | 1010101010101010 | A | +| 0x0001 | 0010001001001011 | B | +| 0x0002 | 0010001001001010 | C | diff --git a/Computer_Architecture/Memory/Memory_temp.md b/Computer_Architecture/Memory/Memory_temp.md deleted file mode 100644 index 65422d9..0000000 --- a/Computer_Architecture/Memory/Memory_temp.md +++ /dev/null @@ -1,60 +0,0 @@ -### Relative speeds and placement of memory types - -SRAM is used as [cache memory](/Computer_Architecture/Memory/Role_of_memory_in_computation.md#the-role-of-the-cache) on the [motherboard](/Electronics_and_Hardware/Motherboard.md) of which there are two types: L1 (on the processor chip) and L2 (separate from the processor). - -The table below details the relative speeds of the different types of memory and those of other types of motherboard storage. - -| Storage type | Access speed (clock cycles) | Relative times slower | -| ------------ | --------------------------- | --------------------- | -| CPU register | 2 | | -| L1 cache | 4 | 2x | -| L2 cache | 6-20 | 3-10x | -| DRAM memory | 50 | 25x | -| Harddisk | 2000 | 1000x | - -## The memory hierarchy - -The diagram below compares the different forms of memory within a computing device in terms of speed, monetary cost and capacity: - -![](/_img/Memory-Hierarchy.jpg) - -# The role of memory in computation - -The following steps outline the way in which memory interacts with the processor during computational cycles, once the [bootstrapping](/Operating_Systems/Boot_process.md) process has completed and the OS kernel is itself loaded into memory. - -1. A file is loaded from the harddisk into memory. -2. The instruction at the first address is sent to the CPU, travelling accross the data bus part of the [system bus](/Computer_Architecture/Bus.md). -3. The CPU processes this instruction and then sends a request accross the address bus part of the system bus for the next instruction to the memory controller within the [chipset](/Computer_Architecture/Chipset_and_controllers.md). -4. The chipset finds where this instruction is stored within the [DRAM](/Computer_Architecture/Memory/Memory.md#dram) and issues a request to have it read out and send to the CPU over the data bus. - -> This is a simplified account; it is not the case that only single requests are passed back and forth. This would be inefficient and time-wasting. The kernel sends to the CPU not just the first instruction in the requested file but also a number of instructions that immediately follow it. - -![](/_img/memory-flow.svg) - -Every part of the above process - the journey accross the bus, the lookup in the controller, the operations on the DRAM, the journey back accross the bus - takes multiple CPU clock cycles. - -## The role of the cache - -The cache is SRAM memory that is separate from the DRAM memory which comprises the main memory. It exists in order to boost perfomance when executing the read/request cycles of the steps detailed above. - -There are two types of cache memory: - -- L1 cache - - Situated on the CPU chip itself -- L2 cache - - Situated outside of the CPU on its own chip - -The L1 cache is the fastest since the data has less distance to travel when moving to and from the CPU. This said, the L2 cache is still very fast when compared to the main memory, both because it is SRAM rather than DRAM and because it is closer to the processor than the main memory. - -Cache controllers use complex algorithms to determine what should go into the cache to facilitate the best performance, but generally they work on the principle that what has been previously used by the CPU will be requested again soon. If the CPU has just asked for an instruction at memory location 555 it's very likely that it will next ask for the one at 556, and after that the one at 557 and so on. The cache's controller circuits therefore go ahead and fetch these from slow DRAM to fast SRAM. - -## Relation between cache and buffers - -The terms _cache_ and _buffer_ are often used interchangeably because they are both types of temporary storage used to speed up CPU operations. Also they are both mechanisms for avoiding writing data to a storage device in the midst of active computation. They are different however: - -- A cache is used to store a subset of data (typically transient in nature) from a more permanent or slower storage location. In the context of the CPU, the L1 is a cache whereas the DRAM is the more permanent storage location. -- A buffer is a temporary storage area for data while it is being transferred from one place to another. It helps with "smoothing out" data transfers, ensuring that the sending and receiving entities (which might operate at different speeds) can handle the data transfer effectively. - -Whereas a CPU cache is a **physical** part of the processor a buffer is more of a **logical** concept implemented within the system software. However a buffer does use physical memory - it is portion of RAM set aside for temporary storage. - -[Registers](/Computer_Architecture/CPU/CPU_architecture.md#registers) should not be confused with caches. Unlike the caches, registers are a part of the CPU itself. They are much quicker but hold less data than the caches.