Nothing Special   »   [go: up one dir, main page]

Module 5 Dfco

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Design and Characteristics of Memory Hierarchy

The computer memory can be divided into 5 major hierarchies that are based on use as well
as speed. A processor can easily move from any one level to some other on the basis of its
requirements. These five hierarchies in a system’s memory are register, cache memory, main
memory, magnetic disc, and magnetic tape.
In this article, we will take a look at the Design and Characteristics of Memory Hierarchy
according to the GATE Syllabus for CSE (Computer Science Engineering). Read ahead to
learn more.

What are the Design and Characteristics of Memory


Hierarchy?
Memory Hierarchy, in Computer System Design, is an enhancement that helps in organising the
memory so that it can actually minimise the access time. The development of the Memory
Hierarchy occurred on a behaviour of a program known as locality of references. Here is a figure
that demonstrates the various levels of memory hierarchy clearly:
Memory Hierarchy Design
This Hierarchy Design of Memory is divided into two main types. They are:

External or Secondary Memory


It consists of Magnetic Tape, Optical Disk, Magnetic Disk, i.e. it includes peripheral storage
devices that are accessible by the system’s processor via I/O Module.

Internal Memory or Primary Memory


It consists of CPU registers, Cache Memory, and Main Memory. It is accessible directly by the
processor.

Characteristics of Memory Hierarchy


One can infer these characteristics of a Memory Hierarchy Design from the figure given above:

1. Capacity
It refers to the total volume of data that a system’s memory can store. The capacity increases
moving from the top to the bottom in the Memory Hierarchy.

2. Access Time
It refers to the time interval present between the request for read/write and the data availability.
The access time increases as we move from the top to the bottom in the Memory Hierarchy.

3. Performance
When a computer system was designed earlier without the Memory Hierarchy Design, the gap in
speed increased between the given CPU registers and the Main Memory due to a large difference
in the system’s access time. It ultimately resulted in the system’s lower performance, and thus,
enhancement was required. Such a kind of enhancement was introduced in the form of Memory
Hierarchy Design, and because of this, the system’s performance increased. One of the primary
ways to increase the performance of a system is minimising how much a memory hierarchy has
to be done to manipulate data.

4. Cost per bit


The cost per bit increases as one moves from the bottom to the top in the Memory Hierarchy, i.e.
External Memory is cheaper than Internal Memory.

Design of Memory Hierarchy


In computers, the memory hierarchy primarily includes the following:

1. Registers
The register is usually an SRAM or static RAM in the computer processor that is used to hold the
data word that is typically 64 bits or 128 bits. A majority of the processors make use of a status
word register and an accumulator. The accumulator is primarily used to store the data in the
form of mathematical operations, and the status word register is primarily used for decision
making.

2. Cache Memory
The cache basically holds a chunk of information that is used frequently from the main memory.
We can also find cache memory in the processor. In case the processor has a single-core, it will
rarely have multiple cache levels. The present multi-core processors would have three 2-levels
for every individual core, and one of the levels is shared.

3. Main Memory
In a computer, the main memory is nothing but the CPU’s memory unit that communicates
directly. It’s the primary storage unit of a computer system. The main memory is very fast and a
very large memory that is used for storing the information throughout the computer’s operations.
This type of memory is made up of ROM as well as RAM.

4. Magnetic Disks
In a computer, the magnetic disks are circular plates that’s fabricated with plastic or metal with a
magnetised material. Two faces of a disk are frequently used, and many disks can be stacked on
a single spindle by read/write heads that are obtainable on every plane. The disks in a computer
jointly turn at high speed.

5. Magnetic Tape
Magnetic tape refers to a normal magnetic recording designed with a slender magnetizable
overlay that covers an extended, thin strip of plastic film. It is used mainly to back up huge
chunks of data. When a computer needs to access a strip, it will first mount it to access the
information. Once the information is allowed, it will then be unmounted. The actual access time
of a computer memory would be slower within a magnetic strip, and it will take a few minutes for
us to access a strip.
Cache Memory
Cache memory is a high-speed memory, which is small in size but faster than the main
memory (RAM). The CPU can access it more quickly than the primary memory. So, it is
used to synchronize with high-speed CPU and to improve its performance.

Cache memory can only be accessed by CPU. It can be a reserved part of the main
memory or a storage device outside the CPU. It holds the data and programs which
are frequently used by the CPU. So, it makes sure that the data is instantly available for
CPU whenever the CPU needs this data. In other words, if the CPU finds the required
data or instructions in the cache memory, it doesn't need to access the primary
memory (RAM). Thus, by acting as a buffer between RAM and CPU, it speeds up the
system performance.

Types of Cache Memory:


L1: It is the first level of cache memory, which is called Level 1 cache or L1 cache. In
this type of cache memory, a small amount of memory is present inside the CPU itself.
If a CPU has four cores (quad core cpu), then each core will have its own level 1 cache.
As this memory is present in the CPU, it can work at the same speed as of the CPU.
The size of this memory ranges from 2KB to 64 KB. The L1 cache further has two types
of caches: Instruction cache, which stores instructions required by the CPU, and the
data cache that stores the data required by the CPU.
L2: This cache is known as Level 2 cache or L2 cache. This level 2 cache may be inside the
CPU or outside the CPU. All the cores of a CPU can have their own separate level 2 cache, or
they can share one L2 cache among themselves. In case it is outside the CPU, it is connected
with the CPU with a very high-speed bus. The memory size of this cache is in the range of
256 KB to the 512 KB. In terms of speed, they are slower than the L1 cache.

L3: It is known as Level 3 cache or L3 cache. This cache is not present in all the
processors; some high-end processors may have this type of cache. This cache is used
to enhance the performance of Level 1 and Level 2 cache. It is located outside the CPU
and is shared by all the cores of a CPU. Its memory size ranges from 1 MB to 8 MB.
Although it is slower than L1 and L2 cache, it is faster than Random Access Memory
(RAM).

How does cache memory work with CPU?


When CPU needs the data, first of all, it looks inside the L1 cache. If it does not find
anything in L1, it looks inside the L2 cache. If again, it does not find the data in L2
cache, it looks into the L3 cache. If data is found in the cache memory, then it is known
as a cache hit. On the contrary, if data is not found inside the cache, it is called a cache
miss.

If data is not available in any of the cache memories, it looks inside the Random Access
Memory (RAM). If RAM also does not have the data, then it will get that data from the
Hard Disk Drive.

So, when a computer is started for the first time, or an application is opened for the
first time, data is not available in cache memory or in RAM. In this case, the CPU gets
the data directly from the hard disk drive. Thereafter, when you start your computer or
open an application, CPU can get that data from cache memory or RAM.
Associative Memory
An associative memory can be considered as a memory unit whose stored data can be
identified for access by the content of the data itself rather than by an address or memory
location.

Associative memory is often referred to as Content Addressable Memory (CAM).

When a write operation is performed on associative memory, no address or memory location


is given to the word. The memory itself is capable of finding an empty unused location to
store the word.

On the other hand, when the word is to be read from an associative memory, the content of
the word, or part of the word, is specified. The words which match the specified content are
located by the memory and are marked for reading.

The following diagram shows the block representation of an Associative memory.


From the block diagram, we can say that an associative memory consists of a memory array
and logic for 'm' words with 'n' bits per word.

The functional registers like the argument register A and key register K each have n bits, one
for each bit of a word. The match register M consists of m bits, one for each memory word.

The words which are kept in the memory are compared in parallel with the content of the
argument register.

The key register (K) provides a mask for choosing a particular field or key in the argument
word. If the key register contains a binary value of all 1's, then the entire argument is
compared with each memory word. Otherwise, only those bits in the argument that have 1's
in their corresponding position of the key register are compared. Thus, the key provides a
mask for identifying a piece of information which specifies how the reference to memory is
made.

The following diagram can represent the relation between the memory array and the external
registers in an associative memory.
The cells present inside the memory array are marked by the letter C with two subscripts. The
first subscript gives the word number and the second specifies the bit position in the word.
For instance, the cell Cij is the cell for bit j in word i.

A bit Aj in the argument register is compared with all the bits in column j of the array
provided that Kj = 1. This process is done for all columns j = 1, 2, 3......, n.

If a match occurs between all the unmasked bits of the argument and the bits in word i, the
corresponding bit Mi in the match register is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is cleared to 0.

What is Virtual Memory in OS (Operating


System)?
Virtual Memory is a storage scheme that provides user an illusion of having a very big
main memory. This is done by treating a part of secondary memory as the main memory.

In this scheme, User can load the bigger size processes than the available main memory
by having the illusion that the memory is available to load the process.

Instead of loading one big process in the main memory, the Operating System loads the
different parts of more than one process in the main memory.

By doing this, the degree of multiprogramming will be increased and therefore, the CPU
utilization will also be increased.

How Virtual Memory Works?


In modern word, virtual memory has become quite common these days. In this scheme,
whenever some pages needs to be loaded in the main memory for the execution and the
memory is not available for those many pages, then in that case, instead of stopping the
pages from entering in the main memory, the OS search for the RAM area that are least
used in the recent times or that are not referenced and copy that into the secondary
memory to make the space for the new pages in the main memory.

Since all this procedure happens automatically, therefore it makes the computer feel like
it is having the unlimited RAM.

Demand Paging
Demand Paging is a popular method of virtual memory management. In demand paging,
the pages of a process which are least used, get stored in the secondary memory.

A page is copied to the main memory when its demand is made or page fault occurs. There
are various page replacement algorithms which are used to determine the pages which
will be replaced. We will discuss each one of them later in detail.

Snapshot of a virtual memory management


system
Let us assume 2 processes, P1 and P2, contains 4 pages each. Each page size is 1 KB. The
main memory contains 8 frame of 1 KB each. The OS resides in the first two partitions. In
the third partition, 1st page of P1 is stored and the other frames are also shown as filled
with the different pages of processes in the main memory.

The page tables of both the pages are 1 KB size each and therefore they can be fit in one
frame each. The page tables of both the processes contain various information that is also
shown in the image.

The CPU contains a register which contains the base address of page table that is 5 in the
case of P1 and 7 in the case of P2. This page table base address will be added to the page
number of the Logical address when it comes to accessing the actual corresponding entry.
Advantages of Virtual Memory

1. The degree of Multiprogramming will be increased.


2. User can run large application with less real RAM.
3. There is no need to buy more memory RAMs.

Disadvantages of Virtual Memory

1. The system becomes slower since swapping takes time.


2. It takes more time in switching between applications.
3. The user will have the lesser hard disk space for its use.
MEMORY MANAGEMENT HARDWARE
What is memory management?
Memory management is the process of controlling and coordinating a
computer's main memory. It ensures that blocks of memory space are
properly managed and allocated so the operating system
(OS), applications and other running processes have the memory they
need to carry out their operations.

As part of this activity, memory management takes into account the


capacity limitations of the memory device itself, deallocating memory space
when it is no longer needed or extending that space through virtual
memory. Memory management strives to optimize memory usage so
the CPU can efficiently access the instructions and data it needs to execute
the various processes.

What are the 3 areas of memory management?


Memory management operates at three levels: hardware, operating system
and program/application. The management capabilities at each level work
together to optimize memory availability and efficiency.

Memory management at the hardware level. Memory management at


the hardware level is concerned with the physical components that
store data, most notably the random access memory (RAM) chips and
CPU memory caches (L1, L2 and L3). Most of the management that occurs
at the physical level is handled by the memory management unit (MMU),
which controls the processor's memory and caching operations. One of the
MMU's most important roles is to translate the logical addresses used by
the running processes to the physical addresses on the memory devices.
The MMU is typically integrated into the processor, although it might be
deployed as a separate integrated circuit.

Memory management at the OS level. Memory management at the OS


level involves the allocation (and constant reallocation) of specific memory
blocks to individual processes as the demands for CPU resources change.
To accommodate the allocation process, the OS continuously moves
processes between memory and storage devices (hard disk or SSD), while
tracking each memory location and its allocation status.

The OS also determines which processes will get memory resources and
when those resources will be allocated. As part of this operation, an OS
might use swapping to accommodate more processes. Swapping is an
approach to memory management in which the OS temporarily swaps a
process out of main memory into secondary storage so the memory is
available to other processes. The OS will then swap the original process
back into memory at the appropriate time.

Memory management at the program/application level. Memory management


at this level is implemented during the application development process and
controlled by the application itself, rather than being managed centrally by the OS
or MMU. This type of memory management ensures the availability of adequate
memory for the program's objects and data structures. It achieves this by combing
two related tasks:

• Allocation. When the program requests memory for an object or data


structure, the memory is allocated to that component until it is
explicitly freed up. The allocation process might be manual or
automatic. If manual, the developer must explicitly program that
allocation into the code. If the process is automatic, a memory manager
handles the allocation, using a component called an allocator to assign
the necessary memory to the object. The memory manager might be
built into the programming language or available as a separate language
module.

• Recycling. When a program no longer needs the memory space that has
been allocated to an object or data structure, that memory is released
for reassignment. This task can be done manually by the programmer or
automatically by the memory manager, a process often called garbage
collection.

What is Vector Processing in Computer


Architecture

Vector processing is a central processing unit that can perform the complete vector
input in individual instruction. It is a complete unit of hardware resources that
implements a sequential set of similar data elements in the memory using individual
instruction.
The scientific and research computations involve many computations which require
extensive and high-power computers. These computations when run in a
conventional computer may take days or weeks to complete. The science and
engineering problems can be specified in methods of vectors and matrices using
vector processing.

Features of Vector Processing


There are various features of Vector Processing which are as follows −
• A vector is a structured set of elements. The elements in a vector are
scalar quantities. A vector operand includes an ordered set of n
elements, where n is known as the length of the vector.
• Each clock period processes two successive pairs of elements. During
one single clock period, the dual vector pipes and the dual sets of vector
functional units allow the processing of two pairs of elements.
As the completion of each pair of operations takes place, the results are
delivered to appropriate elements of the result register. The operation
continues just before the various elements processed are similar to the
count particularized by the vector length register.
• In parallel vector processing, more than two results are generated per
clock cycle. The parallel vector operations are automatically started
under the following two circumstances −
o When successive vector instructions facilitate different
functional units and multiple vector registers.
o When successive vector instructions use the resulting flow
from one vector register as the operand of another operation
utilizing a different functional unit. This phase is known as
chaining.
• A vector processor implements better with higher vectors because of
the foundation delay in a pipeline.
• Vector processing decrease the overhead related to maintenance of the
loop-control variables which creates it more efficient than scalar
processing.

Array Processors or SIMD Processors


Array processors are also designed for vector computations. The difference between
an array processor and a vector processor is that a vector processor uses multiple
vector pipelines whereas an array processor employs a number of processing elements
to operate in parallel.

An array processor contains multiple numbers of ALUs. Each ALU is provided with the
local memory. The ALU together with the local memory is called a Processing Element
(PE). An array processor is a SIMD (Single Instruction Multiple Data) processor. Thus
using a single instruction, the same operation can be performed on an array of data
which makes it suitable for vector computations.

Fig:- Schematic Diagram of an Array Processor or SIMD Processor


What is Inter Process Communication?
Inter process communication (IPC) is used for exchanging data between
multiple threads in one or more processes or programs. The Processes may be
running on single or multiple computers connected by a network. The full form
of IPC is Inter-process communication.

It is a set of programming interface which allow a programmer to coordinate


activities among various program processes which can run concurrently in an
operating system. This allows a specific program to handle many user requests
at the same time.
Since every single user request may result in multiple processes running in the
operating system, the process may require to communicate with each other.
Each IPC protocol approach has its own advantage and limitation, so it is not
unusual for a single program to use all of the IPC methods.

Approaches for Inter-Process Communication


Here, are few important methods for interprocess communication:
Inter-Process Communication Approaches

Pipes
Pipe is widely used for communication between two related processes. This is a
half-duplex method, so the first process communicates with the second
process. However, in order to achieve a full-duplex, another pipe is needed.

Message Passing:
It is a mechanism for a process to communicate and synchronize. Using
message passing, the process communicates with each other without resorting
to shared variables.

IPC mechanism provides two operations:

• Send (message)- message size fixed or variable


• Received (message)

Message Queues:
A message queue is a linked list of messages stored within the kernel. It is
identified by a message queue identifier. This method offers communication
between single or multiple processes with full-duplex capacity.

Direct Communication:
In this type of inter-process communication process, should name each other
explicitly. In this method, a link is established between one pair of
communicating processes, and between each pair, only one link exists.

Indirect Communication:
Indirect communication establishes like only when processes share a common
mailbox each pair of processes sharing several communication links. A link can
communicate with many processes. The link may be bi-directional or
unidirectional.

Shared Memory:
Shared memory is a memory shared between two or more processes that are
established using shared memory between all the processes. This type of
memory requires to protected from each other by synchronizing access across
all the processes.

You might also like