Lecture 3 - Memory Interface I
Lecture 3 - Memory Interface I
Lecture 3 - Memory Interface I
MAXIMUM MINIMUM
MAXIMUM MINIMUM
MODE /MX=gnd MODE MX=+5V
MODE /MX=gnd MODE MX=+5V
GND 1 40 Vcc
GND 1 40 Vcc
AD14 AD15 A14 A15
1 0 1 IOW (8088)
RD’
MEMW (8086)
0 0 X NO USE
5 6
9 10
11 12
Maximum Mode
MAXIMUM MINIMUM
MODE MODE
GND 1 40 Vcc
AD14 AD15 S2’,S1’,S0’ (26-28)– identifies
AD13 A16,S3 the status of current bus cycle.
AD12 A17,S4
AD11 A18,S5
(Decoded by 8288 – the Bus Controller).
AD10 A19,S6
AD9 /BHE,S7
AD8 MN,/MX QS0, QS1 (24-25)–instruction queue code
AD7 /RD
AD6 /RQ,/GT0 HOLD
Tells the external circuit what type
8086
AD5 /RQ,/GT1 HLDA of info was removed from the queue
AD4 /LOCK /WR
AD3 /S2 IO/M
during the previous clock cycle.
AD2 /S1 DT/R In IBM PC, these pins are connected to
AD1 /S0 /DEN
AD0 QS0 ALE 8087 to synchronize with 8088.
Minimum Mode 8086 NMI QS1 /INTA
INTR /TEST
CLK READY
GND 20 21 RESET
13 14
8088/86 pins
19 20
8088/86 Hardware Organization of the Memory Space Memory Bank Selection in 8086
Byte-Wide addressing
ODD Addresses (8086) EVEN Addresses (8086)
(8088)
BHE’ A0/BLE Selection
FFFFF FFFFF FFFFE
FFFFE ONE ADDRESS FFFFD FFFFC
FFFFD FFFFB FFFFA
FFFFC FFFF9 FFFF8
0 0 Whole word (16-bits)
A19..A1 A19..A1
0 1 High byte to/from odd
address
00002 00005 00004
00001 00003 00002 1 0 Low byte to/from even
00000 00001 00000 address
1 1 None
D15:D8 D7:D0
BHE BLE(A0)
21 22
23 24
8088/86 pins 8088/86 pins
MAXIMUM MINIMUM MAXIMUM MINIMUM
MODE MODE MODE MODE
GND 1 40 Vcc READY (22) GND 1 40 Vcc
AD14 AD15 • Input signal AD14 AD15
AD13 A16,S3
• Used to insert a wait state for slower AD13 A16,S3 S3,S4,S5,S6,S7 (35-38) – STATUS
AD12 A17,S4 AD12 A17,S4
memories and IO. • Output signal
AD11 A18,S5 AD11 A18,S5
AD10 A19,S6 • As long as READY is held at ‘0’, wait AD10 A19,S6
• S7: Logic 1, S6: Logic 0.
AD9 /BHE,S7 states are inserted. AD9 /BHE,S7 S5: Indicates condition of IF flag bit.
AD8 MN,/MX AD8 MN,/MX S4-S3: Indicate which segment is
AD7 /RD AD7 /RD
accessed during current bus cycle:
AD6 /RQ,/GT0 HOLD AD6 /RQ,/GT0 HOLD
8086 8086
AD5 /RQ,/GT1 HLDA AD5 /RQ,/GT1 HLDA
TEST’ (23)
AD4 /LOCK /WR AD4 /LOCK /WR
AD3 /S2 IO/M • Input to µp AD3 /S2 IO/M
AD2 /S1 DT/R • Related to external interrupt interface AD2 /S1 DT/R
AD1 /S0 /DEN • Used to synchronize the operation of the AD1 /S0 /DEN
AD0 QS0 ALE AD0 QS0 ALE
NMI QS1 /INTA
µp to an event in external hardware. NMI QS1 /INTA
INTR /TEST Ex: Serve as input signal from 8087 INTR /TEST
CLK READY Coprocessor CLK READY
GND 20 21 RESET GND 20 21 RESET
25 26
27 28
Example 1 Example 2
A given memory chip has 12 address pins and A 512K capacity memory chip has 8 pins for
4 data pins. Find data. Find:
(a) the organization (a) the organization
(b) the capacity (b) the number of address pins for this
memory chip
Solution:
(a) This memory chip has 4096 locations (2^12 = 4096), Solution:
each location can hold 4 bits of data. (a) A memory chip with 8 data pins means that each location within
This gives an organization of 4096 x 4, often represented as 4Kx4. the chip can hold 8 bits of data.
(b) 4Kx4=16K To find the number of locations within this memory chip, divide the
capacity by the number of data pins. 512K/8 = 64K; therefore,
the organization for this memory chip is 64Kx8;
(b) The chip has 16 address pins 64K=65536 = 2^16
From the book The 80x86 IBM PC by Muhammad Ali From the book The 80x86 IBM PC by Muhammad Ali
29 30
Mazidi and Janice Gilispie pg. 267 Mazidi and Janice Gilispie pg. 267
RAM – volatile
Faster than ROM
RAM is often used to shadow the BIOS
ROM
RAM
Static & Dynamic
Volatile & Non-volatile RAM
31 32
EPROM -2764
Access Time
amount of time that it takes for the memory to
produce the data required, from the start of
the access until when the valid data is
available for use
SIZE?
Ranging 2ns to 70ns - depending on the IC
technology used in the design and
fabrication.
33 34
EEPROM
EEPROM X2816C
EEPROM allows its contents to be programmed
and erased while it is still in the system board.
It does not require physical removal of the
memory chip from its socket.
35 36
Flash memory EPROM
Static RAM(1)
The process of erasure of the entire
contents takes less than a second. Used as Cache memory, LI & L2.
Storage cells in static RAM memory are made
The erasure method is electrical.
of flip-flops and therefore do not require
Difference between EEPROM and flash refreshing in order to keep their data. This is in
memory is the fact that when flash contrast to DRAM
memory's contents are erased, the SRAM bit consist 4 to 6 transistor, DRAM just 1
entire device is erased, in contrast to + a capacitor.
EEPROM, where one can erase a SRAM chips consists thousands, millions of
desired section or byte. identical cells.
Incomparison, CPU occupies a large area of die
with non-repetitive structure.
37 38
39 40
DRAM (1) DRAM (2)
High density (capacity),
Holds its data if it is continuously accessed by special Cheaper cost per bit,
logic called a refresh circuit.
Lower power consumption per bit.
Keep on refresh whether it been used by CPU or not.
Must be refreshed periodically, due to the fact that the
The use of capacitors as storage cells in DRAM results capacitor cell loses its charge;
in much smaller net memory cell size.
While it is being refreshed, the data cannot be
Cheaper and take up much less space, typically 1/4 (or accessed. This is in contrast to SRAM's flip-flops,
less) the silicon area of SRAMs. which retain data as long as the power is on, which do
The transistor is used to read the contents of the not need to be refreshed, and whose contents can be
capacitor but keep on refresh. accessed at any time.
Refreshing action=dynamic and expressed in DRAM requires more supporting circuitry.
nanosecond (ns)
41 42
SIZE?
Typical memory IC
45 46
Example 1
DRAM (5) Show possible organizations and number
of address pins for 256K DRAM chip
DRAM is usually places on Solution:
For 256K chips, possible organizations are 256Kx1 or 64Kx4.
small circuit boards called In the case of 256Kx1, there are 256K locations and
SIMMs (Single In-line each location inside DRAM provides 1 bit.
The 256K. locations are accessed through the 18-bit address
Memory Modules). A0–A17 since 2^18 = 256K.
The chip has only A0-A8 physical pins plus RAS and CAS and
one pin for data in addition to Vcc, ground,
and the R/W pin that every DRAM chip must have.
For 64Kx4 organization,
4Mx9 it requires 16 address bits to access each location (2^16 = 64K),
and each location inside the DRAM has 4 cells.
SIZE? That means that it must have 4 data pins, D0-D3, 8 address pins,
A0-A7, plus RAS and CAS.
51 52
8284 clock generator (2)
8284 clock generator (3)
/8088
•Correct reset timing requires that the RESET input to
the microprocessor becomes a logic 1 NO LATER than
4 clocks after power up and stay high for at least 50us.
53 54
55 56
Important pins in 8284 (3) Important pins in 8284 (4)
RDY1 and AEN1 (3,4) (input) OSC (oscillator) (12) (output) CLK (clock) (8) (output) PCLK (peripheral clock)(2)(output)
Output clock frequency equal to This frequency is one-half of CLK
RDY1 is active high and AEN1
Provides a clock one-third of the crystal oscillator, (or one-sixth of the crystal) with a
(address enable) is active low. or EFI input frequency, with a duty cycle of 50% and is TTL
frequency equal to the duty cycle of 33%. compatible.
They are used together to
provide a ready signal to the crystal oscillator and it is This is connected to the clock In the IBM PC this 2.386383 MHz is
microprocessor, which will TTL compatible. input of the 8088/86 and all other provided to the 8253 timer to be
devices that must be used to generate speaker tones,
insert a WAIT state to the CPU The IBM crystal oscillator synchronized with the CPU. and other functions
read/write cycle. is 14.31818 MHz, OSC will In IBM PC it is connected to pin
In the IBM PC, RDY1 is 19 of the 8088 microprocessor
provide this frequency to and other circuitry under the
connected to DMAWAIT and CLK88 label. READY (5) (output)
AEN1 is connected to the expansion slot of the
This frequency, 4.772776 MHz Connected to READY of the CPU.
RDY/WAIT. IBM PC. (14.31818 divided by 3), is the
57 58
Clock cycles
All µp actions are synchronized by a continuous train of Why Clock Cycle Calculations?
regularly time pulses known as clock.
These clock pulse obtained from an oscillator circuit .
Each clock period is generally called a T-state or time state Delays
If the clock frequency is 5 MHz then each T-state lasts Analog to digital sampling
200ns. Each basic operation is called a machine cycle.
The number of T-states in a machine cycle depends on the
Sound
instruction----take at least 3 or 4 T-states. Real-Time Applications
Example of clock cycle Anti-lock breaking system
T-state Industrial vision inspection of
items on a conveyor belt
Playback of stored video clips
Instruction cycle 59 60
Clock Cycle Calculation (1) Clock Cycle Calculation (2)
Speed between the two processors can be
differentiated by calculating the number of If we design a delay loop, how many clock
clock cycles each requires for a series of cycles are used in the loop? and how much
operations. time a single clock cycle lasts?
E.g. In 80286 the LOOP instruction takes 8 We can adjust the number of iterations of the
clocks instead of 17 clocks in 8086/88.
delay loop so that the time consumed t, is
The Intel datasheet lists the number of clocks equal to the desired value.
each instruction needs to execute.
t=total of clocks used * duration of one clock
For each instruction the clock cycle might be
fixed or variable, depend on cycle *number of iteration
Operand type, operand size, instruction outcome, and
addressing mode.
61 62
Answer
Clock Cycle Calculation (3)
mov cx,N ;
mov al,01 ;
Assume the Pentium µp( clock frequency = 50MHz) has the out 23,al
following information
Instruction Number of clocks
ag: loop ag ; 6*(N-1) + 5 clocks
Mov reg,immed 1
mov al,00h ; 1 clock
loop 6/5 (6 if branch, 5 if not)
out 23,al ; 12 clocks
Out al,immed 12
How many iteration will be needed to achieve the 1 ms delay for Total no. of clocks = 6*(N-1)+5+1 +12
the following code?
mov cx,N ; N unknown- number of iteration Total time = 6*(N-1)+5+1 +12*1/(50MHz)
mov al,01
out 23,al To achieve 1ms delay:
ag: loop ag ; cx=cx-1 with each loop; loop til cx=0
mov al,00h ; 1 clock (6N+12)/50 = 1ms
out 23,al ; 12 clocks
N = 8331 iterations
63 64
Example
Write a program to generate a delay of 100ms using 8086 system that
runs on 10MHz frequency.
Instruction Execution Time
Mov cx, count ;4
Assume that the instructions have already
again: Dec
nop
cx ;2
;3
been fetched by the BIU and stored in the
jnz again ;16
Number of clock cycles for execution of the loop > instruction queue.
Time required for execution of the loop * 1/10 MHz =2.1 µs
Count=100ms/2.1 µs = 47619 = BA03H NOTE:
Complete codes
8086 has a 6-byte queue
delay proc far 8088 has a 4-byte queue
mov cx, BA03H ;4
again: Dec cx ;2
nop ;3 The wider data bus for 8086 means I.Q.
jnz
ret
again ;16 if branch, 4 if not
;8 filled up faster.
Exact time= 0.1*4 + (2+3)*47619*0.1 + 16*47618*0.1 + 4*0.1 + 8*0.1= 99.9999ms
From the book The 80x86 IBM PC by Muhammad Ali From the book The 80x86 IBM PC by Muhammad Ali
Mazidi and Janice Gilispie pg. 291 Mazidi and Janice Gilispie pg. 291
67 68
Inserting Wait States (2) Notes on Clock Cycles Calculations
Memory access time is not the only factor slowing down
the CPU although it is the largest. • 8088 needs 4 extra clock cycles for each 16-bit
Path Delay: memory access.
Time taken for address signals to go from CPU pins to memory
pins going through decoders and buffers. This include the time it • 8086 needs 4 extra clock cycles for each 16-bit
takes for the data to travel from memory to CPU. memory access only if address is odd.
E.g – A 20MHz 80386 based system is using ROM of • Transfer of control instructions require more time if
speed 150ns. Calculate the number of wait states they jump (to flush queue and fetch new machine
needed if the path delay is 25ns. instructions).
Answer: • JNZ ; 16 clocks cycles if jump else 4
Total time to get data into CPU= 150+25=175ns
Memory cycle time of 20MHz 80386 with 0 WS • Segment overrides add 2 clock cycles.
= 1/20MHz * 2 = 100ns.
2 WS is needed to make the memory cycle time > 175ns.
=> 100+50+50=200ns
69 70
77 78
Example
Find how many unit loads can be driven by the 74HCT244/245 Buffers/drivers
output of the LS logic family with the following
information: When the receiver current requirements exceed
the driver capability, bus buffering is use to
IIL = 1.6 mA IOL = 8 mA IIH = 40 uA IOH = 400 uA boost the signals traveling on the buses.
For unidirectional bus
Fan-out (low) = IOL = 8mA = 5 74XX244s are used
IIL 1.6mA For bidirectional bus
I 400uA 74XX245s are used
Fan-out (high) = OH = = 10 Can sink and source a much larger current than
IIH 40uA other gates…fan out is high compare to others.
This mean that the fan out is 5. in other words the LS output must not connected IOH=3mA
74244 and 74245 IOL=12mA
to more than 5 inputs with unit load characteristics
79 80
74HC244 and 74HC245
H X 81
Isolation