CPU Design PDF
CPU Design PDF
CPU Design PDF
6 CPU Design
A s we saw in Chapter 4, a CPU contains three main sections: the register sec-
tion, the arithmetic/logic unit (ALU), and the control unit. These sections work to-
gether to perform the sequences of micro-operations needed to perform the
fetch, decode, and execute cycles of every instruction in the CPU’s instruction set.
In this chapter we examine the process of designing a CPU in detail.
To demonstrate this design process, we present the designs of two CPUs,
each implemented using hardwired control. (A different type of control, which
uses a microsequencer, is examined in Chapter 7.) We start by analyzing the
applications for the CPU. For instance, will it be used to control a microwave oven
or a personal computer? Once we know its application, we can determine the
types of programs it will run, and from there we can develop the instruction set
architecture (ISA) for the CPU. Next, we determine the other registers we need to
include within the CPU that are not a part of its ISA. We then design the state dia-
gram for the CPU, along with the micro-operations needed to fetch, decode, and
execute each instruction. Once this is done, we define the internal data paths and
the necessary control signal. Finally, we design the control unit, the logic that
generates the control signals and causes the operations to occur.
In this chapter we present the complete design of two simple CPUs, along
with an analysis of their shortcomings. We also look at the internal architecture
of the Intel 8085 microprocessor, whose instruction set architecture was intro-
duced in Chapter 3.
Figure 6.1
Generic CPU state diagram
FETCH
Decode
Execute
00-173 C06 pp3 10/25/00 11:10 AM Page 216
Table 6.1
Instruction set for the Very Simple CPU
Figure 6.2
Fetch cycle for the Very Simple CPU
Fig-
ure
6.2 FETCH1
FETCH2
FETCH3
Figure 6.3
Fetch and decode cycles for the Very Simple CPU
FETCH1
FETCH2
FETCH3
IR = 00 IR = 11
IR = 01 IR = 10
To fetch the operand from memory, the CPU must first make its
address available via A[5..0], just as it did to fetch the instruction
from memory. This is done by moving the address into AR. However,
this was already done in FETCH3, so the CPU can simply read the value
in immediately. (This is the time savings mentioned earlier.) Thus,
ADD1: DR←M
Now that both operands are within the CPU, it can perform the
actual addition in one state.
ADD2: AC←AC DR
These two operations comprise the entire execute cycle for the ADD
instruction. At this point, the ADD execute cycle would branch back to
the fetch cycle to begin fetching the next instruction.
6.2.4.2 AND Instruction
The execute cycle for the AND instruction is virtually the same as that
for the ADD instruction. It must fetch an operand from memory, mak-
ing use of the address copied to AR during FETCH3. However, instead
of adding the two values, it must logically AND the two values. The
states that comprise this execute cycle are
00-173 C06 pp3 10/25/00 11:10 AM Page 221
AND1: DR←M
AND2: AC←AC ∧ DR
6.2.4.3 JMP Instruction
Any JMP instruction is implemented in basically the same way. The ad-
dress to which the CPU must jump is copied into the program counter.
Then, when the CPU fetches the next instruction, it uses this new ad-
dress, thus realizing the JMP.
The execute cycle for the JMP instruction for this CPU is quite
trivial. Since the address is already stored in DR[5..0], we simply copy
that value into PC and go to the fetch routine. The single state which
comprises this execute cycle is
JMP1: PC←DR[5..0]
In this case, we actually had a second choice. Since this value was
copied into AR during FETCH3, we could have performed the opera-
tion PC←AR instead. Either is acceptable.
6.2.4.4 INC Instruction
The INC instruction can also be executed using a single state. The CPU
simply adds 1 to the contents of AC and goes to the fetch routine. The
state for this execute cycle is
INC1: AC←AC 1
The state diagram for this CPU, including the fetch, decode, and exe-
cute cycles, is shown in Figure 6.4 on page 222.
FETCH1: AR←PC
FETCH2: DR←M, PC←PC 1
FETCH3: IR←DR[7..6], AR←DR[5..0]
ADD1: DR←M
ADD2: AC←AC DR
AND1: DR←M
AND2: AC←AC ∧ DR
JMP1: PC←DR[5..0]
INC1: AC←AC 1
00-173 C06 pp3 10/25/00 11:10 AM Page 222
Figure 6.4
Complete state diagram for the Very Simple CPU
FETCH1
FETCH2
FETCH3
IR = 00
IR = 01 IR = 10 IR = 11
ADD2 AND2
(If this looks like RTL code, you’re headed in the right direction!) Note
that memory supplies its data to the CPU via pins D[7..0]. Also recall
that the address pins A[5..0] receive data from the address register, so
the CPU must include a data path from the outputs of AR to A.
To design the data paths, we can take one of two approaches. The
first is to create direct paths between each pair of components that
transfer data. We can use multiplexers or buffers to select one of sev-
eral possible data inputs for registers that can receive data from more
than one source. For example, in this CPU, AR can receive data from PC
or DR[5..0], so the CPU would need a mechanism to select which one
is to supply data to AR at a given time. This approach could work for
this CPU because it is so small. However, as CPU complexity increases,
this becomes impractical. A more sensible approach is to create a bus
within the CPU and route data between components via the bus.
00-173 C06 pp3 10/25/00 11:10 AM Page 223
Figure 6.5
Preliminary register section for the Very Simple CPU
8
M
6
A[5..0] D[7..0]
6 8
6 6 6 6
AR
^
6 6 6
PC
^
8 8 8
DR
^
8 8 8
AC
^
2 2 2
IR
^
CLK
8-bit bus
00-173 C06 pp3 10/25/00 11:10 AM Page 225
Now we look at the actual transfers that must take place and
modify the design accordingly. After reviewing the list of possible op-
erations, we note several things:
1. AR only supplies its data to memory, not to other components. It
is not necessary to connect its outputs to the internal bus.
2. IR does not supply data to any other component via the internal
bus, so its output connection can be removed. (The output of IR
will be routed directly to the control unit, as shown later.)
3. AC does not supply its data to any component; its connection to
the internal bus can also be removed.
4. The bus is 8 bits wide, but not all data transfers are 8 bits; some
are only 6 bits and one is 2 bits. We must specify which registers
send data to and receive data from which bits of the bus.
5. AC must be able to load the sum of AC and DR, and the logical
AND of AC and DR. The CPU needs to include an ALU that can
generate these results.
The first three changes are easy to make; we simply remove the
unused connections. The fourth item is more of a bookkeeping matter
than anything else. In most cases, we simply connect registers to the
lowest order bits of the bus. For example, AR and PC are connected to
bits 5..0 of the bus, since they are only 6-bit registers. The lone excep-
tion is IR. Since it receives data only from DR[7..6], it should be con-
nected to the high-order 2 bits of the bus.
Now comes the tricky part. Since AC can load in one of two val-
ues, either AC DR or AC ∧ DR, the CPU must incorporate some arith-
metic and logic circuitry to generate these values. (Most CPUs contain
an arithmetic/logic unit to do just that.) In terms of the data paths, the
ALU must receive AC and DR as inputs, and send its output to AC.
There are a couple of ways to route the data to accomplish this. In this
CPU we hardwire AC as an input to and output from the ALU, and route
DR as an input to the ALU via the system bus.
At this point the CPU is capable of performing all of the required
data transfers. Before proceeding, we must make sure transfers that
are to occur during the same state can in fact occur simultaneously.
For example, if two transfers that occur in the same state both require
that data be placed on the internal bus, they could not be performed
simultaneously, since only one piece of data may occupy the bus at a
given time. (This is another reason for implementing PC←PC 1 by
using a counter for PC; if that value was routed via the bus, both oper-
ations during FETCH2 would have required the bus.) As it is, no state
of the state diagram for this CPU would require more than one value
to be placed on the bus, so this design is OK in that respect.
The modified version of the internal organization of the CPU is
shown in Figure 6.6. The control signals shown will be generated by
the control unit.
00-173 C06 pp3 10/25/00 11:10 AM Page 226
Figure 6.6
Final register section for the Very Simple CPU
READ
6 8
M
A [5..0] D [7..0]
8
6 MEMBUS
6 [5..0]
AR
LD ^
ARLOAD
PCBUS
6 [5..0] 6 6 [5..0]
PC
LD INC ^
PCLOAD PCINC
DRBUS
8 8 8
DR
LD ^
DRLOAD
8
8 ALU AC
LD INC ^
ALUSEL ACLOAD ACINC
2 [7..6]
IR
LD ^
IRLOAD
CLK
8-bit bus
Figure 6.7
A Very Simple ALU
8 8
AC
8
8 8 1M
8
U To AC
8 X
Parallel 0
Adder S
8 8
DR
(from bus)
Control Signal
(from control unit)
These signals cause the control unit to traverse the states in the
proper order. A generic version of this type of hardwired control unit
is shown in Figure 6.8.
Figure 6.8
Generic hardwired control unit
Control signals
to registers,
Input Counter Decoder Logic ALU, buffers
and output
pins
CLK
^
LD INC CLR
Table 6.2
Instructions, first states, and opcodes for the Very Simple CPU
Table 6.3
Counter values for the proposed mapping function
the execute routines have counter values at least two apart, it is possi-
ble to store the execute routines in sequential locations. This is ac-
complished by using the mapping function 1IR[1..0]0, which results in
counter values of 8, 10, 12, and 14 for ADD1, AND1, JMP1, and INC1,
respectively. To assign the execute routines to consecutive values, we
assign ADD2 to counter value 9 and AND2 to counter value 11.
Now that we have decided which decoder output is assigned to
each state, we can use these signals to generate the control signals
for the counter of the control unit and for the components of the rest
of the CPU. For the counter, we must generate the INC, CLR, and LD
signals. INC is asserted when the control unit is traversing sequential
states, during FETCH1, FETCH2, ADD1, and AND1. CLR is asserted at
the end of each execute cycle to return to the fetch cycle; this happens
during ADD2, AND2, JMP1, and INC1. Finally, as noted earlier, LD is as-
serted at the end of the fetch cycle during state FETCH3. Note that
each state of the CPU’s state diagram drives exactly one of these three
control signals. The circuit diagram for the control unit at this point is
shown in Figure 6.9.
Figure 6.9
Hardwired control unit for the Very Simple CPU
0 FETCH1
1 FETCH2
2 FETCH3
14 INC1
LD INC CLR
FETCH3
ADD2
AND2
JMP1
INC1
FETCH1
FETCH2
ADD1
AND1
00-173 C06 pp3 10/25/00 11:10 AM Page 231
These state signals are also combined to create the control signals
for AR, PC, DR, IR, M, the ALU, and the buffers. First consider register AR.
It is loaded during states FETCH1 (AR←PC) and FETCH3 (AR←DR[5..0]).
By logically ORing these two state signals together, the CPU generates
the LD signal for AR. It doesn’t matter which value is to be loaded into
AR, at least as far as the LD signal is concerned. When the designers cre-
ate the control signals for the buffers, they will ensure that the proper
data is placed on the bus and made available to AR. Following this pro-
cedure, we create the following control signals for PC, DR, AC, and IR:
PCLOAD JMP1
PCINC FETCH2
DRLOAD FETCH1 ∨ ADD1 ∨ AND1
ACLOAD ADD2 ∨ AND2
ACINC INC1
IRLOAD FETCH3
The ALU has one control input, ALUSEL. When ALUSEL 0, the
output of the ALU is the arithmetic sum of its two inputs; if ALUSEL 1,
the output is the logical AND of its inputs. Setting ALUSEL AND2
routes the correct data from the ALU to AC when the CPU is executing
an ADD or AND instruction. At other times, during the fetch cycle and
the other execute cycles, the ALU is still outputting a value to AC.
However, since AC does not load this value, the value output by the
ALU does not cause any problems.
Many of the operations use data from the internal system bus.
The CPU must enable the buffers so the correct data is placed on the
bus at the proper time. Again, looking at the operations that oc-
cur during each state, we can generate the enable signals for the buf-
fers. For example, DR must be placed onto the bus during FETCH3
(IR←DR[7..6], AR←DR[5..0]), ADD2 (AC←AC DR), AND2 (AC←AC ∧ DR)
and JMP1 (PC←DR[5..0]). (Recall that the ALU receives DR input via the
internal bus.) Logically ORing these state values produces the DRBUS
signal. This procedure is used to generate the enable signals for the
other buffers as well:
MEMBUS FETCH2 ∨ ADD1 ∨ AND1
PCBUS FETCH1
Finally, the control unit must generate a READ signal, which is
output from the CPU. This signal causes memory to output its data
value. This occurs when memory is read during states FETCH2, ADD1,
and AND1, so READ can be set as follows:
READ FETCH2 ∨ ADD1 ∨ AND1
The circuit diagram for the portion of the control unit that gener-
ates these signals is shown in Figure 6.10. This completes the design
of the Very Simple CPU.
00-173 C06 pp3 10/25/00 11:10 AM Page 232
Figure 6.10
Control signal generation for the Very Simple CPU
FETCH1
ARLOAD FETCH3 IRLOAD
FETCH3
AND2 ALUSEL
JMP1 PCLOAD
FETCH2
ADD1 MEMBUS
FETCH2 PCINC AND1
FETCH1 PCBUS
FETCH2
ADD1 DRLOAD FETCH3
AND1 ADD2
AND2 DRBUS
ADD2 JMP1
ACLOAD
AND2
FETCH2
ADD1 READ
INC1 ACINC AND1
Table 6.4
Execution trace
Table 6.5
Instruction set for a Relatively Simple CPU
Instruction
Instruction Code Operation
NOP 0000 0000 No operation
LDAC 0000 0001 AC←M[]
STAC 0000 0010 M[]←AC
MVAC 0000 0011 R←AC
MOVR 0000 0100 AC←R
JUMP 0000 0101 GOTO
JMPZ 0000 0110 IF (Z1) THEN GOTO
JPNZ 0000 0111 IF (Z0) THEN GOTO
ADD 0000 1000 AC←AC R, IF (AC R 0) THEN Z←1 ELSE Z←0
SUB 0000 1001 AC←AC R, IF (AC R 0) THEN Z←1 ELSE Z←0
INAC 0000 1010 AC←AC 1, IF (AC 1 0) THEN Z←1 ELSE Z←0
CLAC 0000 1011 AC←0, Z←1
AND 0000 1100 AC←AC ∧ R, IF (AC ∧ R 0) THEN Z←1 ELSE Z←0
OR 0000 1101 AC←AC ∨ R, IF (AC ∨ R 0) THEN Z←1 ELSE Z←0
XOR 0000 1110 AC←AC ⊕ R, IF (AC ⊕ R 0) THEN Z←1 ELSE Z←0
NOT 0000 1111 AC←AC ′, IF (AC ′ 0) THEN Z←1 ELSE Z←0
00-173 C06 pp3 10/25/00 11:10 AM Page 235
Most CPUs have several general purpose registers; this CPU has only
one to illustrate the use of general purpose registers while still keep-
ing the design relatively simple.
Most CPUs contain internal data registers that cannot be accessed
by the programmer. This CPU contains temporary register TR, which it
uses to store data during the execution of instructions. As we will see,
the CPU can use this register to save data while it fetches the address
for memory reference instructions. Unlike the contents of AC or R,
which are directly modified by the user, no instruction causes a per-
manent change in the contents of TR.
Finally, most CPUs contain flag registers, or flags, which show
the result of a previous operation. Typical flags indicate whether or
not an operation generated a carry, the sign of the result, or the parity
of the result. The Relatively Simple CPU contains a zero flag, Z, which
is set to 1 if the last arithmetic or logical operation produced a result
equal to 0. Not every instruction changes the contents of Z in this and
other CPUs. For example, an ADD instruction sets Z, but a MOVR (move
data from R into AC ) instruction does not. Most CPUs contain condi-
tional instructions that perform different operations, depending on
the value of a given flag. The JMPZ and JPNZ instructions for this CPU
fall into this category.
Figure 6.11
Fetch and decode cycles for the Relatively Simple CPU
FETCH1
FETCH2
FETCH3
IR = 00 IR = OD
IR = 03 IR = 06 ^ Z = 1 IR = 07 ^ Z = 0 IR = OA
Having fetched the low-order half of the address, the CPU now
must fetch the high-order half. It must also save the low-order half
somewhere other than DR; otherwise it will be overwritten by the
high-order half of address . Here we make use of the temporary reg-
ister TR. Again, the CPU must increment PC or it will not have the cor-
rect address for the next fetch routine. The second state is
Now that the CPU contains the address, it can read the data from
memory. To do this, the CPU first copies the address into AR, then
reads data from memory into DR. Finally, it copies that data into the
accumulator and branches back to the fetch routine. The states to per-
form these operations are
LDAC3: AR←DR,TR
LDAC4: DR←M
LDAC5: AC←DR
memory address in exactly the same way as LDAC; states STAC1, STAC2,
and STAC3 are identical to LDAC1, LDAC2, and LDAC3, respectively.
Once AR contains the address, this routine must copy the data
from AC to DR, then write it to memory. The states that comprise this
execute routine are
At first glance, it may appear that STAC3 and STAC4 can be combined
into a single state. However, when constructing the data paths later in
the design process, we decided to route both transfers via an internal
bus. Since both values cannot occupy the bus simultaneously, we
chose to split the state in two rather than create a separate data path.
This process is not uncommon, and the designer should not be con-
cerned about needing to modify the state diagram because of data
path conflicts. Consider it one of the tradeoffs inherent to engineering
design.
6.3.3.4 MVAC and MOVR Instructions
The MVAC and MOVR instructions are both fairly straightforward. The
CPU simply performs the necessary data transfer in one state and goes
back to the fetch routine. The states that comprise these routines are
MVAC1: R←AC
and
MOVR1: AC←R
6.3.3.5 JUMP Instruction
To execute the JUMP instruction, the CPU fetches the address just as it
did for the LDAC and STAC instructions, except it does not increment
PC. Instead of loading the address into AR, it copies the address into
PC, so any incremented value of PC would be overwritten anyway. This
instruction can be implemented using three states.
JUMP1: DR←M, AR←AR 1
JUMP2: TR←DR, DR←M
JUMP3: PC←DR,TR
6.3.3.6 JMPZ and JPNZ Instructions
The JMPZ and JPNZ instructions each have two possible outcomes, de-
pending on the value of the Z flag. If the jump is to be taken, the CPU
follows execution states exactly the same as those used by the JUMP
00-173 C06 pp3 10/25/00 11:10 AM Page 240
instruction. However, if the jump is not taken, the CPU cannot simply
return to the fetch routine. After the fetch routine, the PC contains the
address of the low-order half of the jump address. If the jump is not
taken, the CPU must increment the PC twice so that it points to the next
instruction in memory, not to either byte of . The states to perform the
JMPZ instruction are as follows. Note that the JMPZY states are executed
if Z 1 and the JMPZN states are executed if Z 0.
JMPZY1: DR←M, AR←AR 1
JMPZY2: TR←DR, DR←M
JMPZY3: PC←DR, TR
JMPZN1: PC←PC 1
JMPZN2: PC←PC 1
The states for JPNZ are identical but are accessed under opposite con-
ditions—that is, JPNZY states are executed when Z 0 and JPNZN states
are traversed when Z 1.
JPNZY1: DR←M, AR←AR 1
JPNZY2: TR←DR, DR←M
JPNZY3: PC←DR, TR
JPNZN1: PC←PC 1
JPNZN2: PC←PC 1
FETCH1
Figure 6.12
10/25/00
FETCH2
FETCH3
11:10 AM
IR = 00 IR = OC
IR = 02 IR = 04 IR = 06 ^ Z = 1 IR = 07 ^ Z = 0 IR = 08
STAC5 IR = OA IR = OE
IR = 01 IR = 05 IR = 07 ^ Z = 1
IR = 06 ^ Z = 0
Complete state diagram for the Relatively Simple CPU
IR = 03
INAC1 XOR1
LDAC1 MVAC1 JUMP1 JMPZN1 JPNZN1
LDAC4
LDAC5
DESIGN AND IMPLEMENTATION OF A RELATIVELY SIMPLE CPU
241
00-173 C06 pp3 10/25/00 11:10 AM Page 242
Figure 6.13
Preliminary register section for the Relatively Simple CPU
16 8
M
A[15..0] D [7..0]
8
16
16 16 16
AR
^
16 16 16
PC
^
8 8 8
DR
^
8 8 8
TR
^
8 8 8
IR
^
8 8 8
R
^
8
8 8 8
ALU AC
8 ^
CLK
16-bit bus
00-173 C06 pp3 10/25/00 11:10 AM Page 244
3. The 16-bit bus is not fully used by all registers. We must specify
which bits of the data bus are connected to which bits of the
registers.
4. Register Z is not connected to anything.
To address the first point, we simply remove the unused connec-
tions. The second point is also straightforward: A standard way to im-
plement bidirectional pins is to use a pair of buffers, one in each di-
rection. One buffer is used to input data from the pins and the other
outputs data to the pins. The two buffers must never be enabled si-
multaneously. This configuration is shown in Figure 6.14.
Figure 6.14
Generic bidirectional data pin
Di
Figure 6.15
Final register section for the Relatively Simple CPU
READ
WRITE 8
M
16
A[15..0] D [7..0]
16 MEMBUS BUSMEM
[7..0]
16
AR
LD INC ^
ARLOAD ARINC
PCBUS
16 16 16
PC
LD INC ^ DRHBUS
8 8 [15..8]
PCLOAD PCINC
DRLBUS
8[7..0] 8 8 8 [7..0]
DR
LD ^ 8
DRLOAD
TRBUS
8 8 [7..0]
TR
LD ^
8 TRLOAD
IR
LD
^
IRLOAD
RBUS
8[7..0] 8 8 [7..0]
R
LD
^
8 RLOAD
ACBUS
8 8 8 [7..0]
8[7..0] ALU AC
LD ^
ALUS[1..7] ACLOAD
8
Z
LD ^
ZLOAD CLK
16-bit bus
00-173 C06 pp3 10/25/00 11:10 AM Page 247
LDAC5: AC←BUS
MOVR1: AC←BUS
ADD1: AC←AC BUS
SUB1: AC←AC BUS
INAC1: AC←AC 1
CLAC1: AC←0
Figure 6.16
A Relatively Simple ALU
O AC O BUS BUS'
8 8 8 8 8
8
0 1 0 1 2 AC
S1 ALUS 2 8
ALUS 1 S MUX S MUX BUS
S0 ALUS 3
8 8
8 8 8 8
0 1
MUX S ALUS 7
To AC
generate the state value. One value is the opcode of the instruction.
The other is a counter to keep track of which state in the fetch or exe-
cute routine should be active.
The opcode value is relatively easy to design. The opcode is
stored in IR, so the control unit can use that register’s outputs as in-
puts to a decoder. Since the instruction codes are all of the form 0000
XXXX, we only need to decode the four low-order bits. We NOR together
the four high-order bits to enable the decoder. Then the counter can be
set up so that it only has to be incremented and cleared, and never
loaded; this greatly simplifies the design. These components, and the
labels assigned to their outputs, are shown in Figure 6.17.
The fetch routine is the only routine that does not use a value
from the instruction decoder. Since the instruction is still being
fetched during these states, this decoder could have any value during
the instruction fetch. Just as with the Very Simple CPU, this control
unit assigns T0 to FETCH1, since it can be reached by clearing the time
counter. We assign T1 and T2 to FETCH2 and FETCH3, respectively.
The states of the execute routines depend on both the opcode
and time counter values. T3 is the first time state of each execute rou-
tine, T4 is the second, and so on. The control unit logically ANDs the
correct time value with the output of the instruction multiplexer cor-
responding to the proper instruction. For example, the states of the
LDAC execute routine are
LDAC1 ILDAC ∧ T3
LDAC2 ILDAC ∧ T4
00-173 C06 pp3 10/25/00 11:10 AM Page 249
Figure 6.17
Hardwired control unit for the Relatively Simple CPU
0 INOP
1 ILDAC
2 ISTAC
3 IMVAC Decoder
4 IMOVR 0 T0
Decoder 5 IJUMP 1 T1
Time
6 IJMPZ Counter 2 T2
4
IR [3..0] 7 IJPNZ 3 T3
3
8 IADD 4 T4
9 ISUB 5 T5
10 IINAC 6 T6
11 ICLAC INC CLR 7 T7
^
12 IAND
IOR CLK
13
14 IXOR
E 15 INOT
4
IR [7..4]
LDAC3 ILDAC ∧ T5
LDAC4 ILDAC ∧ T6
LDAC5 ILDAC ∧ T7
The complete list of states is given in Table 6.6 on page 250.
Having generated the states, we must generate the signals to
supply the CLR and INC inputs of the time counter. The counter is
cleared only at the end of each execute routine. To do this, we logi-
cally OR the last state of each execute routine to generate the CLR in-
put. The INC input should be asserted at all other times, so it can be
implemented by logically ORing the remaining states together. As an
alternative, the INC input can be the complement of the CLR input,
since, if the control unit is not clearing the counter, it is incrementing
the counter.
Following the same procedure we used for the Very Simple CPU,
we generate the register and buffer control signals. Table 6.7 on page
251 shows the values for the buffers and AR. The remaining control
signals are left as design problems for the reader.
00-173 C06 pp3 10/25/00 11:10 AM Page 250
Table 6.6
State generation for a Relatively Simple CPU
Table 6.7
Control signal values for a Relatively Simple CPU
Signal Value
PCBUS FETCH1 ∨ FETCH3
DRHBUS LDAC3 ∨ STAC3 ∨ JUMP3 ∨ JMPZY3 ∨ JPNZY3
DRLBUS LDAC5 ∨ STAC5
TRBUS LDAC3 ∨ STAC3 ∨ JUMP3 ∨ JMPZY3 ∨ JPNZY3
RBUS MOVR1 ∨ ADD1 ∨ SUB1 ∨ AND1 ∨ OR1 ∨ XOR1
ACBUS STAC4 ∨ MVAC1
MEMBUS FETCH2 ∨ LDAC1 ∨ LDAC2 ∨ LDAC4 ∨ STAC1 ∨ STAC2 ∨
JUMP1 ∨ JUMP2 ∨ JMPZY1 ∨ JMPZY2 ∨ JPNZY1 ∨ JPNZY2
BUSMEM STAC5
ARLOAD FETCH1 ∨ FETCH3 ∨ LDAC3 ∨ STAC3
ARINC LDAC1 ∨ STAC1 ∨ JUMP1 ∨ JMPZY1 ∨ JPNZY1
Figure 6.18
Register section for the Relatively Simple CPU using multiple buses
READ
8
WRITE M
A[15..0] D [7..0]
MEMBUS BUSMEM
16
16 8
AR
LD INC
^ To/from
ARLOAD ARINC data bus #2
PCBUS
16 16 16
PC
LD INC
^
PCLOAD PCINC CLK
[7..0] 8 8 [15..8]
TRBUS DRHBUS
8 8 8
TR
LD
^
TRLOAD
DRLBUS
8 8 8
From data 8
bus #2
DR
LD 8
^ DR3BUS
ORLOAD
To data
8 bus #2
IR
LD ^
IRLOAD
CLK
Figure 6.18
(continued)
To/from
D[7..0] buffers
8
From
8 8
To DR DR3BUS
buffer
RBUS
16 8 8
R
LD ^
RLOAD
8
ACBUS
8 8 8
8 ALU AC
LD
^
ACLOAD
ALUS[1..7]
8
Z
LD ^
Z LOAD CLK
CPU. This is the basis of the complex versus reduced instruction set
computing debate, which is examined more closely in Chapter 11.
Figure 6.19
10/25/00
Accumulator Instruction
Temp. Reg.
(A Reg.) (8) (8) Register (8)
Flag (5)
Flip-Flops
B (8) C (8)
Instruction Reg. Reg.
Arithmetic Decoder D (8) E (8)
Logic and Reg. Reg.
Unit Machine H (8) L (8)
(ALU) Cycle Reg. Reg. Register
(8) Encoding (16) Array
Stack pointer
^
Program counter (16)
Power –5
supply GND
Incrementer/decrementer
address latch (16)
times, it must tri-state its connections to these buses; this is the func-
tion of these buffers. This happens when the computer is performing
a DMA transfer, described in detail in Chapter 10. In addition, the
data/address buffers determine whether data is input to or output
from the CPU, just as was done with the Relatively Simple CPU.
The interrupt control block contains the interrupt mask register.
The user can read the value from this register or store a value into
that register, so it is included in the microprocessor’s instruction set
architecture and its register section. The serial I/O control block also
contains a register to latch serial output data.
The registers communicate within the CPU via the 8-bit internal
data bus. Although it is not very clear in Figure 6.19, the connection
from the register array (the block containing registers B, C, D, E, H, L,
SP, and PC ) is wide enough for one register to place data onto the bus
while another register reads the data from the bus, as when the in-
struction MOV B,C is executed. When data is read from memory, such
as during an instruction fetch, or from an I/O device, the data is
passed through the data/address buffer on to the internal data bus.
From there, it is read in by the appropriate register.
The control section consists of several parts. The timing and con-
trol block is equivalent to almost the entire control unit of the Rela-
tively Simple CPU. It sequences through the states of the microproces-
sor and generates external control signals, such as those used to read
from and write to memory. Although not shown, it also generates all
of the internal control signals used to load, increment and clear regis-
ters; to enable buffers; and to specify the function to be performed by
the ALU.
The instruction decoder and machine cycle encoding block takes
the current instruction (stored in the instruction register) as its input
and generates state signals that are input to the timing and control
block. This is similar to the function performed by the 4-to-16 decoder
in the control unit of the Relatively Simple CPU, as shown in Figure
6.17. Essentially, it decodes the instruction. The decoded signals are
then combined with the timing signals in the timing and control block
to generate the internal control signals of the microprocessor.
Finally, the interrupt control and serial I/O control blocks are
partially elements of the control unit. The interrupt control block ac-
cepts external interrupt requests, checks whether the requested inter-
rupts are enabled, and passes valid requests to the rest of the control
unit. (As with the internal control signals, the path followed by these
requests is not shown in Figure 6.19 but it is present nonetheless.)
The serial I/O control block contains logic to coordinate the serial
transfer of data into and out of the microprocessor.
The 8085 microprocessor addresses several but not all of the
shortcomings of the Relatively Simple CPU. First of all, it contains
more general purpose registers than the Relatively Simple CPU. This
allows the 8085 to use fewer memory accesses than the Relatively
00-173 C06 pp3 10/25/00 11:10 AM Page 259
PROBLEMS 259
Simple CPU to perform the same task. The 8085 microprocessor also
has a larger instruction set, and has the ability to handle subroutines
and interrupts. However, it still uses only one internal bus to transfer
data, which limits the number of data transfers that can occur at any
given time. The 8085 also does not use an instruction pipeline. Like
the Relatively Simple CPU, it processes instructions sequentially—it
fetches, decodes, and executes one instruction before fetching the
next instruction.
6.6 Summary
In previous chapters, we looked at the CPU from the point of view of
the programmer (instruction set architecture) and the system designer
(computer organization). In this chapter, we examined the CPU from
the perspective of the computer architect.
To design a CPU, we first develop its instruction set architecture,
including its instruction set and its internal registers. We then create a
finite state machine model of the micro-operations needed to fetch,
decode, and execute every instruction in its instruction set. Then we
develop an RTL specification for this state machine.
A CPU contains three primary sections: the register section, con-
sisting of the registers in the CPU’s ISA as well as other registers not
directly available to the programmer, the ALU, and the control unit.
The micro-operations in its RTL code specify the functions to be per-
formed by the register section and the ALU. These micro-operations
are used to design the data paths within the register section, includ-
ing direct connections and buses, and the functions of each register.
The micro-operations also specify the functions of the ALU. Since the
ALU must perform all of its calculations in a single clock cycle, it is
constructed using only combinatorial logic.
The conditions under which each micro-operation occurs dictate
the design of the control unit. The control unit generates the control
signals that load, increment, and clear the registers in the register sec-
tion. The control unit also enables the buffers used to control the
CPU’s internal buses. The function to be performed by the ALU is spec-
ified by the control unit. By outputting the control signals in the
proper sequence, the control unit causes the CPU to properly fetch,
decode, and execute every instruction in its instruction set.
Problems
1 A CPU with the same registers as the Very Simple CPU, connected as
shown in Figure 6.6, has the following instruction set and state dia-
gram. Show the RTL code for the execute cycles for each instruction.
Assume the RTL code for the fetch routine is the same as that of the
Very Simple CPU.
00-173 C06 pp3 10/25/00 11:10 AM Page 260
FETCH1
FETCH2
FETCH3
IR = 00
IR = 01 IR = 10 IR = 11
2 A CPU with the same registers as the Very Simple CPU, connected as
shown in Figure 6.6, has the state diagram on the next page and fol-
lowing RTL code. Show the instruction set for this CPU.
FETCH1: AR←PC
FETCH2: DR←M, PC←PC 1
FETCH3: IR←DR [7..6], AR←DR[5..0]
001: DR←M, AR←AR 1
002: AC←AC DR
003: DR←M
00-173 C06 pp3 10/25/00 11:10 AM Page 261
PROBLEMS 261
004: AC←AC DR
011: DR←M, PC←PC 1
012: AC←AC ∧ DR
1X1: AC←AC 1, DR←M
1X2: AC←AC ∧ DR
FETCH1
FETCH2
FETCH3
IR = 00 IR = 1X
IR = 01
003
004
FETCH1
FETCH2
FETCH3
FETCH4
XY = 00 XY = 11
XY = 01 XY = 10
IA3
0 FETCH1
4 1 FETCH2
^ 4
Y,Y',X,(X Y) Counter 2 FETCH3
3 FETCH4
4 IA1
LD INC CLR Decoder
5 IA2
FETCH4 6 IA3
FETCH1 7 IB1
FETCH2 8 IB2
FETCH3
IA1
IA3 9 IC1
IB1
IC1 10 IC2
11 ID1
IA2
IB2
ID1 IC2
00-173 C06 pp3 10/25/00 11:10 AM Page 263
PROBLEMS 263
5 Modify the control unit of Problem 4 so that it realizes the state dia-
gram properly.
7 For the CPU of Problem 6, show the modifications necessary for the
register section.
8 For the CPU of Problem 6, show the modifications necessary for the
control unit. Include the hardware needed to generate any new or
modified control signals.
9 Verify the functioning of the CPU of Problems 6, 7, and 8 for the new
instruction.
10 We wish to modify the Very Simple CPU to incorporate a new 8-bit reg-
ister, R, and two new instructions. MVAC performs the transfer R←AC
and has the instruction code 1110 XXXX; MOVR performs the opera-
tion AC←R and has the instruction code 1111 XXXX. The new instruc-
tion code for INC is 110X XXXX; all other instruction codes remain un-
changed. Show the new state diagram and RTL code for this CPU.
11 For the CPU of Problem 10, show the modifications necessary for the
register section.
12 For the CPU of Problem 10, show the modifications necessary for the
control unit. Include the hardware needed to generate any new or
modified control signals.
13 Verify the functioning of the CPU of Problems 10, 11, and 12 for the
new instructions.
15 Show the logic needed to generate the control signals for registers PC,
DR, TR, and IR of the Relatively Simple CPU.
00-173 C06 pp3 10/25/00 11:10 AM Page 264
16 Show the logic needed to generate the control signals for registers R,
AC, and Z of the Relatively Simple CPU.
17 Show the logic needed to generate the control signals for the ALU of
the Relatively Simple CPU.
18 Verify the functioning of the Relatively Simple CPU for all instructions,
either manually or using the CPU simulator.
20 For the CPU of Problem 19, show the modifications necessary for the
register section.
21 For the CPU of Problem 19, show the modifications necessary for the
control unit. Include the hardware needed to generate any new or
modified control signals.
22 Verify the functioning of the CPU of Problems 19, 20, and 21 for the
new instruction.
23 Modify the Relatively Simple CPU to include a new 8-bit register, B, and
five new instructions as follows. Show the modified state diagram and
RTL code for this CPU.
24 For the CPU of Problem 23, show the modifications necessary for the
register section and the ALU.
25 For the CPU of Problem 23, show the modifications necessary for the
control unit. Include the hardware needed to generate any new or
modified control signals.
26 Verify the functioning of the CPU of Problems 23, 24, and 25 for the
new instructions.
00-173 C06 pp3 10/25/00 11:10 AM Page 265
PROBLEMS 265
27 For the Relatively Simple CPU, assume the CLAC and INAC instructions
are implemented via the CLR and INC signals of the AC register, in-
stead of through the ALU. Modify the input and control signals of Z so
it is set properly for all instructions.
30 Modify the Relatively Simple CPU so that it can use a stack. The
changes required to do this are as follows.
• Include a 16-bit stack pointer (SP) register that holds the address of
the top of the stack.
• The CPU must realize the following additional instructions. Note
that operations separated by semicolons occur sequentially, and op-
erations separated by commas occur simultaneously. Also note that
the value of PC used by the CALL instruction is the value of PC after
has been fetched from memory.