HP-67 Simulators
HP-67 Simulators
HP-67 Simulators
Munroe
---------------
- CONTENTS -
H67-806
Elementary concepts. . . . . . . . . . . . . . . 1
An appraisal of the.H67-806 . . . . . . . . . . . 4
H67-809A
Miscellaneous. . . . . . . . . . . . . . . . . .17
Elementary Concepts
registers:
GJEJ0GJ~ control
'!nit
memory
! I
The 806 has five registers called A, B, C, PC, and CC. The A-register
(accumulator) is used mostly for arithmetic operations; registers Band C
are simple work registers, but B can also be used as an index register
(indexing was described in the PPC article and further examples are given
in the next section of this document). Although values in memory should
range from 000 to 999, the value in a register can range from 0 to
+ 9999999999 E + 99. PC is the program counter; the contents of the PC
is always the address of the next instruction to be executed. For example,
when the instruction in location 20 is being executed, PC will contain the
value 21 -- which is the address of the next instruction to be executed.
For all practical purposes, the PC can be ignored by the programmer since
it is updated automatically by the 806 computer. However, before a program
begins execution the programmer must put the starting address into the PC.
Also, when a program halts, the contents of the PC will point one location
beyond where the program halted (as is typical of all computers). CC is the
condition code register. Some instructions, such as Add, cause the contents
of a register to be changed; others, such as Branch, do not affect any
rp.gisters. Whenever an instruction is executed that affects the contents
of a register, the CC is set to reflect whether the new contents are greater,
equal, or less than zero. This is then used in conjunction with the Branch-
on-Condition instruction, as you will see in the examples. If we were to
execute an instruction that puts the value 16 into A, the CC would also be
set to 16 automatically. Next, if we executed an instruction putting -20
into B, CC would be set to -20. NO'., if we execute an instruction putting
,
the same value 16 into A again, CC would be changed from -20 to 16. The
contents of A was affected even though the new value was the same as the
-1b-
old. :-!ost microcomputers and many minicomputers use condition ~;()dps just lH:e
this -- although they just use single bits to indicate greater. 0qual, or less
than zero, rather than saving the complete value.
registt~r: A 13 c PC CC
contents:
I
xxx [1 xxx II
xx-x 00 II xxx
location:
contents:
Hhen execution halts, registers A and CC will contain the sum, 40; the PC will
have the value 04 (since the halt is in location 01); and location 2~ will
contain the sum, t~O:
register: A B e PC ec
contents: I
40 II
xxx 1.1
xxx (
I
04 II
40 r
location: 00 01 02 03 04 26 27 28 29
/
contents:
I
127
I
328
I
229 I
000 xxx I 4 xxx 016 024 I
040
r
I I 1 I
/
microcode and the simulator
---------------------------
The H67-806 simulator i.s an FiP-c.7 (or He) progra!;J. that simulates the opf~rations
of this hypothetical 806 computer. This again raises the question of why anyone
',,1Ould simulate a computer on another computing device. Among the reasons given
in the PPC article, there is an even more important one: the widespread use of
microprogramming to replace hard1..ired logic. In the 1950's and early 60's, if
the instruction "laa" meant "Load A from location aa", then, to decode that,
the computer's central processing unit (CPU) had to have the necessary hardwired
logic gates to perform precisely that operation in ~equenced time intervals.
Not only did this involve designing and testing a lot of tedious circuitry, it
was also inflexible: coming out with the next model computer required designing
and testing a new CPU. Today, the instruction sets of IBM, DEC, and many other
-'c-
Before describing the simulator, let's briefly introduc~ the H67-B09A, which
part 2 of this document describes in detail.
First, the similarities: the 809A has the same registers A, B, C, PC, CCj and
also has 30 words of memory. Howev!"r, the 809A has two new instructions (one
allows reading additional instructionsfrom cards) and its memory organization
is a bit different. Everything you've read in these sections so far, including
the example "16+24" program, applies equally well to the 809A.
HP-41C users: the 41C's card reader will translate the HP-67 program card auto-
matically; the simulator program requires SIZE 026. Registers ROO to R09 and
register R25 are used exclusively by the simulator program itself. During
simulation, registers R20, R21, R22, R23, R24 correspond to A, B, C, PC, and CC.
Registers RIO to R19 correspond to memory locations 00 to 29 (see below). Note:
registers RIO to RI9 are the same as RSO to RS9 on the fIP-67, so to simplify
terminology for both 67 and 41 users, I'll use RSO to RS9 when discussing memory
.
organization.
Memory organization: Each register contains three memory locations. On the 806,
RSO contains locations 00, 01, 02; RSl contains locations 03, 04, 05; and so on.
RS9 contains locations 27, 28, 29. Values are put into 806 memory as a 9-digit
fractional number. So, if locations 06, 07, 08 are to contain the values 116,
720, 020, then the value .116720020 would be put into RS2. Note that this is
only an external representation; from the simulator's viewpoint, location 07
would contain the value 720 and not a fraction. On the S09A however, RSO contains
locations 00, 10, 20; RSI contains locations 01, ]1, 21; RS2 contains 02, 12, 22j
and so on. RS9 contains 09, 19, 29. Values Qre put into 809A memory as a Dumber.
with three digits preceding the decimal point and six digits following it, e.g.
to put 123, 456, 789 into locations 02, 12, 22, then 123.456789 would be put
into RS2.
-ld-
~~~~~E~~~_2f_~_~l~E!~_EE~~E~~
We'll end t~is section by actually executing the 16+24 program on the 309~:
IIP-67 us ers :
5. Press f P~S Memory locations 00 to 29 now cor~taln the progra~ and data.
6. Put the starting address ~nto the PC: key in a STO D (normally this
is not done -- the 809A automatically begins execution at memory loc~tion
10).
7. Begin execution: press E
8. After about 22 seconds, the 809A will halt. Examine registers and memory:
RCL A shows 40 (accumulator)
RCL D shows 4 (PC)
RCL E shows 40 (CC)
.'.
~
DSP 6, f p~S, RCL 9, f P~S shows O. 000 040 (040 in location 29).
HP-41C users:
O. Set USER, SIZE 026, GTC.. (remove any assignments to keys A, B, C, D, E)
1. Load both sides of the 809A simulator program card
2. Press BOOT (the 'B' key)
3. Key in the program:
127. 000 000 STO 10
328. 000 000 STO 11
229. 000 000 STO 12
000. 000 000 STO 13
000. 000 016 STO 17
000. 000 024 STO 18 Memory locations 00 to 29 now contain
the program and data.
,!i. Put the starting address into the PC: kC!y in 0 STO 23 (normally this
is not done -- the 809A automatically begins execution at memory location
10).
5. Begin execution: press E
6. After about 19 seconds, the 809A will halt. Examine registers and memory:
RCL 20 shows 40 (accumulator)
RCL 23 shows 4 (PC)
RCL 24 shows 40 (CC)
~': ~neasy way to see a single memory location is to use the ADDR ('A') key; first key
in the address and-then press A, e.g. 29 A shows 40; 1 A shows 328. 1\n indexed
address can also b~ keyed in.
-2-
The important thing to note here is that all data specific to each
subroutine call is obtained relative to rB -- the subroutine never
requires its data to be put in the s~ne fixed locations for each call.
The program begins 'Jith the instruction 802, which puts the value 2 in
rB. The values we want to multiply, 3 and 4, are in locations 02 and
03, and so rB is pointing to the address of the first value.
The next instruction executed is 150, which - since rB=2 - loads the
contents of location 00+02 into rA. So, rA now contains the value 3
from location 02.
Next, 020 is used simply to put the 3 into rC (we could have used a
902 instruction, but 020 is faster).
At location 21, the instruction 252 puts the product in rA into location
02+02.
The H67-809A
The H67 model 809A has two new and powerful instructions, while
retaining the ten high-speed memory locations introduced in the
807C.
RSO 00 10 20
RSl 01 11 21
RS2 02 12 22
RS3 03 13 23
RS4 04 14 24
RS5 05 15 25
RS6 06 16 26
RS7 07 17 27
RS8 08 18 28
RS9 09 19 29
Note that this is completely different from the H67-806.
Explanation of timing
Although the ST instruction accesses memory twice, only the first access
(of the instruction itself) affects the overall execution time of the
instruction. The second access (of memory where rA is to be stored)
always requires a fixed amount of ttme regardless of the operand~ location.
For BC, when CC=O the H67 must do a second memory fetch in order to get
the branch address, bb, in the second word of the instruction. The timed
value represents the unusual case when the first word of the instruction,
6aa, is in low core location 09 and the second word is in high core
location 10.
L laa LOADrA
ST 2aa STORE rA
A 3aa ADD TO rA
B 5aa BRANCH TO aa
i = increment by i-l
m =
2 = increment by 1
1 = unmodified operation
o = decrement by 1
pair: rl, r2
9 PC,A
8 C,C
7 C,B
6 C,A
p = 5 B,C
4 B,B
3 B,A
2 A,C
1 A,B
o A,A
Below is the multiplication program that was given for the 806. It has
been optimized for the S09A by putting the multiplication routine in
high speed locations 00 to 09, since that is where the program spends
most of its time. Note that execution begins at location 10 on the 809A
unless explicitly altered beforehand (i.e. on initializing the 809A, the
PC is always set to 10).
;
08 252 DONE: ST 2,B ; store product
09 553 B 3,B ; return to caller
;
; *** start of program -Irl:* the S09A always begins at location 10
,
.
lS OOS CON 8 ;
19 000 ANS2: RES 1 ; will be (6*S)
20 712 T A,C ; rC=rA= (6*8)
21 414 S ANS1 ; rA= (6*8)-(3*4)
22 000 HLT ; display rA
LOC 29
29 000 ERR: HLT ; error halt
010 END PGM ; execution begins at PGM (0)
To load this program, the secondary registers would be loaded as shown below;
this program takes about 3m 56s to execute.
H67-809A Operations
A. Recording a program
1. Load both sides of the 809A simulator program card, unless already done.
2. Press BOOT (the 'B' key). This resets the simulator's flags and sets up
certain constants. 809A memory (the secondary registers) is preserved
(unlike the 806). It is never necessary to reload the program card
between 809A programs, just use BOOT.
3. Press f P~
4. Key the 809A codes into the appropriate registers (e.g. 108.720309 STO 2
will place the code into R2 (RS2 after step 6) ).
S. (optional): to record the program,* press f W/DATA and insert only track 1
of a card. Clear the "Crd" message using CLX. For 41C, see below.
6. Press f pe;s
B. Running a program
O. (optional): BOOT, f P~S, read in track 1 of 809A codes (one side of card
only), CLX, f P~. For 41C, see below.
1. Load rA, rB, rC if required.* If execution is to begin at an address
other than 10, then set the PC to the address where execution is to begin.
2. For single step execution, press 'D'.
3. For normal execution, press 'E'.
4. Be patient.
(note that setting the PC is STO D, and 'D' is an executable operation)
C. Examination of registers and memory**
Halted programs: use "RCL" to examine the H67 registers A, B, C, PC, or CC.
Also, f ~~ allows examination of appropriate secondary registers (do f ~S
when finished). When resuming from a halt, the contents of stack register X
sets rA and the CC.
Single-step operation: in this mode, the H67 will halt before each instruction
fetch and will display the PC value for the instruction about to be executed
next. So, if the value 12 is displayed in single-step mode, this means that
the instruction at location 12 is going to be executed when R/S is pressed.
When halted in this mode, the H67 registers may be examined by 'RCL' and H67
memory may be examined by f P~S (as long as you switch back by f P~).
Execution can be traced easily in single-step mode by pressing R/S and watching
the PC change. Note: it is very important that the current value of the PC is
or has been recalled back into stack register X before pressing R/S.
Single-step can be turned off and normal execution enabled by pressing 'C'.
Recording an overlay
Now the main segment may be entered as described in section A of the previous section.
Note that the main 809A code segment (plus A, B, c, PC) may then be recorded onto track
1 and later read back (see steps AS and BO in the previous section).
Using overlays
When the main program is running, put track 2 of the card into the card reader snugly.
When the rop is executed the card will read through and a "Crd" message will appear.
At this point press only CLX and execution will now resume in the overlay segment.
If the main program contains many pauses or halts, be careful not to insert the card
prior to a halt or the wrong rOPe
41C Useage
Operations are simpler on the 41C; all we need remember is:
registers 00 through 09 are used by the simulator (SIZE= 026)
*
* registers 10 through 19 correspond to the HP-67's secondary registers and thus
also to 809A memory locations 00 to 29.
e.g. 108.720309 STO 12 puts the instructions 108, 720,
and 309 into H67 locations 02, 12, and 22.
(the HP-67 can't do this directly, so it has to use ~S)
* registers 20 through 24 correspond to H67 registers A, B, C, PC, and CC.
In USER mode, the top row keys (A to E) correspond to ADDR, BOOT, CONT,STEP, and EXEC.
Regarding the operations given in the previous section, we have:
step A1: OK
step A2: OK
step A3: omit this step
step A4: the 809A codes can be stored directly into registers 10
through 19, as required
step AS: this step becomes: Enter 10.024 into stack register X, issue WDTAX,
then insert side 1 of the card
step A6: omit this step
-14-
step BO: this step becomes: BOOT ('B'), enter 10.024 into X, issue
RDTAX, then insert side 1 of the card
steps Bl to B4 are OK
step 1: OK
step 2: OK
step 3: omit this step
step 4: codes can be stored directly into registers 10 through 19
step 5: omit this step
step 6: this step becomes: enter 10.019 into X, issue WDTAX
step 7: this step becomes: insert only side 2
Modify the translated 41C program as follows: find the LBL 09 routine
and the PSE instruction which follows. Replace PSE with:
TONE 9
10.019
VIEW Y ( Y contains the xx of 9xx)
RDTAX
Useage: when the main program sounds the tone (via lOP), insert the
overlay (side 2 of the card). To execute an lOP without reading a
card, just respond to the tone wi th +- or RIS.
Avoiding problems
Since programming the H67 is essentially programming in machine language, it is
very easy to make errors. In the past year, no problem has ever been found in the
simulator itself; the usual reasons one of my programs doesn't seem to be executing
properly are:
1. the algorithm doesn't solve the problem (e.g. a few extra logic steps
need to be added).
2. left an instruction out.
3. used wrong op code or wrong address.
4. didn't key data into registers properly.
5. inadvertently changed data during halt or in step mode.
I would recommend "executing" the code by hand before it gets entered into the machine,
even if it looks okay. Lastly, observing the program in step mode will reveal where
any problems are.
The 809A executes its instructions at a rate about one million times slower than a
typical computer today, so 5-minute program times are common. Putting pauses, halts,
or error halts in a program will help discriminate between a program with a long
execution time and one that is a runaway.
-15-
H67-809A Internals
RO = 'pos' value
R1 = the last three digits in secondary register 'reg'
It It to to It It to
R2 = middle
R3 first It to It If to It If
=
R4 = 'reg' value
R5 = constant 20
R6 = most recent effective address referenced
R7 = scratch
R8 = constant 10
R9 = constant 1000
Flags:
Routine Descriptions
The most important routine in the simulator is LBL A, which handles addressing
and memory references. It is such an important routine that it is described in
detail here. Once LBL A is understood, most of the other routines are simple
enough to be followed from the listing comments.
Important points about one other routine is described below.
When this routine is called (GSB A), these conventions are used:
The routine first computes an effective address, ea, from the address aa. If 00~aa$29,
then ea=aa; otherwise if 50 ~ aa ~ 79, then ea= (aa-50)+contents(rB). The first digit
of the effective address is called "pos" and the second digit plus 10 is called "reg"
(e.g. if ea=08, then pos=O and reg=18). The value reg, when put into rI of the HP-67
indicates the secondary register in which to find the H67 memory location; in our
example 18 points to RS8. Pos isolates the correct 3-digit section of the register:
pos=O for the first three digits, pos=1 for the middle, pos=2 for the last. Pos will
be used as a pointer to the three components of a register. Given register reg in the
format iii.jjjkkk, the LBL A routine separates the components serially into iii, jjj,
and kkk. After this, flag 0 is tested; if flag 0 is clear (fetch), the routine simply
uses pos to retrieve the appropriate component. If flag 0 is set (store), then the
contents of R7 replaces the component pointed to by pos; the components are then packed
into the iii.jjjkkk format and stored back into reg.
So where does the high speed memory come from? Note that if we're doing a fetch
operation (i.e. we don't have to repack and store), then we can return from LBL A
just as soon as the appropriate component has been extracted from the iii.jjjkkk
format. Since locations 00 to 09 all refer to the iii part which is extracted first (and
where pos=O), we can return from LBL A much earlier than normal if we detect flag 0
clear and pos=O. This also explains the special case of timing for the ST instruction.
We could go a step further and detect that if pos=I, we need not extract kkk, but the
overhead offsets any advantage.
This routine simply retrieves the instruction at the address indicated by the PC. Then,
like all computers, increments the PC by one so that while the instruction at location n
is executing, the PC is at n+l. When the value of the PC, n, is presented to the LBL A
routine, we get an instruction from memory in the format iaa (3-digits), e.g. 152. R6
also contains the effective address of n, so R6+1 is really used to increment the PC.
This is being a bit cautious, since the other routines in the simulator guarantee that
any indexed branch will resolve the effective address before setting the new PC. In any
cas~ the first digit of iaa is used to select the routine to handle the instruction and
stack register X is set to aa. When the instruction-handling routine is entered, it
interprets the value in X according to its own needs, e.g. 7mp, 8ii, etc.
-17-
Miscellaneous
For whatever need, the only no-operation, NOP, is B .+1 --i.e. branch to the next
location. Something like T A,A (opcode 710) is not a NOP since the CC is changed.
these last two cases, 701 and 755, are good examples of a common oversight: that the
increment or decrement operation is performed on the source register first and then the
transfer to the destination register is made.
MAIN PROGRAM:
OVERLAY SEGMENT:
RSO =
716.800999 405.020000
RS1 = 000.715520 606.000800
RS2 = 615.713000 003.000020
RS3 = 019.208000 724.000205
RS4 = 308.500000 500.000612
RS5 = 208.308000 000.000012
RS6 = 728.208000 305.000020
RS7 = 500.708000 715.000500
RS8 = 000.500000 711.000000
RS9 = 000.108000 105.000000
You may want to code and record the overlay and the main segment
and then run the program. Each time the program halts with the
current count, enter a value and RIS. After you end the series by
0, RIS, there is sufficient time to insert side 2 of the overlay
card -- or you can wait for the lOP 99 to finish and insert the
card in the short time B .-1 is executing.
The 809A is nicer than the 806 in that it has a better instruction
set and allows an arbitrary number of overlays to be used. Is it
possible to simulate an even more powerful hypothetical computer in
the 224-step, 26-register HP-67?
The answer is most certainly yes. Aside from a bit of sloppy coding
in the 809A, I've prepared a few designs radically different from
the 809A which, when implemented, will provide a more diverse and
better designed instruction set. The 809A, however, is the most
recent simulator that I've implemented and fully tested.