Assembly Language I
Assembly Language I
Assembly Language I
Introduction
Inside the Central Processing Unit (CPU) all information are presented in binary form. In
order to utilize the computer we need to communicate with the CPU, however, binary
format is not convenient for human user. Therefore, computer engineers developed
different kinds of programming languages so that users can easily communicate with the
CPU by writing programs.
There are high level programming languages such as C, C++, Java, C#, Pascal
Assembly language is regarded as a low-level programming language because the
syntax used in assembly language is very close to the hardware itself. For example, if you
want to do a move/copy/assign operation, then you use an instruction called MOV that
comes from the operation move. In C++, you simply do X=10 for move/assign.
C Concept
L Logic thinking
P Practice
Concept we must learn the basic syntax, such as how a program statement is
written
Logic thinking programming is a problem solving process so we must think
logically in order to derive a solution
Practice write more programs
Assembly language programming
The native language is machine language using 0,1 (binary) to represent an operation or
data. A single machine instruction can take up one or more bytes of code. Assembly
language is used to write the program using alphanumeric symbols (or mnemonic), eg
ADD, MOV, PUSH etc.
After you have written your program, the program will then be assembled (similar to
compiled) and linked into an executable program.
The executable program could be .com, .exe, .bin, .hex files
Assembly
Object file Executable file
program
xxx.asm
XXX.obj XXX.exe or .com
Assemble Link or .bin
When you write assembly language program, only put one instruction in each
statement. You cannot do things like:
A=(B+C)*100
Eg ADD AX, BX
Usually each line in your assembly language program could have three parts: label,
the instruction and comment.
The label (Start:) is only used to identify a location within your program. You do
not need to include a label in every line!!!!!!
The word Start is the name of a label. A labels name is defined by a user provided
that it is not a reserved word and do not put space between the name. After the name,
you must include the : to indicate that it is a label.
Comment is used to document the program so that user can understand the logic of
the program. Comment is identified by the ; put in front of it
Memory structure
The memory of a computer is organized in bytes. Each byte occupies one address.
So Even, or odd-addressed bytes of data can be independently accessed.
In a program, if you want to use an 8-bit data, it is represented by symbol db (b-byte)
and 16-bit by dw (w-word). This is also called the directive.
To store a 16-bit data, the MSB (Most Significant Byte) is stored at the higher byte
address and the LSB at the lower byte address.
For example a 16 bit value ABCD (Hex) will occupy two address locations, for
example 12344H and 12345H, then the low byte CD will be stored in 12344H and the
high byte AB will be stored in 12345H.
AB
CD
Lower address
In an 8086, only 4 64K-byte segments are active at the same time and these are: code,
stack, data, and extra
To access the active segments, it is via the segment register: CS (code), SS (stack),
DS (data), ES (extra). When you write your program, you sometimes need to properly
define the different segments.
STACK
Code
SS
CS
EXTRA
ES
DS
Memory
Registers
In assembly language programming for the 8086 and CICS type processors, you cannot
operate on two memory locations in the same instruction. So you usually need to store
(move) value of one location into a register and then perform your operation. After the
operation, you then put the result back to the memory location. Therefore, one form of
operation that you will use frequently is the store (move) operation!!!
There are different categories of registers in a microprocessor. For 8086, the data
registers are most frequently used in programming.
There are four data registers: AX, BX, CX,and DX. All four registers are 16-bit but they
can be used to store 2 8-bit data. Then the name AH, AL, BH, BL, CH, CL, DH, DL are
used. H- High, L Low. Meaning that the 16-bit register eg AX is divided into two 8-bit
registers AH and AL.
The AX register is called the accumulator, usually used for storing result after an
operation.
Each of the 4 data registers can be used as the source or destination of an operation
during an arithmetic, logic, shift, or rotate operation. In some operations, the use of the
accumulator (AX) is assumed, eg in multiplication operation; details will be given in the
following.
In addition to the data registers, there are the pointer and index registers, all 16-bit. Some
pointer and index registers can be used as a general purpose register, ie can be used as an
operand in arithmetic or logic operations. However, most pointer and index registers have
special purposes.
The Source index register (SI) and Destination index register (DI) are used to hold offset
addresses for use in indexed addressing (similar to a pointer in C++ programming) of
operands in memory. When indexed type of addressing is used, then SI refers to the
current data segment and DI refers to the current extra segment. Details can be found in
the section on addressing mode.
The index registers can also be used as source or destination registers in arithmetic and
logical operations but must be used in 16-bit mode.
Data types
In 8086 assembly language, the data types are simple, only 8-bit, 16-bit, and 32-bit (this
is called a double word). You cannot define data as integer, float or char, as in C++.
Integer could be signed or unsigned and in byte-wide or word-wide format.
For a signed integer, the MSB can be used to determine the sign (0 for positive, 1 for
negative).
For example the value 1001 0100 is negative if it is a signed value
The range of Signed integer (8-bit) is from 127 to 128,
For signed word (16-bit) it is from 32767 to 32768
Latest microprocessors can also support 64-bit or even 128-bit data
The above shows a very simple 8086 assembly language program. You can see the basic
syntax used in the program and how the code segment is defined using the .code
keyword.
The flow of the program is top-down, ie from start to end and only one statement is
executed at each time.
Example
RET ; return
START ENDP ; define the end of a program
CSEG ends
End start
Start Proc Far and RET are used to define the start and end of the main program.
Another example
Stacksg segment
. ; define the stack segment
Stacksg ends
Datasg segment
; declare data inside the data segment
Datasg ends
Codesg segment
Main proc far ;
assume ss:stacksg, ds: datasg, cs:codesg
mov ax, datasg
mov ds, ax
.
mov ax, 4c00H
int 21H
Main endp
Codesg ends
end main
PROC define procedures inside the code segment. Each procedure (function) must
be identified by an unique name. At the end of the procedure, you must include the
keyword ENDP .
FAR is related to program execution. When you request execution of a program, the
program loader uses this procedure as the entry point for the first instruction to execute.
Assume to associate the name of a segment with a segment register or put the address
of the segment into the corresponding register.
assume ss:stacksg, ds: datasg, cs:codesg
In the above, the SS register is associated with the stacksg segment (which is the stack) or
the SS register is now storing the address of the stacksg segment.
In some assembler, you need to move the base address of a segment directly into the
segment register!!! Examples will be available in the following.
END ends the entire program and appears as the last statement. Usually the name of the
first or only PROC designated as FAR is put after END
Fortunately, if you are doing something simple you do not need to include all the
segment declarations in the program. For example:
start:
mov DL, 0H ; move 0H to DL
mov CL, op1 ; move op1 to CL
mov AL, data ; move data to AL
step:
cmp AL, op1 ; compare AL and op1
jc label1 ; if carry =1 jump to label1
sub AL, op1 ; AL = AL op1
inc DL ; DL = DL+1
jmp step ; jump to step
label1:
mov AH, DL ; move DL to AH
The emu8086 consists of a tutorial and the reference for a complete instruction set.
During the lectures, this software is being used to demonstrate the program examples.
Keil - www.keil.com
Assembly language program should be more effective and it will take up less memory
space and run faster. In real-time application, the use of assembly program is required
because program that is written in a high-level language probably could not respond
quickly enough. The syntax for different microprocessor may be different but the concept
is the same so once you learn the assembly programming for one microprocessor, you
can easily program other kinds of system. For example, programming the 8051 series is
very similar to the 8086.
[name] Dn expression
FLDC DB 21, 22, 23, 34 ; the data are stored in adjacent bytes
DUP duplicate
DUP can be used to define multiple storages
DB 10 DUP (?) ; defines 10 bytes not initialize
DB 5 DUP (12) ; 5 data all initialized to 12
String :
DB this is a test
EQU this directive does not define a data item; instead, it defines
a value that the assembler can use to substitute in other instructions
(similar to defining a constant in C++ programming or using the #define )
factor EQU 12
mov CX, factor
Addressing modes
When using different addressing modes, you must clearly understand how the offset
address is being calculated. With different kinds of addressing mode, the offset address
may be evaluated with different components, while the base address could come
from either the data segment or the extra segment.
Example:
In the above, the var1 can be regarded as a variable. In order to get the data represented
by var1, a memory read cycle is needed. Data is assumed to be stored in the data segment
(DS) and content of DS should be used as the segment address.
This addressing mode is for transferring a byte or a word between a register and a
memory location addressed by an index or pointer register.
The effective address (EA) is stored either in a pointer register or an index register
The pointer register can be either the base register BX or base pointer register BP.
The index register can be the source index register SI, or the destination index register DI
The default segment is either DS or ES.
Refer to the following example
01236 19
01235 18
01234 20
01233
Eg MOV [BX+SI], AL
Move value in AL to a location (DS+BX+SI)
If BP is used then use SS register instead of DS
The base register (BX) often holds the beginning location of a memory array, while the
index register (SI) holds the relative position of an element in the array
For moving a byte or a word between a register and a memory location addressed by an
index or base register plus a displacement
For the transfer of a byte or a word between a register and the memory location addressed
by a base and an index register plus a displacement, there are 3 components.
This addressing mode is a combination of the based addressing mode and the indexed
addressing mode together
In register indirect addressing mode such as MOV AL, [SI] the value of SI represents an
address. How to move an address of a variable to a register?
This is achieved by the instruction LEA (load effective address).
LEA is similar to the following C++ syntax
int* x ;
x = &y ; // assign the address of y to point x
Syntax of LEA
LEA SI, ARRAY ; move the address of variable ARRAY to the SI register
MOV AL, [SI] ; value of 12 is moved to AL
Example
1. Select an instruction for each of the following tasks:
Copy content of BL to CL
Copy content of DS to AX
mov LIST[SI], DX
mov CL, LIST[BX+SI]
mov CH, [BX+SI]
The string instructions of the 8086 instruction set automatically use the source (SI) and
destination index registers (DI) to specify the effective addresses of the source and
destination operands, respectively.
The instruction is MOVS
There is no operand after movs
Dont need to specify the register but SI and DI are being used during the program
execution so you must set the value of SI and DI before you use MOVS.
Exercises
1. Compute the physical address for the specified operand in each of the following
instructions:
MOV [DI], AX (destination operand)
MOV DI, [SI] (source operand)
MOV XYZ[DI], AH (destination operand)
Given CS=0A00, DS=0B00, SI=0100, DI=0200,
BX=0300, XYZ=0400
2. Express the decimal numbers that follows as unpacked and packed BCD bytes (BCD
binary coded decimal)
a. 29 b. 88
3. How would the BCD numbers be stored in memory starting at address 0B000
Example
Determine which is being moved in each MOV statement for the following assembly
program.
dat ends