Nothing Special   »   [go: up one dir, main page]

SLIDES - ICT444!1!17 (Compatibility Mode)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

ICT 444 COMPILERS AND TRANSLATORS

Instructor: r appiah
Introduction to Compilers
Simply stated, a compiler is a programme that
reads a programme written in one language –
the source language – and translates it into an
equivalent programme in another language –
the target language.
Illustration:

Source COMPILER Target


Programme Programme

Error
Messages
R D Appiah
A Language Processing System
Skeletal source programme

PREPROCESSOR

Source programme

COMPILER

Target assembly programme

ASSEMBLER

Re-locatable machine code

LOADER/ LINK EDITOR Library, re-locatable object


files
Absol te machine code
Absolute

R D Appiah
Analysis of a source programme
IIn compiling,
ili analysis
l i off th
the source programme
consists of three phases:
1. Linear Analysis: This is the process in
which the stream of characters making up the
source programme is read from left-to-right and
grouped into tokens – which are sequences of
characters having a collective meaning.
2. Hierarchical Analysis: This is where the
stream of characters or tokens are grouped
hierarchically into nested collections with
collective meaning.
R D Appiah
3. Semantic Analysis: This is where certain
checks are performed to ensure that the
components of a programme fit together
meaningfully.

R D Appiah
Symbol-Table Management
An essential function of a compiler is to record
the identifiers used in the source programme
and collect information about various attributes
of each identifier.
For variables
variables, the attributes may provide
information about storage allocation of the
identifier its type and its scope
identifier, scope.
In the case of procedure names, they provide
such things as the number and types of its
arguments, the method of argument supply,
and the type returned, if any.
R D Appiah
A Symbol Table is a data structure containing a
record for each identifier
identifier, with
ith fields for the
attributes of the identifier.
This data structure allows us to find the record
for each identifier quickly and to store or
retrieve data from that record quickly.

R D Appiah
Preprocessors: Preprocessors produce
input to compilers and they may perform the
following functions:
1
1. Macro processing: A preprocessor may
allow a user to define macros that are
shorthands for longer constructs.
constructs
Macro processors deal with two kinds of
Statements macro definition and macro use.
Statements: se
Macro definition is normally indicated by some
unique character or keyword like define or
macro with formal parameters in it definition.
R D Appiah
The use of a macro on the other hand consists
of naming the macro and supplying actual
parameters i.e. values for it formal parameters.

2. File Inclusion: This involves a


preprocessor that includes header files in a
programme text.

3. “Rational” processors: These are


g
processors that augment older languages
g g with
more modern flow-of-control and data-
structuring facilities.
R D Appiah
4. Language extensions: These are
processors that attempt to add capabilities to
the language by what amounts to built-in
macros.
macros

R D Appiah
Assemblers
An assembly code is a mnemonic version of
machine code, in which names are used
instead of binary codes for operations and
names are also given to memory addresses.
Some compilers produce assembly code, that is
passed to an assembler for further processing.
On the other hand, other compilers perform the
j b off th
job the assembler
bl – thus,
th producing
d i re-
locatable machine code that can be passed
di tl tto th
directly the lloader/
d / lilink
k editor.
dit
R D Appiah
A typical sequence of assembly instruction:
MOV a, R1
ADD #2, R1
MOV R1, b
The above code moves the contents of the
address a into register 1, then adds the
constant 2 to itit, treating the contents of register
1 as a fixed-point number, and finally stores the
result in the location named by b. b
Thus, it computes an expression like
b = a + 2;
2 i a llanguage lik
in like C
C.
R D Appiah
A Pass
A pass refers to the process of reading an input file
once and in some cases, writing an output file.

Two-Pass Assembly
The simplest
Th i l t fform off assembler
bl makes
k ttwo
passes over the input.
In the first pass, all the identifiers that denote
g locations are found and stored in a
storage
symbol table, which is usually different from that
p
of the compiler.
R D Appiah
Here, identifiers are assigned storage locations
as they are encountered for the first time
time.
After reading the illustration above, and on the
assumption that a word consist of four bytes,
the symbol table might contain the following
entries:

IDENTIFIER ADDRESS
a 0
b 4

R D Appiah
In the second pass, the assembler scans the
input again
again.
This time, it translates each operation code into
the corresponding sequence of bits
representing that operation in machine
language.
language
After this, it translates each identifier
representing a location into the address given
for that identifier in the symbol table.
The output of the second pass is usually re-
re
locatable machine code, which indicates that it
can be loaded starting at any location L in
memory.
R D Appiah
The following is a hypothetical machine code
into which the assembly instructions might be
translated:
0001 01 00 00000000
0011 01 10 00000010
0010 01 00 00000100
Here, we take it that the first four bits are the
i
instruction
i code,d withi h 0001
0001, 0010
0010, and d 0011
standing for load, store, and add, respectively.
The next two bits designate a register, and o1
refers to register 1 in each of the three
instructions above.
R D Appiah
The two bits after that represent a “tag” with 00
standing for the ordinary address mode
mode, where
the last eight bits refer to a memory address.
The ttag 10 stands
Th t d ffor th the “i
“immediate”
di t ” mode,
d
where the last eight bits are taken literally as
th operand
the d ((as iin th
the secondd iinstruction).
t ti )

R D Appiah

You might also like