Ch3 - Syntax Analysis
Ch3 - Syntax Analysis
Ch3 - Syntax Analysis
Syntax Analysis
▪ Left Factoring
▪ Context-Free Grammars versus Regular Expressions
▪ RMD for - ( id + id )
( E ) ( E )
E E E + E
- E - E
-(id+E) -(id+id)
( E ) ( E )
E + E E + E
id id id
id * E + E id
E E
id id
id id
▪ Describe the same language, the set of strings of a's and b's
ending in abb. So we can easily describe these languages either by
finite Automata or PDA.
▪ On the other hand, the language L ={anbn | n ≥1} with an equal
number of a's and b's is a prototypical example of a language that
can be described by a grammar but not by a regular expression.
▪ We can say that "finite automata cannot count" meaning that a
finite automaton cannot accept a language like {anbn | n ≥ 1} that
would require it to keep count of the number of a's before it sees
the b’s.
▪ So these kinds of languages (Context-Free Grammars) are
accepted by PDA as PDA uses stack as its memory.
Chapter – 3 : Syntax Analysis 17 Bahir Dar Institute of Technology
Context-Free Grammars versus Regular Expressions
▪ The general comparison of Regular Expressions vs. Context-
Free Grammars:
Recursive descent
Involves Back tracking Operator precedence
predictive parsing
Parsing without LR parsing
backtracking
SLR
Recursive
predictive
CLR
Non-Recursive
predictive
Or LL(1) LALR
E’ E’ → +TE’ E’ → E’ →
T T → FT’ T → FT’
T’ T’ → T’ → *FT’ T’ → T’ →
F F → id F → (E)
2. If X is , then FIRST(X)={}
3. If X is a non-terminal symbol and X → is a
production rule, then add in FIRST(X).
2. If in FIRST()
➔ for each terminal a in FOLLOW(A) add A → to M[A,a]
▪ All other undefined entries of the parsing table are error entries.
▪ Handle: A “handle” of a string is a substring of the string that matches the right
side of a production, and whose reduction to the non terminal of the production
is one step along the reverse of rightmost derivation.
▪ Handle pruning: The process of discovering a handle and reducing it to
appropriate left hand side non terminal is known as handle pruning.
E→E+E
E→E*E String: id1+id2*id3
E→id
Rightmost Derivation Right sentential form Handle Production
▪ NB: If a shift-reduce parser cannot be used for a grammar, that grammar is called
non-LR(k) grammar. An ambiguous grammar can never be an LR grammar
▪ NB: The worst case of computation of this table (on previous slide) is
O(n2). because the table entry is nXn.
▪ To decrease the cost of computation, the operator function table can be
constructed. and the cost/order of operator function table computation
is O(2n)
Chapter – 3 : Syntax Analysis 67 Bahir Dar Institute of Technology
Algorithm for constructing operator precedence functions
1. Create functions fa and ga for each a that is terminal or $.
2. Partition the symbols in as many as groups possible, in such a way
that fa and gb are in the
same group if a = b.
3. Create a directed graph whose nodes are in the groups, next for
each symbol a and b do:
a) if a <· b, place an edge from the group of gb to the group of fa
b) if a ·> b, place an edge from the group of fa to the group of gb
4. If the constructed graph has a cycle then no precedence
functions exist.
When there are no cycles collect the length of the longest paths
from the groups of fa and gb respectively.
▪ SLR, LR and LALR work same, only their parsing tables are
different.
Chapter – 3 : Syntax Analysis 72 Bahir Dar Institute of Technology
LR Parsers
▪ LR parsing is attractive because:
• LR parsers can be constructed to recognize virtually all
programming-language constructs for which context-free
grammars can be written.
• LR parsing is most general non-backtracking shift-reduce
parsing, yet it is still efficient.
• The class of grammars that can be parsed using LR methods
is a proper superset of the class of grammars that can be
parsed with predictive parsers.
• LL(1)-Grammars LR(1)-Grammars
• An LR-parser can detect a syntactic error as soon as it is
possible to do so a left-to-right scan of the input.
▪ Drawback of the LR method is that it is too much work
to construct an LR parser by hand.
• Use tools e.g. yacc
Sm
Xm
LR Parsing Algorithm output
Sm-1
Xm-1
.
.
Action Table Goto Table
S1 terminals and $ non-terminal
X1 s s
t four different t each item is
S0 a actions will be a a state number
t applied t
e e
S s
A Configuration of LR Parsing Algorithm
6) F → id 7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Actions of A (S)LR-Parser -- Example
For id*id+id
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F b/c goto(0, F) = 3
0T2 *id+id$ shift 7 b/c goto(0, T) = 2
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 (*) +id$ reduce by T→T*F T→T*F b/c goto(7, F) = 10
0T2 +id$ reduce by E→T E→T b/c goto(0, T) = 2
0E1 +id$ shift 6 (*) – T2*7F10 …
0E1+6 id$ shift 5 reduced by T b/c
0E1+6id5 $ reduce by F→id F→id r=3
0E1+6F3 $ reduce by T→F T→F b/c goto(6, F) =3
0E1+6T9 (**) $ reduce by E→E+T E→E+T b/c goto(6, T) = 9
(**) . E1+6T9
0E1 $ accept
reduced to E b/c r
Chapter – 3 : Syntax Analysis 79 =3
Bahir Dar Institute of Technology
Steps for Constructing SLR Parsing Tables
▪ NB: I1, I4, I5, I6 are called final items. They lead to fill the
‘reduce’/ri action in specific row of action part in a
NB: In the LR(0) construction table whenever any state having final item in
that particular row of action part put Ri completely.
egg. in row 4, put R3 , 3 is a leveled number for production in G
Chapter – 3 : Syntax Analysis 86 Bahir Dar Institute of Technology
Example of LR(0) parsing Table
▪ Step 6: check the parser by implementing using stack for string abb$
▪ Reading assignment
• LALR parser and
• CLR parser