Predictive Parsing: Recall The Main Idea of Top-Down Parsing
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
1
Predictive parsing
If we have two productions: A , we want a
distinct way of choosing the correct one.
Define:
for G, x FIRST() iff * x
If FIRST() and FIRST() contain no common
symbols, we will know whether we should choose
A or A by looking at the lookahead symbol.
2
Predictive parsing
Compute FIRST(X) as follows:
if X is a terminal, then FIRST(X)={X}
if X is a production, then add to FIRST(X)
if X is a non-terminal and XY1Y2...Yn is a production, add
FIRST(Yi) to FIRST(X) if the preceding Yjs contain in their
FIRSTs
3
Predictive parsing
What if we have a "candidate" production A
where = or *?
We could expand if we knew that there is some
sentential form where the current input symbol
appears after A.
Define:
for AN, xFOLLOW(A) iff S*Ax
4
Predictive parsing
Compute FOLLOW as follows:
FOLLOW(S) contains EOF
For productions AB, everything in FIRST() except
goes into FOLLOW(B)
For productions AB or AB where FIRST() contains
, FOLLOW(B) contains everything that is in FOLLOW(A)
5
Predictive parsing
Armed with
FIRST
FOLLOW
6
Predictive parsing (w/table)
For each production A do:
For each terminal a FIRST() add A to entry M[A,a]
If FIRST(), add A to entry M[A,b] for each terminal
b FOLLOW(A).
If FIRST() and EOFFOLLOW(A), add A to
M[A,EOF]
Use table and stack to simulate recursion.
7
Recursive Descent Parsing
Basic idea:
Write a routine to recognize each lhs
This produces a parser with mutually recursive routines.
Good for hand-coded parsers.
Example:
AaB | b will correspond to
A() {
if (lookahead == 'a')
match('a');
B();
else if (lookahead == 'b')
match ('b');
else error();
} 8
Building a parser
Original grammar:
EE+E
EE*E
E(E)
Eid
This grammar is left-recursive, ambiguous and requires left-
factoring. It needs to be modified before we build a predictive
parser for it:
Remove ambiguity: Remove left recursion:
EE+T ETE'
TT*F E'+TE'|
F(E) TFT'
Fid T'*FT'|
F(E)
Fid
9
Building a parser
ETE'
The grammar:
E'+TE'|
TFT'
T'*FT'|
F(E)
Fid
FIRST(E) = FIRST(T) = FIRST(F) = {(, id}
FIRST(E') = {+, }
FIRST(T') = {*, }
FOLLOW(E) = FOLLOW(E') = {$, )}
FOLLOW(T) = FOLLOW(T') = {+, $, )}
FOLLOW(F) = {*, +, $, )}
+ * ( ) id $
E ETE' ETE'
E' E'+TE' E' E'
T TFT' TFT'
T' T' T'*FT' T' T'
F F(E) Fid
+ match
* match
( match
) match
id match
$ accept
11
Parsing table
Parse the input id*id using the parse table and a stack
12
Recursive descend parser
parse() { Eprime() {
token = get_next_token(); if (token == '+')
if (E() and token == '$') then token=get_next_token()
then return true if (T())
else return false
then return Eprime()
}
else return false
E() { else if (token==')' or token=='$')
if (T()) then return true
then return Eprime() else return false
else return false }
}
14
LL(1) parsing
For example, the following grammar will have two
possible ways to expand S' when the lookahead is
else.
S if E then S S' | other
S' else S |
E id
15
LL(1) parsing
Here's an example of a grammar that is NOT LL(k) for
any k:
S Ca | Cb
C cC | c
Why? Suppose the grammar was LL(k) for some k. Consider the
input string ck+1a. With only k lookaheads, the parser would not
be able to decide whether to expand using S Ca or S Cb
Note that the grammar is actually regular: it generates strings
of the form c+(a|b)
16
Error detection in LL(1) parsing
An error is detected whenever an empty table slot is
encountered.
We would like our parser to be able to recover from
an error and continue parsing.
Phase-level recovery
We associate each empty slot with an error handling
procedure.
Panic mode recovery
Modify the stack and/or the input string to try and reach
state from which we can continue.
17
Error recovery in LL(1) parsing
Panic mode recovery
Idea:
Decide on a set of synchronizing tokens.
19