Nothing Special   »   [go: up one dir, main page]

Chapter 3 - Part2

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 26

Chapter 3

Part 2

Describing Syntax
and Semantics
Ambiguity in Grammars

• A grammar is ambiguous if and only if it


generates sentences that have two or more
distinct parse trees

• Ambiguity is undesirable because it can


lead to incorrect execution (since code
generation depends on parse tree)
– If a language structure has more than one parse
tree, then the meaning of the structure cannot
be determined uniquely.
– The meaning of the syntactic structure can be
determined from its parse tree.
1-2
An Ambiguous Expression Grammar

<assign>  <id> = <expression>


<id>  A | B| C
<expression>  < expression > + <expression>
| < expression > * <expression>
| (<expression>)
| <id>

A=B +C *A
Which operation is
A=B +C *A done first?

1-3
<assign>  <id> = <expression>
<id>  A | B| C
<expression>  < expression > + <expression>
| < expression > * <expression>
| (<expression>) | <id>

< assign > < assign >

<id> = <expression> <id> = <expression>

A < expression > + <expression> A < expression > <expression>


*

<id> <id>
< expression > * <expression> < expression > + <expression>

B <id> <id> <id> <id> A

C A B C

A=B+C*A
What gets evaluated first the * or +?
1-4
<assign>  <id> = <expression>
<id>  A | B| C
<expression>  < expression > + <expression>
| < expression > * <expression>
| (<expression>) | <id>

A=3; B=4; C=5;


< assign >
A=B +C *A < assign >

<id> = <expression> <id> = <expression>

A < expression > + <expression> A < expression > <expression>


*

<id> <id>
< expression > * <expression> < expression > + <expression>

B <id> <id> <id> <id> A

C A B C

A=B+C*A
19 = 4 + 5 * 3 1-5
<assign>  <id> = <expression>
<id>  A | B| C
<expression>  < expression > + <expression>
| < expression > * <expression>
| (<expression>) | <id>
A=3; B=4; C=5;

< assign >


A=B +C *A < assign >

<id> = <expression> <id> = <expression>

A < expression > + <expression> A < expression > <expression>


*

<id> <id>
< expression > * <expression> < expression > + <expression>

B <id> <id> <id> <id> A

C A B C

A=B+C*A A=B+C*A
19 = 4 + 15 27 = 9 * 3 1-6
An ambiguous Expression Grammar
(Operator Precedence)
• The problem we just saw relates to Operator
Precedence.
• A grammar can be written to ensure that the
addition and multiplication operators are
consistently in a higher-to-lower ordering,
respectively in the parse tree.
• The correct ordering is specified by using
separate abstractions for the operands of the
operators that have different precedence.
• This requires additional non-terminals and rules

1-7
An ambiguous Expression Grammar
(Operator Precedence)
Ambiguous:

<assign>  <id> = <expression>


<id>  A | B| C
<expression>  < expression > + <expression>
| < expression > * <expression>
| (<expression>) | <id>
Unambiguous (generates the same language but indicates the precedence of *
over +):

<assign>  <id> = <expression>


<id>  A | B| C
<expression>  <expression> + <term> | <term>
<term>  <term> * <factor> | <factor>
<factor>  (<expression>) | <id>
Forces * to be lower than +
1-8
An ambiguous Expression Grammar
(Operator Precedence)
<assign>  <id> = <expression>
<id>  A | B| C
<expression>  <expression> + <term> | <term>
<term>  <term> * <factor> | <factor>
<factor>  (<expression>) | <id>

< assign >


A=B +C *A
<id> = <expression>

A <expression> + <term>

< term >


< term > * < factor >

< factor > < factor > <id>

<id> <id> A

1-9
B C
An Unambiguous Expression Grammar

• If we use the parse tree to indicate


precedence levels of the operators, we
cannot have ambiguity.

• For example: an operator generated lower


in the parse tree can be used to indicate
that it has precedence over an operator
produced higher up in the tree (i.e. it must
be evaluated first)

1-10
Parse trees and derivations

• Parse trees can be constructed from


derivations and vice versa
• Every derivation with an unambiguous
grammar has a unique parse tree, although
that tree can be represented by different
derivations.
• For example the leftmost and rightmost
derivations of the sentence A = B + C * A
are represented by the same parse tree
shown previously
1-11
Leftmost derivation of A = B + C * A
<assign>  <id> = <expression>
<assign> => <id> = <expr>
<id>  A | B| C
<expression>  <expression> + <term> | <term> A = <expr>
<term>  <term> * <factor> | <factor>
A = <expr> + <term>
<factor>  (<expression>) | <id>
A = <term> + <term>
< assign >
A = <factor> + <term>
A = <id> + <term>
<id> = <expression>
A = B + <term>
A <expression> + <term> A = B + <term> * <factor>
A = B + <factor> * <factor>
< term >
< term > * < factor > A = B + <id> * <factor>
< factor > < factor > <id> A = B + C * <factor>

<id>
A = B + C * <id>
<id> A
A = B + C *A
B C
1-12
Rightmost derivation of A = B + C * A
<assign>  <id> = <expression> <assign> => <id> = <expr>
<id>  A | B| C
A = <expr>
<expression>  <expression> + <term> | <term>
<term>  <term> * <factor> | <factor> A = <expr> + <term>
<factor>  (<expression>) | <id>
A = <expr> + <term>*<factor>
< assign >
A = <expr> + <term>*<id>
A = <expr> +<term>*A
<id> = <expression>
A = <expr> + <factor>*A
A <expression> + <term> A = <expr> + <id>*A
A = <expr> + C*A
< term >
< term > * < factor > A = <term> + C*A
< factor > < factor > <id> A = <factor> + C*A

<id>
A = <id> + C*A
<id> A
A = B + C *A
B C

1-13
Associativity of Operators

• When an expression includes two


operators that have the same precedence
(as * and / ), a semantic rule is required to
specify which should have precedence.
This rule is named associativity.
• For example:
A/B*C

1-14
Left Associativity
• When a grammar rule has its LHS also appears at the beginning
of its RHS, the rule is said to be left recursive.
• Left recursion specify operator left associativity.
• Given 2 left associative operators with equal precedence the first
will be lower in the parse tree and hence will be evaluated first.

<assign>  <id> = <expression>


<id>  A | B| C
<expression>  <expression> + <term> | <expression> - <term> | <term>
<term>  <term> * <factor> | <term> / <factor> | <factor>
<factor>  (<expression>) | <id>

Which one is evaluated first? How about?


A=B+C+A A=B/C-A
A=B+C -A
A=B*C*A
A=B/C*A

1-15
Right Associativity
• When a grammar rule has its LHS also appears at the end
of its RHS, the rule is said to be right recursive.
• Right recursion specify operator right associativity.
• Given 2 right associative operators with equal precedence
the second will be lower in the parse tree and hence will
be evaluated first.

<assign>  <id> = <expression>


<id>  A | B| C
<expression>  <id> ** < expression > | <id>

Which one is evaluated first?


A = B ** C ** A
1-16
<assign>  <id> = <expression>
<id>  A | B| C
<expression>  <id> ** < expression > | <id> Tree is skewed
< assign > right
A = B ** C ** A
<id> = <expression>

A
A = 2, B = 3, C = 4 <id> ** <expression>

A = 3**(4**2) = 3**16 = 43046721


< id > ** <expression>
B
<id>
C
A

1-17
<assign>  <id> = <expression> Tree is skewed
<id>  A | B| C
< assign > left
<expression>  < expression >**<id> |
<id>
<id> = <expression>
A = B ** C ** A
A <expression> ** <id>

A = 2, B = 3, C = 4
< expresion > ** <id>
A = (3**4)**2 = 6561 A

C
<id>

1-18
If Then Else: Ambiguous Grammar

<if_stmt>  if <logic_expr> then <stmt>


| if <logic_expr> then <stmt> else <stmt>

In the case when <stmt>  <if_stmt> the above grammar is


ambiguous since the following will have 2 distinct parse trees:
if (x <0)
if (y <0) {y = y-1;}
else y = 0;
if <logic_expr> then if < logic_expr > then <stmt> else <stmt>

1-19
<if_stmt>  if <logic_expr> then <stmt> | if <logic_expr> then <stmt> else <stmt>
<stmt>  <if_stmt>…

if <logic_expr> then if < logic_expr > then <stmt> else <stmt>

x = 1; y = -1;
if (x <0)
<if_stmt> if (y <0) {y = y-1;}
else y = 0;
<if_stmt>
if<logic_expr> then<stmt> else <stmt>

if<logic_expr> then<stmt> <if_stmt> y=0

y = -1 <if_stmt>
if<logic_expr> then <stmt>

if<logic_expr> then <stmt> else <stmt>


1-20
If Then Else: Ambiguous Grammar

• In most languages the else is matched with


its nearest pervious unmatched then.

• An unambiguous grammar must distinguish


between statements that are matched and
those that are unmatched

1-21
Unambiguous Grammar for the
if-then-else

• There are 3 types of statements matched,


unmatched, and non-if-statements. For simplicity
we classify non-if-statements as matched.

<if_stmt>  <stmt>
<stmt>  <matched> | <unmatched>
< matched >  if <logic_expr> then <matched> else
<matched> | non-if-statement
< unmatched >  if <logic_expr> then < stmt >

1-22
<if_stmt>  <stmt>
<stmt>  <matched> | <unmatched>
< matched >  if <logic_expr> then <matched> else
<stmt> <matched> | non-if-statement
< unmatched >  if <logic_expr> then < stmt >

<unmatched>
x = 1; y = -1;
if x < 0 then <stmt> if (x <0)
if (y <0) {y = y-1;}
<matched>

else y = 0;
if y < 0 then <matched> else <macthed>

y = y-1 y=0 1-23


<if_stmt>  <stmt>
<stmt>  <matched> | <unmatched>
< matched >  if <logic_expr> then <matched> else
<stmt> <matched> | non-if-statement
< unmatched >  if <logic_expr> then < stmt >

<matched>
x = 1; y = -1;
if x < 0 then <matched> else <macthed> if (x <0)
???? y=0 if (y <0) {y = y-1;}

if (y <0) {y = y-1;} is not matched


This parse tree is not correct else y = 0;

The previous one is unique


1-24
Extended BNF

Three extensions are commonly included in the


various versions of EBNF:
1.Optional parts are placed in brackets [ ]
<proc_call> -> ident [(<expr_list>)]
<if_stmt>→ if (<expression>) <statement> [else <statement>]

2.Alternative parts of RHSs are placed inside parentheses


and separated via vertical bars
<term> → <term> (+|-) const

3.Repetitions (0 or more) are placed inside braces { }


<ident> → letter {letter|digit}
<ident_list> → <identifier> {, <identifier>}

1-25
BNF and EBNF
• BNF
<expr>  <expr> + <term>
| <expr> - <term>
| <term>
<term>  <term> * <factor>
| <term> / <factor>
| <factor>

• EBNF
<expr>  <term> {(+ | -) <expr>}
<term>  <factor> {(* | /) <term>}
1-26

You might also like