Nothing Special   »   [go: up one dir, main page]

Introduction To: Formal Language Theory

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 28

Introduction to

Formal Language Theory


Overview of languages : natural Vs
formal
 Natural Languages
 rules come after the language
 progress and develop

 highly flexible

 quite powerful

 no special learning effort needed

Disadvantages
 vague/unclear

 inaccurate

 ambiguous

 user and context dependent

 Ex. Amharic, English, French, …

Formal Language 4
Theory
Overview of languages: cont’d

 Formal Languages
 developed with strict rules
predefined syntax and semantics
 accurate
 unambiguous

can be processed by machines!


Disadvantages
 unfamiliar notation

 initial learning effort

 Ex. Programming languages: Pascal, C++, …

Formal Language 5
Theory
Overview of languages: cont’d

 Sentences: the basic building blocks of


languages
 Sentence = Syntax + Semantics
 Grammar: the study of the structure of a
sentence
 Ex:
<simple sentence> ::= <noun phrase><verb><noun phrase>
<noun phrase> ::= <article><noun>

A person entered the room


Formal Language 6
Theory
Overview of languages: cont’d

<simple sentence>

<noun phrase> <verb> <noun phrase>

<article> <noun> <article> <noun>

A person entered the room

Derivation tree for the simple sentence: A person entered the room.

Formal Language 7
Theory
Overview of languages: cont’d

 In Pascal (as well as in many other


languages), for example, an identifier is
specified as follows:

<identifier> ::= <letter> | <letter> {<letter> | <digit>}*


<letter> ::= a | b| c …
<digit> ::= 0 | 1| 2 | … | 9

Ex. a, x1, num, count1, …

Formal Language 8
Theory
Review of set theory and relations

 Sets
 A well defined collection of objects (called or elements)
members
  Notation: a Є S  a is an element of the set S

Operation
Let A and Bon
besets
two sets and U the universal set
 Subset: A C B
 Proper subset: A c B
 Equality: A = B
 Union: A U B
 Intersection: A ∩ B
 Set difference: A \ B or A – B
 Complement: A’ or A bar

 Cartesian product: A X B = {(a,b) | a Є A and b Є B}

Note: (a,b) is called an ordered pair, and is different from (b,a)

Formal Language 9
Theory
Set theory and relations: cont’d

 Properties
Let A, B, C be sets and U the universal set
 Associative property: A U (B U C) = ( A U B) U C

 Commutative property: A U B = B U A

 Demorgan’s laws: (A U B)’ = A’ ∩ B’, ...

 Involution law: (A’)’ = A

Definitions:
 Let A be a set. The cardinality of set A is called the
cardinal number and denoted by |A| or #(A).
 The set of all subsets of a set A is called the power set
of A, denoted by 2A.

Formal Language 10
Theory
Set theory and relations: cont’d

Definition:
Let S be a set. A collection {A1, A2, …, An} of subsets of S is called a partition
if Ai ∩ Aj = Ø, i≠j and S = A1 U A2 U … U An.
Ex. S = {1, 2, …, 10}
Let A1 ={1, 3, 5, 7, 9} and A2 ={2, 4, 6, 8, 10}, then {A1 , A2} = {{1, 3, 5, 7, 9},{2, 4,
6, 8, 10}} is a partition of S.
Q. Find other partitions of S
 Countability
 A finite set is countable
 If the elements of set A can be associated with 1st,2nd, …, ith, … elements of
the set of Natural Numbers, then A is countable.
Note: that in this case A may not be finite.
Ex.
1. N = {1, 2, …, ith, …} is countable
2. Z = {…, -3, -2, -1, 0, 1, 2, 3, …} = {0, 1, -1, 2, -2, 3, -3, …} is countable
3. [0, 3] is uncountable (not countable)

Formal Language 11
Theory
Relations and functions

 Relations
 Definition: A relation R is a set of ordered of elements
pairs in S. (i.e is a subset of S X S)
Notation: (x, y) Є R or x R y
 Properties of relations
 Let R be a relation on a set A, then
a. R is reflexive if for all a Є A, a R a or (a, a) Є R
b. R is symmetric if a R b => b R a
c. R is transitive if a R b and b R c => a R c, for all a, b, c Є R
d. R is an equivalence relation if (a), (b) and (c) above hold.
 Let R be an equivalence relation on set A and let a Є A, then
the equivalence class of a, denoted by [a], is defined as:
[a] = {b ЄA | a R b}

Formal Language 12
Theory
Relations and functions: cont’d

Examples:
Check whether the following relations are
reflexive, symmetric, and transitive

1. Let R be a relation in {1, 2, 3, 4, 5, 6} is given by


{(1,2), (2, 3), (3, 4), (4, 4), (4, 5)}
2. Let R be a relation in {1, 2, 3, …, 10} defined as
a R b if a divides b
3. Let R be defined on a set S such that aRb if a=b
4. Let R be defined on all people in Addis Ababa by
aRb if a and b have the same date of birth.
Formal Language 13
Theory
Relations and functions: cont’d

 Functions
 Definition: A function f from a set X to a set Y is a rule that
associates to every element x in X a unique element in Y,
which is denoted by f(x).
 The element f(x) is called the image of x under f.
 The function is denoted by f: X  Y
 Functions can be defined in the following two ways:
1. By giving the images of all elements of X
Ex. f:{1, 2, 3, 4}  {2, 4, 6} can be defined by
f(1) = 2, f(2) = 4, f(3) = 6, f(4) = 6
2. By a computational rule which computes f(x) x is given
once
Ex. f:R  R can be defined by f(x) = x2 + 2x + 1, x Є R (R =
the set of all real numbers)

Formal Language 14
Theory
Relations and functions: cont’d

 Let f: A  B be a function
1. f is an into(Injective) function if Rf C B
2. f is an onto(surjective) function if Rf = B
3. f is a one-to-one(bijective) function
if for x1 & x2 Є A, x1 ≠ x2 => f(x1) ≠ f(x2)
4. f is bijective (one-to-one correspondence) if it satisfies (2)
and (3) above.
Ex. f:Z  Z is given by f(x) = 2x
Show that f is one-to-one but not onto.

 Definition: A set A is said to be countable iff there exists a


function f:A  N such that f is bijective. (N=the set of
natural numbers)

Formal Language 15
Theory
Mathematical induction

 Let Pn be a proposition that depends on nЄZ+.


Then Pn is true for all +ve n provided that:
i. Pi is true
ii. If Pk is true, so is Pk+1, for some kЄZ+.
Three steps:
1. Base case: verify that P1 holds
2. Inductive hypothesis: assume that Pk holds, for some
kЄZ+
3. Inductive step: show that Pk+1 holds
Ex. Show that 1+2+…+n = n(n+1)/2, for all nЄZ+.

Formal Language 16
Theory
Mathematical induction: cont’d

Solution:
Let Pn: 1+2+…+n = n(n+1)/2
Step1: for n = 1, P1 holds
Step2: for some kЄZ+, assume Pk is true
i.e. Pk: 1+2+…+k = k(k+1)/2
Step3: WTS Pk+1 is true
Pk+1 : 1+2+…+k+(k+1) = (k+1)(k+2)/2
: Pk + (k+1) = (k+1)(k+2)/2
: k(k+1)/2 + (k+1) = (k+1)(k+2)/2
: [k(k+1) + 2(k+2)]/2 = (k+1)(k+2)/2
: (k+1)(k+2)/2 = (k+1)(k+2)/2
Therefore, Pn holds for all n ЄZ+

Ex. Show that Pn = ∑(i=1,n)(i2) = (n+1)(n)(2n+1)/6 for all n

Formal Language 17
Theory
Graphs and trees

 Graphs
 Definition: A graph (undirected graph) consists
of:
a. A non-empty set v called the set of vertices,
b. A set E called the set of edges, and
c. A map Φ (phi) which assigns to every edge a unique
unordered pair of vertices
e1 e6
v2 e1 = {v1, v2}
v1
e2 = {v1, v3}
e2 e3 e5 …
e6 = {v2, v2} (a self loop)
v3 e4 v4

Formal Language 18
Theory
Graphs and trees: cont’d

 Definition: A directed graph (digraph) consists of:


a. A non-empty set v called the set of vertices,
b. A set E called the set of edges, and
c. A map Φ (phi) which assigns to every edge a unique
ordered pair of vertices

e1
v2 e1 = (v1, v2)
v1
e3 v1 : a predecessor of v2
e2 e5
v2 a successor of v1
v4 :
v3 e4

Formal Language 19
Theory
Graphs and trees: cont’d

 Definition: The degree of a vertex v in a graph (directed or


undirected) is the number of edges with v as an end vertex.
Note: that a self loop is counted twice when calculating the
degree of a vertex.
Ex. In the previous graph, deg(v1) = ? deg(v2) = ?

Definition: A path in a graph (directed or undirected) is an



alternating sequence of vertices and edges of the form
v1e1v2e2…en-1vn, beginning and ending with vertices such that
ei has vi and vi+1 as its end vertices and no edge or vertex is
repeated in the sequence.
The path is said to be from v1 to vn.

Ex. In the previous graph, v1e1v2e3v3e4v4 is a path from v1 to v4.


Note: that a path may be directed (if all the edges in the path
have the same direction.)

Formal Language 20
Theory
Graphs and trees: cont’d

 Definition: A graph (directed or undirected) is connected if


there is a path between every pair of vertices.
Q. Are the previous two graphs connected?

 Definition: A circuit in a graph is an alternating


sequence
v1e1v2e2…en-1v1 of vertices and edges starting and ending
with the same vertex such that ei has vi and vi+1 as end
vertices and no edge or vertex other than v1 is repeated.
Ex. V2e3v3e4v4e5v2 is a circuit in the previous graph

Formal Language 21
Theory
Graphs and trees: cont’d

 Trees
 Definition: A graph (directed or undirected) is
called a tree if it is connected and has no circuits.
Q. Are the previous two graphs trees?
 Properties of trees:
 In a tree there is one and only one path between every
pair of vertices (nodes)
 A tree with n vertices has n-1 edges
 A leaf in a tree can be defined as a vertex of degree one
 Vertices other than leaves are called internal vertices

Formal Language 22
Theory
Graphs and trees: cont’d

 Definition: An ordered directed tree is a digraph satisfying the


following conditions:
 There is one vertex called the root of the tree which is distinguished
from all other vertices and the root has no predecessors.
 There is a directed path from the root to every other vertex.
 Every vertex except the root has exactly one predecessor.
(For the sake of simplicity, we refer to ordered directed trees as
simply trees.)
 The number of edges in a path is called the length of the
 path. The height of a tree is the length of the longest path
 from the root. A vertex v in a tree is at level k if there is a
path of length k from
the root to the vertex v.
 Q. what is the maximum possible level in a tree?
There are several types of trees: binary, balanced binary, binary
search tree, heap, general tree, …

Formal Language 23
Theory
Graphs and trees: cont’d
1
Ex. 1. List the leaves.
3 2. List the internal nodes.
2
3. What is the length of the
path from 1 to 9?
10 4 5 6
4. What is the height of the
tree?
7 8

 Note: a path from vertex (node) n1 to node nk can


be simply expressed as the sequence of nodes n i,
i=1,…,k such that ni is the parent (predecessor) of
ni+1 (1<= I <=k)

Formal Language 24
Theory
Strings and languages

 Strings
 An alphabet, ∑, is a set of finite symbols.
 A string over an alphabet ∑ is a sequence of symbols from ∑.
 An empty string is a string without symbols, and is denoted by λ.

 Let w be a string, then its length, denoted by /w/, is the number of


symbols of w.
Ex. Let ∑ = {0, 1}, the following are some strings over ∑
w = λ, /w/ = 0; w = 01, /w/ = 2; w = 010110, /w/ = 6
 Given an alphabet ∑, ∑* denotes the set of all strings (including
λ) over ∑.
 ∑+ = ∑* - {λ}

Ex. ∑ = {0, 1} => ∑* = {λ, 0, 1, 01, 00, 11, 111, 0101, 0000, …}
 ∑i is a set of strings of length i, i = 0, 1, 2, …
*
 Let x Є ∑ and /x/ = n, then x = a1a2…an, ai Є ∑

Formal Language 25
Theory
Strings and languages: cont’d

 Operations on strings
 Concatenation operation
 Let x, y Є ∑* and /x/ = n and /y/ = m. Then xy,
concatenation of x and y, = a1a2…anb1b2…bm, ai, bi Є ∑
 The set ∑* has an identity element λ with respect to the
binary operation of concatenation.
Ex. x Є ∑* , xλ = λx = x
 ∑* has left and right cancellation
For x, y, z Є ∑*,
zx = zy => x = y (left cancellation)
xz = yz => x = y (right cancellation)
 For x, y Є ∑* , we have /xy/ = /x/ + /y/

Formal Language 26
Theory
Strings and languages: cont’d

 Transpose operation
 For any x in ∑* and a in ∑, (xa)T = a(x)T
Ex. (aaabab)T = babaaa
 A palindrome of even length can be obtained by the
concatenation of a string and its transpose.
 A prefix of a string is a substring of leading symbols of
that string.
w is a prefix of y if there exists y’ in ∑* such that y=wy’
Ex. y = 123, list all prefixes of y.
 A suffix of a string is a substring of trailing symbols of that
string.
w is a prefix of y if there exists y’ in ∑* such that y=y’w
Ex. y = 123, list all suffixes of y.

Formal Language 27
Theory
Strings and languages: cont’d

 A terminal symbol is a unique indivisible object


used in the generation of strings.
 A nonterminal symbol is a unique object but
divisible, used in the generation of strings.
Ex. In English, a, b, A, B, etc are terminals and
the words boy, cat, dog, … are nonterminals.
In programming languages, a, A, :, ;, =, if, then, …
are terminals

Formal Language 28
Theory
Strings and languages: cont’d

 Languages
 Definition: A language, L, is a set (collection) of strings over a given
alphabet, ∑.
 A string in L is called a sentence or word.
Ex. ∑ = {0, 1}, ∑* = {λ, 0, 1, 01, 00, 11, …}
L1 = {λ}, L2 = {0, 1, 01} over ∑
L3 = {an | n>= 0} over ∑ = {a}

 Let L1 , L2 be languages over ∑, then


 L1L2 = {xy | xЄL1, yЄL1}
 L{λ} = {λ}L = L, for any language
 L
 L0 = {λ}
 L12 = L
LL ≡ {xx | xЄL}
 …
 Li = LiLi-1, for i>=2
 L* = U(i=0,∞)(Li)

Formal Language 29
Theory
! !
ND ??
E N
IO
ST
E

You might also like