Regular Anguage
Regular Anguage
Regular Anguage
Regular Languages
Regular Languages
If a language is regular there exists a finite acceptor for it.
Therefore every regular language can be described by some Dfa
or Nfa. For every regular languages there should exist
corresponding regular expression.
Theorem:
Let L be a regular language. Then there exists a regular expression r
such that L=L(r)
Proof:
If L is regular, there exist an nfa for it. We can assume without
loss of generality, that this nfa has a single final state, distinct
from its initial state. We convert this nfa to a complete
generalized transition graph and apply the procedure nfa to
regular expression to it. This yields the required regular
expression r.
The set of regular expressions is defined
as follows:
Every symbol of Σ is a regular expression
ε is a regular expression
Ø is a regular expression denoting the empty set.
If r1 and r2 are regular expressions, then
L(r1+r2)=L(r1)UL(r2) ---- union
L(r1.r2)=L(r1)L(r2) ----- concatenation
R* -------- (Kleene star) denoting the smallest superset of set
described by R that contains ε and is closed under string
concatenation. This is the set of all strings that can be made by
concatenating any finite number (including zero) of strings from
set described by R. For example, {"0","1"}* is the set of all finite
binary strings (including the empty string), and {"ab", "c"}* = {ε,
"ab", "c", "abab", "abc", "cab", "cc", "ababab", "abcab", ... }.
Priority
* has the highest priority
. has the next priority
+ has the least priority
Parenthesis is used to override the above priorities
For example:
(a+(b.c))* stands for the star closure of {a}U{bc} that is the
language{ ε,a,bc,aa,abc,bca,bcbc,aaa,aabc,.....}
In other words we can define the regular expressions by using the following terms.
r = epsilon
r=a
r = r1 + r2
r = r1 r2
r = r1*
r = (r1)
The language represented or generated by a regular expression is a Regular
Language, denoted L(r).
Cont’d
Example 1:
ab* specifies the strings starting with a followed by 0 or more number
of b’s,
(ab)* specifies 0 or more repetitions of ab
Example 2:
For Σ={a,b} the expression
r = (a+b)*(a+bb) is regular.
It denotes the language L(r)={a,bb,aa,abb,ba,bbb,.....} So, L(r) is the set
of all strings on {a,b}, terminated by either an a or a bb.
Example 3:
r=(aa)*(bb)*b denotes the set of all strings with an even number of
a’s followed by an odd number of b’s that is
L(r)={a2nb2m+1: n>=0,m>=0}
Example 4:
r=(0+1)*00(0+1)*
Algebra of regular expressions
Identity laws
a. ε. R =R. ε = R
b. Ø + R = R+ Ø = R
Idempotent laws
R+R=R
(R*)*=R*
Distributive laws
A.(B+C)=A.B+A.C
Associative laws
A.(B.C)=(A.B).C
A+(B+C)=(A+B)+C
Regular Grammars
A language is said to be regular if it can be represented with a
regular grammar. Regular languages are equivalent to type 3 grammars.
The Linear Grammars are either left or right:
Right Linear Grammars:
Rules of the forms
A→ε
A→a
A → aB
Left Linear Grammars:
Rules of the forms
A→ε
A→a
A → Ba
Transform the following Right Linear grammar in an
equivalent NFAε.
S → aS | bA
A → cA | ε
Right linear Grammar
A -> aB
1. A is a single symbol (corresponding to a state) called a ‘non-terminal symbol’
2. a corresponds to a lexical item
3. B is a single non-terminal symbol.
Formal definition of Right Linear Grammars
A right linear grammar is a 4-tuple <T, N, S, R>, where:
1. N is a finite set of non-terminals
2. T is a finite set of terminals, including the empty string
3. S is the start symbol
4. R is a finite set of rewriting rules of the form A-> xB or A-> x, where A and B
stand for non-terminals and x stands for a terminal.
Formal example:
G1 = <T, N, S, R>, where T = {a, b}, N = {S, A, B}, and
R=
S -> aA
A -> aA
A -> bB
B -> bB
In a left regular grammar (also called left linear grammar), all
rules obey the forms
A → a - where A is a non-terminal in N and a is a terminal in Σ
A → Ba - where A and B are in N and a is in Σ
A → ε - where A is in N and ε is the empty string.
An example of a left regular grammar G with N = {S, A}, Σ = {a,
b, c}, P consists of the following rules
S → Sa
S → Ab
A→ε
A → cA
and S is the start symbol. This grammar describes the same
language as the regular expression a*bc*.
A regular grammar is a left or right regular grammar.
Relation between regular language and
Regular expression
They are equivalent:
With every regular expression we can associate a regular
language.
Conversely, every regular language can be obtained from a
regular expression.
Examples:
–Regular expression = ab*c
–Regular language = {ac, abc, abbc, ….}
Let Σ be an alphabet. The regular expressions over Σ are:
Ø Represents the empty set { }
ε Represents the set {ε}
a Represents the set {a}, for any symbol a in Σ
Con’t
For Ø:
For ε:
For a:
Types of automata
There are four basic types of automata, distinguished
by the following characteristics:
• An NFA is a five-tuple:
M = (Q, Σ, δ, q0, F)
• F = {q2}
δ: 0 1
qo {q0, q1} {}
q1 {} {q1, q2}
q2 {q2} {q2}
An NFA for the language of all strings over {a,b} that contain ababb
Conversion of NFA to DFA:
Suppose that you want to find an equivalent DFA for an
NFA . The algorithm is the following:
• Start from the start state and see where 0 or 1 takes you.
• For every new subset you find, see where 0 or 1 takes you.
• Repeat until no new subsets are found.
To find an equivalent DFA with the NFA of the figure we should complete the
following table:
The NFA below has 3 states q0,q1,q2 with ∑={0,1}
Cont…
Example 2
Transition table A B
0 0,1 0
0,1 0,1,2 0
0,1,2 0,1,2,3 0
0,1,2,3 0,1,2,3 0,3
0,3 0,1,3 0,3
0,1,3 0,1,2,3 0,3
Cont
What does a DFA do on reading an input string?
Input: a word w in ∑*
Steps:
Compute the next state from the current state, given the current input
symbol in w and the transition function
If after all symbols in w are consumed, the current state is one of the final
states (F) then accept w;
Otherwise, reject w.
Minimization of DFA:
For storage efficiency, it is desirable to reduce the number of states as far as possible.
We now describe an algorithm that accomplishes this.
Recall that a DFA M=(Q, Σ, δ, q0, F)
Two states are either indistinguishable or distinguishable. Indistinguishability has
the properties of an equivalence relation. If p and q are indistinguishable and if q
and r are also indistinguishable then all the 3 states are indistinguishable.
One method of reducing the states of a dfa is based on finding and combining
indistinguishable states.
Example
State A B
A B C
B B D
C B C
D B E
E B C
Transition Table:
Cont..
Minimize the number of states of DFA:
Thus, in new (ABCD) must be split into two groups(ABC) and (D).