State Complexity of Proportional Removals
State Complexity of Proportional Removals
State Complexity of Proportional Removals
Michael Domaratzki School of Computing, Queens University Kingston, ON K7L 3N6 Canada e-mail: domaratz@cs.queensu.ca
ABSTRACT We examine the state complexity of proportional removals such as 1 (L). For 1 (L), 2 2 we show a bound which is tight in the case that L is a unary language, and an nearly optimal bound for arbitrary languages. We also compute the average state complexity for 1 (L) if L is unary. We study other proportional removals and give bounds for 2 certain reset automata. Keywords: State complexity, reset automata, proportional removals
1. Introduction and Motivation The state complexity of a regular language L is the number of states in the minimal deterministic nite automaton (DFA) recognizing L. There has been much interest lately in the study of state complexity of operations which preserve regularity (e. g. [2, 9]). These papers are generally interested in proving upper bounds on the state complexity of operations on regular languages, including the particular case of unary regular languages. In this paper, we examine proportional removals, that is, languages of the form {x | y such that r(|x|, |y|) xy L} (1)
for some language L , and a binary relation r. There is a complete characterization, due to Seiferas and McNaughton [6], of the relations r which ensure that (1) is a regular language, if L is regular. We obtain bounds for the equality relation, for both unary and general languages, and show this bound is tight for unary languages. 2. Notation and Denitions We assume the reader is familiar with basic concepts in automata theory and formal languages. For any unfamiliar terminology, see Hopcroft and Ullman [4] or S. Yu [8].
1 Full version of a submission presented at the Third International Workshop on Descriptional Complexity of Automata, Grammars and Related Structures (Vienna, Austria, July 20 22, 2001).
M. Domaratzki
A deterministic nite automaton (DFA) is a quintuple M = (Q, , , q0 , F ) where Q is the set of states, is the alphabet, and is the transition function : Q Q. The start state is q0 Q and the nal states are F Q. We extend to a function from Q to Q in the obvious way. A DFA is said to be complete if (q, a) is dened for all q Q, a . In what follows, we will assume that all DFAs are complete. A DFA is said to be initially connected if, for all q Q, there exists a word w such that (q0 , w) = q, that is, every state q is reachable from the initial state by reading some word w. A string w is accepted by a DFA M = (Q, , , q0 , F ) if (q0 , w) F . Dene the language accepted by a DFA M by L(M ) = {w | (q0 , w) F }. A DFA M = (Q, , , q0 , F ) is minimal for a regular language L if L(M ) = L and for all DFAs M = (Q , , , q0 , F ) with L(M ) = L, we have |Q| |Q |. The state complexity of a regular language L is denoted by sc(L) and is dened as the number of states in the minimal DFA for L. If is an alphabet, we use the notation k = {x | |x| = k}. We adopt the notation of Seiferas and McNaughton [6]: For any binary relation r N N and any language L , let the language P (r, L) be dened as P (r, L) = {x | y such that r(|x|, |y|) xy L}. For any set A N, we say that A is ultimately periodic (u.p.) if there exist integers n0 0 and p > 0 such that for all n > n0 , n A n + p A. If n0 and p are chosen to be minimal among all integers satisfying those respective conditions, we say that n0 is the preperiod of A, and p is the period of A. For any relation r N N and any set A, dene r1 (A) = {i | j A such that r(i, j)}. We call r u.p.-preserving if A u.p. implies r1 (A) u.p. The following is the complete characterization of relations r which preserve regularity, due to Seiferas and McNaughton [6]. Theorem 1 For all languages L, the following are equivalent 1. L regular P (r, L) is regular, and 2. r is u.p.-preserving. 2
We say that a DFA M = (Q, , , q0 , F ) is unary if || = 1. We observe that for any unary DFA, there is exactly one transition going out of any state. Thus, the transition diagram of the DFA consists of a chain of states (the tail) and a simple loop of states (the loop). Let H(n) be dened by H(n) = e n log n . This function is useful in the following result of Chrobak [3]. Theorem 2 For any n state unary NFA M there is a unary DFA M with a tail of O(n2 ) states and a loop of size at most O(H(n)) such that L(M ) = L(M ). 2
We begin with a particular removal, namely that given by the binary relation r = 1 {(n, n) | n 0}. We denote this special case by 2 (L): 1 (L) = {x | y with |x| = |y| and xy L}. 2 Our main theorem is a general bound on the state complexity of 1 (L). 2 Theorem 3 Let L be a regular language with sc(L) = n. O(nH(n)). Then sc( 1 (L)) = 2
Proof. Let L be recognized by the DFA M with M = (Q, , , q0 , F ) and Q = {q0 , q1 , . . . , qn1 }. Dene the NFA
M = (Q {q0 }, {a}, , q0 , F )
with
(q0 , ) = {q | q F }
and (q, a) = {q | b such that (q , b) = q} for each q Q. For each i 0, we dene sets Qi Q as follows: Q0 = F and for i > 0, Qi = (q0 , ai ). Note that (Qi , a) = Qi+1 and Qj = {q | w such that |w| = j and (q, w) F }. We may now dene F as F = Q2n . We choose Q2n since there are 2n subsets of Q. Thus, if we consider the sequence {Qi }i0 , there are indices 0 j < k 2n 1 such that Qj = Qk . Thus all possible distinct sets Qi will be encountered by the time we consider F . Thus, in determinizing M , all distinct subsets of Q which are states in M will appear as a state in the deterministic equivalent. By Theorem 2, there is a unary DFA M = (Q , {a}, , q0 , F ) such that L(M ) = L(M ) and M has at most O(H(n)) states. Further, for all i 0 we can associate Qi with the state q Q such that (q0 , ai ) = q. Thus, we number the states in Q i so that (q0 , a ) = qi is associated with Qi . Let M have r states in its tail and s states in its loop. We now dene a DFA Mh , which we claim will recognize 1 (L). Let Mh = 2 (Qh , , h , q0,h , Fh ), with Qh = Q {i | 0 i < r + s}, h ((q, i), b) = ((q, b), i + 1), if i < r + s 1; ((q, b), r), if i = r + s 1; (2)
M. Domaratzki
Thus we compute the action of M in the rst component, and implement a periodic counter in the second component, with preperiod r and period s. Finally, Fh = {(qi , j) | qi Qj }.
1 We show that L(Mh ) = 2 (L). Let x L(Mh ), so that h (q0,h , x) = (qi , j) F . Then qi Qj = Q|x|. Thus, by (2), there exists z with |z| = |x| and (qi , z) F . Thus, since (q0 , x) = qi , we may conclude that x 1 (L), since (q0 , xz) F . 2 Conversely, let x 1 (L). Then xy L for some y with |x| = |y|. Let (q0 , x) = qj 2 for some j. So (qj , y) F . Thus qj Q|y| = Q|x| . Let i be the integer with i |x| (mod r + s) with 0 i < r + s. Then qj Qi by our construction, and by denition of h , h (q0,h , x) = (qj , i). Further, (qj , i) Fh since qj Qi , so x L(Mh ). We conclude that sc( 1 (L)) |Q | = n(r + s) = O(nH(n)). 2 2
Note: Theorem 3 was known to S. Yu, but was not published. Let L be a unary language, with sc(L) = p + r, where p is the length of the tail of the unary DFA recognizing L, and r is the length of the loop. Then we note that M in the proof of Theorem 3 actually has tail of size p and loop of size r, since we can take M = M . So, simulating the proof of Theorem 3, we get the following corollary. Corollary 4 Let L be a unary language with sc(L) = n. Then sc( 1 (L)) n. 2 2
We can dene a unary language Ln for each n 1 to show that this bound is tight. Lemma 5 For all n 1 there exists a unary language Ln such that sc(Ln ) = 2 sc( 1 (Ln )) = n. 2 It is easy to see that for n 3, L = an2 (an ) will work for n odd, and L = (an1 )+ will work for n even. For n = 2, we can take L = a+ and for n = 1, we can take L = a . Note that for both these languages, we have sc(L) = L. Theorem 6 For innitely many k, there exists a language Lk such that sc(Lk ) = 1 O(n) but sc( 2 (Lk )) > cH(n) for some constant c. Proof. Let pi denote the ith prime with p1 = 2. We dene ni = mi =
i j=2 i j=2
pj and
Lemma 7 [1, p. 29] For all i 1, ni ( (i2log i) ). log i Lemma 8 [1, Cor. 8.2.7] For all i 1, log(mi ) = i log i(1 + o(1)).
2 2
Thus, we may conclude that mi = e ni log ni (1+o(1)) for all i. Let k 3. We write p for pk in what follows. Dene the language Lk as follows.
k1
Lk =
i=2
1 We show that for all k 3, sc(Lk ) = nk + 1 while sc( 2 (Lk )) mk . This will show the result. To show that sc(Lk ) = nk + 1 we will construct a DFA Mk with nk + 1 states such that L(Mk ) = Lk . This will establish sc(Lk ) nk + 1, which will actually suce for our argument (though it not hard to establish that Mk is minimal and thus sc(Lk ) = nk + 1). For each prime pi with i 2, let Ci be a cycle of pi states, {qi,0 , qi,1 , . . . , qi,pi 1 }, with transitions on 0 as follows: (qi,j , 0) = qi,j+1 k where j + 1 is taken modulo pi . For each k dene Qk = i=2 Ci . Finally, let Mk = (Qk {qd }, {0, 1}, , qk,0, F ) where includes all the cycle transitions on 0 dened above, as well as the transitions (qk,ppi , 1) = qi,0 for all 2 i < k. The remaining undened transitions go to qd . Finally,
F = {qi,pi 1 | 2 i < k}. We can verify that L(Mk ) = Lk . Figure 1 shows a DFA M4 recognizing L4 .
q3,3
q 3,2
q 2,2
0 0
q2,1
0 0
q 3,4 q 3,1
0
q2,0
0
q 3,0
1 0
q4,3 q
4,4
0 1
q
4,2
0
q 4,5
0 0
q 4,1 q 4,6
0 0
q4,0
M. Domaratzki
1 To show that sc( 2 (Lk )) mk , we use the Myhill-Nerode Theorem. Consider the i strings wi = 0 for 0 i < mk . The following lemma will be useful.
Lemma 9 For each prime pi with 2 i < k, the following hold: (a) For each integer r 0 there is exactly one s with 0 s < pi such that 0pr+ppi 10s 1 (Lk ). 2 1 1 (Lk ) 0pr2 +ppi 10s (Lk ) 2 2
(b) For any integer s with 0 s < pi , the condition 0pr1 +ppi 10s
pi (t + 2) = pr + p + 2 + 2s. This implies p + 2 + 2s + rp 0 (mod pi ). Now, we prove the converse also holds. Assume that p + 2 + 2s + rp 0 (mod pi ). (3)
Note that p > 0 and s 0. Thus p + 2 + 2s + rp 0. Then (3) implies that there exists some t 0 such that p + 2 + 2s + rp = t pi . But now t pi = p(r + 1) + 2 + 2s p(r + 1) > pi (r + 1). Since r 0, pi (r + 1) pi . Thus t > 1. Let t = t 2 0. Then we have p + 2 + 2s + rp = (t + 2)pi . Thus we have established the following: 1 (Lk ) 2 This establishes part (a) of the lemma: Choose s such that p + 2 + 2s + rp 0 (mod pi ) 0pr+ppi 10s p + 2 + 2s + rp 0 (mod pi ) (4)
and 0 s < pi . This s exists since the integers modulo pi form a eld, and pi = 2. Further, we can establish the statement (b) of the lemma using (4). Fix an integer s with 0 s < pi . Then r1 r2 (mod pi ) (p + 2 + 2s r1 p (mod pi ) p + 2 + 2s r2 p (mod pi )) 1 1 2 0pr1 +ppi 10s (Lk ) 0pr2 +ppi 10s (Lk ) . 2 2 We may now prove that each of the strings 0i for 0 i < mk are in dierent equivalence classes of the Myhill-Nerode relation. Consider two strings wi = 0i , wj = 0j for some 0 i < j < mk . Write i = pi + i j = pj + j where 0 i , j < mk1 = mk /p and 0 i , j < p. We have two cases: Case (a): i = j . We will construct a string zi,j 0 10 such that exactly one of 1 wi zi,j and wj zi,j is a member of 2 (Lk ). We have three subcases. Case (a-i): There exist primes p and p (0 , < k) such that i = p p and j = p p . Since i = j , = . Assume without loss of generality that p < p . According to Lemma 9, let s1 be the unique integer (depending on i ) with 0 s1 < p and 1 wi 10s1 = 0i 10s1 = 0pi +pp 10s1 (Lk ). 2 Similarly, let s2 be the unique integer depending on j with 0 s2 < p and 1 (Lk ). 2 If s1 = s2 we are done, since then we can choose zi,j = 10s1 . Since p < p and thus 1 s1 < p < p , s1 is not equivalent to s2 modulo p . Thus wj zi,j = wj 10s1 2 (Lk ). s1 +p . It is simple to show that If s1 = s2 , then we choose zi,j = 10 wj 10s2 = 0j 10s2 = 0pj
+pp
10s2
1 (Lk ). 2
1 (Lk ). 2 Case (a-ii): There exists a prime p such that i = p p but for all primes p with 2 < k, j = p p . The same case with the roles of i and j reversed holds similarly to what follows. In this case, we choose the unique si with 0 si < p such that wi 10si = 1 pi +pp 10si 2 (Lk ). Since j = p p for all primes p in the range 2 < k, 0 1 s wj 10 2 (Lk ) for any s, since we cannot complete wj 1 with any string of zeroes to ensure it is in Lk . This completes our case, as we can separate wi and wj with zi,j = 10si . wj 10s1 +p = 0pj
+pp
10s1 +p
M. Domaratzki
Case (a-iii): For both i and j there is no prime p with 2 < k such that i = p p or j = p p . We can reduce this to case (a-i) or (a-ii): Let r be the smallest integer such that i + r p (mod p) for some prime p < p; certainly r must exist. Then wi 0r and wj 0r must reduce to one of the above cases, since if we let u be the integer with 0 u < p such that u i + r (mod p), then p 1 satises u = p p . Thus, let zi,j be the string such that wi 0r zi,j 2 (Lk ) and 1 r r wj 0 zi,j 2 (Lk ). Then clearly we may choose zi,j = 0 zi,j .
Case (b): i = j . Then we must have that i = j . Since 0 i , j < mk1 , we must have that there exists an with 1 < k such that i j (mod p ), since if this were not the case then i j (mod p ) for all 1 < k and so i j (mod mk1 ), which would imply i = j , as we have noted that 0 i , j < mk1 . Thus, let be chosen so that i j (mod p ). Let ji be chosen such that 0 < ji < p and 0pi +pp 10ji
1 (Lk ). 2 1 (Lk ). 2
This is possible by Lemma 9. Also by Lemma 9, we know that since i j (mod p ), 0pj
+pp
10ji
Thus, if i (= j ) p p , we may choose zi,j = 0(pp )i 10ji , as wi zi,j 1 (Lk ), 2 while wj zi,j 1 (Lk ). If p > i (= j ) > p p , then note that i + 1 j + 1 (mod p ). 2 Thus, in a similar argument to that presented above, we may choose ji such that 0p(i +1)+pp 10ji and 0p(j
1 (Lk ). 2 1 (Lk ). 2
+1)+pp
10ji
1 Then we choose zi,j = 0pi 0pp 10ji . Thus wi zi,j = 0p(i +1)+pp 10ji 2 (Lk ) and 1 wj zi,j = 0p(j +1)+pp 10ji 2 (Lk ). This completes the proof. 2
4. State Complexity of Polynomial Removals Let f Z[x] be a polynomial with integral coecients. If f (n) N for all n N, we denote this by f (N) N. Recall that a function f is strictly monotonic if i > j implies f (i) > f (j). Our main theorem is: Theorem 10 Let f Z[x] be a strictly monotonic polynomial such that f (N) N. Then the relation rf = {(n, f (n)) | n 0} preserves regularity, and sc(P (rf , L)) O(sc(L)H(sc(L))). The following is an easy fact about polynomials with integer coecients:
Fact 11 Let f Z[x] with f (N) f (N). Then for all n1 , n2 N with n2 > 0, f (n1 ) f (n1 + n2 ) (mod n2 ). 2 Note that the previous fact is not true if we replace f in the statement by f Q[x] and f integer-valued (i. e., f (k) Z for all k Z). For example, let f (n) = n(n1) . 2 Then f is integer valued, and f (4 + 2) = 15 3 (mod 4) but f (2) = 1 1 (mod 4). We have the following denitions, which will allow us to characterize polynomials which preserve regularity. They are from Seiferas and McNaughton [6] and Siefkes [7]: A function f : N N is u.p.-reducible if, for every modulus m, there is a period rm such that the following congruence holds for all but nitely many n N: f (n) f (n + rm ) (mod m). A function f : N N is essentially increasing if, for every k, f (n) k for all but nitely many n N. Theorem 12 [6, Thm. 3, p. 151] If f is essentially increasing and u.p.-reducible, then f is u.p.-preserving. 2 The following corollary is immediate, considering Fact 11. Corollary 13 If f Z[x] with f (N) N and f strictly monotonic, then f is u.p.-preserving. 2 The following fact is simply a consequence of dening the period of a set as the smallest of all possible periods. Fact 14 Let A be a u.p. set with period p, and preperiod n0 . Let p be any integer satisfying a + p A a A for all a n0 . Then p|p . 2
Fact 15 Let f Z[x] be a strictly monotonic polynomial with f (N) N. Then f (n) n for all n N. 2 We are ready for the proof of Theorem 10: Proof. Let A be the ultimately periodic set constructed in the proof of Theorem 3 by the unary portion of the DFA for 1 (L). 2 1 By Corollary 13, f is u.p.-preserving. Thus rf (A) is u.p. Let pf be the period of 1 rf (A). We will show that if p is the period of A, then pf |p, and thus pf p. This will give the result. 1 Let i rf (A). Assume that i n0 , where n0 is the preperiod of A. By denition 1 of rf (A), there is a ji A such that ji = f (i) i. By Fact 11, f (i) f (i + p) (mod p). Stated another way, this tells us that f (i + p) = f (i) + p. for some integer . Since f (i + p) > f (i), we can further deduce that > 0. Thus, 1 f (i + p) A, because p is the period of A. We may conclude that i + p rf (A) by
10
M. Domaratzki
1 1 denition, and so i rf (A) i + p rf (A). We can similarly prove the reverse implication. Thus, by Fact 14, pf |p. Since we assume that both p and pf are positive integers, this gives us that pf p. Thus, we can construct a DFA for P (rf , L) in the same manner of Theorem 3, replacing a loop of period p by a loop of period pf . This gives the result. 2
5. Reset Automata A reset automaton is a DFA M = (Q, , , q0 , F ) such that there exists a word w and a state qs Q such that for all q Q, (q, w) = qs . We call any such w a reset word, and qs the reset state. We consider the state complexity of proportional removals on reset automata with accepting reset state as an demonstration of a class of automata for which the state complexity of proportional removals is exponentially more ecient than the general case. Recall our denition of the sets Qi : Qi = {q Q | w i with (q, w) F }. (5)
We will relate these Qi to reset automata. First we note that if the Qi eventually reach the entire set Q of states, then the state complexity of 1 (L) is low. 2 Lemma 16 Let M = (Q, , , q0 , F ) be a DFA. If Qr = Q for some r then Qr+1 = Q. 2 Lemma 17 If Q = Q for some 0, then sc( 1 (L)) O(sc(L)3 ). 2 Proof. Let sc(L) = n. Let M be the deterministic unary component of the construction given in Theorem 3. Then M has a tail of r = O(n2 ) states, and a loop of length s = O(H(n)). Consider t = r + s. By Lemma 16, since Q = Q for some 0, we have Qt = Q. By another application of Lemma 16, Qs = Q, since Qt+1 = Qs by denition of the unary component given in Theorem 3, and the loop structure of the unary DFA. Then we can conclude that Qi = Q for all s i t = r + s. Thus, we may replace all s such states with a single state st which loops to itself on all transitions. This reduces the number of states in M to t and thus an upper bound 1 on the number of states in a DFA recognizing 2 (L) is n O(n2 ) = O(n3 ). 2 The following lemma follows directly from the denition of reset automata. Lemma 18 Let M = (Q, , , q0 , F ) be a reset automaton. If qs F then Qr = Q for some r. 2 Thus, combining Lemmas 18 and 17, we have the following theorem. Theorem 19 Let M = (Q, , , q0 , F ) be a reset automaton with reset state qs F , 1 2 and let L = L(M ). Let n = |Q|. Then sc( 2 (L)) O(n3 ).
11
Recently, Nicaud [5] has examined the average state complexity of operations on regular languages. This turns out to be much harder than worst-case complexity, as it is currently unknown how many non-isomorphic automata there are over a twoletter alphabet. However, for unary automata, this is known. In this section, we 1 follow the work of Nicaud and give an average state complexity for the 2 () operation on unary languages. For any unary DFA M , let loop(M ) denote the automaton formed by removing the tail of M . An n-loop is a unary automaton which is a loop of n states. Let Un denote the set of complete, deterministic and initially connected unary DFAs with n states. We may enumerate such unary automata using the following notation. Let M (n, k, F ) denote the automaton ({0, . . . , n 1}, {a}, 0, , F ) with (i, a) = i + 1 if 0 i < n 1 and (n 1, a) = k. Theorem 20 The average complexity of the 1 () operation on an n-state unary au2 5 tomaton is equal to ( 8 n + c)(1 + (n)) for some function satisfying (n) 0 as n and some constant c. This tells us that in the average case we dier from the worst case by only a constant factor, but we are closer to the intuition that 1 (L) should only take half as many states 2 as L. We rst state a few preliminary lemmas which are direct interpretations of the action of 1 () on unary languages. 2 Lemma 21 Let M = (Q, {a}, , q0, F ) be the minimal unary DFA for L, and suppose 1 that M has k 0 states in its tail. Then any automaton recognizing 2 (L) needs at most k states in its tail. 2 2 Lemma 22 Let M = (Q, {a}, , q0, F ) be the minimal unary DFA for L, and suppose 1 that M has r 0 states in its loop. Then any automaton recognizing 2 (L) needs at most r states in its loop if r is odd, and r/2 states in its loop if r is even. 2 These lemmas are easily proved by direct observation. For each n-loop L, dene k(L) as k(L) = min{k | (F, ak ) = F } k(L) exists since n is an integer satisfying the above condition. Consider the following lemma of Nicaud [5, Lemma 2]. Lemma 23 For each n-loop L, the minimal automaton of L has k(L) states and k(L)|n. In particular, an n-loop is minimal i k(L) = n. 2 From this we may observe the following lemma. Lemma 24 Let M be an automaton with n = |loop(M )|. If n is even (resp. odd ) and loop(M ) is minimal, then for all M with L(M ) = 1 (L(M )), |loop(M )| n 2 2 (resp. n). 2
12 We also have the following lemma due to Nicaud [5, Thm. 1].
M. Domaratzki
d Lemma 25 There are exactly L(n) = d|n (n/d)2 minimal n-loops. Further, n there exists a function (n) such that L(n) = 2 (1 + (n)) and (n) 0 as n .2
We may now prove the statement of Theorem 20: Proof. We are interested in examining 1 |Un | sc
MUn
1 (L(M )) . 2
(6)
By [5] or by direct observation, we know |Un | = n2n . We rst show an upper bound for (6). Let [n] denote the set {0, 1, . . . , n 1}. Consider that 1 sc( (L(M ))) = 2 =
k=0 F [n] kn odd n1 n1
sc
k=0 F [n] n1
MUn
+
k=0 F [n] kn even n1
sc
n +1 . 2
Since there are 2n possible choices for F [n], we may replace this accordingly: 1 sc (L(M )) 2
n1
MUn
k=0 kn odd
k n + 1 2n + 2
n1
k=0 kn even
n + 1 2n . 2
which establishes the upper bound. For the lower bound, we follow the idea of Nicaud [5]. First, we will construct a set G() of -loops which are minimal. Then we note that by Lemma 24 1 sc (L(M )) 2
n
h(),
=1 LG() MUn loop(M)=L
MUn
13
where h : N N is the function dened by h(r) = r if r is odd and h(r) = r/2 if r is even. Further, for every -loop L with 1 n, there are exactly 2n n-state automata whose loop is L. Thus sc
MUn
1 (L(M )) 2n 2
|G()|2 h().
=1
Thus, we must simply construct a set G() which is large enough so that
n
2n
=1
|G()|2 h() =
5 2 n + cn (1 + (n)) 8
for some with (n) 0 as n and some constant c. But from Lemma 25, we know that we can choose G() to be the set of all minimal -loops, whereby we will get |G()| = 2 (1 + ()) for a function satisfying those constraints. Thus,
n n
2n
=1
|G()|2 h() = 2n
=1
h()(1 + ()).
h() =
=1
5 2 n + O(n) 8 2
In this paper we have considered the state complexity of the operation 1 (). We have 2 shown an upper bound of ne n log n for arbitrary languages, and a lower bound which is matching up to a factor of n. In the unary case, the matching upper and lower bounds are n. In section 5, we have given a condition on automata such that the 1 upper bound for 2 () is O(n3 ). In section 4, we considered the state complexity of polynomial removals. However, as we noted above, the result of Seiferas and McNaughton [6] gives a much broader classication of relations which preserve regularity. It remains open whether or not one can give an expression for the worst case state complexity of a relation {(n, f (n)) | n 0} in terms of an arbitrary function f . Cmpeanu et al. have considered the state complexity of operations on nite lana guages [2]. The state complexity of proportional removals on nite languages remains open. Acknowledgements Thanks to Je Shallit for proposing this problem and for carefully reading this paper. The author would also like to thank both the referees of DCAGRS 2001 and the referees of J.ALC for their many helpful suggestions which improved the presentation of the results in this paper.
14 References
M. Domaratzki
[1] E. Bach, J. Shallit, Algorithmic Number Theory, Volume I: Ecient Algorithms. MIT Press, Cambridge, Mass., 1996. [2] C. Campeanu, K. Culik II, K. Salomaa, S. Yu, State complexity of basic operations on nite languages. In: Proc. 4rd Workshop on Implementing Automata, 1999. LNCS 2214, Springer-Verlag, 2001, 6070. [3] M. Chrobak, Finite automata and unary languages. Theoretical Computer Science 47 (1986), 149158. [4] J. E. Hopcroft, J. D. Ullman, Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, Mass., 1979. [5] C. Nicaud, Average state complexity of operations on unary automata. In: M. Kutylowski, L. Pacholski, T. Wierzbicki (eds.), Proc. 24th Int. Symp. on Mathematical Foundations of Computer Science, 1999. LNCS 1672, SpringerVerlag, 1999, 230240. [6] J. I. Seiferas, R. McNaughton, Regularity-preserving relations. Theoretical Computer Science 2 (1976) 147154. [7] D. Siefkes, Decidable extensions of monadic second order successor arithmetic. In: Automatentheorie und formale Sprachen. Bibliographisches Institut, Mannheim, 1970, 441472. [8] S. Yu, Regular languages. In: G. Rozenberg, A. Salomaa (eds.), Handbook of Formal Language, Vol. 1. Springer-Verlag, Berlin, 1997, 46110. [9] S. Yu, Q. Zhuang, K. Salomaa, The state complexities of some basic operations on regular languages. Theoretical Computer Science 125 (1994), 315328. (Received: November 6, 2001; revised: September 12, 2002)