Tight bounds for the space complexity of nonregular language recognition by real-time machines
aa r X i v : . [ c s . CC ] M a y Tight bounds for the space complexity ofnonregular language recognition by real-timemachines
Abuzer Yakaryılmaz ⋆ and A.C. Cem Say ⋆⋆ University of Latvia, Faculty of Computing, Raina bulv. 19, R¯ıga, LV-1586, Latvia [email protected] Bo˘gazi¸ci University, Department of Computer Engineering, Bebek 34342 ˙Istanbul,Turkey [email protected]
August 8, 2018
Abstract.
We examine the minimum amount of memory for real-time,as opposed to one-way, computation accepting nonregular languages. Weconsider deterministic, nondeterministic and alternating machines work-ing within strong, middle and weak space, and processing general orunary inputs. In most cases, we are able to show that the lower boundsfor one-way machines remain tight in the real-time case. Memory lowerbounds for nonregular acceptance on other devices are also addressed.It is shown that increasing the number of stacks of real-time pushdownautomata can result in exponential improvement in the total amount ofspace usage for nonregular language recognition.
The effects of restricting a Turing machine (TM) to a single left-to-rightpass of its input string have been well studied since the early years ofcomputational complexity theory. An important distinction in this realmis the one between real-time computation [Rab63b], in which the machineis required to spend a single computational step on each input symbol,and one-way computation [SHI65], where the input head is allowed topause for some steps on the tape. These two modes are known to beequivalent in language recognition power in many setups. An importantexception is quantum automata theory, where it has been shown that ⋆ Yakaryılmaz was partially supported by the Scientific and Technological ResearchCouncil of Turkey (T ¨UB˙ITAK) with grant 108E142 and FP7 FET-Open projectQCS. ⋆⋆ Say’s work was partially supported by the Scientific and Technological ResearchCouncil of Turkey (T ¨UB˙ITAK) with grant 108E142. uantum versions of real-time finite automata [AI99,YS11] and pushdownautomata [YFSA12] are strictly less powerful than their more general one-way versions in some modes of language recognition.In this paper, we compare the real-time and one-way modes of compu-tation from the point of view of the minimum amount of memory requiredby computers to recognize nonregular languages. A recent trend in thestudy of the lower limits of computation is the consideration of minimalamounts of combinations of various computational resources to performcertain tasks. Simultaneous lower bounds linking working space with in-put head reversals [BMP94c,BMP94a,BMP95,GMP98], ambiguity degree[BMP94b], and runtime [Pig09] have been established in the context ofnonregular language recognition by many TM variants. Since real-timecomputation naturally involves the strictest possible time bound, ourwork can be seen as part of this literature.Several variants of the TM model [Sip06] will be used in this study.All machines are equipped with a read-only input tape and a single work-tape with a two-way read/write head. We will distinguish between thestrong, middle, and weak modes of space complexity, whose definitionsare reviewed below. We assume without loss of generality that none of theTMs we consider ever moves the work tape head to the left of its initialposition. – A TM is said to be strongly s ( n ) space-bounded if all reachable con-figurations on any input of length n have the work tape head no morethan s ( n ) cells away from its initial position. – A TM is said to be middle s ( n ) space-bounded if all reachable config-urations on any accepted input of length n have the work tape headno more than s ( n ) cells away from its initial position. – A TM is said to be weakly s ( n ) space-bounded if, for any acceptedinput of length n , there exists at least one accepting computationcontaining no configurations that have the work tape head more than s ( n ) cells away from its initial position.In the rest of the paper, Section 2 considers several specializations ofthe alternating TM model for both tally languages (i.e. those with unaryinput alphabets) and general ones, establishing that most minimum spacebounds for nonregular language recognition coming from previous work We use the head position, rather than the number of nonblank cells [Sze94], formeasuring space usage, since real-time quantum TMs are known [YFSA12] to beable to recognize nonregular languages by a technique that involves just moving thework head ( O ( n ) steps) to the right, without writing any nonblank symbols. n one-way versions of these machines remain tight in the real-time case.Similar questions about some other computational models like pushdownand counter automata are examined in Section 3. Section 4 contains a listof open problems. Table 1 depicts the known tight bounds for the minimum space complex-ity of nonregular language recognition by various specializations of theone-way alternating TM model for both general and unary input alpha-bets. (We refer the reader to [Sze94,MP95] for the proofs of the facts inthe table.) These lower bounds clearly carry over to real-time TMs. Wewill now proceed to prove that most of these bounds remain tight, by ex-hibiting real-time machines recognizing nonregular languages within thecorresponding space bounds.
Table 1.
Minimum space used by one-way TMs for recognizing nonreg-ular languages. All bounds are tight.
General input alphabet Unary input alphabetStrong Middle Weak Strong Middle WeakDeterministic TM log n log n log n log n log n log n Nondeterministic TM log n log n log log n log n log n log log n Alternating TM log n log log n log log n log n log n log log n Theorem 1.
All the logarithmic bounds in Table 1 remain tight for thereal-time versions of the corresponding machines.Proof.
We construct a real-time deterministic TM D that recognizes thenonregular tally language L D = { a k i | i ≥ , k = 8 , and k i +1 = k i + 2 i ( i + 1) + 2 for i ≥ } , using only logarithmic space.The input string is assumed to be followed by the endmarker $ onthe tape. The work tape symbols are h , i , 0, 1, and the blank symbol D will maintain a counter in reverse binary representation between thedelimiters h and i on its work tape. At the beginning of the computation, expects to read four a ’s from the input, (rejecting otherwise,) and movesthe work tape head right and then left during these four steps to writethe string h i , and position the head on the symbol h . Having initializedits counter, D enters the following loop: – the work tape head goes from symbol h to symbol i while incrementingthe counter by one, and then, – the work tape head goes from symbol i to symbol h , while checkingwhether the counter value is a power of two or not.Notice that the work tape head never remains stationary on any symbol.Moreover, if required, the symbol i on the work tape is shifted one squareto the right. If D reads $ whenever the work tape head is not placed onthe symbol h , the input is rejected. When the work tape head is placedon symbol h and the currently scanned input symbol is $, then D acceptsif the counter is a power of two, and rejects otherwise.The shortest member of L D is a , since the work tape of D contains h i · · · , and the work tape head is placed on the h after reading theinput a .For any i >
0, suppose that the work tape contains h i − i · · · with the head scanning the h after reading the string a k i − from the input.If the input head scans a $ at this point, D will accept. If the input islonger, the next opportunity for acceptance will come when h i i · · · is written on the work tape, and the head is back on the h . To reach thatconfiguration, the counter will have to be incremented 2 i − i − = 2 i − times. Each such increment involves the work tape head going all the wayto the i and then back to the h . Each of these round trips except the lastone takes ( i + 1) + ( i + 1) steps, whereas the last one takes ( i + 2) + ( i + 2)steps, since it involves extending the length of the counter by 1. Thetotal number of steps between the two consecutive potentially acceptingconfigurations is therefore (2 i − − i + 2) + 2 i + 4 = 2 i ( i + 1) + 2.The fact that k i − k i − increases as i increases leads to a trivial proof,using the pumping lemma for regular languages, of the nonregularity of L D .Since D belongs to the most specialized machine class under consider-ation, and obviously uses O (log n ) space to recognize a unary nonregularlanguage, we conclude that all the logarithmic lower bounds in Table 1are tight for real-time machines as well.We note that Freivalds and Karpinski use a real-time logarithmic-sizecounter in order to demonstrate a lower time bound for probabilistic TMsin [FK95]. Such a counter could also be used to prove at least some of theounds handled by Theorem 1 to be tight. To the best of our knowledge,the minimum space requirements for nonregular language recognition bythese real-time machines have not been studied before.We will now turn our attention to the double logarithmic boundsfor machines with general input alphabets in Table 1. Our strategy isto modify the one-way machines which are used in demonstrating theseresults to obtain real-time machines with the same space complexity. Thelanguages recognized by these new real-time machines will be “padded”versions of their counterparts in the one-way setup. Essentially, the real-time machine will consume a padding symbol ( κ ) for each step of theone-way machine in which the input head pauses on the tape. Definition 1.
Given a one-way deterministic, nondeterministic or al-ternating TM D , the real-time TM D κ of the same model variant is con-structed as follows: – The input alphabet of D κ is Σ κ = Σ ∪ { κ } , where Σ is the inputalphabet of D , and κ / ∈ Σ ; – The work tape alphabets of the two machines are identical; – D κ skips over any prefix of κ ’s in its input until it sees the first non- κ symbol, at which point it starts its emulation of D ; – On each input symbol other than κ , D κ emulates the state transitionand work tape action of D for that symbol; – After implementing a stationary transition (that is, one which doesnot move the input head,) of D in the manner described above, D κ checks whether the next input symbol is κ . If this is the case, D κ implements the transition that D would be making while scanning thesymbol that is the last non- κ symbol scanned by D κ , otherwise, D κ rejects the input; – After implementing a transition of D that moves its input head to theright, D κ skips over any κ ’s in the input without any state or worktape change, until it sees a non- κ symbol.At the end, D κ accepts or rejects as D would, unless it has already rejectedin the eventualities described above. Let h κ : Σ κ → Σ be a homomorphism such that – h κ ( x ) = x if x = κ , and, – h κ ( κ ) = ε . Definition 2. L - κ ⊂ Σ ∗ κ is the language recognized by D κ , where D is aone-way TM recognizing the language L ⊆ Σ ∗ . emma 1. h κ ( L - κ ) = L .Proof. It is obvious that h κ ( L - κ ) ⊆ L . Assume that a string w ∈ Σ ∗ is accepted by the one-way alternating TM D . D may spend differentamounts of time, pausing on different symbols, in different branches ofits accepting computation tree. Let t be the maximum number of stepsthat D pauses for on any particular symbol in this tree. The string w ′ ,obtained by inserting t κ ’s after every symbol of w , is accepted by D κ . Remark 1. If L is not a member of class C that is closed under homomor-phism, then L - κ is also not a member of C . In particular, if L is nonregular,then L - κ is also nonregular. Theorem 2. If L is recognized by a strongly (resp., middle) space s ( n ) -bounded one-way TM D , then for any nondecreasing function t ( n ) suchthat s ( n ) ∈ O ( t ( n )) , the real-time TM D κ recognizing L - κ is strongly(resp., middle) O ( t ( n )) -space bounded.Proof. For any w ∈ Σ ∗ , the space used by D κ on any input which is amember of { w κ ∈ Σ ∗ κ | h κ ( w κ ) = w } is at most equal to the space usedby D on w .Consider the nonregular language L gcm = { a m b M | M is a common multiple of all i ≤ m } .Szepietowski [Sze88] showed that L gcm is recognized by a middle spacelog log( n )-bounded one-way alternating TM that we will call A . Corollary 1.
The real-time alternating TM A κ recognizing the nonreg-ular language L gcm - κ is middle O (log log( n )) -space bounded. We are therefore able to state that the middle and strong space boundsfor alternating TMs with general input alphabets in Table 1 are also tightfor the real-time versions of these machines.
Remark 2.
There is no analogue of Theorem 2 for weak space, becausesome accepting path may use more space but use less time (i.e. performfewer stationary moves) than the accepting path with the best space usagein the one-way machine, and the padding process described above maythen yield some strings whose only accepting paths use unacceptably largeamounts of space in the real-time machine. The cases covered in Theorem2 avoid this problem by requiring all computations in consideration toremain within the space bound.o examine the weak space bound case for nondeterministic machineswith general input alphabets, we focus on the one-way machine N , whichrecognizes L j = k = { a j b k | j = k } , [Sze94] (Lemma 4.1.3 on Page 23) usedto establish the corresponding lower bound in Table 1: The work tape of N is divided into three tracks. At the beginning of the computation, N nondeterministically selects a number l >
1, and writes it in binary onthe top track. Any string containing an a after a b is rejected. For stringsof the form a r b s , N calculates and stores the value of r mod ( l ) (resp., s mod ( l )) on the middle track (resp., bottom track). When the end-markeris read, the numbers in the middle and bottom tracks are compared. Ifthey are not equal, the input is accepted. The nondeterministic pathcorresponding to l , namely npath l , needs only Θ (log( l )) space.If r = s , then r ≡ s mod ( l ) for all l >
1, and the input would berejected. Using a number-theoretical fact ([Sze94], page 22) which statesthat, whenever r = s , there exists a number l ∈ O (log( r + s )) such that r s mod ( l ), one concludes that N is weakly log log( n )-space bounded,and recognizes the nonregular language L j = k .We will modify N to obtain a real-time TM recognizing a padded ver-sion of L j = k , taking extra care to handle the additional complications ofweak space bounds mentioned in the above remark. We do this by mak-ing sure that nondeterministic paths with larger values of l have greaterruntimes than those with small values of l .We rewrite the program of N such that d l , that is, the number ofstationary steps performed during the processing of a single input symbolby the nondeterministic path containing l on the top track, depends solelyon l . For example, we can use the following strategy: – The work tape head is always placed on the leftmost nonblank symbolbefore reading the next input symbol, and, – The operations on the work tape are performed by sweeping the worktape head, which operates at the speed of one cell per step, betweenthe leftmost and rightmost nonblank symbols c > c , d l can be set to c ⌈ log l ⌉ + k fornondeterministic path npath l , where the exact value of k ∈ Z depends onthe way the numbers are stored on the work tape. Thus, we can guaranteethat for any l ′ > l , the number of the stationary steps (on the inputtape) of npath l ′ cannot be less than that of npath l . Let N ′ be the one-waynondeterministic TM implementing this modified algorithm, and L ′ ( j = k ) - κ be the language recognized by N ′ κ . emma 2. The real-time nondeterministic TM N ′ κ recognizing the non-regular language L ′ ( j = k ) - κ is weakly O (log log( n )) -space bounded.Proof. Let w = a r b s be a member of L j = k , and l be the smallest numbersatisfying r s mod ( l ). Then, for all w κ ∈ L ′ ( j = k ) - κ satisfying h κ ( w κ ) = w , the nondeterministic path of N ′ κ corresponding to l accepts the input.Table 2 summarizes the results presented in this section. Table 2.
Lower space bounds for nonregular language recognition byreal-time TMs. The bounds we have shown to be tight are in boldface.
General input alphabet Unary input alphabetStrong Middle Weak Strong Middle WeakDeterministic TM log n log n log n log n log n log n Nondeterministic TM log n log n log log n log n log n log log n Alternating TM log n log log n log log n log n log n log log n The question of the minimum amount of useful space is also interest-ing for more restricted models, like pushdown or counter automata. Thefollowing facts are known about (one-way) pushdown automata (PDAs)with unrestricted pushdown alphabets:
Fact 1.
All deterministic PDAs which recognize nonregular languages areweakly Θ ( n ) -space bounded [Gab84]. Fact 2.
There exists a weakly log n -space bounded nondeterministic PDAthat recognizes a nonregular language [Rei07]. Counter automata [FMR67] (essentially, PDAs with a unary push-down alphabet,) do not seem to have been investigated deeply. As aneasy corollary of Fact 1, we have
Fact 3. o ( n ) -space bounded real-time deterministic counter automatacannot recognize nonregular languages. e have considered only a single work tape in our real-time modelsuntil now. It is known [FMR67] that increasing the number of work tapesdoes increase the computational power of these machines. We show thatdeterministic PDAs and counter automata with two work tapes can rec-ognize nonregular languages using sublinear space, doing better that thebound of Fact 1 for single-stack machines:Let ( i ) denote the binary representation of i ∈ N . Let ( i ) r denote thereverse of ( i ) . Consider the language L even − rev − bins = { a (0) a (1) r a (2) a (3) r a · · · a (2 k ) a (2 k + 1) r | k > } . (1)In the following, substrings of the form { , } ∗ delimited by a’s will becalled “blocks,” and the term block i will denote the i ’th block to be en-countered by the one-way input head. Theorem 3. L even − rev − bins can be recognized by a deterministic real-time PDA with two stacks, and the total amount of space used on thestacks for accepted strings is O (log n ) .Proof. The machine uses one stack for checking that the members ofthe pairs in { ( block , block ) , ( block , block ) , ... } are related accordingto the language definition, whereas the other stack is used for the set { ( block , block ) , ( block , block ) , ... } .Inputs that cause the machine to compare blocks all the way to thelast one, say, block i , incur the greatest space cost. The space used in thestacks in that case is O (log i ), and the length of the input prefix is clearlymore than i . We conclude that the machine uses O (log n ) space. Theorem 4.
For any j > , there exists a nonregular language that canbe recognized by a deterministic real-time automaton with j (unary) coun-ters, such that the total amount of space used on the counters is O ( n j ) for all input strings.Proof. We start by considering the language L = { a a a a a a a · · · a a k | k ∈ N } , (inspired by [ABB80,Gab84],) on the alphabet { a , a } . L is recognizedby a deterministic real-time automaton with two counters. The algorithm,say, A , is similar to, in fact simpler than, that of Theorem 3. As for thespace usage, assume that the machine arrives at a decision to acceptor reject after scanning the i ’th block of a ’s. Since the length of thecanned input prefix is more than P i − l =1 l + 1 = Θ ( i ), and the countershave used O ( i ) space up to that point, the machine is strongly O ( √ n )-space bounded. Let w j ,k denote the k ’th shortest element of L j . Note that A accepts w ,k with one counter containing zero, and the other countercontaining k .We define L = { a w , a w , a w , a · · · a w ,k | k ∈ N } on the alphabet { a , a , a } . A deterministic real-time machine with threecounters can recognize L as follows. The automaton processes each blockdelimited by a ’s by using two of the counters for implementing A . If theinput head arrives at the i ’th a ( i > i −
1. The two counters containing zeroare now used to run A on the i ’th block, while the remaining counter isused to check that the number of a ’s in the i ’th block is indeed i . Twocounters have value zero, and the third one contains k when this machineaccepts w ,k . By a reasoning similar to the one about L above, the spaceusage is O ( n ). The generalization to L j+1 = { a j w j , a j w j , a j w j , a j · · · a j w j ,k | k ∈ N } is now simple.Even with one work tape, the probabilistic versions of these machines[HS10,Fre79] also require less space than the deterministic versions forrecognizing some languages. The language L even − rev − bins of Equation 1can be recognized by a real-time probabilistic PDA with error bound as follows: Reject the input at the start with probability . With theremaining probability, toss a fair coin to split to two computational paths,each using its single stack to mimic the corresponding stack of Theorem3. If the input is in L even − rev − bins , both paths accept, yielding an overallacceptance probability of . Otherwise, at least one path rejects, resultingin an acceptance probability of at most . The space requirement formembers of the language is O (log n ), as in Theorem 3.With a similar approach, real-time probabilistic automata with onecounter that can recognize any language of the L j family from Theorem4 using O ( n j ) space for member strings, albeit with error bounds thatincrease with j , can be constructed. Open questions
We conclude with a list of problems for future work.An obvious question left open in Section 2 is whether the doublelogarithmic lower bounds for the recognition of nonregular tally languagesby real-time nondeterministic and alternating TMs are tight.How does alternation affect the amount of useful space for (both one-way and real-time) PDAs and counter automata?What are the lower total space bounds for nonregular language recog-nition for machines with multiple work tapes, as exemplified in Section3? Real-time “nondeterministic” quantum Turing machines (i.e., thosethat accept with nonzero probability if and only if the input is a memberof the language) are known [YS10] to be able to recognize nonregularlanguages even with constant space, whereas no such language can berecognized by small-space probabilistic Turing machines in this mode,since these are synonymous with nondeterministic TMs (Table 2). Whentwo-sided unbounded error is allowed, probabilistic finite automata canrecognize nonregular languages [Rab63a]. What are the minimum spacerequirements for real-time nonregular language recognition of probabilis-tic TMs in the bounded-error regime [Fre85]? Can one improve on thespace usage of the probabilistic machines mentioned at the end of Section3? What is the minimum amount of space required for a real-time quan-tum Turing machine to recognize a nonregular language with boundederror?
Acknowledgements.
We are grateful for the constructive commentsof the two anonymous reviewers. We also thank Stefan D. Bruda, KlausReinhardt, Giovanni Pighizzini, Juraj Hromkoviˇc, and R¯usi¸nˇs Freivaldsfor their helpful answers to our questions.
References
ABB80. Jean-Michel Autebert, Joffroy Beauquier, and Luc Boasson.
Formal Lan-guage Theory: Perspectives and Open Problems , chapter Very small familiesof algebraic nonrational languages. Academic Press, 1980.AI99. Masami Amano and Kazuo Iwama. Undecidability on quantum finite au-tomata. In
STOC’99: Proceedings of the thirty-first annual ACM symposiumon Theory of computing , pages 368–375. ACM, 1999.BMP94a. Alberto Bertoni, Carlo Mereghetti, and Giovanni Pighizzini. Corrigendum:An optimal lower bound for nonregular languages.
Information ProcessingLetters , 52(6):339, 1994.MP94b. Alberto Bertoni, Carlo Mereghetti, and Giovanni Pighizzini. On languagesaccepted with simultaneous complexity bounds and their ranking problem.In
MFCS , volume 841 of
LNCS , pages 245–255, 1994.BMP94c. Alberto Bertoni, Carlo Mereghetti, and Giovanni Pighizzini. An optimallower bound for nonregular languages.
Information Processing Letters ,50(6):289–292, 1994.BMP95. Alberto Bertoni, Carlo Mereghetti, and Giovanni Pighizzini. Strong optimallower bounds for turing machines that accept nonregular languages. In
MFCS , volume 969 of
LNCS , pages 309–318, 1995.FK95. R¯usi¸nˇs Freivalds and Marek Karpinski. Lower time bounds for randomizedcomputation. In
Automata, Languages and Programming, 22nd Interna-tional Colloquium, ICALP95 Szeged, Hungary, July 1014, 1995, Proceedings ,pages 183–195. Springer-Verlag, 1995.FMR67. Patrick C. Fischer, Albert R. Meyer, and Arnold L. Rosenberg. Real timecounter machines (preliminary version). In
FOCS’67 , pages 148–154, 1967.Fre79. R¯usi¸nˇs Freivalds. Fast probabilistic algorithms. In
Mathematical Founda-tions of Computer Science 1979 , volume 74 of
LNCS , pages 57–69, 1979.Fre85. R¯usi¸nˇs Freivalds. Space and reversal complexity of probabilistic one-wayTuring machines.
Annals of Discrete Mathematics , 24:39–50, 1985.Gab84. Joaquim Gabarr´o. Pushdown space complexity and related full-A.F.L.s. In
STACS’84: Proceedings of the Symposium on Theoretical Aspects of Com-puter Science , pages 250–259, 1984.GMP98. Viliam Geffert, Carlo Mereghetti, and Giovanni Pighizzini. Sublogarithmicbounds on space and reversals.
SIAM Journal on Computing , 28(1):325–340,1998.HS10. Juraj Hromkoviˇc and Georg Schnitger. On probabilistic pushdown au-tomata.
Information and Computation , 208:982–995, 2010.MP95. Carlo Mereghetti and Giovanni Pighizzini. A remark on middle spacebounded alternating Turing machines.
Information Processing Letters ,56:229–232, 1995.Pig09. Giovanni Pighizzini. Nondeterministic one-tape off-line Turing machines.
Journal of Automata, Languages and Combinatorics , 14(1):107–124, 2009.Rab63a. Michael O. Rabin. Probabilistic automata.
Information and Control , 6:230–243, 1963.Rab63b. Michael O. Rabin. Real-time computation.
Israel Journal of Mathematics ,1(4), 1963.Rei07. Klaus Reinhardt. A tree-height hierarchy of context-free languages.
In-ternational Journal of Foundations of Computer Science , 18(6):1383–1394,2007.SHI65. Richard Edwin Stearns, Juris Hartmanis, and Philip M. Lewis II. Hierarchiesof memory limited computations. In
IEEE Conference Record on SwitchingCircuit Theory and Logical Design , pages 179–190, 1965.Sip06. Michael Sipser.
Introduction to the Theory of Computation, 2nd edition .Thomson Course Technology, United States of America, 2006.Sze88. Andrzej Szepietowski. Remarks on languages acceptable in log n space. Information Processing Letters , 27:201–203, 1988.Sze94. Andrzej Szepietowski.
Turing Machines with Sublogarithmic Space .Springer-Verlag, 1994.YFSA12. Abuzer Yakaryılmaz, R¯usi¸nˇs Freivalds, A. C. Cem Say, and RubenAgadzanyan. Quantum computation with write-only memory.
Natural Com-puting , 11(1):81–94, 2012.S10. Abuzer Yakaryılmaz and A. C. Cem Say. Languages recognized by nondeter-ministic quantum finite automata.
Quantum Information and Computation ,10(9&10):747–770, 2010.YS11. Abuzer Yakaryılmaz and A. C. Cem Say. Unbounded-error quantum compu-tation with small space bounds.