[PDF] Small overlap monoids: the word problem

Abstract

We develop a combinatorial approach to the study of semigroups and monoids with finite presentations satisfying small overlap conditions. In contrast to existing geometric methods, our approach facilitates a sequential left-right analysis of words which lends itself to the development of practical, efficient computational algorithms. In particular, we obtain a highly practical linear time solution to the word problem for monoids and semigroups with finite presentations satisfying the condition C(4), and a polynomial time solution to the uniform word problem for presentations satisfying the same condition.

Full PDF

aa r X i v : . [ m a t h . R A ] D ec SMALL OVERLAP MONOIDS:THE WORD PROBLEM

Mark KambitesSchool of Mathematics, University of Manchester,Manchester M13 9PL, England.

[email protected]

Abstract.

We develop a combinatorial approach to the study of semi-groups and monoids with ﬁnite presentations satisfying small overlapconditions. In contrast to existing geometric methods, our approach fa-cilitates a sequential left-right analysis of words which lends itself to thedevelopment of practical, eﬃcient computational algorithms. In par-ticular, we obtain a highly practical linear time solution to the wordproblem for monoids and semigroups with ﬁnite presentations satisfyingthe condition C (4), and a polynomial time solution to the uniform wordproblem for presentations satisfying the same condition. Small overlap conditions are simple and natural combinatorial conditionson semigroup and monoid presentations, which serve to limit the complexityof derivation sequences between equivalent words in the generators. Theyform a natural semigroup-theoretic analogue of the small cancellation condi-tions which are extensively used in combinatorial and computational grouptheory [5]. It is well known that every group admitting a ﬁnite presentationsatisfying suitable small cancellation conditions is word hyperbolic in thesense of Gromov [2], and in particular has word problem solvable in lineartime.In the 1970s, Remmers [6, 7] developed an elegant geometric theory ofsmall overlap semigroups, using the natural semigroup-theoretic analogueof the van Kampen diagrams extensively employed in combinatorial grouptheory (see for example [5]). He applied his methods to show that semi-groups satisfying suﬃciently small overlap conditions have what would nowbe called linear Dehn function , that is, that the minimum length of a deriva-tion sequence between any two equivalent words is bounded above by a linearfunction of the word lengths. In theory, it follows immediately that one cantest if two words in the generators for such a semigroup are equivalent, by ex-haustively searching the (ﬁnite) space of all applicable derivation sequencesof the given length, to see if any of them transforms one word to the other.However, the number of possible derivation sequences, and hence the timecomplexity of this algorithm, is exponential in the word length. More sophis-ticated techniques (such as applications of graph reachability algorithms) areof course applicable, but the problem remains one of searching a space ofexponential size, and so we cannot really hope that this approach will leadto a tractable solution for the word problem. The question naturally arises,then, of how hard the word problem really is in these semigroups.

In this paper, we develop a new approach to the study of this importantclass of semigroups and monoids, along purely combinatorial lines. Whileour work lacks some of the mathematical elegance of Remmers’ approach— indeed our foundational results are of a rather technical nature and ourproofs mainly by case analysis — it has the advantage of permitting a se-quential (left-right) analysis of elements, which for computational purposesseems more relevant than a geometric viewpoint. Two computational con-sequences of the theory we develop are of particular interest. The ﬁrst isa linear time (on a two-tape Turing machine) algorithm to solve the wordproblem in any semigroup with a presentation satisfying Remmers’ condition C (4). The second is a polynomial time (more precisely, in the RAM model,quadratic in the presentation length and linear in the word length) solutionto the uniform word problem for presentations satisfying the same condi-tion. While the proofs of correctness and of the time complexity bounds forthese algorithms are rather technical, the algorithms themselves are quitestraightforward to describe and eminently suitable for practical implemen-tation; the author is currently working on an implementation for the GAPcomputer algebra system [1].In addition to this introduction, this paper comprises ﬁve sections. InSection 1 we brieﬂy recall the deﬁnitions of small overlap semigroups andmonoids, together with some of their properties, and introduce some nota-tion and terminology which will be used in the rest of the paper. Section 2establishes some technical, but nonetheless important, combinatorial prop-erties of small overlap monoids, which are then used in Section 3 to give asequential characterisation of equivalence for two words in the generators ofa C (4) presentation. Section 4 shows how this characterisation can be usedto develop a linear time algorithm for the solution of the word problem ofa ﬁxed small overlap presentation. Finally, in Section 5 we apply our tech-niques to the solution of the uniform word problem for C (4) presentations;we also observe that one test eﬃciently whether an arbitrary presentationsatisﬁes the condition C (4).The relationship of this work to the geometric approach developed byRemmers [6] perhaps deserves a further comment. As already mentioned,our approach to small overlap semigroups is entirely combinatorial and, inits ﬁnished state, makes no direct use of Remmers’ geometric machinery.However, the author would most likely never have arrived at this viewpointwithout the insight and intuition aﬀorded by Remmers’ approach, and thereader interested in fully understanding the present paper may ﬁnd it helpfulto study also Remmers’ work in parallel. Some of his results have beengiven a very accessible treatment by Higgins [3], but unfortunately the onlycomplete source still seems to be his thesis [6].1. Preliminaries

We assume familiarity with basic notions of combinatorial semigroup the-ory, including free semigroups and monoids, and semigroup and monoid pre-sentations. In all but Section 5 of the paper, which is devoted to uniformdecision problems, we assume we have a ﬁxed ﬁnite presentatation for a

MALL OVERLAP MONOIDS 3 monoid (or semigroup — we shall see shortly that the diﬀerence is unim-portant). Words are assumed to be drawn from the free monoid on thegenerating alphabet unless otherwise stated. We write u = v to indicatethat two words are equal in the free monoid, and u ≡ v to indicate thatthey represent the same element of the semigroup presented. We say thata word p is a possible preﬁx of u if there exists a (possibly empty) word w with pw ≡ u , that is, if the element represented by u lies in the right idealgenerated by the element represented by p . The empty word is denoted ǫ .A relation word is a word which occurs as one side of a relation in thepresentation. A piece is a word in the generators which occurs as a factor insides of two diﬀerent relations, or as a factor of both sides of a relation, or intwo diﬀerent (possibly overlapping) places within one side of a relation. Toensure a uniform treatment for free semigroups and monoids, we make theconvention that the empty word ǫ is always a piece, even if the presentationhas no relations.The presentation is said to satisfy the condition C ( n ), where n is a positiveinteger, if no relation word can be written as the product of strictly fewerthan n pieces. Thus for each n , C ( n + 1) is a strictly stronger conditionthan C ( n ). We brieﬂy mention another related condition. The presentation satisﬁes the condition OL( x ) , where 0 ≤ x ≤ p occursas a factor of a relation word R we have | p | < x | R | . Notice that if n is apositive integer, then a semigroup satisfying OL(1 /n ) will certainly satisfy C ( n + 1).The weakest meaningful small overlap condition, C (1), says that no rela-tion word is a product of zero pieces, that is, that ǫ is not a relation word.From this we see that in a small overlap monoid presentation, no non-emptyword can be equivalent to the empty word, that is, no non-empty word canrepresent the identity. It follows that every small overlap monoid presenta-tion is also interpretable as a semigroup presentation, and that the monoidpresented is isomorphic to the semigroup presented with an adjoined identityelement. For simplicity in what follows we shall focus upon small overlapmonoids, but from each of our results one can immediately deduce a corre-sponding result for small overlap semigroups.For each relation word R , let X R and Z R denote respectively the longestpreﬁx of R which is a piece, and the longest suﬃx of R which is a piece. Ifthe presentation satisﬁes C (3) then R cannot be written as a product of twopieces, so this preﬁx and suﬃx cannot meet; thus, R admits a factorisation X R Y R Z R for some non-empty word Y R . If moreover the presentation satisﬁesthe stronger condition C (4) then R cannot be written as a product of threepieces, so Y R is not a piece. The converse also holds: a C (3) presentationsuch that no Y R is a piece is a C (4) presentation. We call X R , Y R and Z R the maximal piece preﬁx , the middle word and the maximal piece suﬃx respectively of R .Assuming now that the presentation satisﬁes at least the condition C (3),we shall use the letters X , Y and Z (sometimes with adornments or sub-scripts) exclusively to represent maximal piece preﬁxes, middle words andmaximal piece suﬃxes respectively of relation words; two such letters with SMALL OVERLAP MONOIDS the same subscript or adornment (or with none) will be assumed to standfor the appropriate factors of the same relation word.If R is a relation word we write R for the (necessarily unique, as a result ofthe small overlap condition) word such that ( R, R ) or (

R, R ) is a relation inthe presentation. We write X R , Y R and Z R for X R , Y R and Z R respectively.(This is an abuse of notation since, for example, the word X R may be amaximal piece preﬁx of two distinct relation words, but we shall be carefulto ensure that the meaning is clear from the context.)2. Weak Cancellation Properties

To perform eﬃcient computations with words, it is very helpful to beable to process them in a sequential, left-right manner. To facilitate thisin the case of the word problem for small overlap monoids, we need toknow what can be deduced about the equivalence (or non-equivalence) oftwo words from preﬁxes of those words. This section develops a theorywith this end in mind, including a number of results which can be viewedas weak cancellativity conditions satisﬁed by small overlap monoids. Weassume throughout a ﬁxed monoid presentation satisfying the small overlapcondition C (4).We ﬁrst introduce some terminology. A relation preﬁx of a word is a preﬁxwhich admits a (necessarily unique, as a consequence of the small overlapcondition) factorisation of the form aXY where X and Y are the maximalpiece preﬁx and middle word respectively of some relation word XY Z . An overlap preﬁx (of length n ) of a word u is a relation preﬁx which admitsan (again necessarily unique) factorisation of the form bX Y ′ X Y ′ . . . X n Y n where • n ≥ • no factor of the form X Y begins before the end of the preﬁx a ; • for each 1 ≤ i ≤ n , R i = X i Y i Z i is a relation word with X i and Z i the maximal piece preﬁx and suﬃx respectively; and • for each 1 ≤ i < n , Y ′ i is a proper, non-empty preﬁx of Y i .Notice that if a word has a relation preﬁx, then the shortest such must bean overlap preﬁx. A relation preﬁx aXY of a word u is called clean if u does not have a preﬁx aXY ′ X Y where X and Y are the maximal piece preﬁx and middle word respectivelyof some relation word, and Y ′ is a proper, non-empty preﬁx of Y . Cleanoverlap preﬁxes, in particular, will play a crucial role in what follows. Proposition 1.

Let aX Y ′ X Y ′ . . . X n Y n be an overlap preﬁx of some word.Then this preﬁx contains no relation word as a factor (except possibly X n Y n in the case that Z n = ǫ ).Proof. Suppose that the given overlap preﬁx contains a relation word R asa factor. By the deﬁnition of an overlap preﬁx, no occurrence of R canbegin before the end of the preﬁx a , so we may assume that R is a factor of X Y ′ X Y ′ . . . X n Y n . It follows that either R contains X i Y ′ i as a factor forsome i , or else R is a factor of X i Y ′ i X i +1 Y ′ i +1 for some i (where Y ′ i +1 = Y n if MALL OVERLAP MONOIDS 5 i + 1 = n ) and we may assume without loss of generality that the occurrenceof R overlaps non-trivially with the preﬁx X i Y ′ i .In the former case, since X i is a maximal piece preﬁx of X i Y i Z i and Y ′ i is non-empty, X i Y ′ i cannot be a piece; it follows then that we musthave R = X i Y i Z i with the occurrence in the obvious place. In the lattercase, R is the product of a non-empty factor of X i Y i Z i with a factor of the X i +1 Y i +1 Z i +1 ; but by the small overlap assumption, R cannot be writtenas a product of two pieces, so it must again be that R = X i Y i Z i with theoccurrence in the obvious place.Now if i = n then, since R is a factor of the given relation preﬁx, we mustclearly have R = X i Y i Z i = X i Y i so that Z i = ǫ . On the other hand, if i < n then either X i Y i Z i contains X i +1 Y ′ i +1 as a factor, which contradicts the factthat X i +1 is a maximal piece preﬁx of X i Y i Z i , or else (recalling that Y ′ i is aproper preﬁx of Y i ) we see that X i +1 Y ′ i +1 contains a non-empty suﬃx of Y i followed by Z i , which contradicts the fact that Z i is a maximal piece suﬃxof X i Y i Z i . (cid:3) Proposition 2.

Let u be a word. Every overlap preﬁx of u is contained ina clean overlap preﬁx of u .Proof. We ﬁx u and prove by induction on the diﬀerence between the lengthof u and the length of the given overlap preﬁx, that is, on the length ofthat part of u not contained in the given overlap preﬁx. For the base case,observe that an overlap preﬁx constituting the whole of u is necessarilyclean. Now suppose aX Y ′ . . . X n Y n is an overlap preﬁx, and that the resultholds for longer overlap preﬁxes of u . If the given preﬁx is clean then thereis nothing to prove. Otherwise, by the deﬁnition of a clean overlap preﬁx,there exist words X and Y , being the maximal piece preﬁx and the middleword respectively of some relation word, and a proper non-empty preﬁx Y ′ n of Y n such that aX Y ′ . . . X n Y ′ n XY is a preﬁx of u . Clearly this is an overlap preﬁx of u which is strictly longerthan the original one, and so by induction is contained in a clean overlappreﬁx of u . But now the original overlap preﬁx of is contained in a cleanoverlap preﬁx, as required. (cid:3) Corollary 1.

If a word u has no clean overlap preﬁx, then it contains norelation word as a factor, and so if u ≡ v then u = v .Proof. Suppose u has no clean overlap preﬁx. If u contained a relation wordas a factor then clearly it would have a relation preﬁx, that is, a preﬁxof the form aX R Y R for some relation word R . But by our observationsabove, the shortest relation preﬁx of u would be an overlap preﬁx, and so byProposition 2, is contained in a clean overlap preﬁx of u . Thus, u containsno relation word as a factor. It follows easily that no relations can be appliedto u , so the only word equivalent to u is u itself. (cid:3) Lemma 1. If u = wXY Zu ′ with wXY a clean overlap preﬁx then wXY isa clean overlap preﬁx of wXY Zu ′ .Proof. Let wXY = aX Y ′ . . . X n Y ′ n XY (1) SMALL OVERLAP MONOIDS be the factorisation given by the deﬁnition of a clean overlap preﬁx. Then wXY Zu ′ has a preﬁx wXY = aX Y ′ . . . X n Y ′ n XY (2)If n ≥ wXY is anoverlap preﬁx of wXY Zu ′ . In the case n = 0, however, we must consider thepossibility that the preﬁx aXY = wXY contains a factor of the form X Y overlapping the ﬁnal initial segment a . Suppose it does. Then recalling that Y is not a piece, and so cannot be a factor of XY , we see that aXY admitsa factorisation aXY = bX Y ′ XY (3)for some non-empty preﬁx Y ′ or Y . Moreover, Y ′ must be a proper preﬁxof Y , or else a would have a factor X Y , contradicting the fact that wXY was a clean overlap preﬁx of u . This shows that wXY is an overlap preﬁxof wXY Zu ′ .It remains to show that the given overlap preﬁx is clean. Suppose fora contradiction that it is not. Then by deﬁnition, there is a factor of theform ˆ X ˆ Y overlapping the end of the preﬁx aXY ; but this factor is eitherby contained in XY Z (contradicting the supposition that ˆ X is a maximalpiece preﬁx of a relation word ˆ X ˆ Y ˆ Z ) or contains a non-empty suﬃx of Y followed by Z (contradicting the assumption that Z is a maximal piece suﬃxof XY Z ). (cid:3) The following lemma is fundamental to our approach to C (4) monoids.With careful application it seems to permit a comparable understanding tothat resulting from Remmers’ geometric theory, but in a purely combinato-rial (and hence more computationally orientated) way. Lemma 2.

Suppose a word u has clean overlap preﬁx wXY . If u ≡ v then v has overlap preﬁx either wXY or wXY , and no relation word occurring asa factor of v overlaps this preﬁx, unless it is XY Z or XY Z as appropriate.Proof.

Since wXY is an overlap preﬁx of u , it has by deﬁnition a factorisa-tion wXY = aX Y ′ . . . X n Y ′ n XY for some n ≥

0. We use this fact to prove the claim by induction on thelength r of a rewrite sequence (using the deﬁning relations) from u to v .In the case r = 0, we have u = v , so v certainly has (clean) overlappreﬁx vXY . By Proposition 1, no relation word factor can occur entirelywithin this preﬁx (unless it is XY and Z = ǫ ). If a relation word factor of v overlaps the end of the given overlap preﬁx and entirely contains XY then,since XY is not a piece, that relation word must clearly be XY Z . Finally,a relation word cannot overlap the end of the given overlap preﬁx but notcontain the suﬃx XY , since this would clearly contradicts the fact that thegiven overlap preﬁx is clean.Suppose now for induction that the lemma holds for all values less than r , and that there is a rewrite sequence from u to v of length r . Let u bethe second term in the sequence, so that u is obtained from u by a singlerewrite using the deﬁning relations, and v from u by r − MALL OVERLAP MONOIDS 7

Consider the relation word in u which is to be rewritten in order to obtain u , and in particular its position in u . By Proposition 1, this relation wordcannot be contained in the clean overlap preﬁx wXY , unless it is XY where Z = ǫ .Suppose ﬁrst that the relation word to be rewritten contains the ﬁnalfactor Y of the given clean overlap preﬁx. (Note that this covers in par-ticular the case that the relation word is XY and Z = ǫ .) From the C (4)assumption we know that Y is not a piece, so we may deduce that the re-lation word is XY Z contained in the obvious place. In this case, applyingthe rewrite clearly leaves u with a preﬁx wXY , and by Lemma 1, this isa clean overlap preﬁx. Now v can be obtained from u by r − v has overlap preﬁxeither wXY or wXY = wXY , and that no relation word occurring as afactor of v overlaps this preﬁx, unless it is XY Z or XY Z as appropriate;this completes the proof in this case.Next, we consider the case in which the relation word factor in u to berewritten does not contain the ﬁnal factor Y n of the clean overlap preﬁx, butdoes overlap with the end of the clean overlap preﬁx. Then u has a factorof the form XY , where X is the maximal piece preﬁx and Y the middleword of a relation word, which overlaps X n Y n , beginning after the start of Y n . This clearly contradicts the assumption that the overlap preﬁx is clean.Finally, we consider the case in which the relation word factor in u whichis to be rewritten does not overlap the given clean overlap preﬁx at all. Thenobviously, the given clean overlap preﬁx of u remains an overlap preﬁx of u . If this overlap preﬁx is clean, then a simple application of the inductivehypothesis again suﬃces to prove that v has the required property.There remains, then, only the case in which the given overlap preﬁx is nolonger clean in u . Then by deﬁnition there exist words X and Y , being amaximal piece preﬁx and middle word respectively of some relation word,such that u has the preﬁx aX Y ′ . . . X n − Y ′ n − X n Y ′ n XY for some proper, non-empty preﬁx Y ′ n of Y n . Now certainly this is not a preﬁxof u , since this would contradict the assumption that aX Y ′ . . . X n Y n is aclean overlap preﬁx of u . So we deduce that u must contain a relation wordoverlapping the ﬁnal XY . This relation word cannot contain the ﬁnal factor XY , since this would again contradict the assumption that aX Y ′ . . . X n Y n is a clean overlap preﬁx of u . Nor can the relation word contain the ﬁnalfactor Y , since Y is not a piece. Hence, u must have a preﬁx aX Y ′ . . . X n − Y ′ n − X n Y ′ n XY ′ R for some relation word and proper, non-empty preﬁx Y ′ of Y and somerelation word R . Suppose R = X R Y R Z R where X R and Z R are the maximalpiece preﬁx and suﬃx respectively. Then it is readily veriﬁed that aX Y ′ . . . X n − Y ′ n − X n Y ′ n XY ′ X R Y R is a clean overlap preﬁx of u . But now by the inductive hypothesis, v haspreﬁx either aX Y ′ . . . X n − Y ′ n − X n Y ′ n XY ′ X R Y R (4) SMALL OVERLAP MONOIDS or aX Y ′ . . . X n − Y ′ n − X n Y ′ n XY ′ X R Y R (5)and so in particular it certainly has preﬁx aX Y ′ . . . X n − Y ′ n − X n Y ′ n XY ′ which in turn is easily seen to have preﬁx aX Y ′ . . . X n − Y ′ n − X n Y n . (6)Moreover, by Proposition 1, the preﬁx (4) or (5) of v contains no relationword as a factor (unless it is the ﬁnal factor X R Y R and Z R = ǫ ) and itfollows easily that no relation word factor overlaps the preﬁx (6) of v . (cid:3) The lemma has the following easy corollary.

Corollary 2.

Suppose a word u has (not necessarily clean) overlap pre-ﬁx wXY . If u ≡ v then v has a preﬁx w and contains no relation wordoverlapping this preﬁx.Proof. By Proposition 2 the overlap preﬁx wXY of u is contained in a cleanoverlap preﬁx w ′ X ′ Y ′ of u . Now by Lemma 2, v has a preﬁx w ′ and containsno relation word overlapping this preﬁx. But it is easily seen that w ′ mustbe at least as long as w , so that v has a preﬁx w and contains no relationword overlapping this preﬁx, as required. (cid:3) The following proposition describes a very weak left cancellation propertyof small overlap monoids; it will allow us to restrict attention to words witha preﬁx of the form XY where X and Y are the maximal piece preﬁx andmiddle word respectively of some relation word. Proposition 3.

Suppose a word u has an overlap preﬁx aXY and that u = aXY u ′′ . Then u ≡ v if and only if v = av ′ where v ′ ≡ XY u ′′ .Proof. Clearly if v = av ′ with v ′ ≡ X Y u ′′ then it is immediate that v = av ′ ≡ aX Y u ′′ = v .Conversely, suppose u ≡ v . Since aXY is an overlap preﬁx, by Propo-sition 1 it cannot contain a relation word starting before the end of a . ByCorollary 2, v has preﬁx a , say v = av ′ . Now consider a rewrite sequence,using the deﬁning relations, from u to v . Again using Corollary 2, everyterm in this sequence will have preﬁx a , and contain no relation word over-lapping this preﬁx. It follows that the same sequence of rewrites can beapplied to take X Y u ′′ to v ′ , so that v ′ ≡ X Y u ′′ as required. (cid:3) We now introduce some more terminology. Let u be a word with shortestrelation preﬁx aXY , and let p be a piece. We say that u is p -inactive if pu has shortest relation preﬁx paXY and p -active otherwise. The followingproposition describes another weak cancellation property of small overlapmonoids. Proposition 4.

Let u be a word and p a piece. If u is p -inactive then pu ≡ v if and only if v = pw for some w with u ≡ w . MALL OVERLAP MONOIDS 9

Proof.

Suppose u has shortest relation preﬁx aXY , so that pu has shortestrelation preﬁx paXY . Suppose u = aXY u ′′ . If pu ≡ v then by Proposition 3(since the shortest relation preﬁx is clearly an overlap preﬁx), we have v = pav ′ where v ′ ≡ XY u ′′ . Now setting w = av ′ we have v = pw and u = aXY u ′ ≡ av ′ = aw . The converse implication is obvious. (cid:3) Proposition 5.

Let Z and Z be maximal piece suﬃxes of relation wordsand suppose u is Z -active and Z -active. Then Z and Z have a commonnon-empty suﬃx, and if z is the maximal common suﬃx then (i) u is z -active; (ii) Z u ≡ v if and only if v = z v ′ where z z = Z and v ′ ≡ zu ; and (iii) Z u ≡ v if and only if v = z v ′ where z z = Z ; and v ′ ≡ zu .Proof. Let bX Y and cX Y be the shortest relation preﬁxes of Z u and Z v respectively. Since u is Z -active and Z -active, we must have | b | < | Z | and | c | < | Z | . Moreover, since Z is a piece and X is a maximal piecepreﬁx of the relation word X Y Z we must have | Z | ≤ | bX | , and similarly | Z | ≤ | cX | .It follows that u has preﬁxes X ′ Y and X ′ Y where X ′ and X ′ are proper(perhaps empty) suﬃxes of X and X respectively. Thus, one of X ′ Y and X ′ Y is a preﬁx of the other, and so either Y is a factor of X ′ Y andhence of X Y Z or Y is a factor of X ′ Y and hence of X Y Z . Butby the C (4) assumption, neither Y nor Y is a piece so the only possibleexplanation is that X Y Z and X Y Z are the same relation word, andmoreover X ′ = X ′ .Now let p be such that pX ′ = X . We have already observed that X ′ isa proper preﬁx of X , so p is non-empty. Now Z = bp , and also pX ′ = pX ′ = X = X so by symmetry we have Z = cp . Hence, p is a common non-empty suﬃxof Z and Z .Now let z be the maximal common suﬃx of Z and Z . Let y , z and z be such that z = yp , Z = z z and Z = z z . Then clearly b = z y and c = z y . Now zu = ypu has a relation preﬁx yX Y , from which it isimmediate that u is z -active so that (i) holds.To show that (ii) holds, let u ′ be such that u = X ′ Y u ′ , and suppose u ≡ v . Now Z u = z zX ′ Y u ′ = z ypX ′ Y u ′ = z yX Y u ′ where z yX Y is the shortest relation preﬁx, and hence is an overlap preﬁx.Hence, by Proposition 3 we have v = z yv ′′ where v ′′ ≡ X Y u ′ . But nowsetting v ′ = yv ′′ we have v = z v ′ , z z = Z and v ′ = yv ′′ ≡ yX Y u ′ = ypX ′ Y u ′ = zX ′ Y u ′ = zu as required. Conversely, if v = z v ′ where z z = Z and v ′ ≡ zu then wehave Z u = z zu ≡ z v ′ = v. This completes the proof that (ii) holds, and an entirely symmetric argumentshows that (iii) holds. (cid:3)

Corollary 3.

Let Z and Z be maximal piece suﬃxes of relation words.Suppose u is Z -active and Z u ≡ Z v . Then Z u ≡ Z v .Proof. If u is Z -inactive then by Proposition 4 we have u ≡ v , and socertainly Z u ≡ Z v .On the other hand, if u is Z -active then let z be the maximal commonsuﬃx of Z and Z and let z and z be such that z z = Z and z z = Z .Then by the Proposition 5(ii), since Z u ≡ Z v we have Z v = z v ′ where v ′ ≡ zu . But from z zv = Z v = z v ′ we deduce that v ′ = zv , so now wehave Z u = z zu ≡ z v ′ = z zv = Z v. (cid:3) Corollary 4.

Let u and v be words and Z and Z be maximal piece suﬃxesof relation words. Suppose there exist words u = u , . . . , u n = v such that Z u ≡ Z u , Z u ≡ Z u , Z u ≡ Z u , . . .. . . , ( Z u n − ≡ Z u n if n is even Z u n − ≡ Z u n if n is odd . Then either Z u ≡ Z v or Z u ≡ Z v or both.Proof. Fix u and v , and suppose n is minimal (allowing exchanging Z and Z if necessary) such that a sequence of equivalences as above exists. Sup-pose further for a contradiction that n >

2. If u was Z -inactive then byProposition 4 we would have u ≡ u so that Z u ≡ Z u ≡ Z u , contra-dicting the minimality assumption on n . Similarly, if u was Z -inactive thenwe would have u ≡ u so that Z u ≡ Z u ≡ Z u again contradicting theminimality assumption on n .Thus, u is both Z -active and Z -active. But now since Z u ≡ Z u ,we apply Corollary 3 to see that Z u ≡ Z u ≡ Z u , again providing therequired contradiction. (cid:3) Sequential Characterisation of Equality

In this section we use the theory developed in Section 2 to provide a newcharacterisation of when two words in the generators of a small overlap pre-sentation represent the same element of the monoid presented. In Section 4we shall use this characterisation to develop an eﬃcient algorithm to solvethe word problem.We ﬁrst present a lemma which gives a set of mutually exclusive combi-natorial conditions, the disjunction of which is necessary and suﬃcient fortwo words of a certain form to represent the same element.

Lemma 3.

Suppose u = XY u ′ where XY is a clean overlap preﬁx of u .Then u ≡ v if and only if one of the following mutually exclusive conditionsholds: (1) u = XY Zu ′′ and v = XY Zv ′′ and either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ or both; (2) u = XY u ′ , v = XY v ′ , and Z fails to be a preﬁx of at least one of u ′ and v ′ , and u ′ ≡ v ′ ; MALL OVERLAP MONOIDS 11 (3) u = XY Zu ′′ , v = XY Zv ′′ and either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ orboth; (4) u = XY u ′ , v = XY Zv ′′ but Z is not a preﬁx of u ′ and u ′ ≡ Zv ′′ ; (5) u = XY Zu ′′ , v = XY v ′ but Z is not a preﬁx of v ′ and Zu ′′ ≡ v ′ ; (6) u = XY u ′ , v = XY v ′ , Z is not a preﬁx of u ′ and Z is not a preﬁxof v ′ , but Z = z z , Z = z z , u ′ = z u ′′ , v ′ = z v ′′ where u ′′ ≡ v ′′ and z is the maximal common suﬃx of Z and Z , z is non-empty, and z is a possible preﬁx of u ′′ .Proof. First we treat the claim that the conditions (1)-(6) are mutuallyexclusive. Since X is a maximal piece preﬁx of XY Z and Y is non-empty, XY is not a piece. An entirely similar argument shows that XY is not apiece. In particular, neither of XY and XY is a preﬁx of the other, and so v can have at most one of them as a preﬁx. Thus, conditions (1)-(2) are notconsistent with conditions (3)-(6). The mutual exclusivity of (1) and (2) isself-evident from the deﬁnitions, and likewise that of (3)-(6).It is easily veriﬁed that each of the conditions (1)-(5) imply that u ≡ v .We show next that (6) implies that u ≡ v . Since z is a possible preﬁx of u ′′ and u ′′ ≡ v ′′ , we may write u ′′ ≡ zx ≡ v ′′ for some word x . Now we have u = XY u ′ = XY z u ′′ ≡ XY z zx = XY Zx ≡ XY Zx = XY z zx ≡ XY z v ′′ = XY v ′ = v. What remains, which is the main burden of the proof, is to prove that u ≡ v implies that at least one of the conditions (1)-(6) holds. To this end, then,suppose u ≡ v ; then there is a rewriting sequence taking u to v . By Lemma 2,every term in this sequence will have preﬁx either XY or XY and this preﬁxcan only be modiﬁed by the application of the relation ( XY Z, XY Z ) in theobvious place. We now prove the claim by case analysis.By Lemma 2, v begins either with XY or with XY . Consider ﬁrst the casein which v begins with XY ; we split this into two further cases dependingon whether u and v both begin with the full relation word XY Z ; these willcorrespond respectively to conditions (1) and (2) in the statement of thelemma.

Case (1).

Suppose u = XY Zu ′′ and v = XY Zv ′′ . Then clearly there isa rewriting sequence taking u to v which by Lemma 2 can be broken up as: u = XY Zu ′′ → ∗ XY Zu → XY Zu → ∗ XY Zu → XY Zu → ∗ · · · → XY Zu n → ∗ XY Zv ′′ = v where none of the steps in the sequences indicated by → ∗ involves rewritinga relation word overlapping with the preﬁx XY or XY as appropriate. Itfollows that there are rewriting sequences. Zu ′′ → ∗ Zu , Zu → ∗ Zu , Zu → ∗ Zu , . . . , Zu n → ∗ Zv ′′ Now by Corollary 4, either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ as required to showthat condition (1) holds. Case (2).

Suppose now that u = XY u ′ , v = XY v ′ and Z fails to bea preﬁx of at least one of u ′ and v ′ . We must show that u ′ ≡ v ′ ; supposefor a contradiction that this does not hold. We consider only the case that Z is not a preﬁx of u ′ ; the case that Z is not a preﬁx of v ′ is symmetric. We consider rewriting sequences from u = XY u ′ to v = XY v ′ . Again usingLemma 2, we see that there is either (i) such a sequence taking u to v containing no rewrites of relation words overlapping the preﬁx XY , or (ii)such a sequence taking u to v which can be broken up as: u = XY u ′ → ∗ XY Zu → XY Zu → ∗ XY Zu → XY Zu → ∗ · · · → XY Zu n → ∗ XY v ′ = v where none of the intermediate words in the sequences indicated by → ∗ con-tains a relation word overlapping with the preﬁx XY or XY as appropriate.In case (i) there is clearly a rewrite sequence taking u ′ to v ′ so that u ′ ≡ v ′ as required. In case (ii), there are rewriting sequences. u ′ → ∗ Zu , Zu → ∗ Zu , Zu → ∗ Zu , . . . , Zu n → ∗ v ′ . Notice that, since u ′ does not begin with Z , we can deduce from Proposi-tion 4 that u is Z -active. By Corollary 4, either Zu ≡ Zu n or Zu ≡ Zu n .In the latter case, since u is Z -active, Corollary 3 tells us that we also have Zu ≡ Zu n in any case. But now u ′ ≡ Zu ≡ Zu n ≡ v ′ so condition (2) holds and we are done.We have now shown that if v begins with XY then either condition (1) orcondition (2) holds. It remains to consider the case in which v begins with XY , and show that one of conditions (1)-(6) must be satisﬁed. We splitthe analysis here into four cases depending on whether u begins with thefull relation word XY Z , and whether v begins with the full relation word XY Z ; these four cases will correspond respectively to conditions (3)-(6) inthe statement of the lemma.

Case (3).

Suppose u = XY Zu ′′ and v = XY Zv ′′ . Then u = XY Zu ′′ ≡ v ≡ XY Zv ′′ , so by the same argument as in case (1) we have either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ as required to show that condition (3) holds. Case (4).

Suppose u = XY u ′ and v = XY Zv ′′ but Z is not a preﬁxof u ′ . Then u = XY u ′ ≡ v ≡ XY Zv ′′ . Now applying the same argumentas in case (2) (with XY Zv ′′ in place of v and setting v ′ = Zv ′′ ) we have u ′ ≡ v ′ = Zv ′′ so that condition (4) holds. Case (5).

Suppose u = XY Zu ′′ , v = XY v ′ but Z is not a preﬁx of v ′ .Then we have XY Zu ′′ ≡ u ≡ v = XY v ′ . Now applying the same argumentas in case (1) (but with XY Zu ′′ in place of u and setting u ′ = Zu ′′ ) weobtain u ′ ≡ v ′ = Zu ′′ so that condition (5) holds. Case (6).

Suppose u = XY u ′ , v = XY v ′ and that Z is not a preﬁx of u ′ and Z is not a preﬁx of v ′ . It follows this time there is a rewriting sequencetaking u to v of the form u = XY u ′ → ∗ XY Zu → XY Zu → ∗ XY Zu → XY Zu → ∗ · · · → XY Zu n → ∗ XY v ′ = v where once more none of the intermediate words in the sequences indicatedby → ∗ contains a relation word overlapping with the preﬁx XY or XY asappropriate. Now there are rewriting sequences. u ′ → ∗ Zu , Zu → ∗ Zu , Zu → ∗ Zu , . . . , Zu n − → ∗ Zu n , Zu n → ∗ v ′ . MALL OVERLAP MONOIDS 13

Notice that, since u ′ does not begin with Z , we may deduce from Proposi-tion 4 that u is Z -active. By Corollary 4, either Zu ≡ Zu n or Zu ≡ Zu n .In the latter case, since u is Z -active, Corollary 3 tells us that we also have Zu ≡ Zu n anyway. But now u ′ ≡ Zu ≡ Zu n where u ′ does not begin with Z , and also v ′ ≡ Zu n were v ′ does not beginwith Z . By applying Proposition 4 twice, we deduce that u n is both Z -activeand Z -active.Let z be the maximal common suﬃx of Z and Z . Then applying Propo-sition 5 (with Z = Z and Z = Z ), we see that z is non-empty and • u ′ = z u ′′ where Z = z z and u ′′ ≡ zu n ; and • v ′ = z v ′′ where Z = z z and v ′′ ≡ zu n .But then we have u ′′ ≡ zu n ≡ v ′′ and also z is a possible preﬁx of u ′′ asrequired to show that condition (6) holds. (cid:3) Lemma 3 gives a ﬁrst clue as to how one might solve the word problem fora small overlap monoid by analysing words sequentially from left to right.The natural strategy is as follows. First, use Proposition 3 to reduce tothe case in which the words both have clean relation preﬁxes of the form XY or XY . Now by examining short preﬁxes, one can clearly always ruleout at least ﬁve of the six mutually exclusive conditions of the lemma. Theremaining condition will involve equivalence of words derived from suﬃxesof u and v , so apply the same approach recursively to test whether thiscondition is satisﬁed.This approach meets with several apparent obstacles. Firstly, it is notclear that the words derived from the suﬃxes of u and v , which must betested for equivalence in the recursive call, are shorter than the originalwords u and v ; for example, a relation word XY Z may be shorter than themaximal piece suﬃx Z of the word on the other side of the relation. In factthe recursive call will not always involve shorter words, but it will involvewords which are simpler in a more subtle sense, so that the algorithm stillterminates rapidly. Secondly, some of the conditions involve a disjunctionof equivalence of two pairs of words derived from the suﬃxes; testing bothwould require two recursive calls, potentially leading to exponential timecomplexity. It tranpires, though, that the theory of activity and inactiv-ity developed in Section 2 means that one recursive call will always suﬃce.Finally, some of the conditions require us to check the possible preﬁxes ofwords derived from suﬃxes; this problem is solved by the following develop-ment of Lemma 3, which gives simultaneous conditions for two words to beequal, and to admit a given piece as a possible preﬁx. Lemma 4.

Suppose u = XY u ′ where XY is a clean overlap preﬁx, andsuppose p is a piece. Then u ≡ v and p is a possible preﬁx of u if and onlyif one of the following mutually exclusive conditions holds: (1’) u = XY Zu ′′ and v = XY Zv ′′ , either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ ,and also p is a preﬁx of either X or X or both; (2’) u = XY u ′ , v = XY v ′ , and Z fails to be a preﬁx of at least one of u ′ and v ′ , and u ′ ≡ v ′ , and also either – p is a preﬁx of X – p is a preﬁx of X and Z is a possible preﬁx of u ′ ;or both; (3’) u = XY Zu ′′ , v = XY Zv ′′ and either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ orboth, and also p is a preﬁx of X or X or both; (4’) u = XY u ′ , v = XY Zv ′′ but Z is not a preﬁx of u ′ and u ′ ≡ Zv ′′ ,and also p is a preﬁx of X or X or both; (5’) u = XY Zu ′′ , v = XY v ′ but Z is not a preﬁx of v ′ and Zu ′′ ≡ v ′ ,and also p is a preﬁx of X or X or both; (6’) u = XY u ′ , v = XY v ′ , Z is not a preﬁx of u ′ and Z is not a preﬁxof v ′ , but Z = z z , Z = z z , u ′ = z u ′′ , v ′ = z v ′′ where u ′′ ≡ v ′′ , z is the maximal common suﬃx of Z and Z , z in non-empty, z is apossible preﬁx of u ′′ , and also p is a preﬁx of X or X or both.Proof. Mutual exclusivity of the six conditions is proved exactly as forLemma 3.Suppose now that one of the six conditions above applies. Each conditionclearly implies the corresponding condition from Lemma 3, so we deduceimmediately that u ≡ v . We must show, using the fact that p is a preﬁx of X or of X , that p is a possible preﬁx of u , or equivalently of v .In case (1’), if p is a preﬁx of X then it is a preﬁx of u , while if p is apreﬁx of X then it is a preﬁx of XY Zu ′′ which is clearly equivalent to u . Incase (2’), if p is a preﬁx of X then it is again a preﬁx of u , while if p is apreﬁx of X and Z is a possible preﬁx of u ′ , say u ′ ≡ Zw , then u = XY u ′ ≡ XY Zw ≡ XY Zw where the latter has p as a preﬁx. In the remaining cases u begins with X and v begins with X , so p is a preﬁx of either u or v , and hence a possiblepreﬁx of u .Conversely, suppose u ≡ v and p is a possible preﬁx of u . Then exactlyone of the six conditions in Lemma 3 applies. By Lemma 2, every wordequivalent to u begins with either XY or XY . Since p is a piece, X is themaximal piece preﬁx of XY Z , and X is the maximal piece preﬁx of XY Z it follows that p is a preﬁx of either X or X . If any but condition (2) ofLemma 2 is satisﬁed, this suﬃces to show that the corresponding conditionfrom the statement of Lemma 4 holds.If condition (2) from Lemma 3 applies, we must show additionally thateither p is a preﬁx of X , or p is a preﬁx of X and Z is a possible preﬁxof u ′ . Suppose p is not a preﬁx of X . Then by the above, p is a preﬁxof X . It follows from Lemma 2, that the only way the preﬁx XY of theword u can be changed using the deﬁning relations is by application of therelation ( XY Z, XY Z ). In order for this to happen, one must clearly be ableto rewrite u = XY u ′ to a word of the form XY Zw ; consider the shortestpossible rewriting sequence which achieves this. By Lemma 2, no term in thesequence except for the last term will contain a relation word overlappingthe initial XY . It follows that the same rewriting steps rewrite u ′ to Zw ,so that Z is a possible preﬁx of u ′ , as required. (cid:3) MALL OVERLAP MONOIDS 15 The Algorithm

In this section we present an algorithm, for a ﬁxed monoid presentationsatisfying C (4), which takes as input arbitrary words u and v and a piece p ,and decides whether u ≡ v and p is a possible preﬁx of u . It will transpirethat this algorithm can be implemented to run time in linear in the shorterof u and v . In particular, by setting p = ǫ we obtain an algorithm tosolve the word problem in time linear in the smaller of the input words.The algorithm is shown (in recursive/functional pseudocode) in Figure 1.Our ﬁrst objective is to prove the correctness of the algorithm, that is, thatwhenever the algorithm terminates, it provides the output it gives is correct. Lemma 5.

Suppose u and v are words and p a piece. Then the algorithm WP-PREFIX ( u, v, p ) • outputs YES only if u ≡ v and p is a possible preﬁx of u ; and • outputs NO only if u = v or p is not a possible preﬁx of u .Proof. We prove correctness using induction on the number n of recursivecalls.Consider ﬁrst the base case n = 0, that is, where the algorithm terminateswithout a recursive call. Suppose u , v and p are such that this happens.We consider each of the possible lines at which termination may occur,establishing in each case that the output produced is correct. Line 3. If u = ǫ , v = ǫ and p = ǫ then clearly u ≡ v and p is a possible preﬁxof u , so the output YES is correct.

Line 4. If u = ǫ [respectively, v = ǫ ] then it follows easily from the smalloverlap condition C (4) that no relations can be applied to u [ v ];indeed a relation which could be applied to u [ v ] would have to have ǫ as one side, but ǫ is a piece and hence cannot be a relation word.Hence, we can have that u ≡ v and p is a possible preﬁx of u onlyif u = v = p = ǫ . In this case, this condition is not satisﬁed, so theoutput NO is correct. Line 7.

In this case, u does not begin with a clean overlap preﬁx of the form XY . So by Proposition 3, every word equivalent to u must beginwith the same letter as u . Hence, if u and v do not begin with thesame letter then we cannot have u ≡ v , so the output NO is correct. Line 9.

Again, u does not begin with a clean overlap preﬁx. If p is non-emptyand begins with a diﬀerent letter to u , then again by Proposition 3, p cannot be a possible preﬁx of u , so the output NO is correct. Line 19.

We are now in the case that u has a clean overlap preﬁx XY . If p is not a preﬁx of X or X then by Lemma 4 we see that p is not apossible preﬁx of u , so the output NO is correct. Line 21.

Once again, we are in the case that u has a clean overlap preﬁx XY .If v does not begin with either XY or XY then by Lemma 3 wecannot have u ≡ v so the output NO is correct. Line 43.

We are now in the case that u = XY u ′ and v = XY v ′ where Z isnot a preﬁx of u ′ and Z is not a preﬁx of v ′ . We know also that z isthe maximal common suﬃx of Z and Z and z and z are such that Z = z z and Z = z z . By Lemma 4 we cannot have u ≡ v unless u ′ WP-Prefix ( u, v, p )1 if u = ǫ or v = ǫ then if u = ǫ and v = ǫ and p = ǫ then return Yes else return No elseif u does not have the form XY u ′ with XY a clean overlap preﬁx6 then if u and v begin with diﬀerent letters7 then return No elseif p = ǫ and u and p begin with diﬀerent letters9 then return No else u ← u with ﬁrst letter deleted12 v ← v with ﬁrst letter deleted13 if p = ǫ then p ← p with ﬁrst letter deleted15 return WP-Prefix ( u, v, p )16 else let X, Y, u ′ be such that u = XY u ′ if p is a preﬁx of neither X nor X then return No elseif v does not begin either with XY or with XY then return No elseif u = XY Zu ′′ and v = XY Zv ′′ then if u ′′ is Z -active24 then return WP-Prefix ( Zu ′′ , Zv ′′ , ǫ )25 else return WP-Prefix ( Zu ′′ , Zv ′′ , ǫ )26 elseif u = XY u ′ and v = XY v ′ then if p is a preﬁx of X then return WP-Prefix ( u ′ , v ′ , ǫ )29 else return WP-Prefix ( u ′ , v ′ , Z )30 elseif u = XY Zu ′′ and v = XY Zv ′′ then if u ′′ is Z -active32 then return WP-Prefix ( Zu ′ , Zv ′ , ǫ )33 else return WP-Prefix ( Zu ′ , Zv ′ , ǫ )34 elseif u = XY u ′ and v = XY Zv ′′ then return WP-Prefix ( u ′ , Zv ′′ , ǫ )36 elseif u = XY Zu ′′ and v = XY v ′ then return WP-Prefix ( Zu ′′ , v ′ , ǫ )38 elseif u = XY u ′ and v = XY v ′ then let z be the maximal common suﬃx of Z and Z let z be such that Z = z z let z be such that Z = z z if u ′ does not begin with z or v ′ does not begin with z ;43 then return NO else let u ′′ be such that u ′ := z u ′′ let v ′′ be such that v ′ := z v ′′ ;46 return WP-Prefix ( u ′′ , v ′′ , z ) Figure 1.

Algorithm for the Word Problem

MALL OVERLAP MONOIDS 17 and v ′ have the form z u ′′ and z v ′′ respectively, so if this is not thecase, the output NO is correct.Now let n > n recursivecalls. Let u, v, p be such that the algorithm terminates after n recursive calls.This time, we consider each of the possible places at which the ﬁrst recursivecall can be made, establishing in each case that the output produced iscorrect. Line 15.

In this case u does not begin with a clean overlap preﬁx of the form XY and we have u = au ′ . It follows by Proposition 3 that everyword equivalent to u has the form aw where w ≡ u ′ . In particular, u ≡ v = av ′ if and only if u ′ ≡ v ′ , p is a possible preﬁx exactly ifeither p = ǫ or p = ap ′ where p ′ is a possible preﬁx of u ′ . By theinductive hypothesis, the recursive call correctly establishes whetherthese conditions hold. Line 24.

We know that u = XY Zu ′′ , that v = XY Zv ′′ and that p is a preﬁxof X or X . By Lemma 4, it follows that u ≡ v and p is a possiblepreﬁx of u if and only if Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ . We also knowthat u ′′ is Z -active, so by Corollary 3, this is true if and only if Zu ′′ ≡ Zv ′′ . Line 25.

This is the same as the previous case, except that u ′′ is not Z -active.In this case, by Proposition 4 we have that Zu ′′ ≡ Zv ′′ implies u ′′ ≡ v ′′ which in turn implies Zu ′′ ≡ Zv ′′ , so it suﬃces to test thelatter. Line 28.

Here we know that u = XY u ′ , v = XY v ′ , that Z is not a preﬁx of u ′ or v ′ and that p is a preﬁx of X . It follows by Lemma 4 that u ≡ v and p is a possible preﬁx of u if and only if u ′ ≡ v ′ . Line 29.

This time we know that u = XY u ′ , v = XY v ′ and that p is a preﬁxof X but not of X . It follows by Lemma 4 that u ≡ v and p is apossible preﬁx of u if and only if u ′ ≡ v ′ and Z is a possible preﬁxof u ′ . Line 32.

Here we have u = XY Zu ′′ and v = XY Zv ′′ , and p is a preﬁx of X or X . It follows by Lemma 4 that u ≡ v and p is a possible preﬁx of u ifand only if either Zu ′′ ≡ Zv ′′ or Zu ′′ ≡ Zv ′′ . We also know that u ′′ is Z -active, so by Corollary 3, this is true if and only if Zu ′′ ≡ Zv ′′ . Line 33.

If we get here, we know that u = XY u ′ , that v = XY Zv ′′ , that Z is not a preﬁx of u ′ and that p is a preﬁx of X or X ; it follows that u ≡ v and p is a possible preﬁx of u if and only if condition (4’) ofLemma 4 holds, that is, if and only if u ′ ≡ Zv ′′ . By the inductivehypothesis, the recursive call will correctly estbalish if this is thecase. Line 37.

The argument here is symmetric to that for termination at line 35.

Line 46.

Having got here, we know that p is a preﬁx of X or X , that u = XY u ′ and v = XY v ′ where Z is not a preﬁx of u ′ and Z is not a preﬁx of v ′ . We know also that z is the maximal common suﬃx of Z and Z and z and z are such that Z = z z and Z = z z . Finally, we knowthat u ′ = z u ′′ and v ′ = z v ′′ . It follows by Lemma 4 that u ≡ v and p is a possible preﬁx of z if and only if u ′′ ≡ v ′′ and z is a possiblepreﬁx of u ′′ . By the inductive hypothesis, the recursive call correctlyestablishes whether this holds. (cid:3) We have now shown that our algorithm produces the correct output when-ever it terminates, but we have not yet shown that it always terminates. Infact, the following theorem shows that it does so after only a linear numberof recursive calls.

Lemma 6.

Let k be the length of the longest maximal piece suﬃx of arelation word. The number of recursive calls during execution of a call to WP-PREFIX ( u, v, p ) is bounded above by ( k + 2) | u | + 1 .Proof. For clarity in our analysis, we let u i , v i and p i denote the parametersto the i th recursive call in the execution (with in particular u = u , v = v and p = p ). Each call to the function involves executing exactly one ofthe sections 1–4, 6–15 and 17–46; we call these calls of type A, B and Crespectively. We shall show that the number of calls of each of these typesis bounded above by a linear function of | u | so that, the total number ofrecursive calls is also bounded above by a linear function of | u | .First, notice that a call of type A cannot make a recursive call, so that isonly at most one type A call in the execution.Now for a word x we let r ( x ) = 0 if x does not have a clean overlap preﬁx,and r ( x ) to be the length of the part of x which follows the shortest cleanoverlap preﬁx, that is, | x ′ | where x = aXY x ′ with aXY the shortest cleanoverlap preﬁx, otherwise.It is readily veriﬁed that if the i th recursive call is of type B and itselfmakes a recursive call then we have r ( u i +1 ) = r ( u i ), while if the i th recursivecall is of type C and itself makes a recursive call then we have r ( u i +1 )

1, while if the i th recursive call isof type C and itself makes a recursive call then we have r ( u i +1 ) ≤ | u i | + k .We have seen that the entire execution cannot feature more than | u | callsof type C or more than one call of type A. Hence, if the execution involves i recursive calls, it must include at most | u | calls of type C, and at least i − | u | − i recursivecalls, we must have | u i | ≤ | u | + | u | k − ( i − | u | −

1) = ( k + 2) | u | − i + 1Since the length of u i cannot be negative, it follows that execution mustterminate after at most ( k + 2) | u | + 1 calls. (cid:3) It remains to justify our claim that this algorithm can be implementedin linear time. Since the concept of linear time is highly dependant upon

MALL OVERLAP MONOIDS 19 model of computation, it is necessary to be precise upon the model underconsideration. We consider a Turing machine with two two-way-inﬁniteread-write storage tapes, using a tape alphabet including the generators forour monoid and a separator symbol u , v and p are initially encoded on one ofthe tapes in the form u v p p in the ﬁnite state control,and arrange for u v u , one can check whether u has a clean overlap preﬁx of the form XY , and if so ﬁnd X , Y and thecorresponding Z , by analysing a preﬁx of u of bounded length. Similarly,for a given maximal piece suﬃx Z , we can check whether u is Z -active byanalysing a preﬁx of u of bounded length. It follows that each recursivestep of our algorithm involves analysing preﬁxes of u and v of boundedlength, before possibly making a recursive call, with u and v modiﬁed onlyby changing preﬁxes of bounded length. Clearly any analysis of a boundedlength preﬁx can be performed in constant time; moreover, if a recursive callis required then the tape contents can be modiﬁed to contain the parametersfor that call, again in constant time. It follows that the algorithm can beimplemented with execution time bounded above by a linear function of thenumber of recursive calls in the execution, which by Lemma 6 is boundedabove by a linear function of the length of u .Moreover, by swapping u and v at the start of the computation if nec-essary, we may assume without loss of generality that u is shorter than v .Thus we obtain the following. Theorem 1.

For each every monoid presentation satisfying C (4) , there ex-ists a two-tape Turing machine which solves the corresponding word problemin time linear in the shorter of the input words. The reader may initially be surprised by the fact that one can test equiv-alence of two words in time bounded by a function of the shorter word –indeed, this bound potentially does not even aﬀord time to fully read thelonger word! However, Remmers showed that, for a ﬁxed C (3) presentation,the length of the longer of two equivalent words is bounded by a linear func-tion of the length of the shorter [3, Theorem 5.2.14]. Thus, if the diﬀerencein lengths of two words is too great, one may conclude without further anal-ysis that the words are not equivalent. In fact Remmers’ result is the onlypossible explanation for this phenomemon, so the fact that this propertyholds for C (4) presentations can also be deduced from Theorem 1.5. Uniform Decision Problems

In Section 4 we developed a linear time algorithm to solve the word prob-lem for a ﬁxed small overlap presentation. Since our method of describingthe algorithm was entirely constructive, one might reasonably expect thatit also gives rise to a solution for the uniform word problem for C (4) presen-tations, that is the algorithmic problem of, given a C (4) presentation and two words, deciding whether the words represent the same element of themonoid presented. In this section, we shall see that this is indeed the case,and show that the resulting algorithm remains fast.To avoid unnecessary technicalities, we describe and analyse the algo-rithms using the RAM model of computation; in particular this allows usto assume that elementary operations involving generators from the presen-tation (such as comparing two generators) are single steps performable inconstant time. The exact time complexity of a Turing machine implemen-tation would depend upon the number of tapes and the precise encoding ofthe input, but would certainly remain polynomial of low degree in the inputsize.We begin with some simple results describing the complexity of someelementary computations with a ﬁnite monoid presentation. If h A | R i is aﬁnite presentation we denote by | A | the cardinality of the alphabet A , andby | R | the sum length of the relation words in R . Where the meaning isclear, we shall abuse notation by using R also to denote the set of relationwords in the presentation. Proposition 6.

There is a RAM algorithm which, given a presentation h A | R i and a word w , computes the maximum piece preﬁx (and/or maximumpiece suﬃx) of w in time O ( | w || R | ) . In particular, there is a RAM algorithmto decide, given the same inmput, decides whether the word w is a piece intime O ( | w || R | ) .Proof. For each relation word R ∈ R and position 1 < i < | R | in that wordwe can compute in time O ( | w | ) the length n of the longest common preﬁxof w and R i . . . R | R | (where R j represents the j th letter of R ). Our machinedoes this for each relation word and each position in that relation word inturn, recording as it goes along (i) the maximum value of n attained so far,and (ii) the maximum value of n which has been attained or exceeded atleast twice. The latter, upon completion, is clearly the length of the longestpiece preﬁx of w , and the total time taken for execution is O  X R ∈ R | R | X i =1 | w |  = O ( | w || R | )as claimed. An obvious dual algorithm can be used to ﬁnd the longest piecesuﬃx of w . (cid:3) Corollary 5.

There is a RAM algorithm which, given as input a presenta-tion h A | R i , decides in time O ( | R | ) whether the presentation satisﬁes thecondition C (4) .Proof. Our machine begins by computing the maximum piece preﬁx X R andmaximum piece suﬃx Z R for each relation word R ∈ R ; by Proposition 6this can be done in time O X R ∈ R | R || R | ! = O ( | R | ) . It then tests, in time O ( | R | ), whether for any of the relation words R wehave | X R | + | Z R | ≥ | R | . If so then some relation word is a product of two MALL OVERLAP MONOIDS 21 pieces, so the presentation does not even satisfy the weaker condition C (3)and we are done.Otherwise, the machine computes, again in time O ( | R | ), the middle word Y R of each relation word. By our remarks in Section 1, the presentation sat-isﬁes C (4) if and only if none of the words Y R is a piece. Using Proposition 6again, this condition can be tested in time O X R ∈ R | Y R || R | ! = O (cid:0) | R | (cid:1) . Thus, we have described a RAM algorithm to test a presentation h A | R i for the C (4) condition in time O ( | R | ). (cid:3) Theorem 2.

There is a RAM algorithm which, given as input a C (4) pre-sentation h A | R i and two words u, v ∈ A ∗ , decides whether u and v representthe same element of the semigroup presented in time O (cid:0) | R | min( | u | , | v | ) (cid:1) . Proof.

Suppose we are given a C (4) presentation h A | R i and two words u, v ∈ A ∗ . Just as in the proof of Proposition 6, the machine begins byﬁnding for every relation R the maximum piece preﬁx X R , the maximumpiece suﬃx Z R and the middle word Y R , in time O ( | R | ).It now has the information required to apply the algorithm WP-PREFIX given above. A simple line-by-line analysis shows that each line, and henceeach recursive call, can be executed in time O ( | R | ). By Lemma 6, thenumber of recursive calls is bounded above by ( k + 2) | u | + 1 where k , beingthe length of the longest maximum piece suﬃx of a relation word, is lessthan | R | . Thus, this part of the algorithm terminates in time O ( | R | | u | ).As above we may assume, by exchanging u and v at the start of thecomputation if necessary, that | u | < | v | so that min( | u | , | v | ) = | u | . It followsthat the uniform word problem can be solved in time O (cid:0) | R | min( | u | , | v | ) (cid:1) as claimed. (cid:3) Acknowledgements

This research was supported by an RCUK Academic Fellowship. Theauthor is grateful to V. N. Remeslennikov, whose questions prompted thisline of research and who shared many helpful ideas. He would also liketo thank A. V. Borovik for some helpful conversations, J. B. Fountain andV. A. R. Gould for facilitating access to some of the relevant literature, andKirsty for all her support and encouragement.

References [1] The GAP-Group.

GAP – Groups, Algorithms, and Programming, Version 4.4 , 2005. .[2] M. Gromov. Hyperbolic groups. In

Essays in Group Theory , volume 8 of

Math. Sci.Res. Inst. Publ. , pages 75–263. Springer, New York, 1987.[3] P. M. Higgins.

Techniques of semigroup theory . Oxford Science Publications. TheClarendon Press Oxford University Press, New York, 1992. With a foreword by G.B. Preston.[4] J. E. Hopcroft and J. D. Ullman.

Formal Languages and their Relation to Automata .Addison-Wesley, 1969. [5] R. C. Lyndon and P. E. Schupp.

Combinatorial Group Theory . Springer-Verlag, 1977.[6] J. H. Remmers.

Some algorithmic prblems for semigroups: a geometric approach . PhDthesis, University of Michigan, 1971.[7] J. H. Remmers. On the geometry of semigroup presentations.