Investigating the discrepancy property of de Bruijn sequences
aa r X i v : . [ c s . D M ] M a y Investigating the discrepancy property of de Bruijn sequences
Daniel Gabric Joe Sawada * May 5, 2020
Abstract
The discrepancy of a binary string refers to the maximum (absolute) difference between the numberof ones and the number of zeroes over all possible substrings of the given binary string. We providean investigation of the discrepancy of known simple constructions of de Bruijn sequences. Furthermore,we demonstrate constructions that attain the lower bound of Θ ( n ) and a new construction that attainsthe previously known upper bound of Θ ( n √ n ) . This extends the work of Cooper and Heitsch [ DiscreteMathematics , 310 (2010)].
Let B ( n ) denote the set of binary strings of length n . A de Bruijn sequence is a circular string of length n that contains every string in B ( n ) as a substring. By this definition, each such substring must occur exactlyonce. As an example, is a de Bruijn sequence of order n = ; it contains each length binary string as a substring when viewedcircularly. There is an extensive literature on de Bruijn sequences motivated in part by their random-likeproperties. As articulated by Golomb [18], de Bruijn sequences:• are balanced : they contain the same number of 0s and 1s;• satisfy a run property : there are an equal number of contiguous runs of 0s and 1s of the same length inthe sequence,• satisfy a span- n property : they contain every distinct length n binary string as a substring.From our example above for n = , note that there are exactly n −
0s and 1s respectively; there are n − contiguous runs of 0s and 1s respectively; and by definition, it contains every distinct length n binary stringas a substring.Despite these properties, many de Bruijn sequences display other properties that are far from random. Forinstance, consider the greedy prefer-1 construction [22]. After starting with an initial seed, successive bits areappended by always trying a 1 first. Only if adding a 1 results in repeating a length n substring will a 0 beappended instead. As one would expect, the resulting de Bruijn sequence (illustrated above for n = ) has amuch higher ratio of 1s to 0s at the start of the sequence. One measure that accounts for this is the discrepancy , * Research supported by the
Natural Sciences and Engineering Research Council of Canada (NSERC) grant RGPIN-2018-04211. n = is ∣ − ∣ = as witnessed bythe underlined substring. The sequences generated by this prefer-1 approach are known to have discrepancy Θ ( n log nn ) [5] with an exact formulation based on the Fibonacci and Lucas numbers [6]. In contrast, theexpected discrepancy of a random sequence of length n is Θ ( n / √ log n ) [5]. Some applications in pseudo-random bit generation require de Bruijn sequences that do not have large discrepancy. For example, when usedas a carrier signal, a de Bruijn sequence with a large discrepancy causes spectral peaks that could interferewith devices operating at these frequencies [23]. Similar measures described as “balance” and “uniformity”are discussed in [19]. However, they focus only on n = and instead vary the size of the alphabet. Theyexplain that de Bruijn sequences with good balance and uniformity are useful in the planning of reaction timeexperiments [10,28]. De Bruijn sequences with high discrepancy necessarily have bad balance and uniformity.In this paper, we extend the work initiated by Cooper and Heitsch [5] providing a more complete analysisof discrepancy for a wide variety of de Bruijn sequence constructions. In particular, we:1. evaluate the discrepancies of an additional 12 efficient/interesting de Bruijn sequence constructions upto n = ,2. demonstrate de Bruijn sequences constructions that attain the minimum possible discrepancy of Θ ( n ) ,and3. present a new de Bruijn sequence construction which has discrepancy meeting the asymptotic upperbound of Θ ( n √ n ) .The second result formalizes preliminary work presented in [15]. The asymptotic upper bound achieved in thethird result was previously known [4, 11], however no specific construction was known to attain this bound.The remainder of this paper is presented as follows. We begin with an overview of our experimentalresults for 13 de Bruijn sequence constructions, including the prefer-1. They are partitioned into four groupswhich are further analyzed in Sections 2, 3, 4, and 5. We conclude in Section 6 with open problems and futureavenues of research. n = In Table 1 we present exact discrepancies for 13 de Bruijn sequence constructions for values of n between 10and 25. The results are partitioned into the following four groups based on increasing discrepancy. A largertable up to n = is provided in the appendix. Group 1 : Constructions based on the Complementing Cycling Register (CCR) which has feedbackfunction f ( a a ⋯ a n ) = a + ( mod 2 ) . Group 2 : The greedy prefer-same and prefer-opposite sequences along with a lexicographic composi-tion construction.
Group 3 : Constructions based on the Pure Cycling Register (PCR) which has feedback function f ( a a ⋯ a n ) = a . Table 1 also shows a random entry based on taking the average discrepancy of10000 randomly generated sequences of length n . Group 4 : Two constructions based on joining smaller weight-range cycles. The sequences were generated in C using the srand and rand functions. http://debruijnsequence.org . Each construc-tion can generate each symbol in O ( n ) time bit (or better) using only O ( n ) space except for the Pref-same and
Pref-opposite algorithms which require O ( n ) space using their greedy construction. ( Group 1 ) ( Group 2 ) n Huang CCR2 CCR3 CCR1 Pref-same Lex-comp Pref-opposite
10 12 13 13 16 24 24 2711 13 14 15 18 29 29 3412 15 16 16 22 35 35 4313 16 17 18 23 43 43 5214 18 19 20 30 48 48 6315 19 21 21 29 59 59 7416 21 22 23 36 68 68 8717 22 24 25 37 79 79 10018 24 26 26 43 88 88 11519 25 27 28 43 103 103 13020 27 29 30 52 114 114 14721 28 31 31 50 127 127 16422 30 32 33 59 142 142 18323 31 34 35 59 155 155 20224 33 36 36 67 172 172 22325 35 37 38 66 187 187 244 ( Group 3 ) ( Group 4 ) n PCR4 Random PCR3 PCR2 PCR1 Cool-lex Weight-range
10 29 50 75 101 120 131 13111 41 71 141 180 222 257 25712 51 101 248 321 416 468 46813 70 143 468 587 784 801 93014 85 203 850 1065 1488 1723 172315 110 288 1604 1974 2824 3439 343916 175 407 2965 3632 5376 6443 644317 246 575 5594 6785 10229 11452 1287818 326 815 10461 12635 19484 24319 2431919 462 1157 19765 23746 37107 48629 4862920 730 1634 37243 44585 71250 92388 9238821 954 2311 70575 84270 138332 167975 18476622 1327 3264 133737 159281 268582 352727 35272723 1820 4565 254322 302449 521553 705443 70544324 2684 6252 484172 574819 1012795 1352090 135209025 3183 9192 924071 1096009 1966813 2496163 2704168
Table 1: Discrepancies of de Bruijn sequence constructions of order n ordered by increasing discrepancy andpartitioned into four groups. Since de Bruijn sequences have the same number of 0s as 1s, the discrepancy for each of the n linear versionsof a given (circular) de Bruijn sequence will be the same. Given a linear version D of a de Bruijn sequence,the discrepancy of D can be computed in linear time by keeping track of two values while scanning D one bita time from left to right:• the maximum value d of the number of 1s minus the number of 0s in any prefix of D , and• the maximum value d of the number of 0s minus the number of 1s in any prefix of D .The discrepancy of D is d + d . 3 Group 1: CCR-based constructions
In this section we consider the four de Bruijn sequence constructions in Group 1 based on the CCR. The se-quences generated by the constructions
CCR1 , CCR2 , and
CCR3 are based on shift-rules presented in [17].The sequences generated by the
CCR2 and
CCR3 constructions can also be constructed by concatenationapproaches [16] described later in this section; the equivalence of the shift-rules to their respective concate-nation constructions has been confirmed up to n = , though no formal proof has been given. The Huang construction is a shift-rule based construction in [20]. Since every de Bruijn sequence of order n contains thesubstring n , a lower bound on discrepancy is clearly n . In this section we prove that two aforementioned con-catenation based constructions have discrepancy at most n , and thus attain the smallest possible asymptoticdiscrepancy of Θ ( n ) .To get a better feel for these four de Bruijn sequence constructions, the following graphs illustrate therunning difference between the number of 1s and the number of 0s in each prefix of the given de Bruijnsequence. The examples are for n = , so the de Bruijn sequences have length = . , − − CCR1 sequence for n = s − s i np r e fi x , − − CCR2 sequence for n = s − s i np r e fi x , − − CCR3 sequence for n = s − s i np r e fi x , − − Huang sequence for n = s − s i np r e fi x Recall that the CCR is a feedback shift register with feedback function f ( a a ⋯ a n ) = a + ( mod 2 ) .The CCR partitions B ( n ) into equivalence classes of strings called co-necklaces . For example, the followingfour columns are the co-necklace equivalence classes for n = :4 periodic reduction of string α , denoted pr ( α ) is the smallest prefix β of α such that α = β t for some t ≥ . In [16], the following two de Bruijn sequence constructions CCR2 and
CCR3 concatenate the periodicreductions of αα for given representatives α of each co-necklace equivalence class. Algo CCR2
1. Let the representative for each co-necklace equivalence class of order n be its lexicographicallysmallest string.2. Let α , α , . . . , α m denote these representatives in colex order.3. Output : pr ( α α ) ⋅ pr ( α α ) ⋯ pr ( α m α m ) .For n = , the representatives for this algorithm are the bolded strings in the equivalence classes above and Algo CCR2 produces: ⋅ ⋅ ⋅ . Algo CCR3
1. Let the representative for each co-necklace equivalence class of order n be the string obtained bytaking the lexicographically smallest string, removing its largest prefix of the form j , and thenappending j to the end.2. Let α , α , . . . , α m denote these representatives in lexicographic order.3. Output : pr ( α α ) ⋅ pr ( α α ) ⋯ pr ( α m α m ) .For n = , the representatives for this algorithm are the underlined strings in the equivalence classes aboveand Algo CCR3 produces: ⋅ ⋅ ⋅ . We now prove that the discrepancy resulting from these two de Bruijn sequence constructions is at most n . 5 emma 2.1 Consider a sequence of binary strings α , α , ⋯ , α m where each α i has the same number of 0sas 1s and has discrepancy at most n . Then S = α α ⋯ α m has discrepancy at most n .Proof. Consider a shortest substring of S of the form α i α i + ⋯ α j that has the same discrepancy as S . Itsdiscrepancy will be the same as that of α i α j which gives an upper bound of n . ◻ Theorem 2.2
The de Bruijn sequences constructed by
Algo CCR2 and
Algo CCR3 have discrepancy atmost n .Proof. Given a length n binary string α , αα has the same number of 0s and 1s and has discrepancy at most n . These properties also hold for pr ( αα ) by definition of the periodic reduction. Thus, by Lemma 2.1, thesequences constructed by Algo CCR2 and
Algo CCR3 have discrepancy at most n . ◻ Interestingly, from Table 1, these two concatenation-based constructions do not demonstrate the small-est discrepancy for n ≤ . The construction by Huang [20], which is based on a cycle-joining approach,demonstrates slightly smaller discrepancy. In particular the author states: “It seems clear that the sequences produced by our algorithm have a relatively good character-istic of local 0-1 balance in comparison with the ones produced by the ‘prefer one’ algorithm.” So the author indicates that their construction may have small discrepancy, however no analysis is provided.
In this section we consider the three de Bruijn sequence constructions in Group 2. The
Pref-same [3, 9, 12]and the
Pref-opposite [2] are greedy constructions based on the last bit of the sequence as it is constructed.They have the downside of requiring an exponential amount of memory. The
Lex-comp construction [13] isobtained by concatenating lexicographic compositions. Its construction was an attempt to efficiently generatethe sequence generated by the
Pref-same approach; it was conjectured to be the same for a very long prefix.Observe that it attains the same discrepancy as the
Pref-same for all values of n tested.To get a better feel for the two greedy de Bruijn sequence constructions, the following graphs illustratethe running difference between the number of 1s and the number of 0s in each prefix of the given de Bruijnsequence. The examples are for n = , so the de Bruijn sequences have length = . , − Pref-same sequence for n = s − s i np r e fi x , − Pref-opp sequence for n = s − s i np r e fi x
6n the following table we study some experimental results for the
Pref-same construction. In particular,for ≤ n ≤ we compute the maximum difference between the number of 1s and the number of 0s alongwith the maximum difference between the number 0s and the number of 1s, over all prefixes of each Pref-same de Bruijn sequence of order n . Adding these two values together, we get the discrepancies shown inTable 1. n
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 max ( − max ( − Interestingly, the values in the row max ( − x ( + x )/(( − x ) ( − x )) ” and the provided formula demonstrates that each value is Θ ( n ) .More specifically the values match the sequence for ≤ n ≤ . This leads to the following conjecture. Conjecture 3.1
The de Bruijn sequences constructed by the
Pref-same and
Lex-comp algorithms have dis-crepancy Θ ( n ) . A similar analysis was performed for sequences generated by the
Pref-opposite construction. n
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 max ( − max ( − Remarkably, observe that the two middle rows are a shift from each other by two positions. Just as interesting,the sequences also correspond to a known sequence in OEIS [1], namely A033638. Specifically, the row max ( − n < , but we have verified it matches for ≤ n ≤ . The sequence corresponds to “quarter squares plus1”, and by applying the appropriate shifts, the discrepancy for the Prefer-opposite sequence of order n , for ≤ n ≤ is given by: ⌊ ( n − ) ⌋ + ⌊ ( n − ) ⌋ + . This leads to the following conjecture.
Conjecture 3.2
The de Bruijn sequence constructed by the
Pref-opposite algorithm has discrepancy Θ ( n ) . We conclude this section with an observation regarding the
Pref-opposite de Bruijn sequence: For ≤ n ≤ , each sequence has the following suffix where j = ⌈ n / ⌉ : j n − j ⋅ j − n − j + ⋯ n − ⋅ n − . For example, when n = , the Pref-opposite de Bruijn sequence has suffix ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ , and the underline section has + + + + ones and + + + zeros. A slight rearrangement gives alower bound of ( − ) + ( − ) + ( − ) + ( − ) + = ⋅ + = for the discrepancy of the sequence.The actual discrepancy is 27. More generally, if this suffix is indeed a suffix for each Pref-opposite de Bruijnsequence, then a lower bound on its discrepancy will be (⌈ n / ⌉ − )(⌊ n / ⌋ − ) + n = Ω ( n ) . PCR-based constructions
In this section we consider the four de Bruijn sequence constructions in Group 3 based on the PCR. Theconstructions
PCR1 , PCR2 , PCR3 , and
PCR4 are based on shift-rules presented in [17]. The sequencesgenerated by
PCR1 are the same as the ones generated by the prefer-0 greedy construction (the complement ofthe prefer-1) and a very efficient necklace concatenation construction based on lexicographic order [14]. Thesequences generated by
PCR2 are the same as the ones generated by a more efficient necklace concatenationconstruction based on colex order [7, 8]. The
PCR3 is based on a general approach in [21] and revisitedin [27].To get a better feel for these four de Bruijn sequence constructions, the following graphs illustrate therunning difference between the number of 1s and the number of 0s in each prefix of the given de Bruijnsequence. The examples are for n = , so the de Bruijn sequences have length = . , PCR1 sequence for n = s − s i np r e fi x , PCR2 sequence for n = s − s i np r e fi x , PCR3 sequence for n = s − s i np r e fi x , PCR4 sequence for n = s − s i np r e fi x The discrepancy for the sequence generated by the
PCR1 construction has already been studied in [5]where they show that the discrepancy is Θ ( n log nn ) . The sequences generated by the PCR2 and
PCR3 constructions appear to have a similar growth trajectories. More interesting are the sequences generated bythe
PCR4 construction that, from Table 1, appear to have discrepancy that is closest to that of a random string.It would be interesting to do a more detailed investigation of this construction, which is based on a very simplesuccessor rule. 8
Weight range constructions
In this section we consider two de Bruijn sequence constructions which are based on joining smaller cyclesbased on weight (number of 1s). The
Cool-lex construction [24], is a concatenation approach which is basedon creating underlying cycles which contain all strings with weights d and d + given ≤ d < n . Then,appropriate such cycles can be joined together to obtain a de Bruijn sequence [25]. By the nature of howthe cycles are joined, the first half of the resulting de Bruijn sequence contains mostly length n substrings ofweight less than or equal to n / . Similarly, the latter half mostly contains length n substrings with weightgreater than or equal to n / . Thus, as one would expect, the resulting de Bruijn sequence has a very largediscrepancy. The Weight-range construction is a new construction presented in this section which we proveattains the maximal possible asymptotic discrepancy of Θ ( n /√ n ) [4, 11].To get a better feel for these two de Bruijn sequence constructions, the following graphs illustrate therunning difference between the number of 1s and the number of 0s in each prefix of the given de Bruijnsequence. The examples are for n = , so the de Bruijn sequences have length = . , Cool-lex sequence for n = s − s i np r e fi x , Weight-range sequence for n = s − s i np r e fi x Notice that if we had shifted the starting position of the
Cool-lex sequence the profile of the graph wouldbe very similar to that the
Weight-range sequence. In fact, the discrepancies of the two sequences are thesame except when n mod 4 ≡ (see Table 1). This will be discussed more after we present the Weight-range construction.A minimum weight de Bruijn sequence is a cyclic sequence that contains each binary string of length n with weight at least w exactly once. A maximum weight de Bruijn sequence is defined similarly where theweight of each string is at most w . A construction for the former sequence is given in [26]; it is constructedby concatenating the periodic reduction of each necklace of weight ≥ w when the necklaces are listed inlexicographic order. Let the resulting sequence be denoted by D w ( n ) . Remark 5.1
For any w < n , D w ( n ) begins with n − w w and ends with n . By complementing the bits in D w ( n ) , we obtain a maximum weight de Bruijn sequence with weight at most n − w . Denote this sequence by D w ( n ) . From the previous remark, it begins with n − w w and ends with n . Example 1
The necklaces of length 6 with weight w ≥ in lexicographic order are: , , , , , , , , . D ( ) . ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ As further examples, D ( ) = ⋅ ⋅ ⋅ ⋅ and D ( ) = ⋅ ⋅ ⋅ ⋅ . From the above example observe that:• D ( ) contains all binary strings of length 6 with weight greater than or equal to 3,• D ( ) contains all binary strings of length 6 with weight less than or equal to 2,• The length n − prefix of D ( ) , namely 11000, appears in the wraparound of D ( ) .Let D rw ( n ) denote the sequence D w ( n ) with the suffix w − rotated to the front. Then by applying the GluingLemma [25], the following is a de Bruijn sequence of order 6: ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ D ( ) ⋅ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ D r ( ) . Applying this strategy more generally, let DB max ( n ) denote the de Bruijn sequence obtained by joining twosuch smaller cycles. Weight-range construction DB max ( n ) = D w ( n ) ⋅ D rw ′ ( n ) , where w = ⌊ n / ⌋ + and w ′ = ⌈ n / ⌉ .A complete C implementation to construct DB max ( n ) is given in the Appendix .The following technical lemma leads to a lower bound for the discrepancy of DB max ( n ) . Lemma 5.2
A maximum weight de Bruijn sequence of order n and maximum weight w has ( n − w ) more 0sthan 1s.Proof. By definition, a maximum weight de Bruijn sequence of order n and maximum weight w containsevery binary string of length n with weight at most w as a substring exactly once. Since each bit in this It is also available at http:\debruijnsequence.org . n different strings the total number of 1s in the sequence is ones = n w ∑ d = d ( nd ) = n ( n ) + n ( n ) + n ( n ) + ⋯ + wn ( nw ) = + ( n − ) + ( n − ) + ⋯ + ( n − w − ) , and the total number of 0s is zeros = n w ∑ d = ( n − d )( nd ) = nn ( n ) + n − n ( n ) + n − n ( n ) + ⋯ + n − wn ( nw ) = ( n − ) + ( n − ) + ( n − ) + ⋯ + ( n − w ) . Thus zeros − ones = ( n − w ) . ◻ Theorem 5.3
The de Bruijn sequence DB max ( n ) has discrepancy at least ( n − ⌊ n / ⌋ ) + ⌊ n ⌋ .Proof. Let w = ⌊ n / ⌋ + and w ′ = ⌈ n / ⌉ . Recall that D w ( n ) is a maximum weight de Bruijn sequencewith maximum weight n − w . Thus, by Lemma 5.2, it has ( n − n − w ) = ( n − n −(⌊ n / ⌋+ ) ) = ( n − ⌊ n / ⌋ ) more 0s than 1s.Consider D w ( n ) with its prefix of n − w removed. The resulting string, which is a substring of DB max ( n ) ,has ( n − ⌊ n / ⌋ ) + ( n − w ) more 0s than 1s. When n is odd we have n − w = n − ⌊ n / ⌋ − = ⌊ n ⌋ and thus DB max ( n ) has discrepancy at least ( n − ⌊ n / ⌋ ) + ⌊ n ⌋ . When n is even, we additionally add the length n − prefix of D rw ′ ( n ) which has more 0s than 1s (exactly one more). Since n − w + = n − (⌊ n / ⌋ − ) + = ⌊ n ⌋ (when n is even)this again means that DB max ( n ) has discrepancy at least ( n − ⌊ n / ⌋ ) + ⌊ n ⌋ . ◻ By applying Stirling’s approximation to ( n − ⌊ n / ⌋ ) we obtain the following corollary. Corollary 5.4
The discrepancy of the de Bruijn sequence DB max ( n ) attains the asymptotic upper bound of Θ ( n √ n ) . Observe from Table 1 that the discrepancy of DB max ( n ) is exactly ( n − ⌊ n / ⌋ ) + ⌊ n ⌋ for ≤ n ≤ . This leadsto the following conjecture. Conjecture 5.5
The de Bruijn sequence DB max ( n ) has discrepancy equal to ( n − ⌊ n / ⌋ ) + ⌊ n ⌋ , and moreover, itis the maximal possible discrepancy over all de Bruijn sequences of order n . As noted earlier, the discrepancy of the cool-lex construction matches the discrepancy for the weight-range construction for ≤ n ≤ , except for when n mod 4 ≡ (see Table 1). As illustration, the cool-lex construction first constructs cycles of the following weights for n = , , , :• n = : (0,1,2), (3,4), (5,6)• n = : (0,1), (2,3), (4,5), (6,7) 11 n = : (0,1,2), (3,4), (5,6), (7,8)• n = : (0,1), (2,3), (4,5), (6,7), (8,9)before joining them together one at a time. Note when n = , strings with weights 4 and 5 are grouped togetherbefore the smaller cycles are joined together. This causes a reduction in the discrepancy compared to the weight-range construction. It is possible, however, to tweak the cool-lex implementation so the discrepanciesare equivalent. For instance for n = , the smaller cycles with weights ( , , ) , ( , ) , ( , ) , ( , , ) couldbe joined together instead. In this paper, we investigated the discrepancies of 13 de Bruijn sequence constructions. We proved that twoconstructions attain the lower bound of Θ ( n ) and presented one new construction that attains the upper boundof Θ ( n √ n ) . It remains an interesting problem to demonstrate a construction with discrepancy that is close tothat of a random stream of bits of the same length. Some avenues of future research include the following.1. Simplify the description of the Huang construction [20]. Does it have the smallest discrepancy over allde Bruijn sequences?2. Answer the conjectures regarding the discrepancies for the greedy
Pref-same and
Pref-opposite con-structions (Conjecture 3.1 and Conjecture 3.2).3. Analyze the discrepancy of
PCR4 which had discrepancy closest to one we might expect from a randomstream of bits.4. Determine whether or not the maximal discrepancy of any de Bruijn sequence is ( n − ⌊ n / ⌋ ) + ⌊ n ⌋ (Conjec-ture 5.5).5. Generalize the investigation of disrepancy to de Bruijn sequences over an arbitrary alphabet size k .6. Study the distribution of discrepancy over all possible de Bruijn sequences. References [1] OEIS Foundation Inc. (2020), The On-Line Encyclopedia of Integer Sequences, http://oeis.org.[2] A. Alhakim. A simple combinatorial algorithm for de Bruijn sequences.
The American MathematicalMonthly , 117(8):728–732, 2010.[3] A. Alhakim, E. Sala, and J. Sawada. Revisiting the prefer-same and prefer-opposite de Bruijn sequenceconstructions.
Theoretical Computer Science (to appear) , 2020.[4] S. R. Blackburn and I. E. Shparlinski. Character sums and nonlinear recurrence sequences.
DiscreteMath. , 306(12):1126–1131, June 2006.[5] J. Cooper and C. Heitsch. The discrepancy of the lex-least de Bruijn sequence.
Discrete Mathematics ,310:1152–1159, 2010.[6] J. Cooper and C. E. Heitsch. Generalized Fibonacci recurrences and the lex-least de Bruijn sequence.
Advances in Applied Mathematics , 50:465–473, 2010.127] P. B. Dragon, O. I. Hernandez, J. Sawada, A. Williams, and D. Wong. Constructing de Bruijn sequenceswith co-lexicographic order: The k -ary Grandmama sequence. submitted manuscript , 2017.[8] P. B. Dragon, O. I. Hernandez, and A. Williams. The grandmama de Bruijn sequence for binary strings.In Proceedings of LATIN 2016: Theoretical Informatics: 12th Latin American Symposium, Ensenada,Mexico , pages 347–361. Springer Berlin Heidelberg, 2016.[9] C. Eldert, H. Gray, H. Gurk, and M. Rubinoff. Shifting counters.
AIEE Trans. , 77:70–74, 1958.[10] P. L. Emerson and R. D. Tobias. Computer program for quasi-random stimulus sequences with equaltransition frequencies.
Behavior Research Methods, Instruments, & Computers , 27(1):88–98, Mar 1995.[11] G. Everest, A. J. Van Der Poorten, I. E. Shparlinski, and T. Ward.
Recurrence sequences , volume 104.AMS Mathematical Surveys and Monographs, 2003.[12] H. Fredricksen. A survey of full length nonlinear shift register cycle algorithms.
Siam Review ,24(2):195–221, 1982.[13] H. Fredricksen and I. Kessler. Lexicographic compositions and de Bruijn sequences.
J. Combin. TheorySer. A , 22(1):17 – 30, 1977.[14] H. Fredricksen and J. Maiorana. Necklaces of beads in k colors and k -ary de Bruijn sequences. DiscreteMath. , 23:207–210, 1978.[15] D. Gabric and J. Sawada. A de Bruijn sequence construction by concatenating cycles of the comple-mented cycling register. In
Combinatorics on Words - 11th International Conference, WORDS 2017,Montréal, QC, Canada, September 11-15, 2017, Proceedings , pages 49–58, 2017.[16] D. Gabric and J. Sawada. Constructing de Bruijn sequences by concatenating smaller universal cycles.
Theoretical Computer Science , 743:12–22, 2018.[17] D. Gabric, J. Sawada, A. Williams, and D. Wong. A framework for constructing de Bruijn sequencesvia simple successor rules.
Discrete Mathematics , 241(11):2977–2987, 2018.[18] S. Golomb. On the classification of balanced binary sequences of period n − (corresp.). IEEE Trans-actions on Information Theory , 26(6):730–732, November 1980.[19] Y. Hsieh, H. Sohn, and D. Bricker. Generating ( n ,2) de Bruijn sequences with some balance and unifor-mity properties. Ars Combinatoria , 72:277–286, 07 2004.[20] Y. Huang. A new algorithm for the generation of binary de Bruijn sequences.
J. Algorithms , 11(1):44–51, 1990.[21] C. J. A. Jansen, W. G. Franx, and D. E. Boekee. An efficient algorithm for the generation of DeBruijncycles.
IEEE Transactions on Information Theory , 37(5):1475–1478, Sep 1991.[22] M. H. Martin. A problem in arrangements.
Bull. Amer. Math. Soc. , 40(12):859–864, 1934.[23] A. A. Philippakis, A. M. Qureshi, M. F. Berger, and M. L. Bulyk. Design of compact, universal DNAmicroarrays for protein binding microarray experiments. In T. Speed and H. Huang, editors,
Researchin Computational Molecular Biology , pages 430–443, Berlin, Heidelberg, 2007. Springer Berlin Heidel-berg. 1324] F. Ruskey, J. Sawada, and A. Williams. De Bruijn sequences for fixed-weight binary strings.
SIAMJournal on Discrete Mathematics , 26(2):605–617, 2012.[25] J. Sawada, A. Williams, and D. Wong. Universal cycles for weight-range binary strings. In
Combinato-rial Algorithms - 24th International Workshop, IWOCA 2013, Rouen, France, July 10-12, 2013, RevisedSelected Papers , pages 388–401, 2013.[26] J. Sawada, A. Williams, and D. Wong. The lexicographically smallest universal cycle for binary stringswith minimum specified weight.
Journal of Discrete Algorithms , 28:31–40, 2014.[27] J. Sawada, A. Williams, and D. Wong. A surprisingly simple de Bruijn sequence construction.
DiscreteMath. , 339:127–131, 2016.[28] H.-S. Sohn, D. L. Bricker, J. R. Simon, and Y. Hsieh. Optimal sequences of trials for balancing practiceand repetition effects.
Behavior Research Methods, Instruments, & Computers , 29(4):574–581, Dec1997. 14
Table of discrepancies ( Group 1 ) ( Group 2 ) n Huang CCR2 CCR3 CCR1 Pref-same Lex-comp Pref-opposite
10 12 13 13 16 24 24 2711 13 14 15 18 29 29 3412 15 16 16 22 35 35 4313 16 17 18 23 43 43 5214 18 19 20 30 48 48 6315 19 21 21 29 59 59 7416 21 22 23 36 68 68 8717 22 24 25 37 79 79 10018 24 26 26 43 88 88 11519 25 27 28 43 103 103 13020 27 29 30 52 114 114 14721 28 31 31 50 127 127 16422 30 32 33 59 142 142 18323 31 34 35 59 155 155 20224 33 36 36 67 172 172 22325 35 37 38 66 187 187 24426 36 39 40 77 208 208 26727 38 41 42 74 224 224 29028 40 43 43 85 246 246 31529 41 44 45 84 264 264 34030 43 46 47 94 286 286 367 ( Group 3 ) ( Group 4 ) n PCR4 Random PCR3 PCR2 PCR1 Cool-lex Weight-range
10 29 50 75 101 120 131 13111 41 71 141 180 222 257 25712 51 101 248 321 416 468 46813 70 143 468 587 784 801 93014 85 203 850 1065 1488 1723 172315 110 288 1604 1974 2824 3439 343916 175 407 2965 3632 5376 6443 644317 246 575 5594 6785 10229 11452 1287818 326 815 10461 12635 19484 24319 2431919 462 1157 19765 23746 37107 48629 4862920 730 1634 37243 44585 71250 92388 9238821 954 2311 70575 84270 138332 167975 18476622 1327 3264 133737 159281 268582 352727 35272723 1820 4565 254322 302449 521553 705443 70544324 2684 6252 484172 574819 1012795 1352090 135209025 3183 9192 924071 1096009 1966813 2496163 270416826 4108 13074 1766284 2092284 3819605 5200313 520031327 5604 17933 3382851 4004050 7453523 10400613 1040061328 7629 22672 6488970 7672443 14544826 20058314 2005831429 10433 34591 12468181 14730243 28382864 37442182 4011661430 13637 57357 23991972 28316271 55421919 77558775 7755877510 29 50 75 101 120 131 13111 41 71 141 180 222 257 25712 51 101 248 321 416 468 46813 70 143 468 587 784 801 93014 85 203 850 1065 1488 1723 172315 110 288 1604 1974 2824 3439 343916 175 407 2965 3632 5376 6443 644317 246 575 5594 6785 10229 11452 1287818 326 815 10461 12635 19484 24319 2431919 462 1157 19765 23746 37107 48629 4862920 730 1634 37243 44585 71250 92388 9238821 954 2311 70575 84270 138332 167975 18476622 1327 3264 133737 159281 268582 352727 35272723 1820 4565 254322 302449 521553 705443 70544324 2684 6252 484172 574819 1012795 1352090 135209025 3183 9192 924071 1096009 1966813 2496163 270416826 4108 13074 1766284 2092284 3819605 5200313 520031327 5604 17933 3382851 4004050 7453523 10400613 1040061328 7629 22672 6488970 7672443 14544826 20058314 2005831429 10433 34591 12468181 14730243 28382864 37442182 4011661430 13637 57357 23991972 28316271 55421919 77558775 77558775