[PDF] Constructive Relationships Between Algebraic Thickness and Normality

Abstract

We study the relationship between two measures of Boolean functions; \emph{algebraic thickness} and \emph{normality}. For a function f , the algebraic thickness is a variant of the \emph{sparsity}, the number of nonzero coefficients in the unique GF(2) polynomial representing f , and the normality is the largest dimension of an affine subspace on which f is constant. We show that for 0<ϵ<2 , any function with algebraic thickness n 3−ϵ is constant on some affine subspace of dimension Ω( n ϵ 2 ) . Furthermore, we give an algorithm for finding such a subspace. We show that this is at most a factor of Θ( n − − √ ) from the best guaranteed, and when restricted to the technique used, is at most a factor of Θ( logn − − − − √ ) from the best guaranteed. We also show that a concrete function, majority, has algebraic thickness Ω( 2 n 1/6 ) .

Full PDF

aa r X i v : . [ c s . CC ] S e p Constructive Relationships Between AlgebraicThickness and Normality ⋆ Joan Boyar and Magnus Gausdal Find ⋆⋆ Department of Mathematics of Computer Science, University of Southern Denmark [email protected] Information Technology Laboratory, National Institute of Standards andTechnology, USA [email protected]

Abstract.

We study the relationship between two measures of Booleanfunctions; algebraic thickness and normality . For a function f , the al-gebraic thickness is a variant of the sparsity , the number of nonzerocoeﬃcients in the unique F polynomial representing f , and the normal-ity is the largest dimension of an aﬃne subspace on which f is constant.We show that for 0 < ǫ <

2, any function with algebraic thickness n − ǫ is constant on some aﬃne subspace of dimension Ω (cid:16) n ǫ (cid:17) . Furthermore,we give an algorithm for ﬁnding such a subspace. We show that this is atmost a factor of Θ ( √ n ) from the best guaranteed, and when restricted tothe technique used, is at most a factor of Θ ( √ log n ) from the best guar-anteed. We also show that a concrete function, majority, has algebraicthickness Ω (cid:16) n / (cid:17) . Boolean functions play an important role in many areas of computer science. Incryptology, Boolean functions are sometimes classiﬁed according to some mea-sure of complexity (also called cryptographic complexity [7], nonlinearity criteria[17] or nonlinearity measures [1]). Examples of such measures are nonlinearity , algebraic degree , normality , algebraic thickness and multiplicative complexity , andthere are a number of results showing that functions that are simple according toa certain measure are vulnerable to a certain attack (see [8] for a good survey).A signiﬁcant amount of work in this area presents explicit functions thatachieve high (or low) values according to some measure. For the nonlinearity measure this was settled by showing the existence of bent functions [21], for algebraic degree the problem is trivial, for multiplicative complexity this is awell studied problem in circuit complexity [3], for normality this is exactly theproblem of ﬁnding good aﬃne dispersers [22]. The ﬁrst result in this paper isthat the majority function has exponential algebraic thickness. ⋆ Partially supported by the Danish Council for Independent Research, Natural Sci-ences. ⋆⋆ Most of this work was done while at the University of Southern Denmark nother line of work has been to establish relationships between these mea-sures, e.g. considering questions of the form “if a function f is simple (or com-plex) according to one measure, what does that say about f according to someother measure”, see e.g. [4,8,1] and the references therein. In this paper we focuson the relationship between algebraic thickness and normality . Intuitively, thesemeasures capture, each in their own way, how “far” functions are from beinglinear [6,7]. In fact, these two measures have been studied together previously(see e.g. [5,6]). The relationship between these measures was considered in thework of Cohen and Tal in [10], where they show that functions with a certainalgebraic thickness have a certain normality. For relatively small values of alge-braic thickness, we tighten their bounds and present an algorithm to witness thisnormality. The question of giving a constructive proof of normality is not just atheoretical one. Recently a generic attack on stream ciphers with high normalitywas successfully mounted in the work [19]. If it is possible to constructively com-pute a witness of normality given a function with low algebraic thickness, thisimplies that any function with low algebraic thickness is likely to be vulnerableto the attack in [19], as well as any other attack based on normality. This worksuggests that this is indeed possible for functions with small algebraic thickness. Let F be the ﬁeld of order 2, F n the n -dimensional vector space over F , and[ n ] = { , . . . , n } . A mapping from F n to F is called a Boolean function . It isa well known fact that any Boolean function f in the variables x , . . . , x n canbe expressed uniquely as a multilinear polynomial over F called the algebraicnormal form (ANF) or the Zhegalkin polynomial . That is, there exist uniqueconstants c ∅ , . . . , c { ,...,n } over { , } , such that f ( x , . . . , x n ) = X S ⊆ [ n ] c S Y j ∈ S x j , where arithmetic is in F . In the rest of this paper, most arithmetic will be in F ,although we still need arithmetic in R . If nothing is mentioned it should be clearfrom the context what ﬁeld is referred to. The largest | S | such that c S = 1 iscalled the (algebraic) degree of f , and functions with degree 2 are called quadratic functions. We let log( · ) be the logarithm base two, ln( · ) the natural logarithm,and exp( · ) the natural exponential function with base e . Algebraic Thickness

For a Boolean function, f , let k f k = P S ⊆ [ n ] c S , witharithmetic in R . This measure is sometimes called the sparsity of f (e.g. [10]). The algebraic thickness [4,6] of f , denoted T ( f ) is deﬁned as the smallest sparsityafter any aﬃne bijection has been applied to the inputs of f . More precisely,letting A n denote the set of aﬃne, bijective operators on F n , T ( f ) = min A ∈A n k f ◦ A k . (1)2lgebraic thickness was introduced and ﬁrst studied by Carlet in [4,5,6].Aﬃne functions have algebraic thickness at most 1, and Carlet showed thatfor any constant c > √ ln 2, for suﬃciently large n there exist functions withalgebraic thickness 2 n − − cn n − , and that a random Boolean function will have such high algebraic thicknesswith high probability. Furthermore no function has algebraic thickness largerthan n . Carlet observes that algebraic thickness was also implicitly mentionedin [18, Page 208] and related to the so called “higher order diﬀerential attack”due to Knudsen [15] and Lai [16] in that they are dependent on the degree aswell as the number of terms in the ANF of the function used. Normality A k -dimensional ﬂat is an aﬃne (sub)space of F n with dimension k . A function is k -normal if there exists a k -dimensional ﬂat E such that f isconstant on E [9,4]. For simplicity deﬁne the normality of a function f , which wedenote N ( f ), as the largest k such that f is k -normal. We recall that aﬃne func-tions have normality at least n − c >

1, a random Boolean function has normality lessthan c log n with high probability.Functions with normality smaller than k are often called aﬃne dispersers of dimension k , and a great deal of work has been put into explicit construc-tions of functions with low normality. Currently the asymptotically best knowndeterministic function, due to Shaltiel, has normality less than 2 log . n [22].Notice the asymmetry in the deﬁnitions: linear functions have very low alge-braic thickness (0 or 1) but very high normality ( n or n − n − − . · n · n − ) but low normality (less than 1 .

01 log n ) [5]. Remark on Computational Eﬃciency

In this paper, we say that somethingis eﬃciently computable if it is computable in time polynomially bounded in thesize of the input. Algorithms in this paper will have a Boolean function with acertain algebraic thickness as input. We assume that the function is representedby the ANF of the function witnessing this small algebraic thickness along withthe bijection. That is, if a function f with algebraic thickness T ( f ) = T is theinput to the algorithm, we assume that it is represented by a function g and anaﬃne bijection A such that g = f ◦ A and k g k = T . In this setting, representinga function f uses poly ( T ( f ) + n ) bits. Quadratic Functions

The normality and algebraic thickness of quadratic func-tions are well understood due to the following theorem due to Dickson [11] (seealso [8] for a proof).

Theorem 1 (Dickson).

Let f : F n F be quadratic. Then there exist aninvertible n × n matrix A , a vector b ∈ F n , t ≤ n , and c ∈ F such that for = A x + b one of the following two equations holds: f ( x ) = y y + y y + . . . y t − y t + c, or f ( x ) = y y + y y + . . . y t − y t + y t +1 . Furthermore A , b and c can be found eﬃciently. That is, any quadratic function is aﬃne equivalent to some inner productfunction. We highlight a simple but useful consequence of Theorem 1. Simply bysetting one variable in each of the degree two terms to zero, one gets:

Proposition 1.

Let f : F n → F be quadratic. Then N ( f ) ≥ (cid:4) n (cid:5) . Furthermorea ﬂat witnessing the normality of f can be found eﬃciently. Some Relationships

It was shown in [6] that normality and algebraic thick-ness are logically independent of (that is, not subsumed by) each other. Severalother results relating algebraic thickness and normality to other cryptographicmeasures are given in [6]. We mention a few relations to other measures.Clearly, functions with degree d have algebraic thickness O ( n d ), so havingsuperpolynomial algebraic thickness requires superconstant degree. The fact thatthere exist functions with low degree and low normality has been established in[4] and [10] independently. In the following, by a random degree three polynomial ,we mean a function where each term of degree three is included in the ANFindependently with probability . No other terms are included in the ANF. Theorem 2 ([4,10] ). Let f on n variables be a random degree three poly-nomial. Then with high probability, f remains nonconstant on any subspace ofdimension . √ n . In fact, as mentioned in [10] it is not hard to generalize this to the fact thatfor any constant d , a random degree d polynomial has normality O (cid:0) n / ( d − (cid:1) .Perhaps surprisingly, this is tight. More precisely the authors give an elegantproof showing that any function with degree d has N ( f ) ∈ Ω (cid:0) n / ( d − (cid:1) . Thisresult implies the following relation between algebraic thickness and normality. Theorem 3 (Cohen and Tal [10]).

Let c be an integer and let f have T ( f ) ≤ n c . Then N ( f ) ∈ Ω (cid:0) n / (4 c ) (cid:1) . The proof of this has two steps: First they show by probabilistic methods that f has a restriction with a certain number of free variables and a certain degree,and after this they appeal to a relation between degree and normality. Althoughthe authors do study the algorithmic question of ﬁnding such a subspace, theydo not propose an eﬃcient algorithm for ﬁnding a subspace of such dimension.We will pay special attention to the following type of restrictions of Booleanfunctions. The constant 6 .

12 does not appear explicitly in these articles, however it can bederived using similar calculations as in the cited papers. This also follows fromTheorem 6 later in this paper. We remark that 6 .

12 is not optimal. eﬁnition 1. Let f : F n → F . Setting k < n of the bits to results in a newfunction f ′ on n − k variables. We say that f ′ is a -restriction of f . By inspecting the proof in the next section and the proof of Theorem 3,one can see that most of the restrictions performed are in fact setting variablesto 0. Furthermore, by inspecting the ﬂat used for the attack performed in [19](section 5.3), one can see that it is of this form as well. Determining whether agiven function represented by its ANF admits a 0-restriction f ′ on n − k variableswith f ′ constant corresponds exactly to the hitting set problem, and this is wellknown to be NP complete [12]. Furthermore it remains NP complete even whenrestricted to quadratic functions (corresponding to the vertex cover problem).This stands in contrast to Proposition 1; for quadratic functions and generalﬂats (as opposed to just 0-restrictions) the problem is polynomial time solvable.To the best of our knowledge, the computational complexity of the followingproblem is open (see also [10]): Given a function, represented by its ANF, ﬁnda large(st) ﬂat on which the function is constant. For many functions, it is trivial to see that the ANF contains many terms, e.g.the function f ( x ) = (1 + x )(1 + x ) · · · (1 + x n ) , which is 1 if and only if all the inputs are 0, contains all the possible 2 n terms inits ANF. However, we are not aware of any explicit function along with a proofof a strong (e.g. exponential) lower bound on the algebraic thickness. Using aresult from circuit complexity [20], it is straightforward to show that the majorityfunction , M AJ n has exponential algebraic thickness. M AJ n is 1 if and only ifat least half of the n inputs are 1. In the following, an AC [ ⊕ ] circuit of depth d is a circuit with inputs x , x , . . . , x n , (1 ⊕ x ) , (1 ⊕ x ) , . . . , (1 ⊕ x n ). The circuitcontains ∧ , ∨ , ⊕ (AND, OR, XOR) gates of unbounded fan-in, and every directedpath contains at most d edges. First we need the following simple proposition: Proposition 2.

Let f : F n → F have T ( f ) ≤ T . Then f can be computed byan AC [ ⊕ ] circuit of depth with at most n + T + 1 gates.Proof. Suppose f = g ◦ A for some aﬃne bijective mapping A . In the ﬁrst layer(the layer closest to the inputs) one can compute A using n XOR gates of fan-in at most n . Then by computing all the monomials independently, g can becomputed by an AC [ ⊕ ] circuit of depth 2 using T AND gates with fan-in atmost n and 1 XOR gate of fan-in T . ⊓⊔ Now we recall a result due to Razborov [20], see also [14, 12.24]

Theorem 4 (Razborov).

Every unbounded fan-in depth-d circuit over {∧ , ∨ , ⊕} computing M AJ n requires Ω ( n / (2 d ) ) gates. M AJ n has high algebraic thickness. Proposition 3. T ( M AJ n ) ≥ Ω ( n / ) . This section is devoted to showing that functions with algebraic thickness at most n − ǫ are constant on ﬂats of somewhat large dimensions. Furthermore our proofreveals a polynomial time algorithm to ﬁnd such a subspace. In the following, aterm of degree at least 3 will be called a crucial term, and for a function f , thenumber of crucial terms will be denoted T ≥ ( f ).Our approach can be divided into two steps: First it uses 0-restrictions toobtain a quadratic function, and after this we can use Proposition 1. As impliedby the relation between 0-restrictions and the hitting set problem, ﬁnding theoptimal 0-restrictions is indeed a computationally hard task. Nevertheless, aswe shall show in this section, the following greedy algorithm gives reasonableguarantees.The greedy algorithm simply works by continually ﬁnding the variable that iscontained in the most crucial terms, and sets this variable to 0. It ﬁnishes whenthere are no crucial terms. We show that when the greedy algorithm ﬁnishes,the number of variables left, n ′ , is relatively large as a function of n (for a moreprecise statement, see Theorem 5). Notice that we are only interested in thebehavior of n ′ as a function of n , and that this is not necessarily related to theapproximation ratio of the greedy algorithm, which is known to be Θ (log n ) [13].We begin with a simple proposition about the greedy algorithm that will beuseful throughout the section, and it gives a tight bound. Proposition 4.

Let g : F n → F have T ≥ ( g ) ≥ m . Then some variable x j iscontained in at least (cid:6) mn (cid:7) crucial terms.Proof. We can assume that no variable occurs twice in the same term. Hencethe total number of variable occurrences in crucial terms is at least 3 m . By thepigeon hole principle, some variable is contained in at least (cid:6) mn (cid:7) terms. ⊓⊔ The following lemma is a special case where a tight result can be obtained.It is included here because the result is tight, and it gives a better constant inTheorem 5 than one would get by simply removing terms one at a time. Theresult applies to functions with relatively small thickness, and a later lemmareduces functions with somewhat larger thickness to this case.

Lemma 1.

Let c ≤ and let f : F n → F have T ≥ ( f ) ≤ cn . Then f has a -restriction f ′ on n ′ = n − (cid:6) c − n (cid:7) variables with T ≥ ( f ′ ) ≤ n ′ .Proof. Let the greedy algorithm run until a function f ′ on n ′ variables with T ≥ ( f ′ ) ≤ n ′ is obtained. By Proposition 4 we eliminate at least 2 terms ineach step. The number of algorithm iterations is at most (cid:6) c − n (cid:7) . Indeed, let6 c − n (cid:7) = c − n + δ for some 0 ≤ δ <

1. After this number of iterations thenumber of variables left is n ′ = n − c − n − δ = 6 − c n − δ and the number of critical terms is at most cn − (cid:18) c − n − δ (cid:19) = 2 − c n − δ. In particular n ′ ≥ − c n − δ . ⊓⊔ Lemma 1 is essentially tight.

Proposition 5.

Let < c ≤ be arbitrary but rational. Then for inﬁnitelymany values of n , there exists a function on n variables with T ≥ ( f ) = cn suchthat every -restriction f ′ on n ′ > n − (cid:6) c − n (cid:7) variables has T ≥ ( f ) > n ′ .Proof. Let < c ≤ be ﬁxed and consider the function on 6 variables: f ( x ) = x x x + x x x + x x x + x x x . The greedy algorithm sets this functions to 0 by killing two variables, and this isoptimal. Furthermore setting any one variable to 0 kills exactly two terms. Nowconsider the following function deﬁned on n = 30 m variables and having 20 m terms. For convenience we index the variables by x i,j for 1 ≤ i ≤ m , 1 ≤ j ≤ g ( x ) = m X i =1 f ( x i, , x i, , x i, , x i, , x i, , x i, ) . Again here the greedy algorithm is optimal, and setting 6 m variables to zeroleaves n ′ = 24 m variables and 8 m terms remaining. Thus, the bound fromLemma 1 is met with equality for c = .To see that it is tight for c < , consider the function, ˜ f on n variables, where n is a multiple of 30 such that c n − c is an integer. Run the greedy algorithmuntil the number of variables is ˜ n and T ≥ ( ˜ f ) = c ˜ n (assuming c ˜ n is an integer).At this point ˜ n = n − c and the number of terms left is c ˜ n . Again, by thestructure of the function, setting any number, t , of the variables to 0 results ina function with ˜ n − t variables and at least c ˜ n − t terms. When t < (3 c − n , wehave c ˜ n − t > ˜ n − t . ⊓⊔ An immediate corollary to Lemma 1 is the following.

Corollary 1.

Let f : F n → F have T ≥ ( f ) ≤ n . Then it is constant on a ﬂatof dimension n ′ ≥ (cid:22) ⌊ n ⌋ (cid:23) ≥ n − . Furthermore, such a ﬂat can be foundeﬃciently. roof. First apply Lemma 1 to obtain a function on n ′ = (cid:4) n (cid:5) variables withat most n ′ crucial terms. Now set one variable in each crucial term to 0, soafter this we have at least (cid:4) n (cid:5) variables left and the remaining function isquadratic. Applying Theorem 1 gives the result. ⊓⊔ The following lemma generalizes the lemma above to the case with moreterms. The analysis of the greedy algorithm uses ideas similar to those used incertain formula lower bound proofs, see e.g. [23] or [14, Section 6.3].

Lemma 2.

Let f : F n → F with T ≥ ( f ) ≤ n − ǫ , for < ǫ < . Then thereexists a -restriction f ′ on n ′ = jq n ǫ k variables with T ≥ ( f ′ ) ≤ n ′ .Proof. Let T ≥ ( f ) = T . Then, by Proposition 4. Setting the variable containedin the largest number of terms to 0, the number of crucial terms left is at most T − Tn = T · (cid:18) − n (cid:19) ≤ T · (cid:18) n − n (cid:19) . Applying this inequality n − n ′ times yields that after n − n ′ iterations the numberof crucial terms left is at most T · (cid:18) n − n (cid:19) (cid:18) n − n − (cid:19) · · · (cid:18) n ′ n ′ + 1 (cid:19) = T · (cid:18) n ′ n (cid:19) . When n ′ = q n ǫ and T = n − ǫ , this is at most n ′ . ⊓⊔ Remark:

A previous version of this paper [2], contained a version of the lemmawith a proof substantially more complicated. We thank anonymous reviewers forsuggesting this simpler proof.It should be noted that Lemma 2 cannot be improved to the case where ǫ = 0, no matter what algorithm is used to choose the 0-restriction. To see thisconsider the function containing all degree three terms. For this function, any n ′ variables will have at least (cid:0) n ′ (cid:1) crucialterms. On the other hand, restricting with x + x = 0 results in all crucialterms with both x and x having lower degree and all crucial terms with justone of them cancelling out. This suggests that for handling functions with largeralgebraic thickness, one should use restrictions other than just 0-restrictions.Combining Lemma 2 with Corollary 1, we get the following theorem. Theorem 5.

Let T ( f ) = n − ǫ for < ǫ < . Then there exists a ﬂat of dimen-sion at least q n ǫ − , such that when restricted to this ﬂat, f is constant.Furthermore this ﬂat can be found eﬃciently. This improves on Theorem 3 for functions with algebraic thickness n s for1 ≤ s ≤ .

82, and the smaller s , the bigger the improvement, e.g. for T ( f ) ≤ n ,our bound guarantees N ( f ) ∈ Ω ( n / ), compared to Ω ( n / ).8 .1 Normal Functions with low sparsity How good are the guarantees given in the previous section? The purpose ofthis section is ﬁrst to show that the result from Theorem 5 is at most a factorof Θ ( √ n ) from being tight. More precisely, we show that for any 2 < s ≤ n s that are nonconstant on ﬂats ofdimension O ( n − s ). Notice that this contains Theorem 2 as a special case where s = 3. Theorem 6.

For any < s ≤ , for suﬃciently large n , there exist functionswith degree and algebraic thickness at most n s that, for suﬃciently large n ,remain nonconstant on all ﬂats of dimension . n − s .Proof. The proof uses the probabilistic method. We endow the set of all Booleanfunctions of degree 3 with a probability distribution D , and show that under thisdistribution a function has the promised normality with high probability.The proof is divided into the following steps: First we describe the probabilitydistribution D . Then, we ﬁx an arbitrary k -dimensional ﬂat E , and bound theprobability that a random f chosen according to D is constant on E . We showthat for k = Cn − s/ , where the constant C is determined later, this probabilityis suﬃciently small that a union bound over all possible choices of E gives thedesired result.We deﬁne D by describing the probability distribution on the ANF. We leteach possible degree 3 term be included with probability n − s . The expectednumber of terms is thus n s − (cid:0) n (cid:1) ≤ n s /

12, and the probability of having morethan n s terms is less than 0 .

001 for large n . Now let E be an arbitrary but ﬁxed k -dimensional ﬂat.One way to think of a function restricted to a k -dimensional ﬂat is that itcan be obtained by a sequence of n − k aﬃne variable substitutions of the form x i := P j ∈ S x j + c . This changes the ANF of the function since x i is no longera “free” variable. Assume without loss of generality that we substitute for thevariables x n , . . . , x k +1 in that order. Initially we start with the function f givenby f ( x ) = X { a,b,c }⊆ [ n ] I abc x a x b x c , where I abc is the indicator random variable, indicating whether the x a x b x c iscontained in the ANF. Suppose we perform the n − k restrictions and obtain thefunction ˜ f . The ANF of ˜ f is given by f ( x ) = X { a,b,c }⊆ [ k ] I abc + X s ∈ S abc I s ! x a x b x c , where S abc is some set of indicator random variables depending on the restric-tions performed. It is important that I abc , the indicator random variable cor-responding to x a x b x c , for { a, b, c } ⊆ [ k ] is only occurring at x a x b x c . Hence weconclude that independently of the outcome of all the indicator random variables9 a ′ b ′ c ′ with { a ′ , b ′ , c ′ } 6⊆ [ k ] , we have that the marginal probability for any I abc with { a, b, c } ⊆ [ k ] occurring remains at least n − s .Deﬁne t = (cid:0) k (cid:1) random variables, Z , . . . , Z t , one for each potential term inthe ANF of ˜ f , such that Z j = 1 if and only if the corresponding term is presentin the ANF, and 0 otherwise. The obtained function is only constant if there areno degree 3 terms, so the probability of ˜ f being constant is thus at most P [ Z = . . . = Z t = 0] ≤ (cid:18) − n − s (cid:19) ( k ) ≤ (cid:18) − n − s (cid:19) C ( n − s/ ) = (cid:18) − n − s (cid:19) n − s ! C ( n − s/ ) ≤ exp (cid:18) − C n − s/ (cid:19) . The number of choices for E is at most 2 n ( k +1) , so the probability that f becomesconstant on some aﬃne ﬂat of dimension k is at mostexp (cid:18) − C n − s/ + C ln(2) n − s/ + n ) (cid:19) . Now if

C > p

54 ln(2) ≈ . .. , this quantity tends to 0. We conclude that withhigh probability the function obtained has algebraic thickness at most n s andnormality at most 6 . n − s . ⊓⊔ There is factor of Θ ( √ n ) between the existence guaranteed by Theorem 5and Theorem 6 and we leave it as an interesting problem to close this gap.The algorithm studied in this paper works by setting variables to 0 until allremaining terms have degree at most 2, and after that appealing to Theorem 1.A proof similar to the previous shows that among such algorithms, the boundfrom Theorem 5 is very close to being asymptotically tight. Theorem 7.

For any < s < , there exist functions with degree and alge-braic thickness at most n s that have degree on any -restriction of dimension √ ln nn − s .Proof. We use the same proof strategy as in the proof of Theorem 6. Endow theset of all Boolean functions of degree 3 with the same probability distribution D . For large n , the number of terms is larger than n s with probability at most0 . C √ ln nn − s of the variables to 0, and consider theprobability of the function being constant under this ﬁxed 0-restriction. We willshow that this probability is so small that a union bound over all such choicesgives that with high probability the function is nonconstant under any such10estriction. We will see that setting C = 3 will suﬃce. There are (cid:0) C √ ln nn − s (cid:1) possible degree 3 terms on these remaining variables, and we let each one beincluded with probability n − s . The probability that none of these degree threeterms are included is (cid:18) − n − s (cid:19) ( C √ ln nn − s ) ≤ (cid:18) − n − s (cid:19) C ( √ ln n ) n − s = (cid:18) − n − s (cid:19) n − s ! n − s (ln n ) / C ≤ exp (cid:18) − C n − s (ln n ) / (cid:19) , and the number of 0-restrictions with all but C √ ln nn − s variables ﬁxed is (cid:18) nC √ ln nn − s (cid:19) ≤ n C √ ln nn − s ( C √ ln nn − s )!= exp(ln nC √ ln nn − s − ln(( C √ ln nn − s )!)) ≤ exp (cid:16) ln / ( n ) Cn − s − . C √ ln nn − s ln (cid:16) C √ ln nn − s (cid:17)(cid:17) ≤ exp (cid:18) (ln n ) / Cn − s − − s . C (ln n ) / n − s (cid:19) = exp (cid:18) (ln n ) / Cn − s (cid:18) − .

98 3 − s (cid:19)(cid:19) , where the last two inequalities hold for suﬃciently large n . Again, by the unionbound, the probability that there exists such a choice on which there are noterms of degree three left is at mostexp (cid:18) − C n − s (ln n ) / (cid:19) exp (cid:18) (ln n ) / Cn − s (cid:18) − .

98 3 − s (cid:19)(cid:19) . For C ≥ √ ln nn − s variablesof degree smaller than 3. ⊓⊔ References

1. Boyar, J., Find, M., Peralta, R.: Four measures of nonlinearity. In: Spirakis, P.G.,Serna, M.J. (eds.) CIAC. LNCS, vol. 7878, pp. 61–72. Springer, Heidelberg (2013),eprint with correction available at the Cryptology ePrint Archive, Report 2013/6332. Boyar, J., Find, M.G.: Constructive relationships between algebraic thickness andnormality. CoRR abs/1410.1318 (2014) . Boyar, J., Peralta, R., Pochuev, D.: On the multiplicative complexity of booleanfunctions over the basis (cap, +, 1). Theor. Comput. Sci. 235(1), 43–57 (2000)4. Carlet, C.: On cryptographic complexity of boolean functions. In: Mullen, G.,Stichtenoth, H., Tapia-Recillas, H. (eds.) Finite Fields with Applications to CodingTheory, Cryptography and Related Areas. pp. 53–69. Springer, Berlin (2002)5. Carlet, C.: On the algebraic thickness and non-normality of boolean functions. In:Information Theory Workshop. pp. 147–150. IEEE (2003)6. Carlet, C.: On the degree, nonlinearity, algebraic thickness, and nonnormality ofboolean functions, with developments on symmetric functions. IEEE Transactionson Information Theory 50(9), 2178–2185 (2004)7. Carlet, C.: The complexity of boolean functions from cryptographic viewpoint.In: Krause, M., Pudl`ak, P., Reischuk, R., van Melkebeek, D. (eds.) Complexityof Boolean Functions. Dagstuhl Seminar Proceedings, vol. 06111. IBFI, SchlossDagstuhl (2006)8. Carlet, C.: Boolean functions for cryptography and error correcting codes. In:Crama, Y., Hammer, P.L. (eds.) Boolean Models and Methods in Mathematics,Computer Science, and Engineering, chap. 8, pp. 257–397. Cambridge, UK: Cam-bridge Univ. Press (2010)9. Charpin, P.: Normal boolean functions. J. Complexity 20(2-3), 245–265 (2004)10. Cohen, G., Tal, A.: Two structural results for low degree polynomialsand applications. In: Rao, A. (ed.) RANDOM (2015), preprint available athttp://arxiv.org/abs/1404.065411. Dickson, L.E.: Linear Groups with and with an Exposition of the Galois Field The-ory. Teubner’s Sammlung von Lehrbuchern auf dem Gebiete der matematischenWissenschaften VL, x+312 (1901)12. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theoryof NP-Completeness. W. H. Freeman (1979)13. Johnson, D.S.: Approximation algorithms for combinatorial problems. J. Comput.Syst. Sci. 9(3), 256–278 (1974)14. Jukna, S.: Boolean Function Complexity - Advances and Frontiers, Algorithms andcombinatorics, vol. 27. Springer (2012)15. Knudsen, L.R.: Truncated and higher order diﬀerentials. In: Preneel, B. (ed.) FastSoftware Encryption. LNCS, vol. 1008, pp. 196–211. Springer, Springer, Heidelberg(1995)16. Lai, X.: Higher order derivatives and diﬀerential cryptanalysis. In: Communicationsand Cryptography, pp. 227–233. Kluwer Academic Publishers (1994)17. Meier, W., Staﬀelbach, O.: Nonlinearity criteria for cryptographic functions. In:Quisquater, J., Vandewalle, J. (eds.) EUROCRYPT ’89. LNCS, vol. 434, pp. 549–562. Springer, Heidelberg (1990)18. Menezes, A., van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptog-raphy. CRC Press (1996)19. Mihaljevic, M.J., Gangopadhyay, S., Paul, G., Imai, H.: Generic cryptographicweakness of k -normal boolean functions in certain stream ciphers and cryptanalysisof grain-128. Periodica Mathematica Hungarica 65(2), 205–227 (2012)20. Razborov, A.A.: Lower bounds on the size of bounded depth circuits over a com-plete basis with logical addition. Mathematical Notes 41(4), 333–338 (1987)21. Rothaus, O.S.: On “bent” functions. J. Comb. Theory, Ser. A 20(3), 300–305 (1976)22. Shaltiel, R.: Dispersers for aﬃne sources with sub-polynomial entropy. In: Ostro-vsky, R. (ed.) FOCS. pp. 247–256. IEEE (2011)23. Subbotovskaya, B.A.: Realizations of linear functions by formulas using +,.,-.Math. Dokl. 2(3), 110–112 (1961)-normal boolean functions in certain stream ciphers and cryptanalysisof grain-128. Periodica Mathematica Hungarica 65(2), 205–227 (2012)20. Razborov, A.A.: Lower bounds on the size of bounded depth circuits over a com-plete basis with logical addition. Mathematical Notes 41(4), 333–338 (1987)21. Rothaus, O.S.: On “bent” functions. J. Comb. Theory, Ser. A 20(3), 300–305 (1976)22. Shaltiel, R.: Dispersers for aﬃne sources with sub-polynomial entropy. In: Ostro-vsky, R. (ed.) FOCS. pp. 247–256. IEEE (2011)23. Subbotovskaya, B.A.: Realizations of linear functions by formulas using +,.,-.Math. Dokl. 2(3), 110–112 (1961)