Isoperimetric Inequalities for Real-Valued Functions with Applications to Monotonicity Testing
IIsoperimetric Inequalities for Real-Valued Functions withApplications to Monotonicity Testing
Hadley Black ∗ Iden Kalemaj † Sofya Raskhodnikova ‡ Abstract
We generalize the celebrated isoperimetric inequality of Khot, Minzer, and Safra (SICOMP 2018) forBoolean functions to the case of real-valued functions f : { , } d → R . Our main tool in the proof ofthe generalized inequality is a new Boolean decomposition that represents every real-valued function f over an arbitrary partially ordered domain as a collection of Boolean functions over the same domain,roughly capturing the distance of f to monotonicity and the structure of violations of f to monotonicity.We apply our generalized isoperimetric inequality to improve algorithms for testing monotonicity andapproximating the distance to monotonicity for real-valued functions. Our tester for monotonicity hasquery complexity (cid:101) O (min( r √ d, d )), where r is the size of the image of the input function. (The bestpreviously known tester, by Chakrabarty and Seshadhri (STOC 2013), makes O ( d ) queries.) Our testeris nonadaptive and has 1-sided error. We show a matching lower bound for nonadaptive, 1-sided errortesters for monotonicity. We also show that the distance to monotonicity of real-valued functions thatare α -far from monotone can be approximated nonadaptively within a factor of O ( √ d log d ) with querycomplexity polynomial in 1 /α and the dimension d . This query complexity is known to be nearly optimalfor nonadaptive algorithms even for the special case of Boolean functions. (The best previously knowndistance approximation algorithm for real-valued functions, by Fattal and Ron (TALG 2010) achieves O ( d log r )-approximation.) We investigate the structure of real-valued functions over the domain { , } d , the d -dimensional hyper-cube. Our main contribution is a generalization of a powerful tool from the analysis of Boolean functions,specifically, isoperimetric inequalities , to the case of real-valued functions. Isoperimetric inequalities forthe undirected hypercube were studied by Margulis [Mar74] and Talagrand [Tal93]. Chakrabarty and Se-shadhri [CS16] had a remarkable insight to develop a directed analogue of the Margulis inequality. Thisbeautiful line of work culminated in the directed analogue of the Talagrand inequality proved by Khot,Minzer, and Safra [KMS18]. We refer to this as the KMS inequality. As Khot, Minzer, and Safra explain intheir celebrated work, the Margulis-type inequalities follow from the Talagrand-type inequalities and, moregenerally, the directed analogue of the Talagrand inequality implies all the other inequalities we mentioned.We generalize all these inequalities to the case of real-valued functions. ∗ Department of Computer Science, University of California, Los Angeles. Email: [email protected] . This work wassupported by NSF Grant CCF-1553605 and Boston University’s Data Science Initiative. † Department of Computer Science, Boston University. Email: [email protected] . This work was supported by NSF awardCCF-1909612 and Boston University’s Dean’s Fellowship. ‡ Department of Computer Science, Boston University. Email: [email protected] . This work was supported by NSF awardCCF-1909612. We discuss isoperimetric inequalities that study the size of the “boundary” between the points on which the function takesvalue 0 and the points on which it takes value 1. The boundary size is defined in terms of the edges of the d -dimensionalhypercube with vertices labeled by the values of the function. The edges of the hypercube might be directed or undirected,depending on the type of the inequality. a r X i v : . [ c s . D M ] N ov or the directed case, we prove a generalization of the KMS inequality for functions f : { , } d → R . Togeneralize the undirected isoperimetric inequalities, we give a property testing interpretation of the Talagrandinequality. With this interpretation, the inequality easily generalizes to the case of real-valued functions.Our proofs of the new isoperimetric inequalities reduce the general case to the Boolean case. Our maintool for generalizing the KMS inequality is a new Boolean decomposition theorem that represents everyreal-valued function f over an arbitrary partially ordered domain as a collection of Boolean functions overthe same domain, roughly capturing the distance of f to monotonicity and the structure of violations of f to monotonicity.We apply our generalized isoperimetric inequality to improve algorithms for testing monotonicity andapproximating the distance to monotonicity for real-valued functions. Our algorithm for testing monotonicityis nonadaptive and has 1-sided error. An algorithm is nonadaptive if its input queries do not depend onanswers to previous queries. A property testing algorithm has if it always accepts all inputs withthe property it is testing. We show that our algorithm for testing monotonicity is optimal among nonadaptive,1-sided error testers. Our distance approximation algorithm is nonadaptive. Its query complexity is knownto be nearly optimal for nonadaptive algorithms, even for the special case of Boolean functions. We view the domain of functions f : { , } d → R as a hypercube. For the directed isoperimetric inequalities,the edges of the hypercube are ordered pairs ( x, y ), where x, y ∈ { , } d and there is a unique i ∈ [ d ] suchthat x i = 0 , y i = 1, and x j = y j for all coordinates j ∈ [ d ] / { i } . This defines a natural partial order on thedomain: x (cid:22) y if x i ≤ y i for all coordinates i ∈ [ d ] or, equivalently, if there is a directed path from x to y in the hypercube. A function f : { , } d → R is monotone if f ( x ) ≤ f ( y ) whenever x (cid:22) y . The distance tomonotonicity of a function f : { , } d → R , denoted ε ( f ), is the minimum of Pr x ∈{ , } d [ f ( x ) (cid:54) = g ( x )] over allmonotone functions g : { , } d → R . An edge ( x, y ) is violated by f if f ( x ) > f ( y ). Let S − f be the set ofviolated edges. For x ∈ { , } d , let I − f ( x ) be the number of outgoing violated edges incident on x , specifically, I − f ( x ) = (cid:12)(cid:12)(cid:12)(cid:110) y : ( x, y ) ∈ S − f (cid:111)(cid:12)(cid:12)(cid:12) . Our main result is the following isoperimetric inequality.
Theorem 1.1 (Isoperimetric Inequality) . There exists a constant
C > , such that for all functions f : { , } d → R , E x ∼{ , } d (cid:20)(cid:113) I − f ( x ) (cid:21) ≥ C · ε ( f ) . (1)Theorem 1.1 is a generalization of the celebrated inequality of Khot, Minzer, and Safra [KMS18], thatwas subsequently strengthened by Pallavoor et al. [PRW20], who proved (1) for the special case of Booleanfunctions f : { , } d → { , } . We show that the same inequality holds for real-valued functions without anydependence on the size of the image of the function. In addition, the constant C is only a factor of 2 smallerthan the constant in the inequality of Pallavoor et al.Applications to monotonicity testing and distance approximation rely on a stronger, “robust” version ofTheorem 1.1. The robust version considers an arbitrary 2-coloring col : S − f → { red , blue } of the violatededges. The color of an edge is used to specify whether the edge is counted towards the lower or the upperendpoint. Let I − f, red ( x ) be the number of outgoing red violated edges incident on x , and I − f, blue ( x ) be thenumber of incoming blue violated edges incident on x , specifically, I − f, red ( x ) = (cid:12)(cid:12)(cid:12)(cid:110) y : ( x, y ) ∈ S − f , col ( x, y ) = red (cid:111)(cid:12)(cid:12)(cid:12) ; I − f, blue ( y ) = (cid:12)(cid:12)(cid:12)(cid:110) x : ( x, y ) ∈ S − f , col ( x, y ) = blue (cid:111)(cid:12)(cid:12)(cid:12) . Given a positive integer (cid:96) ∈ Z + , we let [ (cid:96) ] denote the set { , , . . . , (cid:96) } . C is only a factorof 2 smaller for the real-valued case than for the Boolean case. Theorem 1.2 (Robust Isoperimetric Inequality) . There exists a constant
C > , such that for all functions f : { , } d → R and colorings col : S − f → { red , blue } , E x ∼{ , } d (cid:104)(cid:113) I − f, red ( x ) (cid:105) + E y ∼{ , } d (cid:104)(cid:113) I − f, blue ( y ) (cid:105) ≥ C · ε ( f ) . Note that Theorem 1.2 implies Theorem 1.1 by considering the coloring where all violated edges are red.Therefore, we only present a proof of Theorem 1.2.
Boolean decomposition.
Our main technical contribution is the Boolean decomposition (Theorem 1.3).It allows us to prove Theorem 1.2 by reducing the general case of real-valued functions to the special case ofBoolean functions. Theorem 1.3 states that every non-monotone function f can be decomposed into Booleanfunctions f , f , . . . , f k that collectively preserve the distance to monotonicity of f and violate a subset ofthe edges violated by f . Crucially, they violate edges in disjoint subgraphs of the hypercube.Our Boolean decomposition works for functions over any partially ordered domain. We represent such adomain by a directed acyclic graph (DAG). For a DAG G , we denote its vertex set by V ( G ) and its edge setby E ( G ) . A DAG G determines a natural partial order on its vertex set: for all x, y ∈ V ( G ) , we have x (cid:22) y if and only if G contains a path from x to y . A function f : V ( G ) → R is monotone if f ( x ) ≤ f ( y ) whenever x (cid:22) y . An edge ( x, y ) of G is violated by f if f ( x ) > f ( y ). The definitions of ε ( f ), the distance of f tomonotone, and S − f , the set of violated edges, are the same as for the special case of the hypercube. Theorem 1.3 (Boolean Decomposition) . Suppose G is a DAG and f : V ( G ) → R is a non-monotone functionover the vertices of G . Then, for some k ≥ , there exist Boolean functions f , . . . , f k : V ( G ) → { , } anddisjoint (induced) subgraphs H , . . . , H k of G for which the following hold:1. (cid:80) ki =1 ε ( f i ) ≥ ε ( f ) .2. S − f i ⊆ S − f ∩ E ( H i ) for all i ∈ [ k ] . We derive Theorem 1.2 from Theorem 1.3 in Section 2 and prove Theorem 1.3 in Section 3.A natural first attempt to proving Theorem 1.1 is to try reducing to the special case of Boolean functions(the KMS inequality) via a thresholding argument. Given f : { , } d → R and t ∈ R , define h t : { , } d →{ , } to be h t ( x ) = 1 iff f ( x ) > t . Clearly, this can only reduce the left-hand side of (1) since theinfluential edges of h t are a subset of the influential edges of f . Thus, if there exists some t ∈ R such that ε ( h t ) = Ω( ε ( f )), then applying the KMS inequality to h t would show that the inequality also holds for f . In fact, as we show in Section 7, this technique easily allows us to reduce the undirected inequality forthe real-valued case to the Boolean case, without any significant additional ideas. However, in the directedsetting, a simple argument shows that there exists f for which ε ( h t ) ≤ ε ( f ) /r for all t ∈ R , where r isthe size of the image of f . Thus, additional ideas are required to prove Theorem 1.1 by a reduction to theKMS inequality. The highly structured decomposition of Theorem 1.3 gives a collection of disjoint subgraphs H , . . . , H k of the directed hypercube where, in each H i , an independent “variable thresholding rule” can beapplied, yielding the Boolean function f i . The “threshold” for each vertex x in H i depends on the values ofthe function at a particular set of vertices reachable from x. The Boolean decomposition is quite powerful: in addition to enabling us to prove the new isoperimetricinequality, it can be used to easily derive a lower bound on the number of edges violated by a real-valuedfunction directly from the bound for the Boolean case, without relying on Theorem 1.2. This bound is usedto analyze the edge tester for monotonicity whose significance is described in Section 1.2. The early workson monotonicity testing [GGL +
00, DGL +
99, Ras99] have shown that |S − f | ≥ ε ( f ) · d for every Booleanfunction f on the domain { , } d . In other words, the number of edges violated by f is at least the numberof points on which the value of the function has to change to make it monotone. This bound was generalized3o the case of real-valued functions by [DGL +
99, Ras99] who showed that |S − f | ≥ ( ε ( f ) / (cid:100) log r (cid:101) ) · d forevery real-valued function f on the domain { , } d and with image size r . (The size of the image of f is thenumber of distinct values it takes.) Chakrabarty and Seshadhri [CS13] improved this bound by a factor ofΘ(log r ), thus removing the dependence on the size of the image of the function. Our Boolean decompositionof a real-valued function f in terms of Boolean functions f , . . . , f k , given by Theorem 1.3, yields this resultof [CS13] as an immediate corollary of the special case for Boolean functions: |S − f | ≥ k (cid:88) i =1 |S − f i | ≥ k (cid:88) i =1 ε ( f i ) · d ≥ ε ( f ) · d − , where the inequalities follow by first applying Item 2 of Theorem 1.3, then applying the bound for the Booleancase, and, finally, applying Item 1 of Theorem 1.3. Undirected isoperimetric inequality for real-valued functions.
The original isoperimetric inequalityof Talagrand [Tal93] treats the domain { , } d as an undirected hypercube. An undirected edge { x, y } is influential if f ( x ) (cid:54) = f ( y ). Let I f ( x ) be the number of influential edges { x, y } incident on x ∈ { , } d forwhich f ( x ) > f ( y ). This definition ensures that each influential edge is counted towards I f ( x ) for exactlyone vertex x . The variance var( f ) of a Boolean function is defined as p (1 − p ), where p is the probabilitythat f ( x ) = 0 for a uniformly random point x in the domain. Talagrand [Tal93] proved the following. Theorem 1.4 (Talagrand Inequality [Tal93]) . For all functions f : { , } d → { , } , E x ∼{ , } d (cid:20)(cid:113) I f ( x ) (cid:21) ≥ √ f ) . (2)Before generalizing Theorem 1.4 to real-valued functions, we reinterpret it using a property testing notion.Observe that the natural definition of the variance of a real-valued function results in a quantity that dependson specific values of the function, whereas whether an edge is influential depends only on whether the valueson its endpoints are different and not on the specific values themselves. So, variance is not a suitable notionfor generalizing this inequality. We replace the variance of f with the distance of f to constant, denoted dist ( f, const ), i.e., the minimum of Pr x ∼{ , } d [ f ( x ) (cid:54) = g ( x )] over all constant functions g : { , } d → R . Fora Boolean function f , the distance to constant is min { p , (1 − p ) } and, therefore, the left-hand side of (2) isat least dist ( f, const ) / √
2. Next, we state our generalization of Talagrand’s inequality, proved in Section 7.
Theorem 1.5 (Undirected Isoperimetric Inequality) . For all functions f : { , } d → R , E x ∼{ , } d (cid:20)(cid:113) I f ( x ) (cid:21) ≥ dist ( f, const )2 √ . We do not discuss Margulis-type isoperimetric inequalities here, but note that their natural generalizationsto the real range follow from our Talagrand-type inequalities (for the same reasons as for the special case ofBoolean functions, as discussed in [KMS18]).
We apply our generalized isoperimetric inequality (Theorem 1.2) to improve algorithms for testing mono-tonicity and approximating the distance to monotonicity for real-valued functions.
Monotonicity testing.
Monotonicity of functions, first studied in the context of property testing byGoldreich et al. [GGL + +
00, DGL + +
02, AC06, Fis04, HK08, BRW05, PRR06, ACCL07, BGJ +
12, BCSM12, BBM12, CS13,CS14, CS16, BRY14, CDJS17, CDST15, BB16, CWX17, PRV18, BCS18, KMS18, CS19, BCS20]. A functionis ε -far from monotone if its distance to monotonicity is at least ε ; otherwise, it is ε -close to monotone. An4 -tester for monotonicity is a randomized algorithm that, given a parameter ε ∈ (0 ,
1) and oracle access to afunction f , accepts with probability at least 2/3 if f is monotone and rejects with probability at least 2/3 if f is ε -far from monotone. Prior to our work, the best monotonicity tester for real-valued functions was the edge tester . The edge tester, introduced by [GGL + f on the endpoints of uniformlyrandom edges of the hypercube and rejects if it finds a violated edge. As we discussed in Section 1.1, a seriesof works [GGL +
00, DGL +
99, Ras99, CS13] proved lower bounds on |S − f | , the number of violated edges,resulting in the tight analysis of the edge tester for both Boolean and real-valued functions: O ( d/ε ) queries aresufficient (and also necessary, e.g., for f ( x ) = 1 − x , the anti-dictator function). For many years, it remainedopen whether an o ( d )-query tester for monotonicity existed, until a sequence of breakthroughs [CS16, CST14,KMS18] designed testers for Boolean functions with query complexity (cid:101) O ( d / ) , (cid:101) O ( d / ), and finally (cid:101) O ( √ d ).Prior to our work, the same question remained open for functions with image size, r , greater than 2.We show that when r is small compared to d , monotonicity can be tested with o ( d ) queries. (Note that r ≤ d .) Theorem 1.6.
There exists a nonadaptive, 1-sided error ε -tester for monotonicity of f : { , } d → R thatmakes (cid:101) O (cid:16) min (cid:0) r √ dε , dε (cid:1)(cid:17) queries and works for all functions f with image size r . The proof of Theorem 1.6 (in Section 4) heavily relies on the generalized isoperimetric inequality ofTheorem 1.2. We extend several other combinatorial properties of Boolean functions to real-valued functions.In particular, the persistence of a vertex x ∈ { , } d is a key combinatorial concept in the analysis. A vertex x ∈ { , } d is τ -persistent if, with high probability, a random walk that starts at x and takes τ steps in the d -dimensional directed hypercube ends at a vertex y for which f ( y ) ≤ f ( x ). As we show, the upper boundon the number of vertices which are not τ -persistent grows linearly with the distance τ and the image size r .For the tester analysis, one needs to carefully choose the distance parameter τ for which many vertices are τ -persistent. In particular, this value of τ also depends on the image size r , resulting in the linear dependenceon r in the query complexity of the tester. Our lower bound for testing monotonicity.
We show that our monotonicity tester is optimal amongnonadaptive, 1-sided error testers.
Theorem 1.7.
There exists a constant ε > , such that for all d, r ∈ N , every nonadaptive, -sided error ε -tester for monotonicity of functions f : { , } d → [ r ] requires Ω(min( r √ d, d )) queries. We prove Theorem 1.7 (in Section 6) by generalizing a construction of Fischer et al. [FLN +
02] that showedthat nonadaptive, 1-sided error monotonicity testers of Boolean functions must make Ω( √ d ) queries. Blaiset al. [BBM12] demonstrated that every tester for monotonicity over the d -dimensional hypercube domainrequires Ω(min( d, r )) queries. Our lower bound is stronger when r ∈ [2 , √ d ], although it applies only tononadaptive, 1-sided error algorithms. Approximating the distance to monotonicity.
Motivated by the desire to handle noisy inputs, Parnaset al. [PRR06] generalized the property testing model to tolerant testing. There is a direct connection betweentolerant testing of a property and approximating the distance to the property with additive and multiplicativeerror in the sense that these problems can be reduced to each other with the right setting of parametersand have the same query complexity up to logarithmic factors (see, e.g., [PRR06, Claim 2] and [PRW20,Theorem A.1]). One clean way to state distance approximation guarantees is to replace the additive error α with the promise that the input function is α -far from the property, as specified in the following definition.A randomized c -approximation algorithm for the distance to monotonicity, where c >
1, is given a parameter α ∈ (0 ,
1) and oracle access to a function f : { , } d → R that is α -far from monotone. It outputs an estimateˆ ε that, with probability at least 2/3, satisfies ε ( f ) ≤ ˆ ε ≤ c · ε ( f ).Fattal and Ron [FR10] studied the problem of approximating the distance to monotonicity for real-valued functions over the hypergrid domain [ n ] d . For the special case of the hypercube domain, they givean O ( d log r )-approximation algorithm for functions with image size r that makes poly( d, /α ) queries. The-orem 1.2 allows us to improve on their result, by showing that the algorithm of Pallavoor et al. [PRW20]5or approximating the distance to monotonicity of Boolean functions also works for real-valued functions,without any loss in the approximation guarantee. Theorem 1.8.
There exists a nonadaptive O ( √ d log d ) -approximation algorithm for the distance to mono-tonicity that, given a parameter α ∈ (0 , and oracle access to a function f : { , } d → R that is α -far frommonotone, makes poly( d, /α ) queries. Pallavoor et al. prove that this approximation ratio is nearly optimal for nonadaptive algorithms, evenfor the special case of Boolean functions. We also note that, by the connection between tolerant testingand erasure-resilient testing observed by Dixit et al. [DRTV18], our Theorem 1.8 implies the existence of anerasure-resilient ε -tester for monotonicity of functions f : { , } d → R that can handle up to Θ( ε/ √ d log d )erasures with query complexity poly( d, /ε ). The tester of Dixit et al. could handle only O ( ε/d ) erasures.We prove Theorem 1.8 in Section 5. The query complexity of monotonicity testing of Boolean functions over the hypercube has been resolved fornonadaptive testers by Chen et al. [CDST15, CWX17] who proved a lower bound of (cid:101) Ω( √ d ). For adaptivetesters, the best lower bound known to date is (cid:101) Ω( d / ), also shown by [CWX17]. It is an open questionwhether adaptive algorithms can do better than nonadaptive ones for functions over the hypercube domain,both in the case of Boolean functions and, more generally, for functions with small image size. As wementioned before, there is a lower bound of Ω( d ) for functions with image size Ω( √ d ) [BBM12].Monotonicity testing has also been studied for functions on other types of domains, including generalpartially ordered domains [FLN + n ] d . (It has also beeninvestigated in the context where the distance to monotonicity is the normalized L p distance instead of theHamming distance, but we focus our attention here on the Hamming distance.) When d = 1, monotonicitytesting on the hypergrid [ n ] is equivalent to testing sortedness of n -element arrays. This problem wasintroduced by Ergun et al. [EKK + n and ε by [EKK +
00, Fis04, CS14, Bel18]: it is Θ( log( εn ) ε ). Pallavoor et al. [PRV18, Pal20] considered thesetting when the tester is given an additional parameter r , the number of distinct elements in the array,and obtained an O ((log r ) /ε )-query algorithm. There are also lower bounds for this setting: Ω(log r ) fornonadaptive algorithms by [BRY14] and Ω( log r log log r ) for all testers for the case when r = n / by [Bel18].For general d , Black et al. [BCS18, BCS20] gave an (cid:101) O ( d / )-query tester for Boolean functions f : [ n ] d →{ , } . For real-valued functions, Chakrabarti and Seshadhri [CS13, CS14] proved basically matching upperand lower bounds of O (( d log n ) /ε ) and Ω(( d log n − log ε − ) /ε ). However, their lower bound only applies forfunctions with a large image. Pallavoor et al. [PRV18] gave an O ( dε · log dε · log r )-query tester, where r , thesize of the image, is given to the tester as a parameter. It remains open whether there is an O ( √ d )-querytester for Boolean functions on the hypergrid domain and, in particular, whether the isoperimetric inequalityof [KMS18] can be extended to hypergrids. Note that since the Boolean Decomposition (Theorem 1.3) holdsfor all partially ordered domains, an isoperimetric inequality for Boolean functions on hypergrids would alsogeneralize to real-valued functions. In this section, we use our Boolean decomposition Theorem 1.3 to prove Theorem 1.2, which easily implies thenon-robust version (Theorem 1.1) as we point out in the introduction. Let f : { , } d → R be a non-monotonefunction over the d -dimensional hypercube and let col : S − f → { red , blue } be an arbitrary 2-coloring of S − f .Given x ∈ { , } d and a subgraph H of the d -dimensional hypercube, we define the quantities I − f, red , H ( x ) = (cid:12)(cid:12)(cid:12)(cid:110) y : ( x, y ) ∈ S − f ∩ E ( H ), col ( x, y ) = red (cid:111)(cid:12)(cid:12)(cid:12) ; I − f, blue , H ( y ) = (cid:12)(cid:12)(cid:12)(cid:110) x : ( x, y ) ∈ S − f ∩ E ( H ), col ( x, y ) = blue (cid:111)(cid:12)(cid:12)(cid:12) .6et f , . . . , f k : { , } d → { , } be the Boolean functions and H , . . . , H k be the disjoint subgraphs of the d -dimensional hypercube that are guaranteed by Theorem 1.3. Let C (cid:48) denote the constant from the robustBoolean isoperimetric inequality (Theorem 2.7 of [PRW20]) that is hidden by Ω. We have E x ∼{ , } d (cid:104)(cid:113) I − f, red ( x ) (cid:105) + E y ∼{ , } d (cid:104)(cid:113) I − f, blue ( y ) (cid:105) ≥ E x (cid:104)(cid:113) I − f, red , (cid:83) ki =1 H i ( x ) (cid:105) + E y (cid:104)(cid:113) I − f, blue , (cid:83) ki =1 H i ( y ) (cid:105) (3)= k (cid:88) i =1 (cid:18) E x (cid:104)(cid:113) I − f, red , H i ( x ) (cid:105) + E y (cid:104)(cid:113) I − f, blue , H i ( y ) (cid:105)(cid:19) (4) ≥ k (cid:88) i =1 (cid:18) E x (cid:104)(cid:113) I − f i , red , H i ( x ) (cid:105) + E y (cid:104)(cid:113) I − f i , blue , H i ( y ) (cid:105)(cid:19) (5)= k (cid:88) i =1 (cid:18) E x (cid:104)(cid:113) I − f i , red ( x ) (cid:105) + E y (cid:104)(cid:113) I − f i , blue ( y ) (cid:105)(cid:19) (6) ≥ k (cid:88) i =1 C (cid:48) · ε ( f i ) (7) ≥ C (cid:48) · ε ( f )2 . (8)The inequality (3) holds simply because (cid:83) ki =1 H i is a subgraph of the d -dimensional hypercube, while theequality (4) holds because the H i ’s are disjoint. The inequality (5) holds since S − f i ⊆ S − f and the equality(6) holds since S − f i ⊆ E ( H i ) (these are both by item 2 of Theorem 1.3). Finally, (7) is due to Theorem 2.7of [PRW20] and (8) is due to item 1 of Theorem 1.3. In this section, we prove the Boolean Decomposition Theorem 1.3. Our results consider any partially ordereddomain, which we represent by a DAG G . The transitive closure of G , denoted TC( G ), is the graph ( V ( G ) , E ),where E = { ( x, y ) : x ≺ y } . The violation graph of f is the graph ( V ( G ) , E (cid:48) ), where E (cid:48) is the set of edges in E violated by f .In Section 3.1, we define the key notion of sweeping graphs and identify some of their important properties.In Section 3.2, we prove a general lemma that shows how to use a matching M in TC( G ) to find disjointsweeping graphs in G satisfying a “matching rearrangement” property. The techniques in Section 3.1 andSection 3.2 are inspired by the techniques of [BCS18] used to analyze Boolean functions on the hypergriddomain, [ n ] d . In Section 3.3, we apply our matching decomposition lemma to a carefully chosen matchingto obtain the subgraphs H , . . . , H k . Finally, in Section 3.4, we define the Boolean functions f , . . . , f k andcomplete the proof of Theorem 1.3. Definition 3.1 (( S, T )-Sweeping Graphs) . Given a DAG G and s, t ∈ V ( G ) , define H ( s, t ) to be the subgraphof G formed by the union of all directed paths in G from s to t . Given two disjoint subsets S, T ⊆ V ( G ) ,define the ( S, T )-sweeping graph , denoted H ( S, T ) , to be the union of directed paths in G that start fromsome s ∈ S and end at some t ∈ T . That is, H ( S, T ) = (cid:91) ( s,t ) ∈ S × T H ( s, t ) . Note that if s (cid:14) t then H ( s, t ) = ∅ . 7e now prove three properties of sweeping graphs which we use in Section 3.4 to analyze our functions f , . . . , f k . Given disjoint sets S, T ⊆ V ( G ) and z ∈ V ( H ( S, T )), define the sets S ( z ) = { s ∈ S : s (cid:22) z } and T ( z ) = { t ∈ T : z (cid:22) t } . Claim 3.2 (Properties of Sweeping Graphs) . Let G be a DAG and S, T ⊆ V ( G ) be disjoint sets.1. (Property of Nodes in a Sweeping Graph): If z ∈ V ( H ( S, T )) then S ( z ) (cid:54) = ∅ and T ( z ) (cid:54) = ∅ .2. (Property of Nodes Outside of a Sweeping Graph): If z ∈ V ( G ) \ V ( H ( S, T )) then at most one of thefollowing is true: (a) ∃ y ∈ V ( H ( S, T )) such that z ≺ y , (b) ∃ x ∈ V ( H ( S, T )) such that x ≺ z .3. (Sweeping Graphs are Induced): If x, y ∈ V ( H ( S, T )) and ( x, y ) ∈ E ( G ) then ( x, y ) ∈ E ( H ( S, T )) .Proof. Property 1 holds by definition of the sweeping graph H ( S, T ). If z ∈ V ( H ( S, T )), then, by definitionof H ( S, T ) , there exist s ∈ S and t ∈ T for which z belongs to some directed path from s to t . That is, z ∈ V ( H ( s, t )). Thus s ∈ S ( z ) and t ∈ T ( z ) , and property 1 holds.We now prove property 2. Suppose, for the sake of contradiction, that there exist x, y, z ∈ V ( G ) for which x, y ∈ V ( H ( S, T )), z / ∈ V ( H ( S, T )), and x ≺ z ≺ y . By property 1, there exist some s ∈ S ( x ) and some t ∈ T ( y ). Then s (cid:22) x ≺ z ≺ y (cid:22) t and, consequently, z belongs to some directed path from s to t . Thus z ∈ V ( H ( s, t )) , and so z ∈ V ( H ( S, T )). This is a contradiction.We now prove property 3. Suppose x, y ∈ V ( H ( S, T )) and ( x, y ) ∈ E ( G ). By property 1, there exist s ∈ S and t ∈ T for which s (cid:22) x and y (cid:22) t . Since ( x, y ) ∈ E ( G ), we have x ≺ y and so s (cid:22) x ≺ y (cid:22) t . Thus, theedge ( x, y ) belongs to a directed path from s to t . That is, ( x, y ) ∈ E ( H ( s, t )) and so ( x, y ) ∈ E ( H ( S, T )). (cid:4)
In this section, we prove the following matching decomposition lemma. Recall that TC( G ) = ( V ( G ) , E )denotes the transitive closure of the DAG G , where E = { ( x, y ) : x ≺ y } . Consider a matching M in TC( G ).We represent M : S → T as a bijection between two disjoint sets S, T ⊆ V ( G ) of the same size for which s ≺ M ( s ) for all s ∈ S . For a set S (cid:48) ⊆ S, define M ( S (cid:48) ) = { M ( s ) : s ∈ S (cid:48) } . Note that for convenience we willsometimes abuse notation and represent M as the set of pairs, { ( s, M ( s )) : s ∈ S } , instead of as a bijection. Lemma 3.3 (Matching Decomposition Lemma for DAGs) . For every DAG G and every matching M : S → T in TC( G ) , there exist partitions ( S i : i ∈ [ k ]) of S and ( T i : i ∈ [ k ]) of T , where M ( S i ) = T i for all i ∈ [ k ] ,and the following hold.1. (Sweeping Graph Disjointness): V ( H ( S i , T i )) ∩ V ( H ( S j , T j )) = ∅ for all i (cid:54) = j , where i, j ∈ [ k ] .2. (Matching Rearrangement Property): For all i ∈ [ k ] and ( x, y ) ∈ S i × T i , if x ≺ y then there exists amatching (cid:99) M : S i → T i in TC( G ) for which ( x, y ) ∈ (cid:99) M .Proof. In Algorithm 1, we show how to construct partitions ( S i : i ∈ [ k ]) for S and ( T i : i ∈ [ k ]) for T froma matching M in TC( G ). Algorithm 1 uses the following notion of conflicting pairs. Definition 3.4 (Conflicting Pairs) . Given a DAG G and four disjoint sets X, Y, X (cid:48) , Y (cid:48) ⊂ V ( G ) , we say thatthe two pairs ( X, Y ) and ( X (cid:48) , Y (cid:48) ) conflict if V ( H ( X, Y )) ∩ V ( H ( X (cid:48) , Y (cid:48) )) (cid:54) = ∅ . The following observation is apparent and by design of Algorithm 1.
Observation 3.5 (Loop Invariants of Algorithm 1) . For all s ∈ { , , . . . , s ∗ } , (a) M ( X ) = Y for all ( X, Y ) ∈ Q s , (b) ( X : ( X, · ) ∈ Q s ) is a partition of S , and (c) ( Y : ( · , Y ) ∈ Q s ) is a partition of T . Given a matching M : S → T in TC( G ), we run Algorithm 1 to obtain the set Q s ∗ . See Fig. 1 for anillustration. Define k = |Q s ∗ | and let { ( S i , T i ) : i ∈ [ k ] } be the set of pairs in Q s ∗ . By Observation 3.5,( S i : i ∈ [ k ]) is a partition of S , ( T i : i ∈ [ k ]) is a partition of T , and M ( S i ) = T i for all i ∈ [ k ]. Item 1 ofLemma 3.3 holds since Algorithm 1 terminates at step s only when all pairs in Q s are non-conflicting (recall8 lgorithm 1 Algorithm for constructing conflict-free pairs from a matching M Input:
A DAG G and a matching M : S → T in TC( G ). Q ← { ( { x } , { y } ) : ( x, y ) ∈ M } (cid:46) Initialize pairs using M for s ≥ do if two pairs ( X, Y ) (cid:54) = ( X (cid:48) , Y (cid:48) ) ∈ Q s conflict then Q s +1 ← ( Q s \ { ( X, Y ) , ( X (cid:48) , Y (cid:48) ) } ) ∪ { ( X ∪ X (cid:48) , Y ∪ Y (cid:48) ) } (cid:46) Merge conflicting pairs else s ∗ ← s and return Q s ∗ (cid:46) Terminate when there are no conflictsFigure 1: An illustration for Algorithm 1 with input matching M = { ( a, x ) , ( b, y ) , ( c, z ) } . We initialize Q = { ( { a } , { x } ) , ( { b } , { y } ) , ( { c } , { z } ) } . The pairs ( { a } , { x } ) and ( { b } , { y } ) conflict, so we merge them toobtain a new and final collection Q = { ( { a, b } , { x, y } ) , ( { c } , { z } ) } .Definition 3.4). Thus, to prove Lemma 3.3 it only remains to prove item 2. To do so, we prove the followingClaim 3.6, that easily implies item 2. Note that while we only require Claim 3.6 to hold for the special caseof s = s ∗ , using an inductive argument on s allows us to give a proof for all s ∈ { , , . . . , s ∗ } . Claim 3.6 (Rematching Claim) . For all s ∈ { , , . . . , s ∗ } , pairs ( X, Y ) ∈ Q s , and ( x, y ) ∈ X × Y , thereexists a matching (cid:99) M : X \ { x } → Y \ { y } in TC( G ) .Proof. The proof is by induction on s . For the base case, if s = 0, then, by inspection of Algorithm 1, for( X, Y ) ∈ Q , we must have X = { x } and Y = { y } . Thus, setting (cid:99) M = ∅ trivially proves the claim.Now let s >
0. Fix some (
X, Y ) ∈ Q s and ( x, y ) ∈ X × Y . Let ( X , Y ) , ( X , Y ) ∈ Q s − be the pairsof sets in Q s − for which x ∈ X and y ∈ Y . First, if ( X , Y ) = ( X , Y ), then by induction there existsa matching (cid:99) M (cid:48) : X \ { x } → Y \ { y } in TC( G ). Note that by definition of Algorithm 1, we must have X ⊆ X and Y ⊆ Y . Then the required matching is (cid:99) M = (cid:99) M (cid:48) ∪ M | X \ X where M | ( · ) denotes the restrictionof the original matching M to the set ( · ). Suppose ( X , Y ) (cid:54) = ( X , Y ). This is the interesting case, andwe give an accompanying illustration in Fig. 2. By definition of Algorithm 1, it must be that ( X , Y ) and( X , Y ) conflict (recall Definition 3.4) and were merged to form X = X ∪ X and Y = Y ∪ Y . Thus,there exists some vertex z ∈ V ( H ( X , Y )) ∩ V ( H ( X , Y )) and x ∈ X , y ∈ Y , x ∈ X , y ∈ Y for which x (cid:22) z (cid:22) y and x (cid:22) z (cid:22) y .We now invoke the inductive hypothesis to get matchings (cid:99) M : X \ { x } → Y \ { y } and (cid:99) M : X \ { x } → Y \ { y } in TC( G ). Observe that x (cid:22) z (cid:22) y and thus we can match x and y . The required matching inTC( G ) is (cid:99) M = (cid:99) M ∪ (cid:99) M ∪ { ( x , y ) } . (cid:4) We conclude the proof of Lemma 3.3 by showing that Claim 3.6 implies item 2. We are given ( S i , T i ) ∈ Q s ∗ for some i ∈ [ k ] and ( x, y ) ∈ S i × T i where x ≺ y . By Claim 3.6 there exists a matching (cid:99) M (cid:48) : S i \{ x } → T i \{ y } in TC( G ). We then set (cid:99) M = (cid:99) M (cid:48) ∪ { ( x, y ) } . Since x ≺ y , the final matching (cid:99) M : S i → T i is a matching inTC( G ) which contains the pair ( x, y ). (cid:4) X , Y ) (cid:54) = ( X , Y ) in the proof of Claim 3.6. The solid linesrepresent directed paths. The dotted line represents the pair ( x , y ) added to obtain the final matching (cid:99) M .The only vertices of X ∪ Y not participating in (cid:99) M are x and y . H , . . . , H k In this section, we apply Lemma 3.3 to a carefully chosen matching M in order to construct our disjointsubgraphs H , . . . , H k . Definition 3.7 (Max-weight, Min-cardinality Matching) . A matching M in TC( G ) is a max-weight, min-cardinality matching if M maximizes (cid:80) ( x,y ) ∈ M ( f ( x ) − f ( y )) and among such matchings minimizes | M | . Henceforth, let M denote a max-weight, min-cardinality matching. Let S and T denote the set of lowerand upper endpoints, respectively, of M . We use the following well-known fact on matchings in the violationgraph. Fact 3.8 (Corollary 2 [FLN + . For a
DAG G and function f : V ( G ) → R , the distance to monotonicity ε ( f ) is equal to the size of the minimum vertex cover of the violation graph of f divided by | V ( G ) | . Fact 3.9. M is a matching in the violation graph of f that is also maximal. That is, (a) f ( x ) > f ( y ) forall ( x, y ) ∈ M and (b) | M | ≥ ( ε ( f ) · | V ( G ) | ) / .Proof. First, for the sake of contradiction, suppose f ( x ) ≤ f ( y ) for some pair ( x, y ) ∈ M . Then we canset M = M \ { ( x, y ) } , which can only increase (cid:80) ( x,y ) ∈ M ( f ( x ) − f ( y )) and will decrease | M | by 1. Thiscontradicts the definition of M . Thus, f ( x ) > f ( y ) for all ( x, y ) ∈ M and so M is a matching in the violationgraph of f . Second, since M maximizes (cid:80) ( x,y ) ∈ M ( f ( x ) − f ( y )), it must also be a maximal matching in theviolation graph of f . Thus, (b) follows from Fact 3.8 and the fact that the size of any maximal matching isat least half the size of the minimum vertex cover. (cid:4) We now apply Lemma 3.3 to M , obtaining the partitions ( S i : i ∈ [ k ]) and ( T i : i ∈ [ k ]) for S and T ,respectively, for which M ( S i ) = T i for all i ∈ [ k ]. For each i ∈ [ k ], let H i = H ( S i , T i ). We use the collectionof sweeping graphs H , . . . , H k to prove Theorem 1.3. Note that these subgraphs are all vertex disjoint byitem 1 of Lemma 3.3. We use item 2 of Lemma 3.3 to prove the following lemma regarding the ( S i , T i ) pairs.The proof crucially relies on the fact that M is a max-weight, min-cardinality matching. Lemma 3.10 (Property of the Pairs ( S i , T i )) . For all i ∈ [ k ] and ( x, y ) ∈ S i × T i , if x ≺ y then f ( x ) > f ( y ) .Proof. Suppose there exists i ∈ [ k ], x ∈ S i , and y ∈ T i for which x ≺ y and f ( x ) ≤ f ( y ). By item 2 ofLemma 3.3 there exists a matching (cid:99) M : S → T in TC( G ) for which ( x, y ) ∈ (cid:99) M . In particular, since M and (cid:99) M have identical sets of lower and upper endpoints, (cid:88) ( s,t ) ∈ (cid:99) M ( f ( s ) − f ( t )) = (cid:88) ( s,t ) ∈ M ( f ( s ) − f ( t )) and | (cid:99) M | = | M | . (cid:99) M (cid:48) = (cid:99) M \ { ( x, y ) } and observe that since f ( x ) ≤ f ( y ), (cid:88) ( s,t ) ∈ (cid:99) M (cid:48) ( f ( s ) − f ( t )) ≥ (cid:88) ( s,t ) ∈ M ( f ( s ) − f ( t )) and | (cid:99) M (cid:48) | < | M | .Therefore, M is not a max-weight, min-cardinality matching and this is a contradiction. (cid:4) f , . . . , f k We are now equipped to define the functions f , . . . , f k : V ( G ) → { , } and complete the proof of Theo-rem 1.3. First, given i ∈ [ k ] and z ∈ V ( G ) \ V ( H i ), we say that z is below H i if there exists y ∈ V ( H i ) forwhich z ≺ y , and z is above H i if there exists x ∈ V ( H i ) for which x ≺ z . Since H i is the ( S i , T i )-sweepinggraph, by item 2 of Claim 3.2, vertex z cannot be both below and above H i , simultaneously. Second, given z ∈ V ( H i ), we define the set T i ( z ) = { t ∈ T i : z (cid:22) t } . Note that by item 1 of Claim 3.2, T i ( z ) (cid:54) = ∅ for all z ∈ V ( H i ), and so the quantity max t ∈ T i ( z ) f ( t ) is always well-defined. Definition 3.11.
For each i ∈ [ k ] , define the function f i : V ( G ) → { , } as follows. For every z ∈ V ( G ) , f i ( z ) = , if z ∈ V ( H i ) and f ( z ) > max t ∈ T i ( z ) f ( t ) , , if z ∈ V ( H i ) and f ( z ) ≤ max t ∈ T i ( z ) f ( t ) , , if z / ∈ V ( H i ) and z is above H i , , if z / ∈ V ( H i ) and z is not above H i . See Fig. 3 for an illustration of the values of f i . We first prove item 1 of Theorem 1.3. Recall that M ( S i ) = T i for all i ∈ [ k ]. Let M i = M | S i denote the matching M restricted to S i . Consider x ∈ S i . ByLemma 3.10, f ( x ) > f ( y ) for all y ∈ T i such that x ≺ y . Thus f ( x ) > max t ∈ T i ( x ) f ( t ) and so f i ( x ) = 1. Nowconsider y ∈ T i . Observe that y ∈ T i ( y ). Thus, clearly, f ( y ) ≤ max t ∈ T i ( y ) f ( t ), and so f i ( y ) = 0. Therefore, f i ( x ) = 1 for all x ∈ S i and f i ( y ) = 0 for all y ∈ T i . In particular, f i ( x ) = 1 > f i ( M ( x )) for all x ∈ S i and so M i is a matching in the violation graph of f i . Thus, ε ( f i ) ≥ | M i || V ( G ) | for all i ∈ [ k ]. It follows, k (cid:88) i =1 ε ( f i ) ≥ | V ( G ) | − k (cid:88) i =1 | M i | = | V ( G ) | − · | M | ≥ | V ( G ) | − · ε ( f ) · | V ( G ) | ε ( f )2by the above argument and Fact 3.9. Thus, item 1 of Theorem 1.3 holds.To prove item 2 of Theorem 1.3, we need to show that for all i ∈ [ k ] the following hold: S − f i ⊆ E ( H i ) and S − f i ⊆ S − f . We first prove that S − f i ⊆ E ( H i ). Consider an edge ( x, y ) ∈ E ( G ) \ E ( H i ). We need to show that f i ( x ) ≤ f i ( y ).First, observe that if both x, y ∈ V ( H i ), then by item 3 of Claim 3.2, we have ( x, y ) ∈ E ( H i ). Thus, we onlyneed to consider the following three cases. Recall that f i ( x ) , f i ( y ) ∈ { , } .1. x ∈ V ( H i ), y / ∈ V ( H i ): In this case, y is above H i , and so f i ( y ) = 1. Thus, f i ( x ) ≤ f i ( y ).2. x / ∈ V ( H i ), y ∈ V ( H i ): In this case, x is below H i , and so x is not above H i by item 2 of Claim 3.2.Thus, f i ( x ) = 0, and so f i ( x ) ≤ f i ( y ).3. x / ∈ V ( H i ), y / ∈ V ( H i ): If x is above H i , then y is above H i as well, and so f i ( x ) = f i ( y ) = 1.Otherwise, x is not above H i and so f i ( x ) = 0. Thus, f i ( x ) ≤ f i ( y ).Therefore, S − f i ⊆ E ( H i ).We now prove that S − f i ⊆ S − f . Consider an edge ( x, y ) ∈ S − f i . Then f i ( x ) = 1 and f i ( y ) = 0. Since S − f i ⊆ E ( H i ), we have ( x, y ) ∈ E ( H i ) and so x, y ∈ V ( H i ). By definition of the functions f i , it holds that11igure 3: An illustration for the Boolean function f i of Definition 3.11. The diamond represents the DAG G whose paths are directed from bottom to top. The hexagon represents the sweeping graph H i = H ( S i , T i ).The value of f i is 1 for the vertices in S i and 0 for the vertices in T i . For vertices outside of H i , its value is1 for those vertices which are above H i and 0 for vertices which are not above H i . f ( x ) > max t ∈ T i ( x ) f ( t ) and f ( y ) ≤ max t ∈ T i ( y ) f ( t ). Since x ≺ y , then T i ( y ) ⊆ T i ( x ), because all verticesreachable from y are also reachable from x . Therefore, f ( x ) > max t ∈ T i ( x ) f ( t ) ≥ max t ∈ T i ( y ) f ( t ) ≥ f ( y ).Thus f ( x ) > f ( y ), and so ( x, y ) ∈ S − f . As a result, S − f i ⊆ S − f and item 2 of Theorem 1.3 holds. Thisconcludes the proof of Theorem 1.3. In this section, we prove Theorem 1.6. We show that the tester of [KMS18] for Boolean functions can beemployed to test monotonicity of real-valued functions. The tester is simple: it queries two comparablevertices x and y and rejects if the pair exhibits a violation to monotonicity for f . The tester tries differentvalues τ for the distance between x and y , that is, the number of coordinates on which they differ. Thekey step in the analysis of [KMS18] (and in our analysis) is to show that for some choice of τ , the testerwill detect a violation to monotonicity with high enough probability. The extra factor of r in the querycomplexity of our tester arises because we are forced to choose τ which is a factor of ( r −
1) smaller than forthe Boolean case. Intuitively, the reason for this is that as the walk length τ increases, the probability thatthe function value stays below a certain threshold decreases. We make this precise in Section 4.2.We first define the distribution from which the tester samples x and y . Following this, we present thetester as Algorithm 2. Let p denote the largest integer for which 2 p ≤ (cid:112) d/ log d . In Algorithm 2, we samplepairs of vertices at distance τ , where τ ranges over the powers of two up to 2 p . Definition 4.1 (Pair Test Distribution) . Given parameters b ∈ { , } and a positive integer τ , definethe following distribution D pair ( b, τ ) over pairs ( x, y ) ∈ ( { , } d ) . Sample x uniformly from { , } d . Let S = { i ∈ [ d ] : x i = b } . If τ > | S | , then set y = x . Otherwise, sample a uniformly random set T ⊆ S of size | T | = τ . Obtain y by setting y i = 1 − x i if i ∈ T and y i = x i otherwise. Our tester only uses comparisons between function values, not the values themselves. Thus, for thepurposes of our analysis we can consider functions with the range [ r ] w.l.o.g.When τ = 1, the algorithm is simply sampling edges from the d -dimensional hypercube. The distributionfrom which we sample is not the uniform distribution on edges, but following an argument from [KMS18],we can assume that for τ = 1, our tester has the same guarantees as the edge tester.12 lgorithm 2 Monotonicity Tester for f : { , } d → R Input:
Parameters ε ∈ (0 , d , and image size r ; oracle access to function f : { , } d → R . for all b ∈ { , } and τ ∈ { , , , . . . , p } do repeat (cid:101) O (cid:16) min (cid:0) r √ dε , dε (cid:1)(cid:17) times: Sample ( x , y ) ∼ D pair ( b, τ ). if b = 0 and f ( x ) > f ( y ) then reject . (cid:46) if b = 0 then x (cid:22) y if b = 1 and f ( x ) < f ( y ) then reject . (cid:46) if b = 1 then x (cid:23) y accept .The choice of the distance parameter τ for which the rejection probability of the tester is high depends onthe existence of a certain “good” bipartite subgraph of violated edges. Our analysis differs from the analysisof [KMS18] both in how we obtain the “good” subgraph of violated edges and in the choice of the optimaldistance parameter τ .We extend the following definitions from [KMS18]. Let G ( A, B, E AB ) denote a directed bipartite graphwith vertex sets A and B and all edges in E AB directed from A to B . Definition 4.2 (( K, ∆)-Good Graphs) . A directed bipartite graph G ( A, B, E AB ) is ( K, ∆) -good if for X, Y such that either X = A , Y = B or X = B , Y = A , we have: (a) | X | = K . (b) Vertices in X havedegree exactly ∆ . (c) Vertices in Y have degree at most . The graph G is ( K, ∆) -left-good if X = A and ( K, ∆) -right-good if X = B . The weight of x ∈ { , } d , denoted by | x | , is the number of coordinates of x with value 1. Definition 4.3 (Persistence) . Given a function f : { , } d → [ r ] and an integer τ ∈ (cid:104) , (cid:113) d log d (cid:105) , a vertex x ∈ { , } d of weight in the range d ± O ( √ d log d ) is τ -right-persistent for f if Pr y [ f ( y ) ≤ f ( x )] > , where y is obtained by choosing a uniformly random set T ⊂ { i ∈ [ d ] : x i = 0 } of size τ and setting y i = 1 − x i if i ∈ T and y i = x i otherwise . We define τ -left-persistence symmetrically. We use the following technical claim implicitly shown in the analysis of the tester of [KMS18].
Claim 4.4 ([KMS18]) . Suppose there exists a ( K, ∆) -right-good subgraph G ( A, B, E AB ) of the directed d -dimensional hypercube, such that (a) E AB ⊆ S − f , (b) K √ ∆ = Θ( ε ( f ) · d log d ) , and (c) at least | B | of thevertices in B are ( τ (cid:48) − -right-persistent for some τ (cid:48) such that τ (cid:48) · ∆ (cid:28) d . Then there exists a constant C (cid:48) > , such that for ( x , y ) ∼ D pair (0 , τ (cid:48) ) , Pr x , y [ f ( x ) > f ( y )] ≥ C (cid:48) · τ (cid:48) d · K d · ∆ . The analogous claim holds given a ( K, ∆)-left-good subgraph with many ( τ (cid:48) − A and ( x , y ) drawn from D pair (1 , τ (cid:48) ).In Section 4.1, we prove Lemma 4.6 which obtains a good subgraph for f satisfying conditions (a) and (b)of Claim 4.4. In Section 4.2, we prove Lemma 4.8 which gives an upper bound on the fraction of non-persistentvertices, enabling us to satisfy condition (c). Finally, in Section 4.3, we use Lemma 4.6 and Lemma 4.8 toshow that the conditions of Claim 4.4 are satisfied. Finally, we use it to prove Theorem 1.6. Note that τ ≥ |{ i ∈ [ d ]: x i = 0 }| by our assumption on x and τ . .1 Existence of a Good Bipartite Subgraph In this section, we prove Lemma 4.6 on the existence of good bipartite subgraphs for real-valued functions,which was proved in [KMS18] for the special case of Boolean functions. This lemma crucially relies on ourisoperimetric inequality for real-valued functions (Theorem 1.2). We first state (without proof) a combina-torial result of [KMS18], which we need for our lemma.
Lemma 4.5 (Lemma 6.5 of [KMS18]) . Let G ( A, B, E AB ) be a directed bipartite graph whose vertices havedegree at most s . Suppose in addition, that for any 2-coloring of its edges col : E AB → { red , blue } we have (cid:88) x ∈ A (cid:112) deg red ( x ) + (cid:88) y ∈ B (cid:112) deg blue ( y ) ≥ L, (9) where deg red ( x ) denotes the number of red edges incident on x and deg blue ( y ) denotes the number of blueedges incident on y . Then G ( A, B, E AB ) contains a subgraph that is ( K, ∆) -good with K √ ∆ ≥ L s . We can now generalize Lemma 7.1 of [KMS18].
Lemma 4.6.
For all functions f : { , } d → R , there exists a subgraph G ( A, B, E AB ) of the directed, d -dimensional hypercube which is ( K, ∆) -good, where K √ ∆ = Θ( ε ( f ) · d log d ) and E AB ⊆ S − f .Proof. Our proof relies on Lemma 4.5. Condition (9) is clearly reminiscent of the isoperimetric inequality inTheorem 1.2. We want to partition the vertices in { , } d into sets A and B such that all the violated edgesare directed from A to B and apply Theorem 1.2 to the resulting graph. In addition, we want (9) to holdfor a big enough value of L . In the Boolean case, we can simply partition the vertices by function values.In contrast, for real-valued functions, a vertex x ∈ { , } d can be incident on both incoming and outgoingviolated edges. To overcome this challenge we resort to the bipartiteness of the directed hypercube, whereeach edge is between a vertex with an odd weight and a vertex with an even weight. Partition S − f into twosets: E = { ( x, y ) ∈ S − f : | x | is even } ; E = { ( x, y ) ∈ S − f : | x | is odd } . For j ∈ { , } , let V j and W j denote the set of lower and upper endpoints, respectively, of the edges in E j .We consider the two subgraphs G j ( V j , W j , E j ) for j ∈ { , } . Notice that the vertices in V ∪ W have evenweight and the vertices in V ∪ W have odd weight. Obviously, V and W may not be disjoint, and similarly V and W may not be disjoint, and thus G and G may not be vertex disjoint.We quickly explain why we cannot simply use Lemma 4.5 with either G or G . Fix a 2-coloring of theedges E ∪ E . By averaging, one of the graphs will have a high enough contribution to left-hand side of theisoperimetric inequality of Theorem 1.2. Assume this graph is G . As a result, condition (9) will hold for G with L = Ω( ε · d ). However, one cannot guarantee that condition (9) holds for all possible colorings ofthe edges of G . Our construction below describes how to combine G and G so that we can jointly “feed”them into Lemma 4.5.We construct copies (cid:98) G and (cid:98) G of G and G , so that (cid:98) G contains a vertex labelled ( x,
0) for each vertex x of G , and (cid:98) G contains a vertex ( x,
1) for each vertex x of G . For each edge ( x, y ) in G we add an edgefrom ( x,
0) to ( y,
0) in (cid:98) G . We do the same for the edges of G . Note that each edge of S − f has exactly onecopy, either in (cid:98) G or (cid:98) G .Let (cid:98) G ( (cid:98) V , (cid:99)
W , S − f ) denote the union of the two disjoint graphs (cid:98) G and (cid:98) G . That is, (cid:98) V = { ( x, | x ∈ V } ∪ { ( x, | x ∈ V } , (cid:99) W = { ( y, | y ∈ W } ∪ { ( y, | y ∈ W } . All the edges of (cid:98) G are directed from (cid:98) V to (cid:99) W . Although imprecise, we think of the edges of (cid:98) G as S − f , sinceeach edge in S − f has exactly one copy in (cid:98) G . 14onsider a 2-coloring col : S − f → { red , blue } . Observe that (cid:88) ( x, · ) ∈ (cid:98) V (cid:113) I − f, red ( x ) + (cid:88) ( y, · ) ∈ (cid:99) W (cid:113) I − f, blue ( x ) = (cid:88) x ∈ V ∪ V (cid:113) I − f, red ( x ) + (cid:88) y ∈ W ∪ W (cid:113) I − f, blue ( y )= (cid:88) x ∈{ , } d | x | is even (cid:113) I − f, red ( x ) + (cid:113) I − f, blue ( x ) + (cid:88) x ∈{ , } d | x | is odd (cid:113) I − f, red ( x ) + (cid:113) I − f, blue ( x )= (cid:88) x ∈{ , } d (cid:113) I − f, red ( x ) + (cid:88) y ∈{ , } d (cid:113) I − f, blue ( y ) ≥ C · ε ( f ) · d , where the inequality holds by Theorem 1.2.By construction, I − f, red ( x ) = deg red (( x, · )) for all ( x, · ) ∈ (cid:98) V and I − f, blue ( y ) = deg blue (( y, · )) for all( y, · ) ∈ (cid:99) W . We have that condition (9) of Lemma 4.5 holds with L = C · ε ( f ) · d . Thus, (cid:98) G contains asubgraph G good ( A, B, E AB ) that is ( K, ∆)-good with K √ ∆ ≥ L d . Without loss of generality, assume G good ( A, B, E AB ) is ( K, ∆)-right-good.Let G good , = ( A , B , E A B ) denote the subgraph of G good lying in (cid:98) G and let G good , = ( A , B , E A B )denote the subgraph of G good lying in (cid:98) G . Since B ∩ B = ∅ , we know that either | B | ≥ K/ | B | ≥ K/ | B | ≥ K/
2. Moreover, since (cid:98) G and (cid:98) G are vertex and edge disjoint subgraphs, the degree of avertex of A ∪ B in G good , is the same its degree in G good . Thus, G good , is a ( K/ , ∆)-right-good subgraphof the d -dimensional directed hypercube for which K √ ∆ ≥ L
16 log d .By removing some vertices from B , and redefining K if necessary, we may assume that K √ ∆ =Θ (cid:16) ε ( f ) · d log d (cid:17) . This completes the proof of Lemma 4.6. (cid:4) We prove Lemma 4.8 that bounds the number of non-persistent vertices for a function f and a given distanceparameter τ . All results in this section also hold for τ -left-persistence.For a function f : { , } d → R , we define I − f as |S − f | d . Corollary 4.7 (Corollary of Theorem 6.6, Lemma 6.8 of [KMS18]) . Consider a function h : { , } d → { , } and an integer τ ∈ (cid:104) , (cid:113) d log d (cid:105) . If I − h ≤ √ d then Pr x ∼{ , } d (cid:2) x is not τ -right-persistent for h (cid:3) = O (cid:18) τ √ d (cid:19) . (10)We generalize the above result to functions with image size r ≥ Lemma 4.8.
Consider a function f : { , } d → [ r ] and an integer τ ∈ (cid:104) , (cid:113) d log d (cid:105) . If I − f ≤ √ d , then Pr x ∼{ , } d (cid:2) x is not τ -right-persistent for f (cid:3) = ( r − · O (cid:18) τ √ d (cid:19) .Proof. For all t ∈ [ r ], define the threshold function h t : { , } d → { , } as: h t ( x ) = (cid:40) f ( x ) > t, . Observe that for all t ∈ [ r ], we have S − h t ⊆ S − f , and thus I − h t ≤ I − f ≤ √ d . By Corollary 4.7, we have that(10) holds for h = h t for all t ∈ [ r ]. Next, we point out that a vertex x ∈ { , } d is τ -right-persistent for f
15f and only if x is τ -right-persistent for the Boolean function h f ( x ) . Too see this, consider a vertex z suchthat x ≺ z . First, note that h f ( x ) ( x ) = 0. Second, note that h f ( x ) ( z ) = 1 if and only if f ( z ) > f ( x ) bydefinition of h f ( x ) . Therefore, f ( z ) ≤ f ( x ) if and only if h f ( x ) ( z ) ≤ h f ( x ) ( x ). Finally, note that all verticesare persistent for h r since h r ( x ) = 0 for all x ∈ { , } d . Using these observations, we havePr x ∼{ , } d [ x is not τ -right-persistent for f ] = Pr x ∼{ , } d (cid:2) x is not τ -right-persistent for h f ( x ) (cid:3) ≤ Pr x ∼{ , } d [ ∃ t ∈ [ r −
1] : x is not τ -right-persistent for h t ] ≤ r − (cid:88) t =1 Pr x ∼{ , } d [ x is not τ -right-persistent for h t ]= r − (cid:88) t =1 O (cid:18) τ √ d (cid:19) = ( r − · O (cid:18) τ √ d (cid:19) , where the second inequality is by the union bound and the last equality is due to the fact that (10) holdsfor all h t , t ∈ [ r ]. (cid:4) In this section, we show how to use Lemma 4.6 and Lemma 4.8 to ensure that the conditions of Claim 4.4hold. Once the conditions are met, we prove Theorem 1.6.
Proof of Theorem 1.6.
Let G ( A, B, E AB ) be the ( K, ∆)-good subgraph for f which we obtain from Lemma 4.6.Then K √ ∆ = Θ( ε ( f ) · d log d ) and E AB ⊆ S − f . Without loss of generality, suppose that G ( A, B, E AB ) is a ( K, ∆)-right-good subgraph. Note that G ( A, B, E AB ) satisfies the conditions (a) and (b) of Claim 4.4. We define σ = K/ d , so that σ √ ∆ = Θ( ε ( f )log d ). Before proceeding with the main analysis, we rule out some simplecases with the following claim. Claim 4.9.
Suppose any of the following hold: (a) I − f ≥ √ d . (b) r ≥ √ d log d . (c) σ ≤ r · log d √ d . Then, for ( x , y ) ∼ D pair (0 , , we have Pr x , y [ f ( x ) > f ( y )] ≥ (cid:101) Ω( ε ( f ) r √ d ) .Proof. As we remarked, for τ = 1, Algorithm 2 has the same guarantees as the edge tester. By definition,the edge tester rejects with probability at least I − f d . Therefore, (a) implies the conclusion, since if I − f ≥ √ d ,then the edge tester succeeds with probability Ω( √ d ). In addition, the edge tester rejects with probabilityΩ( ε ( f ) d ) for all real-valued functions. Thus, (b) implies the conclusion, since if r ≥ √ d log d , then ε ( f ) d ≥ ε ( f ) r √ d log d .To see that (c) implies the conclusion, suppose σ ≤ r · log d √ d . Recall that σ √ ∆ = Θ( ε ( f )log d ). Thus, σ · ∆ = ( σ √ ∆) σ = σ − · Θ (cid:18)(cid:16) ε ( f )log d (cid:17) (cid:19) = Ω (cid:32) ε ( f ) √ dr (log d ) (cid:33) .Next, recall that E AB ⊆ S − f and since G is ( K, ∆)-right-good, we have | E AB | = | B | · ∆ = K · ∆. Thus, I − f ≥ K ∆2 d = σ · ∆. Therefore, the edge tester rejects with probability I − f d ≥ σ ∆ d ≥ Ω (cid:16) ε ( f ) r √ d · (log d ) (cid:17) . (cid:4) In light of Claim 4.9, we henceforth assume that I − f ≤ √ d , r ≤ √ d log d , and σ ≥ r · log d √ d . Note that thisimplies r · log d √ d ≤ r · log d √ d ≤ σ ≤
1. Since the tester iterates through all values of τ that are powers of 2and at most (cid:113) d log d , we can fix the unique value τ (cid:48) satisfying τ (cid:48) ≤ σr − (cid:115) d log d ≤ τ (cid:48) . τ (cid:48) ≥ · √ log d . Moreover, since I − f ≤ √ d , we can apply Lemma 4.8 toconclude that the fraction of vertices in { , } d which are not ( τ (cid:48) − f is at most c · τ (cid:48) · ( r − √ d for some constant c >
0. Using our upper bound on τ (cid:48) , this value is at most c · σ √ log d ≤ σ for sufficiently large d . Since | B | = σ · d , we conclude that at least | B | vertices in B are ( τ (cid:48) − · τ (cid:48) (cid:28) d .∆ · τ (cid:48) ≤ ∆ · σr − (cid:115) d log d = 1 r − · σ √ ∆ (cid:115) d ∆log d ≤ r − · Θ (cid:18) ε ( f )log d (cid:19) d √ log d (cid:28) d, and therefore condition (c) of Claim 4.4 holds. We have shown that all conditions, (a), (b), and (c) ofClaim 4.4 hold. Therefore, for ( x , y ) ∼ D pair (0 , τ (cid:48) ), we havePr x , y [ f ( x ) > f ( y )] ≥ C (cid:48) · τ (cid:48) d · σ · ∆ for some constant C (cid:48) > . Using our lower bound on τ (cid:48) , it follows thatPr x , y [ f ( x ) > f ( y )] ≥ C (cid:48) · τ (cid:48) · σ · ∆ d ≥ · σr − (cid:115) d log d · C (cid:48) · σ · ∆ d = C (cid:48) · σ · ∆2( r − √ d log d . Since ( σ √ ∆) = Θ (cid:16)(cid:0) ε ( f )log d (cid:1) (cid:17) , then:Pr ( x , y ) ∼D pair (0 ,τ (cid:48) ) [ f ( x ) > f ( y )] ≥ C (cid:48) ε ( f ) r − √ d (log d ) / = (cid:101) O (cid:18) ε ( f ) r √ d (cid:19) . Therefore, (cid:101) O ( r √ dε ( f ) ) iterations of the tester with ( x , y ) ∼ D pair (0 , τ (cid:48) ) will suffice for the tester to detect aviolation to monotonicity and reject with high probability. This concludes the proof of Theorem 1.6. (cid:4) In this section, we prove Theorem 1.8 by showing that the algorithm of Pallavoor et al. [PRW20] can beemployed to approximate distance to monotonicity of real-valued functions.To prove Theorem 1.8, it is sufficient to give a tolerant tester for monotonicity of functions f : { , } d → R .A tolerant tester for monotonicity gets two parameters ε , ε ∈ (0 , ε < ε , and oracle access toa function f . It has to accept with probability at least 2/3 if f is ε -close to monotone and reject withprobability at least 2/3 if f is ε -far from monotone. Our tester distinguishes functions that are (cid:101) O ( ε/ √ d )-close to monotone from those that are ε -far. Suppose this tolerant tester has query complexity q ( ε, d ). Then,by [PRW20, Theorem A.1], it can be converted to a distance approximation algorithm with the requiredapproximation guarantee and query complexity O ( q ( α, d ) log log(1 /α )) . The following lemma, proved byPallavoor et al. for the special case of Boolean functions, states our result on tolerant testing of monotonicity.Together with the conversion procedure from tolerant testing to distance approximation discussed above, itimplies Theorem 1.8.
Lemma 5.1.
There exists a fixed universal constant c ∈ (0 , and a nonadaptive algorithm, ApproxMono ,that gets a parameter ε ∈ (0 , / and oracle access to a function f : { , } d → R , makes poly( n, /ε ) queriesand returns close or far as follows:1. If ε ( f ) ≤ c · ε √ d log d it outputs close with probability at least 2/3.2. If ε ( f ) ≥ ε it outputs far with probability at least 2/3. roof. We show that Algorithm
ApproxMono of Pallavoor et al. [PRW20], presented as Algorithm 3, works forreal-valued functions. At a high level, the algorithm uses the fact that a function that is far from monotoneviolates many edges or has a large matching of violated edges of a special type. The first subroutine estimatesthe number of edges violated by the function by sampling edges uniformly at random and checking if theyviolate monotonicity. The second subroutine estimates the size of the special type of matching of violatededges. If either of these estimates is large enough, the algorithm outputs far . Otherwise, it outputs close .The class of matchings sought by the algorithm is parametrized by a subset of the coordinates S ⊆ [ d ].The special property of these matchings is that one can verify locally whether a given point is matched byquerying its neighbors and their neighbors.To estimate the size of the matching parametrized by S , the algorithm estimates the probability of thefollowing event Capture ( x, S, f ). We denote by x ( i ) the point in { , } d whose i -th coordinate is equal to1 − x i and the remaining coordinates are the same as in x . Definition 5.2 (Capture Event) . For a function f : { , } d → R , a set S ⊆ [ d ] , and a point x ∈ { , } d , let Capture ( x, S, f ) be the following event:1. There exists an index i ∈ S such that, for y = x ( i ) , the edge between x and y is violated by f . (Notethat the edge between x and y is ( x, y ) if x i = 0 ; otherwise, it is ( y, x ) . )2. Neither the edges of the form ( y, y ( j ) ) nor the edges of the form ( y ( j ) , y ) , where j ∈ S \ { i } , are violatedby f. Denote Pr x ∼{ , } d [ Capture ( x , S, f )] by µ f ( S ) . Observe that µ f ( S ) can be estimated nonadaptively, by sampling vertices x uniformly and independentlyat random and querying f on x and all points that differ from x in at most two coordinates. 𝑥 𝑦𝑖 𝑆 𝑥 𝑦 𝑖 𝑆 Figure 4: An illustration to Definition 5.2. Two cases are depicted: when x ≺ y and when y ≺ x .The first component of the analysis is the observation that both the fraction of violated edges and µ f ( S ),for every S ⊆ [ d ] , provide a good lower bound on the distance to monotonicity. We state this observationwithout proof because the proof for the Boolean case from [PRW20] extends to the general case verbatim.Intuitively, it tells us that, assuming that the two estimates computed by Algorithm 3 are accurate, if oneof the estimates is large enough then the input function is far from monotone. Observation 5.3 ([PRW20]) . For every function f : { , } d → R , the distance ε ( f ) is at least half thefraction of the hypercube edges that are violated by f and ε ( f ) ≥ µ f ( S ) / for all S ⊆ [ d ] . The second (and the main) component of the analysis for the Boolean case is [PRW20, Lemma 2.8], whichrelies on the robust isoperimetric inequality of [KMS18]. We generalize this lemma to real-valued functionsin Lemma 5.4 below. Intuitively, it states that, if function f violates few edges then, for one of the O (log d )choices of the parameter tried in Step 3 of Algorithm 3 for sampling set S , the expectation of µ f ( S ) is largein terms of ε ( f ). That is, again assuming that the estimates computed by Algorithm 3 are accurate, if noneof the estimates is large enough then the input function is close to monotone.18 lgorithm 3 Algorithm
ApproxMono
Input:
Parameters ε ∈ (0 , /
2) and dimension d ; oracle access to function f : { , } d → R . Calculate ˆ ν , an estimate of the fraction of the hypercube edges that are violated by f , up to an additiveerror ε √ d log d . if ˆ ν ≥ ε/ (4 √ d log d ) then return far . for t ∈ { , , , . . . , (cid:98) log d (cid:99) } do Sample S ⊆ [ d ] by including each coordinate i ∈ [ d ] independently with probability 1 /t . Calculate ˆ µ , an estimate of µ f ( S ) = Pr x ∼{ , } d [ Capture ( x , S, f )] up to an additive error c (cid:48) · ε √ d log d forsome constant c (cid:48) > if ˆ µ ≥ c (cid:48) · ε √ d log d then return far . Return close .Equipped with Observation 5.3 and Lemma 5.4, it is easy to convert the intuition above into the for-mal proof that the algorithm satisfies the guarantees of Lemma 5.1. This part of the proof uses standardtechniques and is the same as for the case of Boolean functions described in [PRW20], so we omit it. Thiscompletes the proof of Lemma 5.1. (cid:4)
It remains to prove the following lemma, which crucially relies on our robust isoperimetric inequality forreal-valued functions. We generalize the quantities used by Pallavoor et al. so that the proof is syntacticallysimilar to that for the case of Boolean functions. One subtlety that arises in the case of real-valued functionsis that a vertex can be incident to violated edges of both colors. In constrast, in the case of Boolean functions,each vertex can be adjacent either to violated edges going to higher-weight vertices or to violated edges goingto lower-weight vertices, that is, it cannot be incident on both blue and red violated edges.
Lemma 5.4 (Generalized Lemma 2.8 of [PRW20]) . Let f : { , } d → R be ε -far from monotone, with fractionof violated edges smaller than ε √ d log d . Then, for some t ∈ { , , , . . . , (cid:98) log d (cid:99) } , it holds E S ⊆ [ d ] i ∈ S w.p. /t [ µ f ( S )] = Ω (cid:18) ε √ d log d (cid:19) . Proof.
For x ∈ { , } d , let U − f ( x ) denote the number of violated edges incident on x (both incoming andoutgoing). Consider the following 2-coloring of the edges in S − f : col (( x, y )) = (cid:26) red if U − f ( x ) ≥ U − f ( y );blue if U − f ( x ) < U − f ( y ) . This coloring ensures that, in the isoperimetric inequality, each edge is counted towards the endpoint incidenton the largest number of violated edges (and, in case of a tie, towards the lower endpoint).The proof of [PRW20, Lemma 2.8] relies on the existence of a set B ⊆ { , } d and a color b ∈ { red , blue } that satisfy the following two properties:1. no edge violated by f has both endpoints in the set B ;2. d (cid:80) x ∈ B (cid:113) I − f,b ( x ) = Ω( ε ).To obtain the set B and the color b , we partition { , } d into two sets: B even = { x ∈ { , } d : | x | is even } ,B odd = { x ∈ { , } d : | x | is odd } . B even and B odd clearly satisfy property 1. Note that, for the case of Boolean functions, Pallavooret al. partition the domain points according to their function values instead of the parity of their weight toguarantee property 1.By Theorem 1.2, (cid:88) x ∈ B even (cid:113) I − f, red ( x ) + (cid:113) I − f, blue ( x ) + (cid:88) x ∈ B odd (cid:113) I − f, red ( x ) + (cid:113) I − f, blue ( x ) ≥ C · ε · d . By averaging, there exist a color b ∈ { red , blue } and a set B ∈ { B even , B odd } that satisfy (cid:88) x ∈ B (cid:113) I − f,b ( x ) ≥ C · ε · d . (11)Therefore, property 2 also holds. Note that due to the partition into even-weight and odd-weight points weloose an extra factor of 2 as compared to Pallavoor et al. in the contribution of the set B and the color b tothe isoperimetric inequality. This results in a loss by a factor of 2 (hidden in the Ω-notation) in the lowerbound in Lemma 5.4.The rest of the proof is the same as in [PRW20], so we only summarize the key steps. We proceed bypartitioning the points x ∈ B into buckets B t,s for t, s ∈ { , , , . . . , (cid:98) log d (cid:99) } , where t ≥ s , as follows: B t,s = { x ∈ B : t ≤ U − f ( x ) < t and s ≤ I − f,b ( x ) < s } . Each vertex x ∈ B t,s is incident on between t and 2 t violated edges and between s and 2 s edges colored b ,which are counted towards x in property 2.When the set S is chosen so that each coordinate is included with probability 1 /t , it holds for all x ∈ B t,s that the event Capture ( x, S , f ) occurs with probability Ω( s/t ). Using this claim, one can lower bound thecontribution of each bucket towards E S ⊆ [ d ] [ µ f ( S )] . By combining the contributions of the buckets with thesame value s and applying the Cauchy-Schwartz inequality, one obtains (cid:88) t ∈{ , , ,..., (cid:98) log2 d (cid:99) } E S ⊆ [ d ] i ∈ S w.p. 1 /t [ µ f ( S )] = Ω (cid:32) d · ( (cid:80) t,s : t ≥ s | B t,s |√ s ) (cid:80) t,s : t ≥ s | B t,s | t (cid:33) . (12)We lower bound the sum in the numerator using (11) and upper bound the sum in the denominator usingthe assumed upper bound on the number of violated edges. As a result, we get that the left-hand side of(12) is Ω( ε √ log d/ √ d ). Averaging over the O (log d ) possible values of t yields Lemma 5.4. (cid:4) In this section, we prove Theorem 1.7 which gives a lower bound on the query complexity of testing mono-tonicity of real-valued functions with 1-sided error nonadaptive testers. Fischer et al. proved Theorem 1.7for the special case of r = 2 [FLN +
02, Theorem 19]. Our proof of Theorem 1.7 is a natural extension oftheir construction to the more general case of r ∈ [2 , √ d ]. Proof.
Fix r ∈ [2 , √ d ]. We show that every nonadaptive, 1-sided error tester for functions over { , } d withimage size r must make Ω( r √ d ) queries. This implies Theorem 1.7, since Blais et al. [BBM12, Theorem 1.6]proved an Ω(min( d, r )) lower bound for all testers.For convenience, assume d is an odd perfect square and r divides 2 √ d + 1. We partition the points z ∈ { , } d − into levels, according to their weight | z | . We group levels from the middle of the ( d − r blocks of width w , where w = √ d +1 r . Specifically, for each j ∈ [ r ], we definethe set Z j = (cid:26) z ∈ { , } d − : ( j − w ≤ | z | − (cid:16) d − − √ d (cid:17) ≤ jw (cid:27) .20bserve that r (cid:91) j =1 Z j = (cid:26) z ∈ { , } d − : − √ d ≤ (cid:12)(cid:12)(cid:12) | z | − d − (cid:12)(cid:12)(cid:12) ≤ √ d (cid:27) and Z j is a block of w consecutive levels from the middle of the ( d − i ∈ [ d ], we define function f i : { , } d → [ r ] as follows. For x ∈ { , } d and i ∈ [ d ], let x − i be the point in { , } d − obtained by removing the i ’th coordinate from x . Given x ∈ { , } d , we define f i ( x ) = r if | x − i | > d − + √ d, | x − i | < d − − √ d,j + (1 − x i ) if x − i ∈ Z j . Claim 6.1.
For all i ∈ [ d ] , ε ( f i ) = Ω(1) .Proof. Consider the matching of edges M = (cid:110) ( x, y ) : x i = 0, y i = 1, and x − i = y − i ∈ (cid:83) rj =1 Z j (cid:111) . Observethat all pairs in M are edges violated by f i and | M | = Ω(1) · d . (cid:4) Every 1-sided error tester must accept if the function values on the points it queried are consistent witha monotone function. We say that a set Q ⊆ { , } d of queries contains a violation for a function f if thereexist x, y ∈ Q such that x ≺ y and f ( x ) > f ( y ). If Q does not contain a violation, then the function valueson Q are consistent with a monotone function. Claim 6.2.
For all sets Q ⊆ { , } d of queries, (cid:12)(cid:12) { i ∈ [ d ] : Q contains a violation for f i } (cid:12)(cid:12) < w · | Q | .Proof. We use the following claim due to [BCP + Claim 6.3 (Lemma 3.18 of [BCP + . Let c, d ∈ N and Q ⊆ { , } d . Given x, y ∈ Q , define cap c ( x, y ) as follows. If x and y differ on at least c coordinates, then let cap c ( x, y ) be the set of the first c coordinates on which x and y differ. Otherwise, let cap c ( x, y ) be the set of all coordinates on which x and y differ. Define cap c ( Q ) = (cid:83) x,y ∈ Q cap c ( x, y ) . Then | cap c ( Q ) | ≤ c ( | Q | − . By design of f i , if Q contains a violation for f i , then there exist x, y ∈ Q that differ in at most w coordinates, one of which is i . Then i ∈ cap w ( x, y ) and thus i ∈ cap w ( Q ). Therefore, by Claim 6.3, (cid:12)(cid:12) { i ∈ [ d ] : Q contains a violation for f i } (cid:12)(cid:12) ≤ | cap w ( Q ) | ≤ w ( | Q | − < w · | Q | .This completes the proof of Claim 6.2. (cid:4) Now, consider a nonadaptive tester T with 1-sided error that makes q = q ( ε, d, r ) queries. Let Q ⊆ { , } n denote the random set of queries of size q made by T . Using linearity of expectation and Claim 6.2, d (cid:88) i =1 Pr[ T finds a violation for f i ] = E Q (cid:104)(cid:12)(cid:12) { i ∈ [ d ] : Q contains a violation for f i } (cid:12)(cid:12)(cid:105) < w · q and therefore there exists i ∈ [ d ] such thatPr [ T finds a violation for f i ] < w · qd = (2 √ d + 1) · qrd < qr √ d ,whereas, if T is a valid monotonicity tester, then we must have Pr[ T finds a violation for f i ] ≥ /
3. There-fore, for T to be a valid monotonicity tester, we require that it makes q ≥ r √ d = Ω( r √ d ) queries. (cid:4) Undirected Talagrand Inequality for Real-Valued Functions
In this section we prove Theorem 1.5 through a simple reduction to Talagrand’s (undirected) isoperimetricinequality for Boolean functions, which we state as Theorem 1.4.
Proof.
Given t ∈ R , let p t = d |{ x : f ( x ) = t }| denote the fraction of points x in { , } d with f ( x ) = t . Notethat (cid:80) t ∈ R p t = 1 and that p t > d values of t . Choose m ∈ R to be the smallest real numbersuch that (cid:80) t ≤ m p t ≥ /
2. Then we also have (cid:80) t
We thank Ramesh Krishnan Pallavoor Suresh for useful discussions.
References [AC06] Nir Ailon and Bernard Chazelle. Information theory in property testing and monotonicity testingin higher dimension.
Information and Computation , 204(11):1704–1717, 2006.[ACCL07] Nir Ailon, Bernard Chazelle, Seshadhri Comandur, and Ding Liu. Estimating the distance to amonotone function.
Random Structures Algorithms , 31(3):371–383, 2007.[BB16] Aleksandrs Belovs and Eric Blais. A polynomial lower bound for testing monotonicity. In
Pro-ceedings, ACM Symposium on Theory of Computing (STOC) , pages 1021–1032, 2016.[BBM12] Eric Blais, Joshua Brody, and Kevin Matulef. Property testing lower bounds via communicationcomplexity.
Computational Complexity , 21(2):311–358, 2012.22BCP +
20] Roksana Baleshzar, Deeparnab Chakrabarty, Ramesh Krishnan S. Pallavoor, Sofya Raskhod-nikova, and C. Seshadhri. Optimal unateness testers for real-valued functions: Adaptivity helps.
Theory of Computing , 16(3):1–36, 2020.[BCS18] Hadley Black, Deeparnab Chakrabarty, and C. Seshadhri. A o ( d ) · polylog n monotonicity testerfor Boolean functions over the hypergrid [ n ] d . In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 2133–2151. SIAM,2018.[BCS20] Hadley Black, Deeparnab Chakrabarty, and C. Seshadhri. Domain reduction for monotonicitytesting: A o ( d ) tester for Boolean functions in d -dimensions. In Shuchi Chawla, editor, Proceedingsof the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 1975–1994. SIAM,2020.[BCSM12] Jop Bri¨et, Sourav Chakraborty, David Garc´ıa Soriano, and Ari Matsliah. Monotonicity testingand shortest-path routing on the cube.
Combinatorica , 32(1):35–53, 2012.[Bel18] Aleksandrs Belovs. Adaptive lower bound for testing monotonicity on the line. In
Approx-imation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (AP-PROX/RANDOM) , pages 31:1–31:10, 2018.[BGJ +
12] Arnab Bhattacharyya, Elena Grigorescu, Kyomin Jung, Sofya Raskhodnikova, and David P.Woodruff. Transitive-closure spanners.
SIAM J. Comput. , 41(6):1380–1425, 2012.[BRW05] T. Batu, R. Rubinfeld, and P. White. Fast approximate
P CP s for multidimensional bin-packingproblems.
Information and Computation , 196(1):42–56, 2005.[BRY14] Eric Blais, Sofya Raskhodnikova, and Grigory Yaroslavtsev. Lower bounds for testing propertiesof functions over hypergrid domains. In
IEEE 29th Conference on Computational Complexity(CCC) , pages 309–320, 2014.[CDJS17] Deeparnab Chakrabarty, Kashyap Dixit, Madhav Jha, and C. Seshadhri. Property testing onproduct distributions: Optimal testers for bounded derivative properties.
ACM Trans. on Algo-rithms (TALG) , 13(2):20:1–20:30, 2017.[CDST15] Xi Chen, Anindya De, Rocco A. Servedio, and Li-Yang Tan. Boolean function monotonicitytesting requires (almost) n / non-adaptive queries. In Proceedings, ACM Symposium on Theoryof Computing (STOC) , pages 519–528, 2015.[CS13] Deeparnab Chakrabarty and C. Seshadhri. Optimal bounds for monotonicity and Lipschitz test-ing over hypercubes and hypergrids. In
Proceedings, ACM Symposium on Theory of Computing(STOC) , pages 419–428, 2013.[CS14] Deeparnab Chakrabarty and C. Seshadhri. An optimal lower bound for monotonicity testing overhypergrids.
Theory of Computing , 10:453–464, 2014.[CS16] Deeparnab Chakrabarty and C. Seshadhri. An o ( n ) monotonicity tester for Boolean functionsover the hypercube. SIAM Journal on Computing (SICOMP) , 45(2):461–472, 2016.[CS19] Deeparnab Chakrabarty and C. Seshadhri. Adaptive Boolean monotonicity testing in total influ-ence time. In
Proceedings, Innovations in Theoretical Computer Science (ITCS) , pages 20:1–20:7,2019.[CST14] Xi Chen, Rocco A. Servedio, and Li-Yang Tan. New algorithms and lower bounds for monotonicitytesting. In
Proceedings, IEEE Symposium on Foundations of Computer Science (FOCS) , pages286–295, 2014. 23CWX17] Xi Chen, Erik Waingarten, and Jinyu Xie. Beyond Talagrand functions: new lower bounds fortesting monotonicity and unateness. In
Proceedings, ACM Symposium on Theory of Computing(STOC) , pages 523–536, 2017.[DGL +
99] Yevgeniy Dodis, Oded Goldreich, Eric Lehman, Sofya Raskhodnikova, Dana Ron, and AlexSamorodnitsky. Improved testing algorithms for monotonicity. In
Proceedings of Approx-imation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, AP-PROX/RANDOM , pages 97–108, 1999.[DRTV18] Kashyap Dixit, Sofya Raskhodnikova, Abhradeep Thakurta, and Nithin Varma. Erasure-resilientproperty testing.
SIAM J. Comput. , 47(2):295–329, 2018.[EKK +
00] Funda Ergun, Sampath Kannan, Ravi Kumar, Ronitt Rubinfeld, and Mahesh Viswanathan.Spot-checkers.
J. Comput. System Sci. , 60(3):717–751, 2000.[Fis04] Eldar Fischer. On the strength of comparisons in property testing.
Information and Computation ,189(1):107–116, 2004.[FLN +
02] Eldar Fischer, Eric Lehman, Ilan Newman, Sofya Raskhodnikova, Ronitt Rubinfeld, and AlexSamorodnitsky. Monotonicity testing over general poset domains. In
Proceedings, ACM Sympo-sium on Theory of Computing (STOC) , pages 474–483, 2002.[FR10] Shahar Fattal and Dana Ron. Approximating the distance to monotonicity in high dimensions.
ACM Trans. on Algorithms (TALG) , 6(3):52:1–52:37, 2010.[GGL +
00] Oded Goldreich, Shafi Goldwasser, Eric Lehman, Dana Ron, and Alex Samordinsky. Testingmonotonicity.
Combinatorica , 20:301–337, 2000.[HK08] Shirley Halevy and Eyal Kushilevitz. Testing monotonicity over graph products.
Random Struc-tures Algorithms , 33(1):44–67, 2008.[KMS18] Subhash Khot, Dor Minzer, and Muli Safra. On monotonicity testing and Boolean isoperimetric-type theorems.
SIAM J. Comput. , 47(6):2238–2276, 2018.[LR01] Eric Lehman and Dana Ron. On disjoint chains of subsets.
Journal of Combinatorial Theory,Series A , 94(2):399–404, 2001.[Mar74] Grigory A. Margulis. Probabilistic characteristics of graphs with large connectivity.
ProblemyPeredachi Informatsii , 10(2):101–108, 1974.[Pal20] Ramesh Krishnan Pallavoor Suresh.
Improved Algorithms and New Models in Property Testing .PhD thesis, Boston University, 2020.[PRR06] Michal Parnas, Dana Ron, and Ronitt Rubinfeld. Tolerant property testing and distance approx-imation.
Journal of Computer and System Sciences , 6(72):1012–1042, 2006.[PRV18] Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, and Nithin Varma. Parameterized propertytesting of functions.
ACM Trans. Comput. Theory , 9(4):17:1–17:19, 2018.[PRW20] Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, and Erik Waingarten. Approximating thedistance to monotonicity of Boolean functions.
Random Structures and Algorithms (to appear) ,2020. Preliminary version appeared in SODA 2020.[Ras99] Sofya Raskhodnikova. Monotonicity testing.
Masters Thesis, MIT , 1999.[Tal93] Michel Talagrand. Isoperimetry, logarithmic Sobolev inequalities on the discrete cube, and Mar-gulis’ graph connectivity theorem.