Asymptotics of Lower Dimensional Zero-Density Regions
AASYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITYREGIONS
By Hengrui Luo , Steve N. MacEachern and Mario Peruggia Department of Statistics, The Ohio State University, [email protected]; * [email protected]; † [email protected] Abstract
Topological data analysis (TDA) allows us to explore thetopological features of a dataset. Among topological features, lowerdimensional ones have recently drawn the attention of practitioners inmathematics and statistics due to their potential to aid the discoveryof low dimensional structure in a data set. However, lower dimen-sional features are usually challenging to detect from a probabilisticperspective.In this paper, lower dimensional topological features occurring aszero-density regions of density functions are introduced and thor-oughly investigated. Specifically, we consider sequences of coveringsfor the support of a density function in which the coverings are com-prised of balls with shrinking radii. We show that, when these cov-erings satisfy certain sufficient conditions as the sample size goes toinfinity, we can detect lower dimensional, zero-density regions withincreasingly higher probability while guarding against false detection.We supplement the theoretical developments with the discussion ofsimulated experiments that elucidate the behavior of the methodol-ogy for different choices of the tuning parameters that govern theconstruction of the covering sequences and characterize the asymp-totic results.
1. Introduction.
Topological data analysis (TDA) has been developed in the pastseveral decades to explore and understand the topological structure of the space wherethe data arise (for example, the support of a probability distribution) using a finiteamount of observed data. We refer readers who are interested in a detailed treatmentof these data analysis techniques to the introductory texts by Carlsson [5] and Edels-brunner and Harer [9].TDA operates on a data set, usually a set of n points in R d or a more general metricspace. The data and the metric determine simplices leading to simplicial complexes,and the filtration of the complexes constructed at different scales is analyzed to re-veal topological features. TDA makes use of the arrangement of the points in R d andextracts all features for every dimension simultaneously, but does not take the data gen-erating mechanism into consideration. Our goal is to investigate how knowledge of thedistributional properties of the data can help detection of lower dimensional featuresthat are typically difficult to identify based on the data arrangement alone.Having the data in hand, it is quite natural to consider them as a random samplefrom some probability distribution. Interest then focuses on the limiting behavior ofgeometric complexes based on a point set of growing size. In this regard, Kahle [12]and Kahle and Meckes [13] have developed results about the limiting behavior of Betti ∗ This material is based upon work supported by the National Science Foundation under GrantsNo. DMS-1613110 and No. SES-1921523.
MSC 2010 subject classifications:
Primary 62G20; secondary 62H12
Keywords and phrases:
Topological Data Analysis, covering, zero-density regions a r X i v : . [ m a t h . S T ] J un numbers of complexes, based on the probabilistic results for random geometric graphsgiven by Penrose [19]. Subsequent work in this direction characterizes other types oftopological features [8, 2, 14]. These results provide a picture of how the probabilitymechanism informs us about the underlying topology. Bobrowski and Kahle [3] providea comprehensive review of results along this line. This body of literature suggests thatasymptotic regimes provide a convenient framework for analyzing the topology of data.In this paper, we first discuss situations where independent and identically dis-tributed (i.i.d.) data points are drawn from a distribution having continuous densityfunction f (with respect to Lebesgue measure on R d ) on its support supp ( f ) = M ⊂ R d .Our results are first formally stated for supp ( f ) = M = [0 , d , d ∈ N , and then extendedto more general situations. We work with a well-behaved version of the density f forwhich the notion of a zero-density region S ⊂ M (to be formally defined later) is mean-ingful. If S is of dimension lower than d , supp ( f ) = supp ( f ) ∪ S by definition, and thesupport has no topological features of interest. Such a zero-density region S is difficultto identify with traditional constructions of simplices or density estimators. Example.
Let S = { } × [0 . , .
75] and define d ( x , S ) = inf y ∈ S d ( x , y ), where d denotes the L metric. Consider the density f ( x ) ∝ d ( x , S ) ◦ [0 , shown in Figure1.1, for which S is a zero density region of lower dimension. S does not contain anyprobability mass and, being a segment, is a “lower dimensional object” (a concept tobe made more precise later).The volume of S as a subset of the support of the density, [0 , , is zero and thedensity assigns positive probability to any neighborhood of any point in S . Hence,given a single sample of points drawn from this density, no matter how large, it willbe impossible to identify the topological structure of S with accuracy. The intuitionis that identification of S can only be attained by evaluating the rate at which pointsaccumulate in its vicinity as the sample size grows. The situation would be differentif the density were zero on a region of positive volume, such as a disk, and nonzeroelsewhere. The disk could then be identified with a large enough sample as a “hole” notcontaining any points.Our upcoming results allow us to detect the lower dimensional object S with prob-ability one as the sample size n goes to infinity, and help us to avoid detection of falseholes.Our approach to detecting zero-density regions is to conduct an analysis as we varythe radius of a collection of covering balls of supp ( f ). By choosing the shared radiusof the covering balls appropriately relative to the increasing sample size n , we wishthat S be covered only by balls having no observations inside. For each point in thenon-zero density region we wish for the point to eventually be covered by a ball withat least one observation inside. If our wishes come true, then we can simply collect theempty covering balls and recover an approximation to the region S . In Theorem 10 wepresent a set of sufficient conditions for our detection method to work asymptotically.This notion of varying the radius of the covering balls can be related to the construc-tion of complexes in TDA. For the Čech complex, balls are centered at the observedpoints. Balls of a fixed radius r lead to a Čech complex C ( X , r ). Varying the radius r ,the Čech filtration, a collection of Čech complexes, is produced. For a small radius r ,the balls will not overlap and no holes will be discovered. For a large radius r , lowerdimensional zero-density regions will be covered and these holes will not be found.The rest of the paper is organized as follows. We first illustrate our observations abovewith a simple example in Section 2. Then, in Section 3, we discuss our approach for the SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Figure 1.1 . A lower dimensional region S = { } × [0 . , . (red segment) for the density function f ( x ) ∝ d ( x , S ) ◦ [0 , ( x ) , where d ( x , S ) = inf y ∈ S d ( x , y ) and d denotes Euclidean distance. Thedensity f evaluates to 0 over S , but nowhere else over [0 , . construction of coverings of a compact support along with a set of sufficient conditionsthat ensure asymptotic consistency of the procedure for the detection of zero-densityregions. Generalizations of the results to the case of a non-compact support are providedin Section 4. We present some experimental results and connections to other areas inSection 5 and end with a brief discussion of our findings in Section 6.
2. An Illustrative Example.
The central problem we investigate in this paperis the detection of a lower dimensional zero-density region. Before formally addressingthe problem, we present an illustrative example to show how dimensionality plays animportant role.We consider three different densities on the interval M = [ − , M = 1, andthe problem of detecting interesting topological features by partitioning M into theunion of disjoint, equally sized bins. The density g ( x ) = (cid:16) [ − , − / ( x ) + [1 / , ( x ) (cid:17) has a genuine hole, ( − / , / n goesto infinity, one need only consider bins whose width decrease at a rate no faster than n − ε , ε >
0, to ensure that the bins away from the hole are eventually filled. When S = ∅ , as is the case for the density h ( x ) = [ − , ( x ) · (cid:0) x + 1 (cid:1) , and the width of thebins decreases at a rate no faster than n − ε , we cannot detect any hole as all bins willeventually be filled.The interesting case has S = ∅ and dim S < dim M = 1, as is the case for f ( x ) = x [ − , ( x ). Here, S = { } , and the question is whether we can detect this topologicalfeature of dimension zero with positive probability. The hole is difficult to detect becausesamples will accumulate around S = { } . If the bins are too large, they will be filledas n goes to infinity and the binary heatmaps of f , constructed by drawing a verticalsegment at the center of every bin in which there is at least one observation, will lookmore and more like the ones for h .In the Figure 2.1, the binary heatmaps show the denseness of the non-empty bins.The first column represents histograms and heatmaps for the density f with a zero-dimensional hole { } . The second column represents histograms and heatmaps for thedensity g with a one-dimensional hole ( − / , / h with no holes. f ( x ) = x [ − , ( x ) g ( x ) = (cid:16) [ − , − ] ( x ) + [ , ( x ) (cid:17) h ( x ) = [ − , ( x ) · (cid:0) x + 1 (cid:1) Figure 2.1 . Binned histogram and binary heatmaps based on random samples of size n = 100 and n = 10 , from three distributions with densities f, g, and h . All graphical summaries areconstructed using 100 equal-width bins over the interval [ − , ⊂ R . The figure suggests that, as the sample size n goes to infinity, we should be ableto distinguish between f and h by using binary heatmaps with an appropriate scalingscheme. This is perhaps surprising, because f and h have the same support. Formally,our Theorem 10 shows that, if we shrink the common width of the bins (which corre-sponds to the radius of the covering balls in higher dimensions) at appropriate rates asa function of sample size n , the lower dimensional holes will be characterized in termsof empty bins with probability tending to one. This result can be extended to moregeneral situations.The objective of this article is to show that, without appropriate scaling, authentictopological holes of strictly lower dimension cannot be detected, while, with appropriatescaling, they can. The sufficient conditions that we will impose on the scaling schemesto attain these results depend on the dimension of the zero-density region S and alsoon the local smoothness of the density that generates the data.
3. Statement of the Main Result.
Consider a random vector X having den-sity f with respect to Lebesgue measure on R d . For any Borel set A ∈ R d , let P ( X ∈ A ) = R A f ( x ) dx . Definition . (Support) The support of X is defined to be supp ( X ) := ∩ { R X closed in R d | P ( X ∈ R X )=1 } R X . (3.1)With an abuse of notation, we write supp ( f ) to represent the support of X .In most applications where TDA is employed, a compact support M can be reason-ably assumed. For technical convenience, we will also assume the existence of a contin-uous version of the density f on its support M . The latter is a mild condition satisfied SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS for most theoretical questions and applied scenarios. In the following discussion, as willbe stated in Assumption 1 on page 8, we consider the case of M = [0 , d ⊂ R d , d > M in mind, we turn to a special region called the zero-densityregion lying in its interior. Definition . (Zero-density region) For the continuous version of a density f on M , we call the inverse image of { } , i.e. f − ( { } ), the zero-density region of f anddenote it by S . S may consist of more than one connected component. Our main results are easy toextend to the case of a finite number of connected components. Topological restrictionson S are summarized in Assumption 2 on page 8.The local behavior of the density f around the zero-density region S ⊂ M is acrucial aspect of our investigation. The following concepts will help us to describe thebehavior of f near S . We consider the L metric and the associated norm, k · k , in thefollowing discussion, but remark that our methods can be generalized to other metricsand norms.A ball of radius r centered at x in R d is the open set B r ( x ) := n y ∈ R d (cid:12)(cid:12)(cid:12) k x − y k < r o . An (cid:15) -neighborhood ( (cid:15) >
0) of S is the open set B (cid:15) ( S ) := (cid:26) x ∈ R d (cid:12)(cid:12)(cid:12)(cid:12) inf y ∈ S k x − y k < (cid:15) (cid:27) . (Note that the definition of an an (cid:15) -neighborhood makes sense for an arbitrary set, notjust a zero-density region. In particular, the (cid:15) -neighborhood of a point y , B (cid:15) ( y ), is justthe open ball of radius (cid:15) centered at y .) Definition . (Upper and lower (cid:15) -order of smoothness) We consider densities f for which the following quantities, called the upper and lower (cid:15) -order of smoothness, respectively, exist and are well-defined for every 0 < (cid:15) < K f ( (cid:15) ) := inf (cid:26) α > (cid:12)(cid:12)(cid:12)(cid:12) < inf x ∈ B (cid:15) ( S ) \ S f ( x ) d ( x , S ) α (cid:27) , and K f ( (cid:15) ) := sup ( α > (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ > sup x ∈ B (cid:15) ( S ) \ S f ( x ) d ( x , S ) α ) . Because our result concerns the limiting case as (cid:15) goes to zero, we give the followingdefinition.
Definition . (Upper and lower order of smoothness) The upper and lower orderof smoothness of the density f w.r.t. S , are K f = lim (cid:15) → + K f ( (cid:15) ) and K f = lim (cid:15) → + K f ( (cid:15) ) , respectively, provided they exist. Figure 3.1 . Lower dimensional region S = { } × [0 . , . of a density function f ( x ) with K f = K f . The density f evaluates to 0 over S but nowhere else over [0 , . It can be shown that K f and K f will exist if K f ( (cid:15) ) and K f ( (cid:15) ) exist for some (cid:15) andthat there are densities for which K f ( (cid:15) ) and K f ( (cid:15) ) are undefined for all (cid:15) . The limits K f and K f >
0, if they exist, need not coincide. If they do exist and coincide, we willuse the notation K f = K f = K f .The following example, depicted in Figure 3.1, shows that the upper and lower ordersof smoothness K f and K f need not coincide. Example.
For x = ( x, y ) ∈ R , define h ( x ) as h ( x ) = d ( x , S ) , x ∈ [0 . , × [0 . , . ,d ( x , S ) , x ∈ [0 , . × [0 . , . ,d ( x , S ) − π θ , x ∈ [0 , × (0 . , ,d ( x , S ) π θ , x ∈ [0 , × [0 , . , where d is the L metric, θ = arctan { ( y − . / ( x − . } , and θ = arctan { ( y − . x − . } . Consider the density function f satisfying the relationship f ( x ) ∝ g ( x ) × [0 , ( x ). For x ∈ M with d ( x , S ) = δ , δ >
0, sup x f ( x ) = δ and inf x f ( x ) = δ .Thus K f ( (cid:15) ) = 4 and K f ( (cid:15) ) = 2.3.1. Covering balls and dimension of S . We cover the support M of the density f with a collection of balls of equal radius and, letting the radius shrink to zero atan appropriate rate, we attempt to detect the zero-density region S with a certainlimiting probability guarantee. In what follows we will make use of the following pieceof notation. Notation. (Covering) Let E be a given subset of R d . We denote by B dr ( E ) a collectionof d -dimensional balls of radius r whose union contains E . Note that a covering B dr ( E )may also depend on the sample size n through the radius r = r ( n ) in subsequent devel-opments. We also use |B dr ( E ) | to denote the cardinality of B dr ( E ) and write B dr ( E ) = B when the meaning is unambiguous.We distinguish between three types of covering balls based on how far they are fromthe zero-density region S . SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Figure 3.2 . Illustration of the types of balls with the density in Figure 1.1. The blue region is anepsilon neighborhood surrounding the zero-density region S . The red ball is an (cid:15) -inside ball, the orangeballs are (cid:15) -neighboring balls and the green balls are (cid:15) -outside balls. Definition . (Types of covering balls) Consider a density f with respect toLebesgue measure on R d that is continuous on M and zero on R d \ M . Let S be azero-density region for f . Denote by B a covering of M with balls of radius r and let B r ( x ) ∈ B . We classify B r ( x ) into one of these three types:1. an (cid:15) -outside ball : a ball B r ( x ) such that x / ∈ B (cid:15) ( S );2. an (cid:15) -neighboring ball: a ball B r ( x ) such that x ∈ B (cid:15) ( S ) and B r ( x ) ∩ S = ∅ ;3. an (cid:15) -inside ball : a ball B r ( x ) such that x ∈ B (cid:15) ( S ) and B r ( x ) ∩ S = ∅ .The main result will rely on the condition r ≤ (cid:15)/ (cid:15) -inside ball iscontained in B (cid:15) ( S ). The various types of covering balls are illustrated in Figure 3.2. Definition . (Big O notation [4]) Given two positive real sequences f ( n ) and g ( n ), we write f ∼ O ( g ) and say that f is big O of g if there exist constants L and L ∈ (0 , ∞ ) and an n ∈ N , such that L · g ( n ) ≤ f ( n ) ≤ L · g ( n ) , for all n > n . This condition means that the asymptotic behaviors of f and g are comparable.Next, we introduce the notion of Minkowski dimension which we will later use tocharacterize the dimension of S . Definition . (Minkowski dimension, or box-counting dimension. Definition 3.1in 10) The upper and lower Minkowski dimensions of a bounded subset E ⊂ M of R d ,are defined respectively as dim M ( E ) := lim sup ∆ → log N ∆ ( E ) − log ∆ , dim M ( E ) := lim inf ∆ → log N ∆ ( E ) − log ∆ , where N ∆ ( E ) is the ∆ -covering number of E ⊂ R d N ∆ ( E ) := min n k ∈ N | E ⊂ ∪ ki =1 B ∆ ( x i ) for some x i ∈ R d o , i.e., the smallest number of balls of radius ∆ > E . We call this coveringof minimal cardinality a minimal ∆ -covering for E . When dim M ( E ) = dim M ( E ) = d M ,we define the Minkowski dimension (or box-counting dimension) dim M ( E ) of E to be d M .By Proposition 3.4 in Falconer [10], in the case of M = [0 , d , we always havedim M ( M ) = d , matching the usual definition of dimension.We now state the formal assumptions A.1-A.6 which we will use to prove our mainresults.
A.1. (Compact support) The support of f is supp ( f ) = M = [0 , d , d > f that is zero on R d \ M and continuous on M . A.2. (Single component) The zero-density region S is contained in the interior of M and has one connected component. A.3. (Order of smoothness) There exists an (cid:15) > K f ( (cid:15) ) > K f ( (cid:15) ) > (cid:15) > K f > K f > A.4.
Let (cid:15) > L f and U f such that, for all (cid:15) with 0 < (cid:15) < min(1 , (cid:15) ), L f · d ( x , S ) K f ( (cid:15) ) ≤ f ( x ) ≤ U f · d ( x , S ) K f ( (cid:15) ) for all x ∈ B (cid:15) ( S ) ∩ M . A.5. (Regular covering) There is an η > B dr ( n ) ( M ) are comprised of balls whose centers lie in M and whose radii satisfy r ( n ) ∼ O ( n − η ). The sequence of coverings is regular , i.e., the cardinalities |B dr ( n ) ( M ) | ofthe coverings in the sequence satisfy the condition |B dr ( n ) ( M ) | ∼ O ( n dη ). A.6. (Restriction of covering to S ) Let B dr ( n ) ( M ) be the covering considered in A.5. If d is the Minkowski dimension of S and d < d , then the number of balls in B dr ( n ) ( M )intersecting S is bounded from above by H ε ( n ) ∼ O ( n d η (1+ ε ) ), for each ε > H ε ( n )is a function of sample size n that depends on the parameter ε .This set of assumptions can be roughly categorized into three groups. AssumptionsA.1 and A.2 can be regarded as restrictions on the topological properties of the supportof the density function and we will see later that both assumptions can be relaxed undersuitable conditions. Assumptions A.3 and A.4 describe the local behavior of f in thevicinity of S .Assumptions A.5 and A.6, however, are tied intrinsically to the Minkowski dimensionof S and the covering scheme we devise for detection of S .Next we prove that under Assumptions A.1 and A.2 there exists one covering thatsatisfies Assumptions A.5 and A.6 for S of Minkowski dimension d . Lemma . Suppose that A.1 and A.2 hold and that S is a lower dimensional zero-density region of Minkowski dimension d < d . Then there exists a sequence of coverings B dr ( n ) ( M ) that satisfies A.5 and A.6. Proof.
Step 1.
We prove first that A.6 can be satisfied.Let r ( n ) = cn − η , c >
0, and let B dr ( n ) ( S ) be a minimal r ( n ) covering of S of cardinality |B dr ( n ) ( S ) | . Since we assume that dim M ( S ) = d , SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS lim r ( n ) → log N r ( n ) ( S ) − log r ( n ) = lim n →∞ log N r ( n ) ( S )log 1 /r ( n ) = d ⇒ lim n →∞ log N r ( n ) ( S )log (1 /r ( n )) d = 1 . (3.2)Thus, for each ε >
0, there exists an N ∗ ( ε ) such that, for all n > N ∗ ( ε )log N r ( n ) ( S )log (1 /r ( n )) d ≤ ε, log N r ( n ) ( S ) ≤ (1 + ε ) · log (1 /r ( n )) d , N r ( n ) ( S ) ≤ (cid:18) r ( n ) d (cid:19) ε = c − d (1+ ε ) n d η (1+ ε ) = H ε ( n ) . (3.3)This establishes that each term in the tail of the sequence {|B dr ( n ) ( S ) |} is boundedby H ε ( n ) as in A.6, provided B dr ( n ) ( S ) ∩ B dr ( n ) ( M ) = B dr ( n ) ( S ), where B dr ( n ) ( M ) is as inA.5. Step 2.
Now we turn to the proof of A.5.
First, we take n sufficiently large so that there exists a hyper cube C ( n ) of dimension d and side length r ( n ) such that B r ( n ) ( C ( n )) ⊂ int M \ B r ( n ) ( S ). For such a fixed n and the r ( n )-covering B dr ( n ) ( S ) we derived above, we now construct such a covering B d,ir ( n ) ( M ) where the centers of covering balls in B d,ir ( n ) ( M ) are on the grid set G ( n ) := M ∩{ x = ( x , · · · , x d ) ∈ R d | x i ∈ N · r ( n ) d , ∀ i = 0 , · · · , d } . For this grid, the maximal distance from an arbitrary point in M to a point in G ( n ) isat most q d · ( r ( n ) d ) = r ( n ) √ d ≤ r ( n ) for d ≥
1. This maximal distance is attained by a pairof points with coordinates in the form ( x , · · · , x d ) , ( x + r ( n ) d , · · · , x d + r ( n ) d ) ∈ G ( n ). Soan arbitrary point in M is contained in some ball of radius r ( n ) with its center in G ( n ).The cardinality of this covering is the same as the cardinality of the set G ( n ) of centers,which is of order O (cid:18)(cid:16) / r ( n ) d (cid:17) d (cid:19) ∼ O ( n ηd ). Second, we delete all covering balls in B d,ir ( n ) ( M ) that intersect S . That is, weobtain another covering B d,iir ( n ) ( M ) = { B ∈ B d,ir ( n ) ( M ) | B ∩ S = ∅} , which ensures thatthere are no additional covering balls intersecting S and A.6 is still guaranteed by thecovering B dr ( n ) ( S ) we choose in Step 1 because no additional covering balls are touching S . This operation only deletes those covering balls that are completely contained in B d r ( n ) ( S ), so no covering ball intersecting C ( n ) will be deleted. As argued in the firststep, B d,iir ( n ) ( M ) is also a covering of C ( n ) of Minkowski dimension d so it must containcovering balls of cardinality O ( n ηd ). Third, we recognize that after deletion and obtaining the covering B d,iir ( n ) ( M ), theunion ∪ B ∈B d,iir ( n ) ( M ) ∪B dr ( n ) ( S ) B may not cover all points of M . In the previous deletionoperation, we delete every covering ball such that its center p satisfies d ( p, S ) ≤ r ( n ).Therefore any point q in M \ (cid:18) ∪ B ∈B d,iir ( n ) ( M ) ∪B d r ( n ) ( S ) B (cid:19) will satisfy d ( q, S ) ≤ r ( n ). Nowwe add finitely many translated copies of B dr ( n ) ( S ) to B d,iir ( n ) ( M ) ∪ B dr ( n ) ( S ) to obtainour final covering. Denote by B dr ( n ) ( S ) + v the covering obtained by translating each Figure 3.3 . Illustration of the construction in Proposition 9 with the density in Figure 1.1. With r = 0 . , the grid centers in G ( n ) are displayed. The range of G ( n ) is displayed in red; the ranges of G ( n ) and G ( n ) are displayed in green and blue, respectively. covering ball in B dr ( n ) ( S ) by vector v . We let the translation vector v vary in followingset of vectors G ( n ) := { v = ( v , · · · , v d ) ∈ R d | v i ∈ κ · r ( n ) d , κ = 0 , ± , · · · , ± d, ∀ i = 0 , · · · , d } which is a set consisting of (4 d × d = (8 d + 1) d vectors. This construction forms a(irregular) grid G ( n ) by translating the centers of balls in B dr ( n ) ( S ). This grid covers allpoints in B r ( n ) ( S ) since it translates 4 r ( n ) in each coordinate direction. The maximaldistance from an arbitrary point in B r ( n ) ( S ) to a point in G ( n ) is no more than q d · ( r ( n ) d ) = r ( n ) √ d ≤ r ( n ) for d ≥
1. Therefore B d,iiir ( n ) := { B ∈ B dr ( n ) ( S ) + v | v ∈ G ( n ) } will cover the area B r ( n ) ( S ). This will add at most (8 d + 1) d · (cid:12)(cid:12)(cid:12) B dr ( n ) ( S ) (cid:12)(cid:12)(cid:12) ∼ O (cid:16)(cid:12)(cid:12)(cid:12) B dr ( n ) ( S ) (cid:12)(cid:12)(cid:12)(cid:17) additional covering balls that intersect S . This addition will not affectthe order of magnitude of the cardinality of B d,iir ( n ) ( M ) ∪ B dr ( n ) ( S ) satisfying A.5. Wesimply add finitely many (8 d + 1) d translated copies of B dr ( n ) ( S ) to cover the “gaps”created by deletion in the previous step.The covering B d, † r ( n ) := B d,iir ( n ) ( M ) ∪ B dr ( n ) ( S ) ∪ B d,iiir ( n ) satisfies both A.5 and A.6 byconstruction as we saw above.The constructive proof of the previous lemma makes explicit use of minimal coveringsof S . There are many situations in which sequences of coverings satisfying A.5 and A.6can be produced without resorting to the use of a minimal covering of S . The followingproposition gives an explicit example of such a situation. Proposition . For the example in Figure 1.1, for a fixed η > , there exists asequence of coverings B r ( n ) ( M ) satisfying A.5 and A.6. SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Proof.
We consider the covering where the centers of the balls in B r ( n ) ( M ) areexactly the grid set G ( n ) = M ∩ { y = ( y , y ) ∈ R | y , y ∈ N · r ( n )2 } . (3.4)(See Figure 3.3 for an illustration.) The cardinality of this covering is the same as thecardinality of the grid set G ( n ). As n increases, we have a sequence of coverings whosecardinalities are of order O (cid:16) r ( n ) / (cid:17) ∼ O ( n η ) as required in A.5.Next, we prove that the covering balls of B r ( n ) ( M ) intersecting S also have cardinali-ties that satisfy A.6. Let n be large enough that r ( n ) < /
10 and the r ( n )-neighborhoodof S , B r ( n ) ( S ) ⊂ int M . Consider the set of grids G ( n ) := { y = ( y , y ) ∈ G ( n ) | ≤ d ( y , S ) < r ( n ) } . A covering ball in B r ( n ) ( M ) intersects S if and only if its center isin G ( n ). We construct the rectangular grids defined below to bound the cardinality of G ( n ). G ( n ) := { y = ( y , y ) ∈ G ( n ) | − r ( n ) ≤ y ≤
12 + 2 r ( n ) , − r ( n ) ≤ y ≤
34 + 2 r ( n ) } ,G ( n ) := { y = ( y , y ) ∈ G ( n ) | − r ( n )2 ≤ y ≤
12 + r ( n )2 ,
14 + r ( n )2 ≤ y ≤ − r ( n )2 } . It is clear from their definitions that G ( n ) ⊂ G ( n ) ⊂ G ( n ).In the grid set G ( n ), there are at least r ( n ) r ( n ) / = 2 grid points with the same valueof y ; and at least j / − r ( n ) r ( n ) / k = j r ( n ) − k points with the same value of y . Thusthere are at least 2 · j r ( n ) − k grid points in G ( n ). Similarly, in the grid set G ( n )there are at most r ( n ) r ( n ) / + 1 = 9 grid points with the same value of y ; and at most l / r ( n ) r ( n ) / + 1 m = l r ( n ) + 9 m points with the same value of y . Thus there are at most9 · l r ( n ) + 9 m grid points in G ( n ).Since G ( n ) ⊂ G ( n ) ⊂ G ( n ), the number of balls intersecting S has cardinality | G ( n ) | bounded by the inequality(3.5) 2 · (cid:22) r ( n ) − (cid:23) ≤ | G ( n ) | ≤ | G ( n ) | ≤ | G ( n ) | ≤ · (cid:24) r ( n ) + 9 (cid:25) . Both sides of (3.5) have order of magnitude O (cid:16) r ( n ) (cid:17) ∼ O ( n η ) and this verifies A.6.Although we have shown by example that our Assumption A.1-A.6 can be verifiedfor many densities, we point out that in general it might not be easy to verify theseassumption when S or M possesses a complicated topological structure.3.2. Main result.
Theorem . Suppose we have a set of i.i.d. data X , · · · , X n drawn from a contin-uous density f ( x ) w.r.t. the d -dimensional Lebesgue measure ν d defined on a compactsubset M = [0 , d of R d and assume that Assumptions A.1-A.6 hold. Assume also thatthe radius r and the separation distance (cid:15) satisfy the following growth rates: r ( n ) ∼ O ( n − η ) , < η < d ,(cid:15) ( n ) ∼ O ( n − ψ ) , < ψ ≤ η, r ( n ) ≤ (cid:15) ( n ) < . Finally, assume the validity of the following bounding condition for the density f outsidethe (cid:15) -neighborhood of S : m ( f, n ) := min w ∈ M \ B (cid:15) ( S ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd . Then:(A) If η and ψ satisfy 1 − ηd − K f ψ > , we have lim n →∞ P (no empty (cid:15) (n)-outside balls) = 1.(B) If η and ψ satisfy 1 + d η − K f η − dη < , we have lim n →∞ P (all (cid:15) (n)-inside balls are empty) = 1. Proof.
See Appendix.Similarly, we have a result dealing with the situation where S has more than one(but a finite number) connected component. Corollary . Under the same assumptions as in Theorem 10, suppose that thezero-density region can be decomposed into K disjoint connected components, S = t Kk =1 S k . For n large enough (so that (cid:15) is small enough), an (cid:15) -neighborhood of S will also be comprised of K disjoint connected components, the (cid:15) -neighborhoods of eachcomponent S k .Assume that the following bounding conditions for the density f outside the (cid:15) -neighborhood of each component S k hold: m k ( f, n ) := min w ∈ M \ B (cid:15) ( n ) ( S k ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd . Assume also that the orders of smoothness of f w.r.t. S k are K f,k and K f,k > andthat, for S k of dimension d k , the cardinality of the set of covering balls intersecting S k is of order O ( n ηd k ) .Then:(A) If η and ψ satisfy min k =1 , ··· ,K − ηd − K f,k ψ > , we have lim n →∞ P ( no empty (cid:15) (n)-outside balls ) = 1 for all k .(B) If η and ψ satisfy max k =1 , ··· ,K d k η − K f,k η − dη < , we have lim n →∞ P ( all (cid:15) (n)-inside balls are empty ) = 1 for all k .
4. Non-compact support.
The main result of the previous section relies on theassumption of a compact support for the density f . However, in practice, many dis-tributions have unbounded supports. In this section we consider the case of a densityfunction f with a non-compact support contained in R d and, assuming that the tailsof the density decay at certain rates, we derive results similar to those of the previoussection. SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Our strategy restricts our consideration to a region within the support that containsmost of the probability mass of the density f . The original motivation for these ideascomes from the concentration inequalities that describe the fact that observations usu-ally “concentrate” around the “center” of a probability density with high probability(for example, Talagrand [23]).Suppose that the support of the probability density function f is a non-compactset M ⊂ R d . We consider a truncation of the non-compact support M to a compactsubset that contains most of the probability mass of f and whose interior contains(all components of) S . This allows us to examine the topological features of M over acompact truncated region that contains the bulk of the data and to establish our resultsusing arguments similar to those used to prove Theorem 10. For ease of exposition, wefocus parts of our discussion on the situation where M = R d , although most of ourresults can be extended to more general situations. We will state the restrictions on thetail behavior of the density functions in Proposition 18. Definition . ((1 − δ )-support containing S ) A (1 − δ ) -support containing S for some 0 < δ <
1, denoted by S − δf ⊂ R d , is a subset of the form S − δf := [ − B, B ] d ⊂ R d for some B ∈ (0 , ∞ ), such that S ⊂ int S − δf = ( − B, B ) d and P n X ∈ S − δf o = 1 − δ .The hypercube [ − B, B ] d containing 1 − δ of the probability expands as δ shrinks.If S is compact then S − δf exists for a small enough δ ∈ (0 , S is non-compact,then it is unbounded and S − δf does not exist for any δ ∈ (0 , − δ )-support is completely contained in the interior ofthe support M .Passing from M to S − δf allows us to work with a compact cubical subset andremoves the technical difficulties associated with densities whose tails decay to zero.We replicate the results of Section 3 with modifications to Assumptions A.1-A.6. Themodifications stem from replacing the compact support [0 , d with a compact set S − δf that covers S and contains 1 − δ mass of the probability measure.We first consider the case where δ > S − δf are both fixed and state AssumptionsA.1’-A.6’ for the non-compact support case. A.1’. (Non-compact support) The support of f is supp ( f ) = M ⊂ R d , d > f that is zero on R d \ M and continuous on M . A.2’. (Single component) The zero-density region S is contained in the interior of M and has one connected component. A.3’. (Order of smoothness) There exists an (cid:15) > K f ( (cid:15) ) > K f ( (cid:15) ) > A.4’.
Let (cid:15) > L f and U f such that, for all (cid:15) with 0 < (cid:15) < min(1 , (cid:15) ), L f · d ( x , S ) K f ( (cid:15) ) ≤ f ( x ) ≤ U f · d ( x , S ) K f ( (cid:15) ) for all x ∈ B (cid:15) ( S ) ∩ S − δf . A.5’. (Regular covering) There is an η > B dr ( n ) ( S − δf ) are comprised of balls whose centers lie in int S − δf and whoseradii satisfy r ( n ) ∼ O ( n − η ). The sequence of coverings is regular , i.e., the cardinali-ties |B dr ( n ) ( S − δf ) | of the coverings in the sequence satisfy the condition |B dr ( n ) ( S − δf ) | ∼ O ( n dη ). A.6’. (Restriction of covering to S ) Let B dr ( n ) ( S − δf ) be the covering considered inA.5’. If d is the Minkowski dimension of S and d < d , then the number of balls in B dr ( n ) ( S − δf ) intersecting S is bounded from above by H ε ( n ) ∼ O ( n d η (1+ ε ) ), for each ε > As for the compact case on page 8, these assumptions can be divided into threegroups. A.1’ and A.2’ are restrictions on the topology on the support; A.3’ and A.4’ aredescriptions of the local behavior of the density function; and A.5’ and A.6’ stipulatethe existence of a sequence of coverings with good properties.
Theorem . Consider a sequence of i.i.d. data X , X · · · drawn from a distribu-tion having a continuous density f ( x ) w.r.t. the d -dimensional Lebesgue measure ν d on R d . Assume that A.1’-A.6’ hold and that S − δf ⊂ int M . Assume also that the radius r and the separation distance (cid:15) satisfy the following growth rates: r ( n ) ∼ O ( n − η ) , < η < d ,(cid:15) ( n ) ∼ O ( n − ψ ) , < ψ ≤ η, r ( n ) ≤ (cid:15) ( n ) < . Finally, assume that the density f outside the (cid:15) -neighborhood of S satisfies: m ( f, n ) := min w ∈ S − δf \ B (cid:15) ( S ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd . Then:(A) If η and ψ satisfy 1 − ηd − K f ψ > , we have lim n →∞ P (no empty (cid:15) (n)-outside balls) = 1.(B) If η and ψ satisfy 1 + d η − K f η − dη < , we have lim n →∞ P (all (cid:15) (n)-inside balls are empty) = 1. Proof.
The idea of the proof is that the number of samples falling inside S − δf isclose to the total number of samples, n , by the definition of (1 − δ )-support and thelaw of large numbers. By assumption, we know that S − δf is the hyper-cube [ − B, B ] d contained in int M and that A.1’-A.6’ hold. The number n δ of observations that fallin S − δf has a Binom ( n, − δ ) distribution. Using the two-sided Hoeffding inequality,(A.6) of the appendix, with γ = δ , p = 1 − δ , we have: P ( n δ ≥ n (1 − δ − δ P ( n δ ≥ n (1 − δ ≥ P ((1 − δ n ≤ n δ ≤ (1 − δ n ) ≥ − − δ n ! . Conditioning on the event that that n δ ≥ n (1 − δ ) merely adds a nonzero multiplicativefactor to the probabilities of the events appearing in cases (A) and (B) of the statementof the theorem. With this adjustment the proof of this theorem parallels that of Theorem10. The conditional probabilities of the events in (A) and (B) tend to 1 as n → ∞ .We also know that, as n → ∞ , the probability of the conditioning event tends to 1,completing the proof. SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS The result above is stated for a fixed (1 − δ )-support S − δf . As the sample size n tendsto infinity, we do not want to constrain ourselves to a fixed S − δf , since this will preventus from exploring the entire support of f . Consider a sequence of regions S − δ ( n ) f witha decreasing sequence of δ = δ ( n ), expanding the region under consideration as thesample accumulates. To retain the conclusions of the theorem, we must take care toexpand the region slowly enough to control the decay of the density near the edges of S − δ ( n ) f .In what follows, we denote by B dr ( n ) a covering of S − δ ( n ) f and assume that A.5’ andA.6’ hold for this sequence of coverings. (Note that, here, B dr ( n ) denotes a covering ofthe (1 − δ ) support, not of R d , and that we dropped the explicit dependence on S − δ ( n ) f that appears in A.5’ and A.6’ to simplify notation.) Theorem . Under the assumptions of Theorem 13, consider a positive decreas-ing sequence δ ( n ) satisfying lim n →∞ δ ( n ) = 0 and the associated sequence of (1 − δ )-supports S − δ ( n ) f . For each (1 − δ )-support S − δ ( n ) f we consider its corresponding cov-ering B dr ( n ) consisting of open balls of radius r ( n ).Assume that M ( f ) := sup w ∈ R d f ( w ) ∈ (0 , ∞ ) ,m ( f, n ) := min w ∈ S − δ ( n ) f \ B (cid:15) ( n ) ( S ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd , and that the cardinality |B dr ( n ) | ∼ o ( n Ω ) , Ω ≥ η > η and ψ satisfy 1 − ηd − K f · ψ > , we have lim n →∞ P (no empty (cid:15) (n)-outside balls) = 1. Proof.
We denote the number of total samples by n and the number of samplesfalling inside S − δ ( n ) f by N δ ( n ) . To start, let us consider a sufficiently large n ≥ N such that the decreasing sequence satisfies δ ( n ) ≤ . − δ ( n ) ≥ .
9. Then,let us decompose the event C n = { no empty (cid:15) ( n )-outside balls } using the event D n = { . n ≤ N δ ( n ) ≤ n } and its complement. The the law of total probability yields: P ( C n ) = P ( C n | D n ) · P ( D n ) + P ( C n | D cn ) · P ( D cn ) ≥ P ( C n | D n ) · P ( D n ) . (4.1)The event D n involves a binomial random variable N δ ( n ) , therefore P ( D n ) can bebounded by the Hoeffding inequality, (A.6) of the appendix. Since 1 − δ ( n ) ≥ . P ( D n ) = P (0 . n ≤ N δ ( n ) ≤ n ) ≥ − − . n ! . (4.2)The other term, P ( C n | D n ), can also be bounded by the following argument. (Notethat r ( n ) and (cid:15) ( n ) only depend on the total sample size n , not on N δ ( n ) .) P ( C n | D n ) = 1 − P ( C cn | D n ) ≥ − X B ∈{ (cid:15) ( n ) − outside balls } P ( B empty | D n ) ≥ − |B dr ( n ) | · (cid:18) − inf B ∈{ (cid:15) ( n ) − outside balls } P ( B not empty | D n ) (cid:19) . (4.3)By the bound in (A.8) in the proof of Theorem 10 in the appendix and the assumptionthat |B dr ( n ) | ∼ o ( n Ω ) we have |B dr ( n ) | · (cid:18) − inf B ∈{ (cid:15) ( n ) − outside balls } P ( B not empty | D n ) (cid:19) ∼ o ( n Ω ) · O (cid:16) exp (cid:16) − n min(1 − ηd − K f · ψ, − ηd − ξ ) (cid:17)(cid:17) , which is of order o ( n Ω ) · O (cid:16) exp (cid:16) − n − ηd − K f · ψ (cid:17)(cid:17) , since 0 < ξ < − ηd . Using the defi-nition of big/small O notation, we can find a constant L ∈ (0 , ∞ ) such that P ( C n | D n ) ≥ − L · n Ω · exp (cid:16) − n − ηd − K f · ψ (cid:17) for sufficiently large n. (4.4)The bounds (4.2) and (4.4) can be substituted back into (4.1) to obtain P ( C n ) ≥ P ( C n | D n ) · P ( D n ) ≥ (cid:16) − L · n Ω · exp (cid:16) − n − ηd − K f · ψ (cid:17)(cid:17) − − . n !! . (4.5)As long as 1 − ηd − K f · ψ >
0, this lower bound tends to 1 as n goes to infinity.It is important to notice that it is the decay rates of m ( f, n ) and |B dr ( n ) | ∼ o ( n Ω ) thatdetermine the sufficient conditions of Theorem 14, not the rate at which the volume of S − δ ( n ) f increases.The following theorem deals with the (cid:15) ( n )-inside balls. Theorem . Under the assumptions of Theorem 13, consider a positive decreas-ing sequence δ ( n ) satisfying lim n →∞ δ ( n ) = 0 and the associated sequence of (1 − δ )-supports S − δ ( n ) f . For each (1 − δ )-support S − δ ( n ) f we consider its corresponding cov-ering B dr ( n ) consisting of open balls of radius r ( n ).Assume that m ( f, n ) := min w ∈ S − δ ( n ) f \ B (cid:15) ( n ) ( S ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd . Then, if η and ψ satisfy 1 + d η − K f η − dη < , we have lim n →∞ P (all (cid:15) (n)-inside balls are empty) = 1. Proof.
This result can be established by following exactly the same proof as thatfor case (B) in Theorem 13. Since S ⊂ int M we can choose N sufficiently large so that B (cid:15) ( n ) ( S ) ⊂ int S − δ ( n ) f , for n ≥ N . The probability of the event that an individual (cid:15) ( n )-inside ball is empty does not depend on the varying sequence of regions S − δ ( n ) f . For anappropriate N and every n ≥ N , our assumption A.6’ restricts the number of (cid:15) -insideballs and this cardinality does not depend on the varying (1 − δ ( n ))-supports. The restof the proof parallels that of Theorem 13 (B), considering n ≥ max( N , N ) + 1. SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Figure 4.1 . Density functions with polynomial tails (blue) and exponential tails (orange).
The existence of a sequence of δ ( n ) values satisfying the above conditions can beverified for densities exhibiting specific tail behaviors on their supports. Two examplesof such densities are given in the two examples following Definitions 16 and 17 and aredepicted in Figure 4.1. Definition . (Polynomial tail) We say that a continuous density f supportedon R d has a polynomial tail if it has the form(4.6) f ( x ) = ( f ( d ( x , S )) = C · d ( x , S ) γ , d ( x , S ) < (cid:15) ,f ( d ( x , S )) = C · d ( x , S ) χ , d ( x , S ) ≥ (cid:15) , for some C , C ∈ (0 , ∞ ), γ >
0, and χ < − d . Note that the continuity assumption onthe density f requires C · (cid:15) γ = C · (cid:15) χ . Example.
Consider a continuous density f of the form (4.6) with d = 1 , S = { } , (cid:15) = 1 , γ = , and χ = −
2, leading to the density f ( x ) = ( · | x | / , | x | < , · | x | − , | x | ≥ , . Thisdensity yields R − | x | / dx = . We can take n sufficiently large so that δ ( n ) < . Fromthe definition of (1 − δ ( n ))-support we have B ( n ) > (cid:15) = 1 for δ ( n ) < . The definingequation of B ( n ) is R − B ( n ) −∞ f ( x ) dx + R ∞ B ( n ) f ( x ) dx = 2 R ∞ B ( n ) f ( x ) dx = δ ( n ) . Therefore B ( n ) = δ ( n ) ∼ O ( δ ( n ) − ).The maximum of this density function is attained at x = ± M ( f ) = .Let ξ and ψ be as defined in Theorem 13. Then, if we choose sequences δ ( n ) ∼ O ( n ξ/χ ) ∼ O ( n − ξ/ ) and (cid:15) ( n ) ∼ O ( n − ψ ) ∼ O ( n − ξ/γ ) ∼ O ( n − ξ ), we will ensure that theminimal value m ( f, n ) := min w ∈ S − δ ( n ) f \ B (cid:15) ( n ) ( S ) f ( w ) = min ( C · (cid:15) ( n ) γ , C · B ( n ) χ ) =min (cid:16) (cid:15) ( n ) / , B ( n ) − (cid:17) . We have (cid:15) ( n ) / ∼ O ( n − ξ ) and B ( n ) − ∼ O ( n − ξ ), so that m ( f, n ) ∼ O ( n − ξ ). Furthermore, δ ( n ) ∼ O ( n − ξ/ ) →
0. The assumptions on the densityfunction in Theorem 13 are therefore satisfied.
Definition . (Exponential tail) We say that a continuous density f supportedon R d has an exponential tail if it has the form(4.7) f ( x ) = ( f ( d ( x , S )) = C · d ( x , S ) γ , d ( x , S ) < (cid:15) ,f ( d ( x , S )) = C · exp( βd ( x , S )) , d ( x , S ) ≥ (cid:15) , for some C , C ∈ (0 , ∞ ), 0 < γ and β <
0. Note that the continuity assumption on thedensity f requires C · (cid:15) γ = C · exp( − β(cid:15) ). Example.
Consider a continuous density f of the form (4.7) with d = 1 , S = { } , (cid:15) = 1 , γ = , and β = −
2, leading to the density f ( x ) = ( · | x | / , | x | < , · e · e − | x | , | x | ≥ , .This density yields R − · | x | / dx = . We can take n sufficiently large so that δ ( n ) < .From the definition of (1 − δ ( n ))-support we have B ( n ) > (cid:15) = 1 for δ ( n ) < . The defin-ing equation of B ( n ) is R − B ( n ) −∞ f ( x ) dx + R ∞ B ( n ) f ( x ) dx = 2 R ∞ B ( n ) f ( x ) dx = δ ( n ) . Therefore B ( n ) = 1 − log (cid:0) δ ( n ) (cid:1) ∼ O ( − log δ ( n )).The maximum of this density function is attained at x = ± M ( f ) = .Let ξ and ψ be as defined in Theorem 13. Then if we choose sequences δ ( n ) ∼ O ( n ξ/β ) ∼ O ( n − ξ/ ) and (cid:15) ( n ) ∼ O ( n − ψ ) ∼ O ( n − ξ/γ ) ∼ O ( n − ξ ), we will ensure that the minimalvalue m ( f, n ) := min w ∈ S − δ ( n ) f \ B (cid:15) ( n ) ( S ) f ( w ) = min ( C · (cid:15) ( n ) γ , C · exp( βB ( n )))= min (cid:16) (cid:15) ( n ) / , e · exp ( − B ( n )) (cid:17) . We have (cid:15) ( n ) / ∼ O ( n − ξ ) and e · exp ( − B ( n )) ∼ O ( n − ξ ), so that m ( f, n ) ∼ O ( n − ξ ). Furthermore, δ ( n ) ∼ O ( n − ξ/ ) →
0. The assumptionson the density function in Theorem 13 are therefore satisfied.The essence of the examples above is embodied in the following general result whoseproof follows along the path suggested by the examples.
Proposition . Assume that a continuous density supported on R d is of the form(4.6) or (4.7) with a compact zero-density region S . For fixed < η < d , < ξ < − ηd , we can find a sequence (cid:15) ( n ) ∼ O ( n − ψ ) , < ψ ≤ η , and a decreasing sequence of δ ( n ) ,with lim n →∞ δ ( n ) = 0 , such that M ( f ) := sup w ∈ R d f ( w ) ∈ (0 , ∞ ) ,m ( f, n ) := min w ∈ S − δ ( n ) f \ B (cid:15) ( n ) ( S ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd . Proof.
That M ( f ) := sup w ∈ R d f ( w ) ∈ (0 , ∞ ) is a direct consequence of the assump-tions about the form of (4.6) or (4.7).We want to show that we can construct a decreasing sequence δ ( n ) , with lim n →∞ δ ( n ) =0, such that the corresponding 1 − δ ( n ) supports S − δ ( n ) f := [ − B ( n ) , B ( n )] d have min-imal values m ( f, n ) ∼ O ( n − ξ ). First, we choose the sequences B ( n ) and (cid:15) ( n ), whichjointly determine the desired decay rate of m ( f, n ). Then, if B ( n ) is an increasingsequence, by the definition of S − δ ( n ) f as a cube, δ ( n ) is automatically a decreasingsequence.To determine B ( n ), we use a sandwich argument on the minimal value m ( f, n )attained on S − δ ( n ) f . We will consider two sequences of balls, one where each ball iscontained in S − δ ( n ) f , called the sequence of inner tangential balls B ∗ ( n ), and the otherwhere each ball contains S − δ ( n ) f , called the sequence of outer inclusive balls B ∗ ( n ): B ∗ ( n ) := B B ( n ) ( ) ≡ { x ∈ R d | d ( x , ) < B ( n ) } , (4.8) B ∗ ( n ) := B √ dB ( n ) ( ) ≡ { x ∈ R d | d ( x , ) < √ dB ( n ) } . (4.9) SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Consider the following minimal values of f on B ∗ ( n ) \ B (cid:15) ( n ) ( S ) and B ∗ ( n ) \ B (cid:15) ( n ) ( S ), m ∗ ( f, n ) = inf w ∈B ∗ ( n ) \ B (cid:15) ( n ) ( S ) f ( w ) ,m ∗ ( f, n ) = inf w ∈B ∗ ( n ) \ B (cid:15) ( n ) ( S ) f ( w ) . Since B ∗ ( n ) ⊂ S − δ ( n ) f ⊂ B ∗ ( n ) by the above construction, we have m ∗ ( f, n ) ≥ m ( f, n ) ≥ m ∗ ( f, n ). When f is of the form (4.6) or (4.7), we can choose the sequence (cid:15) ( n ) = (cid:16) C · n − ξ (cid:17) /γ where γ ≥ γ > < ξγ ≤ η . We use this choice of (cid:15) ( n ) to en-sure that (cid:15) ( n ) ∼ O ( n − ψ ) ∼ O ( n − ξγ ), and 0 < ψ ≤ η hold. Note that (cid:15) ( n ) is a decreasingsequence by definition. (Polynomial tail) When f is of the form (4.6), we use the (cid:15) ( n ) above and chooseanother sequence B ( n ) = (cid:16) C · n ξ (cid:17) − /χ and take n sufficiently large so that (cid:15) ( n ) < B ( n ) and (cid:15) < B ( n ). Then the minimal values for B ∗ ( n ) satisfy m ∗ ( f, n ) ≤ min ( C · (cid:15) ( n ) γ , C · ( B ( n ) − (cid:15) ( n )) χ ) ∼ O (min ( C · (cid:15) ( n ) γ , C · B ( n ) χ )) ∼ O (cid:16) min (cid:16) C · n − γγ ξ , n − ξ (cid:17)(cid:17) . Since γ/γ ≤ n > m ∗ ( f, n ) ∼ O ( n − ξ ). Similarly, theminimal values for B ∗ ( n ) satisfy m ∗ ( f, n ) ≥ min (cid:16) C · (cid:15) ( n ) γ , C · (4 √ dB ( n ) + (cid:15) ( n )) χ (cid:17) ∼ O (min ( C · (cid:15) ( n ) γ , C · B ( n ) χ )) ∼ O ( n − ξ ) . (Exponential tail) When f is of the form (4.7), we use the (cid:15) ( n ) above and chooseanother sequence B ( n ) = β log (cid:16) C · n − ξ (cid:17) and take n sufficiently large so that (cid:15) ( n ) < B ( n ) and (cid:15) < B ( n ). Then the minimal values for B ∗ ( n ) satisfy m ∗ ( f, n ) ≤ min ( C · (cid:15) ( n ) γ , C · exp( β ( B ( n ) − (cid:15) ( n )))) ∼ O (min ( C · (cid:15) ( n ) γ , C · exp( βB ( n )))) ∼ O (cid:16) min (cid:16) C · n − γγ ξ , n − ξ (cid:17)(cid:17) . Since γ/γ ≤ n > m ∗ ( f, n ) ∼ O ( n − ξ ). Similarly, theminimal values for B ∗ ( n ) satisfy m ∗ ( f, n ) ≥ min (cid:16) C · (cid:15) ( n ) γ , C · exp( β (4 √ dB ( n ) + (cid:15) ( n ))) (cid:17) ∼ O (min ( C · (cid:15) ( n ) γ , C · exp ( βB ( n )))) ∼ O ( n − ξ ) . By a sandwich argument with B ∗ ( n ) ⊂ S − δ ( n ) f ⊂ B ∗ ( n ) and m ∗ ( f, n ) ≥ m ( f, n ) ≥ m ∗ ( f, n ), we know that we can find some √ d · (cid:16) C · n ξ (cid:17) − /χ ≤ B ( n ) ≤ (cid:16) C · n ξ (cid:17) − /χ for a density of the form (4.6) or some √ d · β log (cid:16) C · n − ξ (cid:17) ≤ B ( n ) ≤ β log (cid:16) C · n − ξ (cid:17) for adensity of the form (4.7). This choice of an increasing sequence of B ( n ) proves our claimthat we can find a sequence of δ ( n ) , with lim n →∞ δ ( n ) = 0, such that M ( f ) ∈ (0 , ∞ )and m ( f, n ) ∼ O ( n − ξ ). Corollary . Consider a continuous density supported on R d of the form (4.6)or (4.7). We can find a sequence (cid:15) ( n ) ∼ O ( n − ψ ) , < ψ ≤ η , and a decreasing sequenceof δ ( n ) values, with lim n →∞ δ ( n ) = 0 , such that the associated sequence of (1 − δ ( n )) -supports S − δ ( n ) f have corresponding coverings B dr ( n ) satisfying A.5’ and A.6’. Then(A) If η and ψ satisfy − ηd − K f · ψ > , we have lim n →∞ P ( no empty (cid:15) (n)-outside balls ) = 1 .(B) If η and ψ satisfy d η − K f η − dη < , we have lim n →∞ P ( all (cid:15) (n)-inside balls are empty ) = 1 . Proof.
By the result in Proposition 18, it suffices to consider the grid constructionin Lemma 8. The cardinality of the resulting coverings B dr ( n ) of S − δ ( n ) f can be guaran-teed to be O (cid:16) B ( n ) d r ( n ) d (cid:17) . So, for case (4.6) we can have (cid:12)(cid:12)(cid:12) B dr ( n ) (cid:12)(cid:12)(cid:12) ∼ O (cid:18)(cid:16) C · n − ξ (cid:17) d/χ · n ηd (cid:19) and for case (4.7) we can have (cid:12)(cid:12)(cid:12) B dr ( n ) (cid:12)(cid:12)(cid:12) ∼ O (cid:16) β d log d (cid:16) C · n − ξ (cid:17) · n ηd (cid:17) . In either case, (cid:12)(cid:12)(cid:12) B dr ( n ) (cid:12)(cid:12)(cid:12) is obviously o ( n Ω ) for some sufficiently large Ω >
0. Therefore all assumptionsin Theorems 14 and 15 are met.
5. Simulations and Connections to Other Areas.
Simulation Studies for the choice of r, (cid:15) . Our theoretical results establish therate of decay for the radius r ( n ) and the neighborhood width (cid:15) ( n ). That is, for anypositive constants M r and M (cid:15) , r ( n ) = M r · O ( n − η ) ,(cid:15) ( n ) = M (cid:15) · O ( n − ψ ) . (with η and ψ following the same notation used in Theorem 10) the asymptoticguarantees of filled and empty balls hold. These guarantees allow one to identify lowerdimensional zero-density regions in the limit. However, for a fixed sample size, theactual values of r ( n ) and (cid:15) ( n ) matter. This subsection reports simulations investigatingchoice of the constants M r and M (cid:15) .We consider the density f ( x ) ∝ d ( x , S ) ◦ [0 , .The zero-density region S = { } × [0 . , .
75] is of strictly lower dimension. For thisexample, m ( f, n ) decays as a polynomial with ξ = 4 · ψ and M ( f ) <
2. The conditions ofTheorem 10, parts (A) and (B), hold with, for example, η = 0 .
21 and ψ = 0 .
01 (therefore0 < ξ ≤ − η ).In the simulation, random samples of various sizes are generated from the density.A grid of balls with centers on a lattice is used to cover the unit square. The centers ofthe balls depend on M r and n and follow the rule we described in Proposition 9. We SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Figure 5.1 . The percentage of various types of non-empty covering balls under different multipliers M r and M (cid:15) for ball radius r ( n ) of the covering and neighborhood width of S , (cid:15) ( n ) , for varioussample sizes n . note that the fraction of filled balls of a particular kind serves as an empirical estimateof the mean probability that a particular kind of ball is non-empty.Figure 5.1 presents the results of the simulation. The color scheme is the same asin Figure 3.2. Proceeding down a column, the sample size changes from 100 to 10,000.Tracking the red lines, we see that the percentage of filled (cid:15) -inside balls tends to 0 as n increases. The orange lines represent the (cid:15) -neighboring balls. The percentage of thesethat are filled does not always tend to 0. The green lines represent the (cid:15) -outside balls.Here, the fraction filled increases toward the eventual limit of 1.As we step across rows of the figure, the value of M (cid:15) increases from 0 .
05 to 0 . M r increases from 0 .
05 to 0 .
40. As r ( n ) increases, the ballsare larger and a greater fraction are filled, as evidenced by the lines within each plot.As M (cid:15) increases, balls in the covering are moved from (cid:15) -outside to (cid:15) -neighboring to (cid:15) -inside. The changing values of (cid:15) ( n ) and r ( n ) shuffle the group membership of theballs and change the resulting fractions. In general, the larger values of M r that wehave investigated lead to greater separation of the red and green lines–hence greaterdifferentiation between (cid:15) -inside and (cid:15) -outside balls. Larger values of M (cid:15) have a similar,though weaker effect.The simulation also provides a caution. The plots toward the bottom of the figureshow much greater separation between the red and green lines. This separation (near0, near 1) is needed in order to reliably detect a zero-density region. For this density,the larger sample sizes prove much more effective than do the smaller sample sizes. The simulation results suggest an interesting possibility—the use of multiple cover-ings with balls of different sizes. Doing so could allow one to examine the set of M r , M (cid:15) pairs that suggest the presence of a zero-density region. In practice, when n is finite,the choice of multipliers M r and M (cid:15) impacts the results.5.2. Connections to Existing Research.
Our object of interest in this paper is theinverse image of { } under the continuous density function f . In the existing literature,Cuevas [6] and Rigollet and Vert [21] study the related concept of the λ -level set fordensity f . That is, the set S λ := (cid:8) x ∈ R d | f ( x ) > λ (cid:9) , λ >
0. The special case where λ = 0 is studied in Devroye and Wise [7]. The difference between the regions is that S λ is the inverse image of an open set that contains positive mass if non-empty while S := (cid:8) x ∈ R d | f ( x ) = 0 (cid:9) is the inverse image of a closed set and and contains no mass.To the best of our knowledge, regions like S have not previously been studied.In the level set estimation literature, there are two major approaches to the problemof estimating level sets. Suppose we are interested in estimating the level sets of density f (corresponding to a probability measure F ). One approach is to construct plug-inestimators ˆ S λ := n x ∈ R d | ˆ f ( x ) > λ o for a level set, based on a kernel density estimatorˆ f computed from the data and appropriate choices of the bandwidth parameters forthe kernel [21, 17, 22].The other approach is based on the empirical excess-mass functional. The (empirical)excess-mass functional E ( λ ) := F ( S λ ) − λ Leb( S λ ) measures how the “excess probabilitymass” of the probability measure F distributes over the region S λ when compared toLebesgue measure [11]. If we substitute the set S λ with a set estimator ˆ S λ for the λ -level set S λ , we can similarly consider the functional H ( λ ) := F ( ˆ S λ ) − λ Leb( ˆ S λ ), whichcan be used to evaluates the estimator ˆ S λ . The functional H is maximized over a classof sets to obtain the level set estimator ˆ S λ [20, 24], to obtain level set estimators ˆ S λ .The consistency and asymptotic behavior of both approaches has been derived underregularity assumptions.In this paper, we study the lower dimensional object that arises as the inverse im-age of density function f of a single point set. When we restrict our concern to themanifold supp ( f ) and the inverse of the density function f − is sufficiently smooth, f − ( { a } ) , f − ( { b } ) encode the boundary of the manifold defined by the inverse image f − (( a, b )), as a sub-manifold of supp ( f ). When a = λ and b = ∞ , this inverse imagedefines λ -level sets. The object S = f − ( { } ) is the boundary of such a particularexample where a = 0 and b = ∞ .Our method detects this specific kind of lower dimensional topological feature thatcould arises as a manifold boundary. We construct a sequence of covering families todetect S and we derive a set of sufficient conditions that ensure consistent detection. Asis to be expected, when more sample points are available, our covering scheme locatesthe zero-density region S more accurately. In applied scenarios where the boundariesarise as zero-density regions of certain density functions, our method could help indetection [15].The main results in this paper also exhibit the relation between topological featuresthat arise as S , and its dimensionality. In Adler, Bobrowski and Weinberger [1] andOwada and Adler [18], the authors observe that higher dimensional topological featuresare generally smaller in scale. As shown by the sufficient conditions in Theorem 10 and13, when the dimension d of S is higher, we need to specify a faster decay rate for radii r ( n ) of the covering sequence in order to meet the sufficient conditions. This interplaybetween the dimension of the support, the dimension of the zero-density region S andthe sufficient growth rates supports Adler, Bobrowski and Weinberger [1]’s observationfrom a different angle. SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS
6. Conclusions.
In this paper, we consider the question of detection of S = f − ( { } ) lying in the support M of a continuous density f when S has Minkowskidimension strictly lower than the dimension of the data, d . This type of topologicalfeature is difficult to identify and has not been studied before. The main contributionin this paper is to provided a novel approach, based on a sequence of coverings tied togrowing sample sizes, to study a specific kind of lower dimensional object S in the sup-port of a density function. This approach works under both compact and non-compactsupport and its construction has geometric intuition. Being of lower dimension, S is adelicate object. It is in the closure of M \ S and, in a sense, disappears when one looksat it at too coarse a resolution.Our strategy is to construct a sequence of covering balls (e.g., Lemma 8) and shrinktheir radius as the sample size n goes to infinity. We derive a set of sufficient conditions,using the local behavior of f , captured by K f and K f , that ensure that a particularcovering scheme leads to probability one consistency results in Theorem 10 (compactsupport) and Theorem 13 (non-compact support). This set of theorems in Section 3and 4 can be generalized to the case where S has multiple disconnected components.As the sample size n tends to ∞ , a shrinking (cid:15) -neighborhood of S can be identifiedby empty covering balls asymptotically with probability one while the rest of M willbe covered by non-empty balls.Our result provides a range of asymptotic schemes for the radius of covering ballsunder which the lower dimensional topological feature can be detected. We provideexperimental evidence to support our claim in Section 5. Our approach supports theinsight that different dimensional topological features occur at different scales. Thenovelty of our result is the role of the ambient dimension of the data in addition to thelocal behavior of f near S .Our approach and results focus on the connection between i.i.d. samples and thenear-topological features of the support of the density from which they are drawn. Asfuture work, it is of great interest to generalize the results to dependent draws fromthe density, with the eventual goal of understanding how probabilistic dependence inthe sample can be useful in the construction of complexes based on sub-samples (e.g.witness complexes) used in TDA [16].APPENDIX A: PROOF OF MAIN THEOREMThe Theorem 10 is established by providing a bound on the probability mass of acovering ball B r ( x ) , counting the number of each type of covering ball under consid-eration and taking a limit as the sample size n → ∞ . A sequence of upper bounds isneeded for (cid:15) -inside balls and a sequence of lower bounds is needed for (cid:15) -outside balls.In the statement of the theorem, we assume that the sequence r ( n ) ∼ O ( n − η ) , < η < d ,(cid:15) ( n ) ∼ O ( n − ψ ) , ψ ≤ η, and 2 r ( n ) ≤ (cid:15) ( n ) < . To establish the bounds on the probability on the probability mass P ( B r ( x )) = R B r ( x ) f ( w ) d w of a ball B r ( x ), it suffices to consider the volume of the ball and abound on the density function over the ball. The volume for a d -dimensional ball ofradius r is V d ( r ) = π d Γ (cid:16) d + 1 (cid:17) r d , (A.1) which for our sequence of radii r ( n ) is V d ( r ( n )) ∼ O ( n − dη ). Recall from the definitionsof (cid:15) -inside and (cid:15) -outside covering balls: (cid:15) -inside balls are those balls B r ( x ) such that x ∈ B (cid:15) ( S ) and B r ( x ) ∩ S = ∅ . (cid:15) -outside balls are those balls B r ( x ) such that x / ∈ B (cid:15) ( S ).Outside B (cid:15) ( n ) ( S ) the density is bounded below by m ( f, n ) ∼ O ( n − ξ ). Inside B (cid:15) ( n ) ( S ) the density is bounded below by the inequality in the Assumption A.4: L f · d ( x , S ) K f ( (cid:15) ( n )) ≤ f ( x ) ≤ U f · d ( x , S ) K f ( (cid:15) ( n )) . (A.2)• Upper bound for an (cid:15) ( n ) -inside covering ball. Since the (cid:15) ( n )-inside ball intersects S , i.e. B r ( n ) ( x ) ∩ S = ∅ , we know that any point y ∈ B r ( n ) ( x ) will be at most 2 r ( n ) ≤ (cid:15) ( n ) away from the zero-density region S . Theprobability mass contained in an (cid:15) ( n )-inside covering ball B r ( n ) ( x ) is bounded fromabove by P ( B r ( n ) ( x )) ≤ U f · V d ( r ( n )) · (2 r ( n )) K f ( (cid:15) ( n )) (A.3)There are two types of (cid:15) ( n )-outside balls: those that are entirely contained in M = [0 , d and those that are only partially contained in M . Probability mass P ( B r ( n ) ( x )) = R B r ( n ) ( x ) f ( w ) d w is bounded from below by volume of the ball timesthe minimum of the density f ( x ) in the ball.• Lower bound for an (cid:15) ( n ) -outside covering ball that lies within M . From the assumption on density f , m ( f, n ) := min w ∈ M \ B (cid:15) ( S ) f ( w ) ∈ (0 , ∞ ) ∼ O ( n − ξ ) , < ξ < − ηd . The probability mass contained in the (cid:15) ( n )-outside covering ball is bounded frombelow by P ( B r ( n ) ( x )) ≥ V d ( r ( n )) · min h L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) , m ( f, n ) i . (A.4)• Lower bound for an (cid:15) ( n ) -outside covering ball that does not lie entirely within M . Assumption A.5 states that the center of the covering ball B r ( n ) ( x ) is in M . We knowthat the volume of such an (cid:15) ( n )-outside ball B r ( n ) ( x ) will be at least ( ) d times thevolume V d ( r ( n )) of a full d -dimensional ball. With the same m ( f, n ) > P ( B r ( n ) ( x )) ≥ (cid:18) (cid:19) d V d ( r ( n )) · min h L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) , m ( f, n ) i (A.5)This lower bound also holds for (cid:15) ( n )-outside balls that are entirely contained in M .In the following discussion, we use the notation p B r ( n ) ( x ) := P ( B r ( n ) ( x )) to emphasizethat we use it as a parameter. Part (A) (Outside balls).
Consider regular families of covering balls B dr ( n ) of supp ( f ) and a sequence of (cid:15) ( n )-neighborhoods of the zero-density region S wheredim S < d . The Hoeffding concentration inequality for X ∼ Binom ( n, p ) can be writtenfor arbitrary γ > P { ( p − γ ) n ≤ X ≤ ( p + γ ) n } ≥ − (cid:16) − γ n (cid:17) . (A.6)The inequality ensures that the number of observations falling into a single ball B r ( n ) ( x )will be close to its expectation. In a single ball, the number N B r ( n ) ( x ) of observations SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS falling in the d -dimensional ball of radius r ( n ), is distributed as a binomial distribution Binom ( n, p B r ( n ) ( x ) ). Also, P ( B r ( n ) ( x ) not empty) = P (cid:16) N B r ( n ) ( x ) ≥ (cid:17) . By assumption on the range of η , the lower bounds (A.4), (A.5) under the differentsituations are determined by the factor r ( n ) d . But note that r ( n ) d ∼ O ( n − dη ) with − dη > − N ( x ) ∈ N large enough such that N ( x ) · p B r ( n ) ( x ) ≥
1, ensuring n · p B r ( n ) ( x ) ≥ n ≥ N ( x ). We take N ( x ) = p Br ( n )( x ) and observe that N ( x ) ∼ O ( n dη ) with dη <
1, rendering such a choice of n ≥ N ( x ) possible. Assume inthe arguments below that n ≥ N ( x ).For an (cid:15) ( n )-outside ball B r ( n ) ( x ), we can rewrite the probability using (A.6) with X = N B r ( n ) ( x ) , p = p B r ( n ) ( x ) and γ = p Br ( n )( x ) , P ( B r ( n ) ( x ) not empty) ≥ P (cid:18)(cid:12)(cid:12)(cid:12) N B r ( n ) ( x ) − np B r ( n ) ( x ) (cid:12)(cid:12)(cid:12) < n · p B r ( n ) ( x ) (cid:19) ≥ (cid:20) − (cid:18) − p B r ( n ) ( x ) · n (cid:19)(cid:21) . (A.7)From the discussion of the probability mass contained in each type of covering ballabove, if B r ( n ) ( x ) is an (cid:15) ( n )-outside covering ball, then from formulas (A.4) and (A.5)we have P ( B r ( n ) ( x ) not empty) ≥ (cid:20) − (cid:18) − h V d ( r ( n )) · min h L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) , m ( f, n ) ii · n (cid:19)(cid:21) (A.8)If min h L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) , m ( f, n ) i = m ( f, n ) then P ( B r ( n ) ( x ) not empty) ≥ (cid:20) − (cid:18) −
12 [ V d ( r ( n )) · m ( f, n )] · n (cid:19)(cid:21) If min h L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) , m ( f, n ) i = L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) then P ( B r ( n ) ( x ) not empty) ≥ (cid:20) − (cid:18) − h V d ( r ( n )) · L f · ( (cid:15) ( n ) − r ( n )) K f ( (cid:15) ( n )) i · n (cid:19)(cid:21) Consider the right hand side of (A.8), taking n → ∞ . If it holds that1 − ηd − ξ > − ηd − K f · ψ > n →∞ P ( B r ( n ) ( x ) not empty) is 1. For the first inequality, it is simply ξ < − ηd ≤ (the second equality holds iff d = 0) as we assumed in the statement ofthe theorem so it suffices to consider the second inequality.We want to let n > p Br ( n )( x ) for all x ∈ M \ B (cid:15) ( S ). But p Br ( n )( x ) reaches its maxi-mum when p B r ( n ) ( x ) attains its minimum m ( n, f ) := min w ∈ M \ B (cid:15) ( S ) f ( w ), i.e. p Br ( n )( x ) ≤ V d ( r ( n )) · m ( n,f ) ∼ O ( n ηd + ξ ). As long as ηd + ξ <
1, which is guaranteed by (A.9), we canensure that n is greater than the maximum value max w ∈ M \ B (cid:15) ( n ) ( S ) 1 p B r ( n ) ( w ) . There-fore, such an n is greater than all such quantities p Br ( n )( x ) , ∀ x ∈ M \ B (cid:15) ( n ) ( S ) and so n · p B r ( n ) ( x ) ≥ x ∈ M \ B (cid:15) ( n ) ( S ). P (no empty (cid:15) ( n )-outside balls) = P ( ∩ B ∈{ (cid:15) ( n ) − outside balls } { B not empty } )= 1 − P ( ∪ B ∈{ (cid:15) ( n ) − outside balls } { B empty } ) ≥ − X B ∈{ (cid:15) ( n ) − outside balls } P ( B empty) ≥ − (cid:12)(cid:12)(cid:12) B dr ( n ) (cid:12)(cid:12)(cid:12) · sup B ∈{ (cid:15) ( n ) − outside balls } P ( B empty)(A.10)By the equation (A.7) and the subsequent bound (A.8) we obtain that P (no empty (cid:15) ( n )-outside balls) ≥ − |B dr ( n ) | · (cid:18) − inf B ∈{ (cid:15) ( n ) − outside balls } P ( B not empty) (cid:19) ≥ − ( L · n dη ) · (cid:16) L · exp (cid:16) − n − ηd − K f · ψ (cid:17)(cid:17) (A.11)for sufficiently large n .The constants L , L ∈ (0 , ∞ ) exist by the definition of the big Onotation for sufficiently large n . The last expression follows from |B dr ( n ) | ∼ O ( n dη ) and(A.8). It is dominated by the second term in the product. Therefore, if 1 − ηd − K f · ψ > n →∞ P (no empty (cid:15) ( n )-outside balls) = 1 . Part (B) (Inside balls).
We observe that the event “all B (cid:15) ( n ) ( S )-inside balls areempty” can be regarded as all observations falling into the other two types of coveringballs. Note that, as stated in the theorem, we only ensure that (cid:15) ( n )-inside covering ballsare empty but not every (cid:15) ( n )-outside ball contains at least one observations under theassumptions in part (B). We investigate the upper bound on the probability mass ineach of these covering balls.Since the observations are i.i.d. we have, using the notation that p ∪ B ∈{ (cid:15) ( n ) − inside balls} B := P ( ∪ B ∈{ (cid:15) ( n ) − inside balls} B ), P (all (cid:15) ( n )-inside balls are empty) = (cid:16) − p ∪ B ∈{ (cid:15) ( n ) − inside balls} B (cid:17) n since covering balls may overlap, ≥ − X B ∈{ (cid:15) ( n ) − inside balls} p B n (A.12)from the upper bound above.Assumption 3 asserts that the limit K f = lim (cid:15) → + K f ( (cid:15) ) exists and that we canchoose a bound K U ≥ lim (cid:15) → + K f ( (cid:15) ) uniformly. By the bounds (A.2) for (cid:15) ( n )-insidecovering balls, there is a positive constant D f,d := U f · π d Γ ( d +1 ) · K U ≥ U f · π d Γ ( d +1 ) · K f ( (cid:15) ( n )) depending only on density f and the dimension d . The factor π d Γ ( d +1 ) comesfrom the multiplier of the volume of a d -dimensional ball (A.1). We denote by B d r ( n ) thesub-collection of covering balls from B dr ( n ) that intersect S . P (all (cid:15) ( n )-inside balls are empty) ≥ (cid:16) − |B d r ( n ) | · h U f · V d ( r ( n )) · (2 r ( n )) K f ( (cid:15) ( n )) i(cid:17) n ≥ (cid:16) − H ε ( n ) · h D f,d · ( r ( n )) K f ( (cid:15) ( n ))+ d i(cid:17) n (A.13) SYMPTOTICS OF LOWER DIMENSIONAL ZERO-DENSITY REGIONS Due to the Assumption A.6 , the collection of covering balls B d r ( n ) that intersect S satisfy |B d r ( n ) | ≤ H ε ( n ) ∼ O ( n d η (1+ ε ) ) for any ε >
0. On one hand, by the assumptionfor part (B), we have 1 + d η − K f η − dη < > ε > d η (1 + ε ) − K f η − dη < Lemma. (Bernoulli inequality) If x > , then (1 + x ) n ≥ nx for ∀ n ∈ N . Proof.
Let us prove the lemma by induction. For n = 0 , x ) n ≥ nx holds for every n ≤ k ∈ N andproceed by induction for n = k + 1.(1 + x ) n = (1 + x ) · (1 + x ) n − = (1 + x ) · (1 + x ) k since n − k ≤ k by the inductive hypothesis, ≥ (1 + x ) · (1 + kx )due to the fact that (1 + x ) > kx + x + kx = 1 + ( k + 1) · x + kx ≥ k + 1) · x, since kx ≥ x = H ε ( n ) · h D f,d · ( r ( n )) K f ( (cid:15) ( n ))+ d i ∼ O ( n d η (1+ ε ) ) · h D f,d · ( r ( n )) K f ( (cid:15) ( n ))+ d i ,which is of order O ( n d η (1+ ε ) − K f · η − dη ) as n → ∞ . We emphasize again that ε > (cid:15) = (cid:15) ( n ) is the quan-tity in Theorem 10 that defines the neighborhood size of S . These two are differentquantities. Under the assumption1 + d η (1 + ε ) − K f η − dη < ⇔ d η (1 + ε ) − K f η − dη < − , (A.16)we can take n sufficiently large, say n ≥ N , to ensure d η (1 + ε ) − K f η − dη < − x >
0. Then we can apply the Bernoulli inequality to the right hand sideof (A.13) and we have the following lines, with H ε ( n ) ∼ O ( n d η (1+ ε ) ) and (A.13), as n → ∞ : P (all (cid:15) ( n )-inside balls are empty) ≥ (cid:16) − ( L · n d η (1+ ε ) ) · h D f,d · ( r ( n )) K f + d i(cid:17) n ≥ (cid:16) − n · ( L · n d η (1+ ε ) ) · h D f,d · ( r ( n )) K f + d i(cid:17) ≥ (cid:16) − L f,d · ( n d η (1+ ε ) − K f η − dη ) (cid:17) (A.17)for sufficiently large n .The constant L ∈ (0 , ∞ ) exists by the definition of the big Onotation for sufficiently large n . If we take the limit n ≥ N , n → ∞ on both sides of theinequality we know that the quantity (cid:16) − L f,d · ( n d η (1+ ε ) − K f η − dη ) (cid:17) converges to 1from (A.15). therefore if 1 + d η − K f η − dη < n →∞ P (all (cid:15) ( n )-inside balls are empty) = 1 . Acknowledgements.
This material is based upon work supported by the NationalScience Foundation under Grants No. DMS-1613110 and No. SES-1921523.REFERENCES [1]
Adler, R. J. , Bobrowski, O. and
Weinberger, S. (2013). Crackle: The persistent homologyof noise. arXiv:1301.1466
Bauer, U. and
Pausinger, F. (2018). Persistent Betti numbers of random Cech complexes. arXiv:1801.08376
Bobrowski, O. and
Kahle, M. (2017). Topology of random geometric complexes: a survey. arXiv:1409.4734
Bürgisser, P. and
Cucker, F. (2013).
Condition: The Geometry of Numerical Algorithms .Berlin: Springer.[5]
Carlsson, G. (2009). Topology and data.
Bulletin of the American Mathematical Society
Cuevas, A. (2009). Set estimation: Another bridge between statistics and geometry.
Boletín deEstadística e Investigación Operativa (BEIO)
Devroye, L. and
Wise, G. L. (1980). Detection of abnormal behavior via nonparametric esti-mation of the support.
SIAM Journal on Applied Mathematics
Duy, T. K. , Hiraoka, Y. and
Shirai, T. (2016). Limit theorems for persistence diagrams. arXiv:1612.08371
Edelsbrunner, H. and
Harer, J. (2010).
Computational Topology: an Introduction . AmericanMathematical Society.[10]
Falconer, K. (2004).
Fractal Geometry: Mathematical Foundations and Applications . New York:John Wiley & Sons.[11]
Hartigan, J. A. (1987). Estimation of a convex density contour in two dimensions.
Journal ofthe American Statistical Association
Kahle, M. (2009). Topology of random clique complexes.
Discrete Mathematics
Kahle, M. and
Meckes, E. (2013). Limit the theorems for Betti numbers of random simplicialcomplexes.
Homology, Homotopy and Applications
Kalisnik, S. , Lehn, C. and
Limic, V. (2019). Geometric and probabilistic limit theorems intopological data analysis. arXiv:1903.00470
Luo, H. and
Strait, J. (2019). Combining geometric and topological information in imagesegmentation. arXiv:1910.04778
Luo, H. , Patania, A. , Kim, J. and
Vejdemo-Johansson, M. (2020+). Generalized penaltyfor circular coordinate representation.
Submitted .[17]
Mason, D. M. and
Polonik, W. (2009). Asymptotic normality of plug-in level set estimates.
The Annals of Applied Probability
Owada, T. and
Adler, R. J. (2017). Limit theorems for point processes under geometricconstraints (and topological crackle).
The Annals of Probability
Penrose, M. (2003).
Random Geometric Graphs . Oxford: Oxford University Press.[20]
Polonik, W. (1995). Measuring mass concentrations and estimating density contour clusters –an excess mass approach.
The Annals of Statistics
Rigollet, P. and
Vert, R. (2009). Optimal rates for plug-in estimators of density level sets.
Bernoulli
Rinaldo, A. and
Wasserman, L. (2010). Generalized density clustering.
The Annals of Statis-tics
Talagrand, M. (2014).
Upper and Lower Bounds for Stochastic Processes: Modern Methodsand Classical Problems.
Heidelberg: Springer.[24]
Walther, G. (1997). Granulometric smoothing.
The Annals of Statistics25.6