2D Fractional Cascading on Axis-aligned Planar Subdivisions
22D Fractional Cascading on Axis-aligned Planar Subdivisions
Peyman Afshani ∗ and Pingan Cheng ∗ Department of Computer Science, Aarhus University, Denmark { peyman, pingancheng } @cs.au.dk November 30, 2020
Abstract
Fractional cascading is one of the influential and important techniques in data structures,as it provides a general framework for solving a common important problem: the iterativesearch problem. In the problem, the input is a graph G with constant degree. Also asinput, we are given a set of values for every vertex of G . The goal is to preprocess G suchthat when we are given a query value q , and a connected subgraph π of G , we can find thepredecessor of q in all the sets associated with the vertices of π . The fundamental resultof fractional cascading, by Chazelle and Guibas, is that there exists a data structure thatuses linear space and it can answer queries in O (log n + | π | ) time, at essentially constanttime per predecessor [15]. While this technique has received plenty of attention in the pastdecades, an almost quadratic space lower bound for “two-dimensional fractional cascad-ing” by Chazelle and Liu in STOC 2001 [17] has convinced the researchers that fractionalcascading is fundamentally a one-dimensional technique.In two-dimensional fractional cascading, the input includes a planar subdivision for everyvertex of G and the query is a point q and a subgraph π and the goal is to locate the cellcontaining q in all the subdivisions associated with the vertices of π . In this paper, we showthat it is actually possible to circumvent the lower bound of Chazelle and Liu for axis-alignedplanar subdivisions. We present a number of upper and lower bounds which reveal that intwo-dimensions, the problem has a much richer structure. When G is a tree and π is a path,then queries can be answered in O (log n + | π | + min {| π |√ log n, α ( n ) (cid:112) | π | log n } ) time usinglinear space where α is an inverse Ackermann function; surprisingly, we show both branchesof this bound are tight, up to the inverse Ackermann factor. When G is a general graph orwhen π is a general subgraph, then the query bound becomes O (log n + | π |√ log n ) and thisbound is once again tight in both cases. Fractional cascading [15] is one of the widely used tools in data structures as it provides ageneral framework for solving a common important problem: the iterative search problem, i.e.,the problem of finding the predecessor of a single value q in multiple data sets. In the problem,we are to preprocess a degree-bounded “catalog” graph G where each vertex represents aninput set of values from a totally ordered universe U ; the input sets of different vertices of G arecompletely unrelated. Then, at the query time, given a value q ∈ U and a connected subgraph π of G , the goal is to find the predecessor of q in the sets that correspond to the vertices of π . The fundamental theorem of fractional cascading is that one can build a data structure of ∗ Supported by DFF (Det Frie Forskningsr¨ad) of Danish Council for Independent Research under grant IDDFF − − a r X i v : . [ c s . C G ] N ov inear size such that the queries can be answered in O (log n + | π | ) time , essentially giving usconstant search time per predecessor after investing an initial O (log n ) search time [15]. Manyproblems benefit from this technique [16] since they need to solve the iterative search problemas a base problem.Given its importance, it is not surprising that many have attempted to generalize thistechnique: The first obvious direction is to consider the dynamic version of the problem byallowing insertions or deletions into the sets of the vertices of G . In fact, Chazelle and Guibasthemselves consider this [15] and they show that with O (log n ) amortized time per update,one can obtain O (log n + | π | log log n ) query time. Later, Mehlhorn and N¨aher improve theupdate time to O (log log n ) amortized time [26] and then Dietz and Raman [22] remove theamortization. There is also some attention given to optimize the dependency of the query timeon the maximum degree of graph G [23].The next obvious generalization is to consider the higher dimensional versions of the problem.Here, each vertex of G is associated with an input subdivision and the goal is to locate a givenquery point q on every subdivision associated with the vertices of π . Unfortunately, here werun into an immediate roadblock already in two dimensions: After listing a number of potentialapplications of two-dimensional fractional cascading, Chazelle and Liu [17] “dash all such hopes”by showing an ˜Ω( n ) space lower bound in the pointer-machine model for any data structurethat can answer queries in O (log O (1) n + | π | ) time. Note that this lower bound can be generalizedto also give a Ω( n − ε ) space lower bound for data structures with O (log O (1) n )+ o ( | π | log n ) querytime. As far as we can tell, progress in this direction was halted due to this negative result sincethe trivial solution already gives the O ( | π | log n ) query time, by just building individual pointlocation data structures for each subdivision.We observe that the lower bound of Chazelle and Liu does not apply to orthogonal sub-divisions, a very important special case of planar point location problem. Many geometricproblems need to solve this base problem, e.g., 4D orthogonal dominance range reporting [3, 4],3D point location in orthogonal subdivisions [27], some 3D vertical ray-shooting problems [21].In geographic information systems, it is very common to overlay planar subdivisions describingdifferent features of a region to generate a complete map. Performing point location queries onsuch maps corresponds to iterative point locations on a series of subdivisions.Motivated by this observation, we systematically study the generalization of fractional cas-cading to two dimensions, when restricted to orthogonal subdivisions. We obtain a number ofinteresting results, including both upper and lower bounds which show most of our results aretight except for the general path queries of trees where the bound is tight up to a tiny inverseAckermann factor [18]. The problem definition
The formal definition of the problem is as follows. The input isa degree-bounded connected graph G = ( V, E ) where each vertex v ∈ V is associated with anaxis-aligned planar subdivision. Let n be the total number of vertices, edges, and faces in thesubdivisions, which we call the graph subdivision complexity . We would like to build a datastructure such that given a query ( q, π ), where q is a query point and π is a connected subgraphof G , we can locate q in all the subdivisions induced by vertices of π efficiently. We call thisproblem 2D Orthogonal Fractional Cascading (2D OFC). While the negative result of Chazelle and Liu [17] stops any progress on the general problem oftwo-dimensional fractional cascading, there have been other results that can be seen as specialcases of two-dimensional fractional cascading. For example, Chazelle et al. [13] improved the All logs are base 2 unless otherwise specified. The ˜Ω notation hides polylogarithmic factors. n factor. In a “geodesically triangulated”subdivision of n vertices, they showed it is possible to locate all the triangles crossed by a rayin O (log n ) time instead of O (log n ), which resembles 2D fractional cascading. However, theirsolution relies heavily on the characteristic of geodesic triangulation and cannot be generalizedto other problems. Chazelle’s data structure for the rectangle stabbing problem [11] can alsobe viewed as a restricted form of two-dimensional fractional cascading where π = G .In recent years, interestingly, a technique similar to 2D fractional cascading has been usedto improve many classical computational geometry data structures. While working on the 4Ddominance range reporting problem, Afshani et al. [3] are implicitly performing iterative pointlocation queries along a path of a balanced binary tree on somewhat specialized subdivsions in O (log / n ) total time. Later Afshani et al. [4] studied an offline variant of this problem, andthey presented a linear sized data structure that achieves optimal query time. The same ideais used to improve the result of 3D point location in orthogonal subdivisions. In that probelm,Rahul [27] obtained another data structure with O (log / n ) query time.Another related problem is the “unrestricted” version of fractional cascading where essen-tially π can be an arbitrary subgraph of G , instead of a connected subgraph. In one variant, weare given a set L of categories and a set S of n points in d dimensional space where each pointbelongs to one of the categories. The query is given by a d -dimensional rectangle r and a subset Q ⊂ L of the categories. We are asked to report the points in S contained in r and belonging tothe categories in Q . In 1D, Chazelle and Guibas [16] provided a O ( | Q | log | L || Q | + log n + k ) querytime and linear size data structure, where k is the output size, together with a restricted lowerbound. Afshani et al. [5] strenghtened the lower bound and presented several data structuresfor three-sided queries in two-dimensions. Their data structures match the lower bound withinan inverse Ackermann factor for the general case. We study 2D OFC in a pointer machine model of computation. Some of our bounds involveinverse Ackermann functions. The particular definition that we use is the following. We define α ( n ) = log n and then we define α i ( n ) = α ∗ i − ( n ), meaning, it’s the number of times we needto apply the α i − ( · ) function to n until we reach a fixed constant. α ( n ) corresponds to the valueof i such that α i ( n ) is at most a fixed constant. Our results are summarized in Table 1.Table 1: Our Results Graph Query Space Query Time Tight?
Tree Path O ( nα c ( n )) O (min {| π |√ log n, c (cid:112) | π | log n } +log n + | π | ) Up to α c ( n )factorTree Path O ( n ) O (min {| π |√ log n, α ( n ) (cid:112) | π | log n } + log n + | π | ) Up to α ( n )factorTree Subtree O ( n ) O (log n + | π |√ log n ) yesGraph Path / Subgraph O ( n ) O (log n + | π |√ log n ) yesOur results show some very interesting behavior. First, by looking at the last two rows ofTable 1, we can see that we can always do better than the na¨ıve solution by a √ log n factor.Furthermore, this is tight. We show matching query lower bounds both when G can be anarbitrary graph but with π being restricted to a path and also when G is a tree but π is allowedto be any subtree of G . Second, when G is a tree and π is a path we get some variationdepending on the length of the query path. When π is of length at most log n , then we cananswer queries in O ( | π |√ log n ) time, but when π is longer than log n , we obtain the query boundof O ( (cid:112) | π | log n ) (ignoring some inverse Ackermann factors). Furthermore, we give two lower3ounds that show both of these branches are tight! When π is very long, longer than log n ,then the query bound becomes O ( | π | ) which is also clearly optimal. In this section, we introduce some geometric preliminaries and present the tools we will use tobuild the data structures and to prove the lower bounds.
First we review the definition of planar subdivisions.
Definition 2.1.
A graph is said to be a planar graph if it can be embedded in the plane withoutcrossings. A planar subdivision is a planar embedding of a planar graph where all the edgesare straight line segments. The complexity of a planar subdivision is the sum of the number ofvertices, edges, and faces of the subdivision.
Planar point location, defined below, is one classical problem related to planar subdivisions:
Definition 2.2.
Given a planar subdivision S of complexity n , in the planar point locationproblem, we are asked to preprocess S such that given any query point q in the plane, we canfind the face f in S containing q efficiently. Note that we can assume that the subdivision is enclosed by a bounding box. There areseveral different ways to solve the planar point location problem optimally in O (log n ) querytime and O ( n ) space, see [29] for details. One simple solution uses trapezoidal decomposition,see [19] for a detailed introduction. Roughly speaking, given a planar subdivision S enclosed bya bounding box R , we construct a trapezoidal decomposition of the subdivision by extendingtwo rays from every vertex of S , one upwards and one downwards. The rays stop when they hitan edge in S or the boundary of R . The faces of the subdivision we obtain after this transformwill be only trapezoids. Figure 1 gives an example of trapezoidal decomposition. A crucialproperty of trapezoidal decomposition is that it increases the complexity of the subdivision byonly a constant factor. (a) A Planar Subdivision (b) After Trapezoidal Decomposition Figure 1: Example of Trapezoidal DecompositionWe also review some concepts related to cuttings.
Definition 2.3.
Given a set H of n hyperplanes in the plane, a (1 /r ) -cutting, ≤ r ≤ n , isa set of (possibly open) disjoint simplices that together cover the entire plane such that eachsimplex intersects O ( n/r ) hyperplanes of H . For each simplex in the cutting, the set of allhyperplanes of H intersecting it is called the conflict list of that simplex. /r )-cuttings are important in computational geometry as they enable us to apply thedivide-and-conquer paradigm in higher dimensions. The following theorem by Chazelle [12],after a series of work in the computational geometry community [24, 25, 6, 7, 14], showsthe existence of (1 /r )-cuttings of small size and an efficient deterministic algorithm computing(1 /r )-cuttings. Theorem 2.1 (Chazelle [12]) . Given a set H of n hyperplanes in the plane, there exists a (1 /r ) -cutting, ≤ r ≤ n , of size O ( r ) , which is optimal. We can find the cutting and thecorresponding conflict lists in O ( nr ) time. In this paper, we will use intersection sensitive (1 /r )-cuttings which is a generalization of(1 /r )-cuttings. The following theorem is given by de Berg and Schwarzkopf [20]. Theorem 2.2 (de Berg and Schwarzkopf [20]) . Given a set H of n line segments in the planewith A intersections, we can construct a (1 /r ) -cutting, ≤ r ≤ n , of size O ( r + Ar /n ) .We can find the cutting and the corresponding conflict lists in time O ( n log r + Ar/n ) using arandomized algorithm. Note that by the construction of generalized cuttings, see [20] for detail, the following corol-lary follows directly from Theorem 2.2,
Corollary 2.1.
Given an axis-aligned planar subdivision of complexity n , we can construct a (1 /r ) -cutting, ≤ r ≤ n , of size O ( r ) . More specifically, each cell of the cutting is an axis-aligned rectangle and the size of the conflict list of every cell is bounded by O ( n/r ) . We can findthe cutting and the corresponding conflict lists in time O ( n log n ) using a randomized algorithm. In d -dimensional rectangle stabbing problem, we are given a set of n d -dimensional axis-parallelrectangles, our task is to build a data structure such that given a query point q , we can reportthe rectangles containing the query point efficiently. As noted earlier, Chazelle [11] providesan optimal solution in two-dimensions, a linear-sized data structure that can answer queries in O (log n + t ) time where t is the output size. The following lemma by Afshani et al. [3] establishesan upper bound of this problem and it is obtained by a basic application of range trees [8] withlarge fan-out and Chazelle’s data structure. Lemma 2.1 (Afshani et al. [3]) . We can answer d dimensional rectangle stabbing queriesin time O (log n · (log n/ log H ) d − + t ) using space O ( nH log d − n ) , where n is the number ofrectangles, t is the output size, and H ≥ is any parameter. We will use the pointer machine lower bound framework of Afshani [1]. The framework dealswith an abstract “geometric stabbing problem” which is defined by a set R of “ranges” and aset U of queries. An instance of the geometric stabbing problem is given by a set R ⊂ R of n “ranges” and the goal is to preprocess R to answer queries q . Given R , an element q ∈ U (implicitly) defines a subset R q ⊂ R and the data structure is to output the elements of R q .However, the data structure is restricted to operate in the (strengthened) pointer machine modelof computation where the memory is a directed graph M consisting of “cells” where each cellcan store an element of R as well as two pointers to other memory cells. At the query time,the algorithm must find a connected subgraph M q of M where each element of R q is stored inat least one memory cell of M q . The size of M is a lower bound on the space complexity ofthe data structure and the size of R q is a lower bound on the query time. However, the lowerbound model allows for unlimited computation and allows the data structure to have complete5nformation about the problem instance; the only bottleneck is being able to navigate to thecells storing the output elements. In addition, the framework assumes that we have a measure µ such that µ ( U ) = 1. We need a slightly more precise version of the lower bound frameworkwhere the dependency on a certain “constant” is made explicit. Theorem 2.3.
Assume, we have an algorithm that given any input instance R ⊂ R of n ranges,it can store R in a data structure of size S ( n ) such that given any query q ∈ U , it can answerthe query in Q ( n ) + γ | R q | time.Then, suppose we can construct an input set R ⊂ R of n ranges such that the followingtwo conditions are satisfied: (i) every query point q ∈ U is contained in exactly | R q | = t ranges and γt ≥ Q ( n ) ; (ii) there exists a value v such that for any two ranges r , r ∈ R , µ ( { q ∈ U | r , r ∈ R q } ) is well-defined and is upper bounded by v . Then, we must have S ( n ) =Ω( tv − / O ( γ ) ) = Ω( Q ( n ) v − / O ( γ ) ) . For the proof of this theorem, we refer the readers to Appendix A. In our applications, µ will basically be the Lebesgue measure and U will be the unit cube. In this section, we give a simple solution for when the catalog graph is a path. It will be usedas a building block for later data structures.
Theorem 3.1.
Consider a catalog path G , in which each vertex is associated with a planarsubdivision. Let n be the total complexity of the subdivisions. We can construct a data structureusing O ( n ) space such that given any query ( q, π ) , where q is a query point and π is a subpath,all regions containing q along π can be reported in time O (log n + | π | ) .Proof. We can convert each subdivision into a set of disjoint rectangles of total size O ( n ) usingtrapezoidal decomposition [19]. Then, we partition G into m = (cid:100)| G | / log n (cid:101) paths, G , · · · , G m where each path except potentially for G m has size log n and G m has size at most log n .Now we use an observation that was also made in previous papers [2, 3, 27]: when H = G ,the two-dimensional fractional cascading can be reduced to rectangle stabbing. As a result, foreach G i , 1 ≤ i ≤ m , we collect all the rectangles of its subdivisions and build a 2D rectanglestabbing data structure on them. By Lemma 2.1 this requires O ( n ) space. Now given a querysubpath of length | π | , we use the rectangle stabbing data structures on the subdivisions ofeach G i as long as | G i ∩ π | >
0. Since π is a path, for at most two indices i we will have0 < | G i ∩ π | < log n and for the rest | G i ∩ π | = | G i | = log n . This gives us O (log n + | π | ) querytime. Now we consider answering path queries on catalog trees. We first show optimal data structuresfor trees of different heights. It turns out we need different data structures to achieve optimalitywhen heights differ. We then present a data structure using O ( nα c ( n )) space that can answerpath queries in O (log n + | π | + min {| π |√ log n, (cid:112) | π | log n } ) time and a data structure using O ( n )space answering path queries in O (log n + | π | +min {| π |√ log n, α ( n ) (cid:112) | π | log n } ) time, where c ≥ α k ( n ) is the k -th function in the inverse Ackermann hierarchy [18] and α ( n )is the inverse Ackermann function [18] . We also present lower bounds for our data structures.Without loss of generality, we assume the tree is a binary tree.6 .1 Trees of height h ≤ log n For trees of this height, we present the following upper bound. The main idea is to use thesampling idea that is employed previously [9, 3], however, there are some main differences.Instead of random samples or shallow cuttings, we use intersection sensitive cuttings [20] andmore notably, the fractional cascading on an arbitrary tree cannot be reduced to a geometricproblem such as 3D rectangle stabbing, so instead we do something else.
Lemma 4.1.
Consider a catalog tree of height h ≤ log n in which each vertex is associated witha planar subdivision. Let n be the total complexity of the subdivisions. We can build a datastructure using O ( n ) space such that given any query ( q, π ) , where q is a query point and π isa path, all regions containing q along π can be reported in time O (log n + | π |√ log n ) .Proof. Let r be a parameter to be determined later. Consider a planar subdivision A i and let n i be the number of rectangles in A i . We create an intersection sensitive ( r /n i )-cutting C i on A i . By Corollary 2.1, C i contains O ( n i /r ) cells and each cell of C i is an axis-aligned rectangle.Furthermore, the conflict list size of each rectangle is O ( r ). For each cell in C i , we build anoptimal point location data structure on its conflict list. The total space usage is linear, sincetotal size of the conflict lists is linear.Then, we consider every path of length at most log r in the catalog graph, and we call themsubpaths. For every subpath, we collect all the cells of the cuttings belonging to the vertices ofthe subpath and build a 2D rectangle stabbing data structure on them. Since the degree of anyvertex is bounded by 3, each vertex is contained in at most log r (cid:88) j =0 j (cid:88) i =0 i · j − i = Θ( r log 3 log r )many subpaths. Then the total space usage of the 2D rectangle stabbing data structures isbounded by O ( n log r/r − log 3 ) = O ( n ). Given any query path π , it can be covered by | π | / log r subpaths. For each subpath, we can find all the cells of the cuttings containing the query pointin O (log n ) time and then perform an additional point location query on its conflict list, for atotal of O (log n + (log r ) ) query time per subpath. Thus, the query time of this data structureis bounded by | π | log r (log n + log r · log r ) . We pick r = 2 √ log n , then we obtain the desired O (log n + | π |√ log n ) query time. We now present a matching lower bound. We show the following:
Lemma 4.2.
Assume, given any catalog tree of height √ log n ≤ h ≤ log n in which each vertexis associated with a planar subdivision with n being the total complexity of the subdivisions, wecan build a data structure that satisfies the following: it uses at most n ε √ log n space, for asmall enough constant ε , and it can answer 2D OFC queries ( q, π ) . Then, its query time mustbe Ω( | π |√ log n ) .Proof. We will use the following idea: We consider a special 3D rectangle stabbing problemand show a lower bound using Theorem 2.3. We will use the 3D Lebesgue measure, denoted by V ( · ). Then we show a reduction from this problem to a 2D OFC problem on trees to obtainthe desired lower bound. 7e consider the following instance of 3D rectangle stabbing problem. The input n rectanglesare partitioned into h sets of size n/h each. The rectangles in each set are pairwise disjointand they together tile the unit cube in 3D. The depth (i.e., the length of the side parallelto the z -axis) of rectangles in set i = 0 , , · · · , h − / i . In fact, the rectangles in set i can be partitioned into 2 i subsets where the projection of the rectangles in the j -th subset, j = 0 , , · · · , i −
1, onto the z -axis is the interval [ j i , j +12 i ]. See Figure 2 for an example. z yx / / / / Figure 2: An example of rectangles in set 2We first show the reduction: assume, we are given an instance of the special 3D rectanglestabbing problem described above. We build a balanced binary tree of height h on the z -axisas the catalog graph. Note that the number of vertices at layer i of the tree is the same as thenumber of subsets in set i . We project the rectangles in each subset to the xy -plane and obtaina 2D axis-aligned planar subdivision. We attach each of the subdivisions to the correspondingvertices. Consider a 2D OFC query, ( q, π ) in which π is a path that connects the root to a leaf.We lift q to a point q (cid:48) in 3D appropriately: W.l.o.g., assume the leaf is the j -th leaf. To obtain q (cid:48) , we assign the z coordinate j +0 . h − to q . By our construct, the z -axis projection of any rectanglein the nodes from the root to the leaf contains the z coordinate of q (cid:48) . This construction ensuresthat finding the rectangles that contain q (cid:48) is equivalent to performing 2D OFC query ( q, π ).Now we describe a hard instance of the special rectangle stabbing problem to establish alower bound. It will have rectangles of h different shapes. For each shape, we tile (disjointlycover) the unit cube using isometric copies of the shape to obtain a set of rectangles. Wecollect every r different shapes into a class and obtain h/r classes, where r is a parameterto be determined later. We say that the j -th rectangle in a class has group number j , j =0 , , · · · , r −
1. Now we specify the dimensions (i.e., side lengths) of the rectangles. For arectangle in class i = 0 , , · · · , h/r −
1, with group number j = 0 , , · · · , r −
1, its dimensionsare [ 1 K j × K j · ir + j · V × ir + j ] , where K, V are parameters to be determined later and the [ W × H × D ] notation denotes anaxis-aligned rectangle with width W , height H , and depth D . Observe that every rectangle hasvolume V and thus we need 1 /V copies to tile the unit cube. By setting V = h/n , the totalnumber of rectangles we generate is n . Also note that all the rectangles in the same group arepairwise disjoint and they together cover the whole unit cube. This implies for any query point q in the unit cube, it is contained in exactly h = | π | rectangles.Now we analyze the intersection of any two rectangles. First, observe that given two axis-aligned rectangles with dimensions [ W × H × D ] and [ W × H × D ], their intersection is anaxis-aligned rectangle with dimensions at most [min { W , W } × min { H , H } × min { D , D } ].Second, by our construction, the rectangles that have identical width, depth, and height are8isjoint. As a result, either the width of the two rectangles will differ by a factor K or theirdepth will differ by a factor 2 r . This means that, the maximum intersection volume of any tworectangles R , R in class i , i , group j , j can be achieved only in one of the following twocases: V ( R ∩ R ) = (cid:40) V K i = i and j = j + 1 , V r i = i + 1 and j = j . We set K = 2 r , then the intersection volume of any two rectangles is bounded by v = V / r .However, for the construction to be well-defined, the side length of the rectangles cannot exceed1 as otherwise, they do not fit in the unit cube. The largest height of the rectangles is obtainedfor j = r − i = hr −
1. Thus, we must have, K r − r ( hr − r − V ≤ . By plugging in the values V = h/n and K = 2 r we get that we must have2 r − r h − h ≤ n (1)Since by our assumptions h ≤ log n , it follows that by setting r = √ log n , the inequality (1)holds.If γh ≥ Q ( n ) holds, then we satisfy the first condition of Theorem 2.3 and thus we obtainthe space lower bound of S ( n ) = Ω (cid:18) Q ( n ) v − O ( γ ) (cid:19) = Ω (cid:18) n r log n O ( γ ) (cid:19) . (2)Now observe that if we set γ = δ √ log n , for a sufficiently small δ >
0, then it follows thatthe data structure must use more than Ω( n Ω( √ log n ) ) space. However, by the statement of ourlemma, we are not considering such data structures. As a result, when γ = δ √ log n , the querytime must be large enough that the first condition of the framework does not hold, meaning,we must have Q ( n ) ≥ γh = δ √ log nh = Ω( | π |√ log n ). log n < h ≤ log n We start with the following lemma which gives us a data structure that can only answer querypaths that start from the root and finish at a leaf. The main idea here is used previously in thecontext of four-dimensional dominance queries [3, 9] and it uses the observation that such “rootto leaf” queries can be turned into a geometric problem, the 3D rectangle stabbing problem.
Lemma 4.3.
Consider a balanced catalog tree of height h , log n < h ≤ log n , in which eachvertex is associated with a planar subdivision. Let n be the total complexity of the subdivisions.We can build a data structure using O ( n ) space such that given any query ( q, π ) , where q is aquery point and π is a path starting from the root to a leaf, all regions containing q along π canbe reported in time O ( (cid:112) | π | log n ) .Proof. Let r be a parameter to be determined later. For each subdivision A i , we create anintersection sensitive ( r/n i )-cutting C i on A i . By the same argument as Lemma 4.1, all thecells in the cuttings are axis-aligned rectangles satisfying (i) the conflict set size of any cell in C i is bounded by O ( r ) and (ii) the total number of cells in C i is O ( n i /r ).Now we lift each cell in the cuttings to 3D rectangles and collect all the 3D rectangles toconstruct a 3D rectangle stabbing data structure for it. This is done as follows. We assign a9 range for each vertex in the catalog tree; Let m be the number of leaves. Order the leavesof the catalog tree from left to right and for the i -th leaf l i , i ∈ { , , · · · , m } , we assign therange [ i − , i ) as its z range. For any internal vertex, its z range is the union of the z rangesof its children. Then, we lift the 2D rectangles induced by the subdivision of a vertex to a 3Drectangle using the z range (i.e., by forming the Cartesian product of the rectangle and the z range). We store the 3D rectangles in a rectangle stabbing data structure. Given a query point q = ( x q , y q ) and a query path π , we first lift q to be ( x q , y q , z q ), where z q is any z value in the z range of the deepest vertex in π , and then query the 3D rectangle stabbing data structure.In addition, for each cell in a cutting, we build an optimal point location data structure onits conflict set. All these point location data structures take space O ( (cid:80) i n i ) = O ( n ) in totaland each of them can answer a point location query in time O (log r ).To achieve space bound O ( n ) for the 3D rectangle stabbing data structure, it suffices tochoose H = r log ( n/r ) . We then balance the query time for 3D rectangle stabbing and 2D pointlocations to achieve the optimal query timelog n · log n log r log ( n/r ) = h · log r. We pick r = 2 log n/ √ h and the query time is bounded by O ( √ h log n ) = O ( (cid:112) | π | log n ).The above data structure is not a true fractional cascading data structure because it canonly support restricted queries. To be able to answer query paths of arbitrary lengths > log n and ≤ log n , we need the following result. Lemma 4.4.
Consider a catalog tree in which each vertex is associated with a planar subdivision.Let n be the total complexity of the subdivisions and let h and h , h < h , be two fixedparameters. We can build a data structure using O ( n log( h /h )) space such that given anyquery ( q, π ) , where q is a query point and π is a path whose length obeys h ≤ | π | ≤ h , allregions containing q along π can be reported in time O ( (cid:112) | π | log n ) .Proof. First, observe that w.l.o.g., we can assume that the height of the catalog tree is at most h : we can partition the catalog tree into a forest by cutting off vertices whose depth is amultiple of h . Since the length of π is at most h , it follows that π can only contain verticesfrom at most two of the trees in the resulting forest, meaning, answering π can be reduced toanswering at most two queries on trees of height at most h .Thus, w.l.o.g., assume v is the root of the catalog tree of height h and π is a path of lengthat least h in this catalog tree. We build the following data structures. Let v , · · · , v m be thevertices at height h /
2. Let T be the tree rooted at v and cut off at height h/ v , · · · , v m being leafs and T i be the tree rooted at v i , 1 ≤ i ≤ m . We build m + 1 data structures ofLemma 4.3 on T , · · · , T m and then we recurse on each of the m + 1 trees. The recursion stopsonce we reach subproblems on trees of height at most h .Since the data structure of Lemma 4.3 uses O ( n ) space, at each recursive level, the totalspace usage of data structures we constructed is O ( n ). Over the O (log( h /h )) recursion levels,this sums up to O ( n log( h /h )) space.Now we analyze the query time. Given a query ( q, π ), we may query several data structuresthat together cover the whole path of π . Let w be the highest vertex on π . We can decompose π into two disjoint parts π and π , that start from w and end at vertices u and u respectively,with u and u being descendants of w . It thus suffices to only answer π , as the other pathcan be answered similarly. The first observation is that we can find a series of data structuresthat can be used to answer disjoint parts of π . The second observation is that we can affordto make the path a bit longer to truncate the recursion. We now describe the details.Consider the trees T , · · · , T m defined at the top level of the recursion. If π is entirelycontained in one of the trees, then we recurse on that tree. Otherwise, w is contained in T u is contained in some subtree T i . Now, π can be further subdivided into two smaller“anchored” paths: one from w to v i (“anchored” at w ) and another from v i to u (“anchored”at v i ) and each smaller path can be answered recursively in the corresponding tree. Thus, itsuffices to consider answering the query q along an anchored path.Thus, consider the case of answering an anchored path π (cid:48) in the data structure. To reducethe notation and clutter, assume π (cid:48) is an anchored path, starting from the root of T and endingat a vertex u . Assume the vertices v , · · · , v m and trees T , · · · , T m are defined as above. First,consider the case when the height of T is at most h ; in this case, we have built an instance ofthe data structure of Lemma 4.3 on T but not on the trees T , · · · , T m . In this case, we simplyanswer q on a root of leaf path in T that includes π (cid:48) , e.g., by picking a leaf in the subtree of u .In this case, we will be performing a number of “useless” point location queries, in particularthose on the descendants of u . However, as the height of T is at most h , it follows that thequery bound stays asymptotically the same: O ( √ h log n ). Furthermore, there is no recursionin this case and thus this cost is paid only once per anchored path. The second case is when theheight of T is greater than h . In this case, if u lies in T we simply recurse on T but if u liesin a tree T i , we first query the data structure of Lemma 4.3 using the path from the root of T until v i , and then we recurse on T i . As a result, answering the anchored path query reduces toanswering at most one query on an instance of data structure Lemma 4.3 and another recursive“anchored” on a tree of half the height. Thus, the i -th instance of the data structure Lemma 4.3that we query covers at most 1 / i fraction of the anchored path. Thus, if k is the length ofthe anchored path, it follows that the total query time of all the data structures we query isbounded by ∞ (cid:88) i =1 √ k log n i = O ( √ k log n ) = O ( (cid:112) | π | log n ) . We now reduce the space of the above lemma dramatically. We will repeatedly use a “boot-strapped” data structure. The following lemma establishes how we can bootstrap a base datastructure to obtain a more efficient one.
Lemma 4.5.
Consider a catalog tree of height h , log n < h ≤ log n , in which each vertex isassociated with a planar subdivision. Let n be the total complexity of the subdivisions. Assume,for any fixed value ∆ , ω (1) ≤ ∆ ≤ log n , we can build a “base” data structure that can answera 2D OFC query ( q, π ) in Q b ( n ) = O ( (cid:112) | π | log n ) time as long as π is path of length between log n and log n . Furthermore, assume it uses S b (∆ , n ) = O ( nf (∆)) space, for some function f which is monotone increasing in ∆ and for ∆ = ω (1) we have f (∆) = ω (1) .Then, for any given fixed value ∆ , ω (1) ≤ ∆ ≤ log n , we can build a “bootstrapped” datastructure that can answer a 2D OFC query ( q, π ) in Q b ( n ) + O ( (cid:112) | π | log n ) time as long as π ispath of length between log n and log n . Furthermore, it uses O ( nf ∗ (∆)) space, where f ∗ ( · ) isthe iterative f ( · ) function which denotes how many times we need to apply f ( · ) function to ∆ to reach a constant value.Proof. We construct an intersection sensitive ( f (∆) /n i )-cutting C i for each planar subdivi-sion A i attached to the tree. Call these the “first level” cuttings. Similar to the analysis inLemma 4.1, we obtain O ( n i /f (∆)) cells, which are disjoint axis-aligned rectangles, for each C i and thus n (cid:48) = O ( n/f (∆)) cells in total. Each cell in the cutting has a conflict list of size O ( f (∆)) and on that we build a point location data structure. This takes O ( n ) space in total.We store the cells of the cutting in an instance of the base data structure with parameter ∆.Call this data structure A . The space usage of A is S b (∆ , n (cid:48) ) = O ( n (cid:48) f (∆)) = O ( n ) . q, π ). Let δ = log( f (∆)). Consider the case when · (cid:16) log n ∆ (cid:17) ≤| π | ≤ · (cid:16) log nδ (cid:17) . In this case, as A is built with parameter ∆, we can query it with ( q, π ).Thus, in Q b ( n ) time, for every subdivision on path π , we find the cell of the cutting thatcontains q . Then, we use the point location data structure on the conflict lists of the cells tofind the original rectangle containing q . This takes an additional O (log( f (∆))) as the size ofeach conflict is O ( f (∆)). Thus, the query time in this case is Q b ( n ) + O ( | π | log( f (∆))) = Q b ( n ) + O ( (cid:112) | π | log n )since we have | π | ≤ · (cid:16) log nδ (cid:17) = · (cid:16) log n log( f (∆)) (cid:17) .Thus, the only paths we cannot answer yet are those when · (cid:16) log nδ (cid:17) ≤ | π | ≤ log n . In thiscase, we can bootstrap. First, observe that we can build a data structure A (cid:48) on the the originalrectangles, where A (cid:48) is an instance of the base data structure but this time with parameter ∆set to δ . This will take S b ( δ , n ) = O ( nf ( δ )) space. Thus, the total space consumption is O ( n ) + S b ( δ , n ) = O ( n ) + O ( nf (log ( f (∆)))) = O ( n ) + O ( nf ( f (∆))) (3)where the last inequality follows since f ( · ) is a monotone increasing function and log ( f (∆))
3, where α ( n ) the inverse Ackermann function [18]. Lemma 4.6.
Consider a catalog tree of height h , log n < h ≤ log n , in which each vertex isassociated with a planar subdivision. Let n be the total complexity of the subdivisions. We canbuild a data structure using O ( nα c ( n )) space, where c ≥ is any constant and α c ( n ) is the c -th function of the inverse Ackermann hierarchy, such that given any query ( q, π ) , where q isa query point and π is a path of length | π | , log n < | π | ≤ log n , all regions containing q along π can be reported in time O ( (cid:112) | π | log n ) . Furthermore, we can also build a data structure using ( n ) space answering queries in time O ( α ( n ) (cid:112) | π | log n ) , where α ( n ) is the inverse Ackermannfunction.Proof. By Lemma 4.4, if we set h = log n and h = log n , we obtain a data structure using O ( n log log n ) answering queries in time O ( (cid:112) | π | log n ). By picking ∆ = log n , f = log n , wecan apply Lemma 4.5 to reduce the space to O ( n log ∗ (log n )) = O ( n log ∗ n ) while achieving thesame query time. If we again pick ∆ = log n , but f = log ∗ n , by applying Lemma 4.5 again,the space is further reduced to O ( n log ∗∗ (log n )) = O ( n log ∗∗ n ). We continue this process untillog ∗ ( i ) n is less than three. Note that we will need to pay O ( (cid:112) | π | log n ) extra query time eachtime we apply Lemma 4.5. We will end up with a linear-sized data structure with query time O ( τ ( n ) (cid:112) | π | log n ) = O ( α ( n ) (cid:112) | π | log n ). On the other hand, if we stop applying Lemma 4.5after a constant c many rounds, we will end up with a O ( n log ∗ ( c ) n ) = O ( nα c +2 ( n )) sized datastructure with the original O ( (cid:112) | π | log n ) query time. We show an almost matching lower bound in this section.
Lemma 4.7.
Assume, given any catalog tree of height h , log n < h ≤ log n , in which each vertexis associated with a planar subdivision with n being the total complexity of the subdivisions, wecan build a data structure that satisfies the following: it uses at most n ε log n/ √ h space, for asmall enough constant ε , and it can answer 2D OFC queries ( q, π ) . Then, its query time mustbe Ω( (cid:112) | π | log n ) .Proof. We first describe a hard input instance for a 3D rectangle stabbing problem and laterwe show that this can be embedded as an instance of 2D OFC problem on a tree of height h .Also, we actually describe a tree of height h (cid:48) = h +(log n ) / ≤ h . This is not an issue as we canadd dummy vertices to the root to get the height to exactly h .We begin by describing the set of rectangles. Each rectangle is assigned a “class number”and a “group number”. The number of classes is √ h and the number of groups is log n √ h + √ h .The rectangles with the same class number and group number will be disjoint, isometric andthey would tile the unit cube. Rectangles with class number i = 0 , , · · · , √ h − j = 0 , , · · · , log n √ h − K j × K j · ir + j · V × ir + j ] , where V, K are some parameters to be determined later and r = log n √ h . Similarly, rectangleswith class number i = 0 , , · · · , √ h − j = log n √ h , · · · , log n √ h + √ h − K j × K j · ir + r · V × ir + r ] . The total number of different shapes is √ h · ( log n √ h + √ h ) = h (cid:48) . Note that each rectangle hasvolume V , so the total number of rectangles we use in all the tilings is n by setting V = h (cid:48) /n .By our construction any query point is contained in t = h (cid:48) = | π | rectangles. Now we analyzethe maximal intersection volume of two rectangles. By the same argument as in the proof ofLemma 4.2 the maximal intersection volume can only be achieved by two rectangles when theyare in the same class and adjacent groups or in the same group of adjacent classes. For tworectangles R and R in group j , j of class i , i , we have13 ( R ∩ R ) = V K i = i and j = j + 1 ≤ log n √ h , VK i = i and log n √ h ≤ j = j + 1 , V r i = i + 1 and j = j . We set K = 2 r , then the intersection of any two rectangle is no more than v = V / r .We also need to make sure no side length of any rectangle exceeds the side length of the unitcube. The maximum side length can only be obtained when i = √ h − j = log n √ h + √ h − K log n √ h + √ h − · r ( √ h − log n √ h · V ≤ K = 2 r and V = h (cid:48) /n in, we must have2 r log n √ h + r ( √ h − · r ( √ h − log n √ h · h (cid:48) ≤ n (4)Since log n ≤ h ≤ log n and r = log n √ h , (4) holds.Suppose γh ≥ Q ( n ), then the first condition of Theorem 2.3 is satisfied and we get the lowerbound of S ( n ) = Ω (cid:18) tv − O ( γ ) (cid:19) = Ω (cid:18) n r O ( γ ) (cid:19) . Observe that by setting γ = δ log n √ h for a sufficiently small δ >
0, the data structure must useΩ( n Ω(log n/ √ h ) ) space, which contradicts the space usage in our theorem. Therefore, Q ( n ) ≥ γh = δ log n √ h h = Ω( √ h log n ) = Ω( (cid:112) | π | log n ). It remains to show that this set of rectangles can ...... ... ...... log n √ h ... ... ... ... ... √ h Figure 3: A difficult tree for fractional cascading.actually be embedded into an instance of the 2D OFC problem. To do that, we describe the tree T that can be used for this embedding. See Figure 3. We hold the convention that the root of T has depth 0. Starting from the root, until depth log n √ h , every vertex will have two children (bluevertices in Figure 3) then we will have √ h vertices with one child (red vertices in Figure 3).Then this pattern continues for √ h steps. The first set of blue and red vertices correspond toclass 0, the next to class 1 and so on. Within each class, the top level corresponds to group 0and so on. To be specific, vertices at depth ( log n √ h + √ h ) i + j of the tree have rectangles of class i and group j . Now, it can be seen that the rectangles can be assigned to the vertices of T ,similar to how it was done in Lemma 4.2. The notable difference here is that the depth (lengthof the side parallel to the z -axis) of the rectangles decreases as the group number increases from0 to log n √ h − log n √ h until log n √ h + √ h −
1. This exactly corresponds14o the structure of the tree T . h > log n For trees of this height, we have:
Lemma 4.8.
Consider a catalog tree of height h > log n in which each vertex is associated witha planar subdivision. Let n be the total complexity of the subdivisions. We can build a datastructure using O ( n ) space such that given any query ( q, π ) , where q is a query point and π isa path of length | π | > log n , all regions containing q along π can be reported in time O ( | π | ) .Proof. We combine the classical heavy path decomposition by Sleator and Tarjan [28] and thedata structure for catalog paths to achieve the desire query time. We first apply the heavypath decomposition to the tree and then for every heavy path created we build a 2D OFC datastructure to answer queries along the path. Clearly, we only spend linear space in total. Thenby the property of the heavy path decomposition, we only need to query O (log n ) heavy pathsto answer a query, which leads to a query time of O (log n + | π | ) = O ( | π | ).By combining Lemma 4.1, Lemma 4.6, Lemma 4.8, we immediately get the following corol-lary. Corollary 4.1.
Consider a catalog tree in which each vertex is associated with a planar subdi-vision. Let n be the total complexity of the subdivisions. We can build a data structure using O ( nα c ( n )) space, where c ≥ is any constant and α c ( n ) is the c -th function of the inverse Ack-ermann hierarchy, such that given any query ( q, π ) , where q is a query point and π is a path, allregions containing q along π can be reported in time O (log n + | π | + min {| π |√ log n, (cid:112) | π | log n } ) .Furthermore, we can also build a data structure using O ( n ) space answering queries in time O (log n + | π | + min {| π |√ log n, α ( n ) (cid:112) | π | log n } ) , where α ( n ) is the inverse Ackermann function. In this section, we consider general catalog graphs as well as subgraph queries on catalog trees.Our result shows that it is possible to build a data structure of space O ( n ) such that we cansave a √ log n factor from the na¨ıve query time of iterative point locations. We also present amatching lower bound.We begin by presenting a basic reduction. Lemma 5.1.
Given a catalog graph G of m vertices with graph subdivision complexity n andmaximum degree d , we can generate a new catalog graph G (cid:48) with Θ( md ) vertices with graphsubdivision complexity Θ( dn ) and bounded degree O ( d ) such that the following holds: given anyconnected subgraph π ⊆ G , in time O ( | π | ) , we can find a path π (cid:48) in G (cid:48) such that the answer toany query Q = ( q, π ) in G equals the answer to query Q = ( q, π (cid:48) ) in G (cid:48) .Proof. The main idea is that we can add a number of dummy vertices to the graph such thatwe can turn a subgraph query to a path query.We can obtain G (cid:48) in the following way. For every vertex in G , place 2 d copies of the vertexin G (cid:48) . All the copies of a vertex are connected in G (cid:48) . Furthermore, every copy of a vertex v i isconnected to every copy of vertex v j if and only if v i and v j are connected in G . The maximumdegree of G (cid:48) is thus O ( d ).Now consider a subraph query π in G . By definition, π is a connected subgraph of G andw.l.o.g., we can assume π is a tree. We can form a walk W from π by following a DFS orderingof π such that W traverses every edge of π at most twice and visits every vertex of π . Weobserve that we can realize W as a path in G (cid:48) by utilizing the dummy vertices; as each vertexhas 2 d dummy vertices, every visit to a vertex in G can be replaced by a visit to a distinctdummy vertex. 15 .1 The Upper Bound Formally, we have the following result.
Lemma 5.2.
Consider a degree-bounded catalog graph in which each vertex is associated witha planar subdivision. Let n be the total complexity of the subdivisions. We can build a datastructure using O ( n ) space such that given any query ( q, π ) , where q is a query point and π isa path, all regions containing q along π can be reported in time O (log n + | π |√ log n ) .Proof. We build the data structure essentially the same way as in the proof of Lemma 4.1. Theonly difference is that the degree of a node is bounded by a d ≥ r log d log r )times and we can obtain a O ( n ) space bounded data structure achieving O (log n + | π |√ log n )query time by creating an intersection sensitive ( r d /n i )-cutting for each planar subdivision A i and balancing the query time of cutting cells and conflict lists.By Lemma 5.1, we can also obtain the following two corollaries. Corollary 5.1.
Consider a catalog graph in which each vertex is associated with a planarsubdivision. Let n be the total complexity of the subdivisions. We can build a data structureusing O ( n ) space such that given any query ( q, π ) , where q is a query point and π is a connectedsubgraph, all regions containing q along π can be reported in time O (log n + | π |√ log n ) . Specifically, for catalog trees we have the following:
Corollary 5.2.
Consider a catalog tree in which each vertex is associated with a planar subdi-vision. Let n be the total complexity of the subdivisions. We can build a data structure using O ( n ) space such that given any query ( q, π ) , where q is a query point and π is a subtree, allregions containing q along π can be reported in time O (log n + | π |√ log n ) . In this section, we show that the √ log n factor that exists in Lemma 5.2, Corollary 5.1, andCorollary 5.2 is tight. Like the proof of path queries for catalog trees, we need a reductionfrom a rectangle stabbing problem to a 2D OFC subtree query problem on catalog trees. Butunlike previous proofs, we use an instance of the rectangle stabbing problem in a much higherdimension. We show the lower bound for subtree queries of a catalog tree. By Lemma 5.1,this also gives a lower bound for path queries in general catalog graphs. Lemma 5.3.
Assume, given any catalog tree of height √ log n ≤ h ≤ log n in which each vertexis associated with a planar subdivision with n being the total complexity of the subdivisions, wecan build a data structure that satisfies the following: it uses at most n ε √ log n space, for a smallenough constant ε , and it can answer 2D OFC queries ( q, π ) , where q is a query point and π isa subtree containing b = 2 h/ leaves. Then, its query time must be Ω( | π |√ log n ) .Proof. We define the following special (2 + b )-dimensional rectangle stabbing problem. Theinput consists of n rectangles in (2 + b ) dimensions. According to their shapes, rectangles aredivided into h (cid:48) = h − r sets of size n/h (cid:48) each where r is a parameter to be determined later. Therectangles in each set are further divided into b groups of size n/ ( h (cid:48) b ) each. All the rectanglesin the same group are pairwise disjoint and they together tile the (2 + b )-dimensional unit cube.We put restrictions on the shapes of the input rectangles to make this problem special. Fora rectangle in set i , i = 0 , , · · · , h (cid:48) −
1, group j , j = 0 , , · · · , b −
1, except for the first twoand the (2 + j )-th dimensions, its other side lengths are all set to be 1. The side length of the(2 + j )-th dimension is set to be 1 / i + r . We put restriction on the side lengths of the first twodimensions in set i as follows: For the first group j = 0 in this set, we put no restrictions of16he first two dimensions as long as they tile the unit cube and the total number of rectanglesused for this group is n/ ( h (cid:48) b ). For an arbitrary group j , we cut the range of the unit cube inthe (2 + j )-th dimension into 2 i + r equal length pieces. This partitions the unit cube into 2 i + r parts. Note that each part of the unit cube is also tiled by rectangles since we require the sidelength of the (2 + j )-th dimension of the rectangles to be 1 / i + r . If we project the rectanglesin each part of the unit cube into the first two dimensions, we obtain 2 i + r axis-aligned planarsubdivisions. The planar subdivisions we generated for the first group is used as a blueprint forthe shape of rectangles in other groups. More specifically, for the remaining groups in set i , werequire that the choices of the first two dimensions give the same set of 2 i + r planar subdivisionsas the first group. The problem is as follows: Given a point in (2 + b )-dimensions, find all therectangles containing this query point.We now describe a reduction from this problem to a 2D OFC subtree query problem oncatalog trees. We consider a complete balanced binary tree of height h = h (cid:48) + r . Note that thenumber of nodes at layer i + r of the tree is the same as the number of different subdivisionswe get by projecting a group in set i to the first two dimensions. Since we require that all thegroups in the same set yield the same set of subdivisions, we can simply attach the subdivisionsto the nodes starting from layer r . For nodes in layer smaller than r , we attach them withempty subdivisions. Now let us analyze a rectangle stabbing query q on the rectangle stabbingproblem. Consider rectangles in set i , we need to find the rectangle containing q in each of the b groups. In this special rectangle stabbing problem, to find the rectangle in group j containing q , we can find the rectangle by first using the (2 + j )-th coordinate of q to find the part of theunit cube where q is in, and then finding the output rectangle by a simple planar point locationon the projection of the part using the first two coordinates of q . By our construction this isequivalent to choosing a node in layer i + r of the binary tree and performing a point locationquery on the subdivision attached to it. Note that the node in layer i + r + 1 we choose mustbe one of the children of the chosen node in layer i + r . So if we only focus on one specificgroup j of all sets, the rectangle stabbing query corresponds to a series of point location queriesfrom the root to a leaf in the binary tree we constructed. Similarly, we obtain b such paths ifwe consider all groups and they together form a subtree of b leaves. The answer to the pointlocation queries along the subtree gives the answer to the rectangle stabbing problem.We describe a hard high dimensional rectangle stabbing problem instance. As before, wecreate rectangles of different shapes to tile the unit cube. But this time, we will consider a(2 + b )-dimensional rectangle stabbing problem. For rectangles in class i = 0 , · · · , h (cid:48) /r − j = 0 , · · · , r −
1, we create the following shapes:[ 1 K j × K j · ir + j + r · V × ir + j + r × × × · · · × K j × K j · ir + j + r · V × × ir + j + r × × · · · × K j × K j · ir + j + r · V × × × · · · × × ir + j + r ] b shapeswhere K, V are parameters to be determined later. Note that all the rectangles are in (2 + b )dimensions.We use each of the shape to tile a unit cube. Since the volume of any rectangle is V , weneed 1 /V rectangles of the same shape to tile the cube. We call it a cluster. Note that therectangles in the same cluster are pairwise disjoint. We generated h (cid:48) b/V rectangles in total. Bysetting V = h (cid:48) b/n , the total number of rectangles is n . Note that any point in the unit cube iscontained in exactly t = h (cid:48) b rectangles.Now we shall analyze the volume of the intersection between any two (2 + b )-dimensional17ectangles. Note that if two rectangles have the same side lengths for b − b dimensions, then it is the case we have analyzed in the proof of, e.g., Lemma 4.2, and thevolume of the intersection of any two rectangles is bounded by V /K if we set K = 2 r . Now weanalyze the other case. By our construction, two rectangles can only have at most two differentside lengths in the last b dimensions. We consider two rectangles in class i supercluster j ,and class i supercluster j respectively. W.l.o.g., we assume j ≥ j . The case for j ≤ j issymmetric. Then there are two possible expressions for the intersection volume depending onthe values of i and i . The first one is1 K j × K j · i r + j + r · V × i r + j + r × i r + j + r = V i r + j + r ≤ VK .
The second possible expression is1 K j × K j · i r + j + r · V × i r + j + r × i r + j + r = VK j − j × i r + j + r ≤ VK .
The last inequality holds because j ≥ j .To make this construction well-defined, no side length of the rectangles can exceed 1. Thelargest side length can only be obtained in the second dimension when i = h (cid:48) /r − j = r − K r − h (cid:48) + r − V ≤ . By plugging in the values V = h (cid:48) b/n and K = 2 r we get that we must have2 r − r h (cid:48) + r − h (cid:48) b ≤ n (5)Since by our assumptions h (cid:48) ≤ log n , b ≤ h/ ≤ log n/ , it follows that by setting r = √ log n ,the inequality (5) holds.If γh (cid:48) b ≥ Q ( n ) holds, then the first condition of Theorem 2.3 is satisfied and we obtain thelower bound of S ( n ) = Ω (cid:18) tv − O ( γ ) (cid:19) = Ω (cid:18) n r O ( γ ) (cid:19) . Now if we set γ = δ √ log n for a sufficiently small δ >
0, the data structure must use Ω( n Ω( √ log n ) )space, which contradicts the space usage stated in our lemma. Note that | π | = Θ(( h (cid:48) + r ) b ) =Θ( h (cid:48) b ). Then Q ( n ) ≥ γh (cid:48) b = Ω( | π |√ log n ). Remark 5.1.
Note that the lower bound holds even when the query path is of length ≥ √ log n and ≤ log n . We have already established this lower bound in Lemma 4.2. Combining Lemma 5.1 and Lemma 5.3, we immediately have the following corollary:
Corollary 5.3.
Assume, given any degree-bounded catalog graph in which each vertex is asso-ciated with a planar subdivision with n being the total complexity of the subdivisions, we canbuild a data structure that satisfies the following: it uses at most n ε √ log n space, for a smallenough constant ε , and it can answer 2D OFC queries ( q, π ) , where q is a query point and π isa path. Then, its query time must be Ω( | π |√ log n ) . For the linear space data structure we obtained for general path queries of trees Corollary 4.1,there is a tiny inverse Ackermann gap between the query time we obtain and the lower bound.It is an interesting problem whether we can get rid of that term or improve the lower bound.18he problem we consider is very general in the sense that the only restriction we place onthe input instance is that the graph subdivision complexity is n . Some special cases admitbetter solutions. For example, if we require the subdivision complexity of each vertex of thegraph to be asymptotically the same, we can obtain an O ( n ) space and O (log n + | π | log log n )query time data structure for path queries on catalog trees of height h ≤ log n , which is bettercompared to the linear space O (log n + | π |√ log n ) query time data structure we obtained in thegeneral case Lemma 4.1.Higher dimensional generalization of our results is another direction. In 2D, we can transforman axis-aligned planar subdivision to a subdivision consisting of only rectangles by increasingthe subdivision complexity by only a constant factor; however it is not the case for 3D. Onthe other hand, for 3D point locations on orthogonal subdivisions, we have Rahul’s O (log / n )query time and linear space data structure [27] in the standard pointer machine model. Recently,the query time is improved to O (log n ) by Chan et al. [10], but they use a stronger arithmeticpointer machine model. Given that the higher dimensional counterparts of the tools we use for2D are suboptimal, it is a challenging and interesting problem to see how the results will be inhigher dimensions.Other open problems include considering the dynamization of our results, i.e., to supportinsertion and deletion, and other computational models, e.g., RAM and I/O model. References [1] P. Afshani. Improved pointer machine and I/O lower bounds for simplex range reportingand related problems. In
Proceedings of the twenty-eighth annual Symposium on Compu-tational Geometry , pages 339–346, 2012.[2] P. Afshani, L. Arge, and K. D. Larsen. Orthogonal range reporting: query lower bounds,optimal structures in 3-d, and higher-dimensional improvements. In
Proceedings of thetwenty-sixth annual symposium on Computational geometry , pages 240–246, 2010.[3] P. Afshani, L. Arge, and K. G. Larsen. Higher-dimensional orthogonal range reportingand rectangle stabbing in the pointer machine model. In
Proceedings of the twenty-eighthannual Symposium on Computational Geometry , pages 323–332, 2012.[4] P. Afshani, T. M. Chan, and K. Tsakalidis. Deterministic rectangle enclosure and offlinedominance reporting on the ram. In
International Colloquium on Automata, Languages,and Programming , pages 77–88. Springer, 2014.[5] P. Afshani, C. Sheng, Y. Tao, and B. T. Wilkinson. Concurrent range reporting in two-dimensional space. In
Proceedings of the twenty-fifth annual ACM-SIAM Symposium onDiscrete algorithms , pages 983–994. SIAM, 2014.[6] P. K. Agarwal. Partitioning arrangements of lines II: Applications.
Discrete & Computa-tional Geometry , 5(6):533–573, 1990.[7] P. K. Agarwal.
Geometric partitioning and its applications . Duke University, 1991.[8] J. L. Bentley. Decomposable searching problems.
Information Processing Letters , 8(5):244– 251, 1979.[9] T. M. Chan, K. G. Larsen, and M. P˘atra¸scu. Orthogonal range searching on the RAM,revisited. In
Proceedings of the twenty-seventh annual symposium on Computational geom-etry , pages 1–10, 2011. This is done by creating an intersection sensitive (log n/n i )-cutting C i for each subdivision A i in the tree andthen storing all cutting cells on each path using the data structure in Theorem 3.1 and building point locationdata structures on the conflict list of each cell. , page 31. Schloss Dagstuhl-Leibniz-Zentrum fur InformatikGmbH, Dagstuhl Publishing, 2018.[11] B. Chazelle. Filtering search: A new approach to query-answering. SIAM Journal onComputing , 15(3):703–724, 1986.[12] B. Chazelle. Cutting hyperplanes for divide-and-conquer.
Discrete & Computational Ge-ometry , 9(2):145–158, 1993.[13] B. Chazelle, H. Edelsbrunner, M. Grigni, L. Guibas, J. Hershberger, M. Sharir, andJ. Snoeyink. Ray shooting in polygons using geodesic triangulations.
Algorithmica ,12(1):54–68, 1994.[14] B. Chazelle and J. Friedman. A deterministic view of random sampling and its use ingeometry.
Combinatorica , 10(3):229–249, 1990.[15] B. Chazelle and L. J. Guibas. Fractional cascading: I. a data structuring technique.
Algo-rithmica , 1(1-4):133–162, 1986.[16] B. Chazelle and L. J. Guibas. Fractional cascading: II. applications.
Algorithmica , 1(1-4):163–191, 1986.[17] B. Chazelle and D. Liu. Lower bounds for intersection searching and fractional cascadingin higher dimension.
Journal of Computer and System Sciences , 68(2):269–284, 2004.[18] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein.
Introduction to algorithms .MIT press, 2009.[19] M. de Berg, O. Cheong, M. van Kreveld, and M. Overmars.
Computational Geometry:Algorithms and Applications . Springer-Verlag, 3 edition, 2008.[20] M. de Berg and O. Schwarzkopf. Cuttings and applications.
International Journal ofComputational Geometry & Applications , 5(04):343–355, 1995.[21] M. de Berg, M. van Kreveld, and J. Snoeyink. Two-and three-dimensional point location inrectangular subdivisions. In
Scandinavian Workshop on Algorithm Theory , pages 352–363.Springer, 1992.[22] P. F. Dietz and R. Raman. Persistence, amortization and randomization. In
Proceedingsof the second annual ACM-SIAM symposium on Discrete algorithms , page 78–88, 1991.[23] Y. Giyora and H. Kaplan. Optimal dynamic vertical ray shooting in rectilinear planarsubdivisions.
ACM Transactions on Algorithms , 5(3), 2009.[24] J. Matouˇsek. Cutting hyperplane arrangements.
Discrete & Computational Geometry ,6(3):385–406, 1991.[25] J. Matouˇsek. Approximations and optimal geometric divide-and-conquer.
Journal of Com-puter and System Sciences , 50(2):203–208, 1995.[26] K. Mehlhorn and S. N¨aher. Dynamic fractional cascading.
Algorithmica , 5(2):215–241,1990.[27] S. Rahul. Improved bounds for orthogonal point enclosure query and point location inorthogonal subdivisions in R . In Proceedings of the twenty-sixth annual ACM-SIAM Sym-posium on Discrete Algorithms , pages 200–211. SIAM, 2014.2028] D. D. Sleator and R. E. Tarjan. A data structure for dynamic trees.
Journal of Computerand System Sciences , 26(3):362–391, 1983.[29] C. D. Toth, J. O’Rourke, and J. E. Goodman.
Handbook of discrete and computationalgeometry . CRC press, 2017.
Appendices
A Proof of Theorem 2.3
Theorem 2.3.
Assume, we have an algorithm that given any input instance R ⊂ R of n ranges,it can store R in a data structure of size S ( n ) such that given any query q ∈ U , it can answerthe query in Q ( n ) + γ | R q | time.Then, suppose we can construct an input set R ⊂ R of n ranges such that the followingtwo conditions are satisfied: (i) every query point q ∈ U is contained in exactly | R q | = t ranges and γt ≥ Q ( n ) ; (ii) there exists a value v such that for any two ranges r , r ∈ R , µ ( { q ∈ U | r , r ∈ R q } ) is well-defined and is upper bounded by v . Then, we must have S ( n ) =Ω( tv − / O ( γ ) ) = Ω( Q ( n ) v − / O ( γ ) ) . To prove this theorem, we first show a special property of the subgraph M q explored toanswer a query q ∈ U . Note that in the pointer machine model, we begin the exploration witha special cell, called the root. If we consider only the first in-edge to any cell in M q , we obtaina tree. Lemma A.1.
Let M q be the explored subgraph corresponds to a query q ∈ U . We call thememory cells in M q containing reported ranges marked cells. Let a fork be a subtree of M q ofsize at most cγ containing two marked cells, where c is a large enough constant and γ is theparameter in Theorem 2.3. Then M q can be decomposed into Ω( | R q | ) many forks, where R q isthe set of ranges containing q .Proof. The proof we present is very similar to the one described in [1]. We generate the forksusing the following method. For every cell in M q , we assign two values mark and size to it. Formarked cell, we initialize its mark value to be one. Other cells will have mark value zero. Forany cell in M q , we assign one to its size value. Without loss of generality, we assume M q to bea tree. At every step, we choose an arbitrary leaf and add its mark value and size value to thecorresponding values of its parent. Then we remove this cell. If its parent has another child,we repeat this process until its parent becomes a leaf. If after this process its parent has markvalue two and size value more than cγ , we remove its parent as well and do nothing. We callthis situation “wasted”. If its parent has mark value two and size value no more than cγ , wefind a fork. We add it to the fork set and remove the subtree. If its parent has mark value lessthan two, we do nothing. Note that its parent cannot have mark value more than two becausethat will indicate one of its child has mark value at least two but not being added to a fork orwasted. We go on to the next step until reaching the root.Let us consider how many marks will be wasted. We only waste marks when we find asubtree containing two marks but of size more than cγ and when we reach the root with onlyone mark. Since M q contains Q ( n ) + γ | R q | ≤ γ | R q | cells, the number of marks wasted isbounded by 4 γ | R q | / ( cγ ) + 1 = 4 | R q | /c + 1. Other marks are all stored in forks, so the numberof forks is more than ( | R q | − | R q | /c − / | R q | ) for a sufficiently large c .We also need another lemma, which follows directed from Lemma 1 in Afshani [1]. Lemma A.2.
The number of forks of size O ( γ ) is O ( S ( n )2 O ( γ ) ) . Proof.
Consider any query point q ∈ U . By definition, it is contained in a set R q of ranges.Consider the explored subgraph M q when answering q . By Lemma A.1, we can decompose M q into a set F q of Ω( | R q | ) forks such that each fork contains two output ranges. Note that for thetwo ranges to be output, q must lie in the intersection of the two ranges. Similarly, q must liein all the intersection of the two ranges for every fork in M q . This implies that q is covered bythese intersections Ω( | R q | ) times.Since we can answer queries for all q ∈ U and by assumption (i) each q is contained in t ranges, it implies that the intersections of two ranges in all possible forks cover U Ω( t ) times.By Lemma A.2, the number of possible forks of size O ( γ ) is O ( S ( n )2 O ( γ ) ). Each fork has (cid:0) O ( γ )2 (cid:1) = O ( γ ) ways to choose two ranges. By assumption (ii), the measure of any two rangesis bounded by v . So by a simple measure argument, O ( γ ) S ( n )2 O ( γ ) v = Ω( t ) . This gives us S ( n ) = Ω( tv O ( γ ) ) . By our assumption (i), γt ≥ Q ( n ), we also obtain S ( n ) = Ω( Q ( n ) v O ( γ ) ) ..