Improving the dilation of a metric graph by adding edges
IImproving the dilation of a metric graph by adding edges
Joachim Gudmundsson and Sampson Wong
University of Sydney, [email protected], [email protected]
Abstract
Most of the literature on spanners focuses on building the graph from scratch. This paperinstead focuses on adding edges to improve an existing graph. A major open problem in thisfield is: given a graph embedded in a metric space, and a budget of k edges, which k edges dowe add to produce a minimum-dilation graph? The special case where k = 1 has been studied inthe past, but no major breakthroughs have been made for k >
1. We provide the first positiveresult, an O ( k )-approximation algorithm that runs in O ( n log n ) time. Let G = ( V, E ) be a graph embedded in a metric space M . For every pair of points u, v ∈ V , theweight of the edge ( u, v ) is equal to the distance d M ( u, v ) between points u and v in the metricspace M . Let d G ( u, v ) be the weight of the shortest path between u and v in the graph G . For anyreal number t >
1, we call G a t -spanner if d G ( u, v ) ≤ t · d M ( u, v ) for every pair of points u, v ∈ V .The stretch , or dilation , of G is the smallest t for which G is a t -spanner.Spanners have been studied extensively in the literature, especially in the geometric setting.Given a fixed t >
1, a fixed dimension d ≥
1, and a set of n points V in d -dimensional Euclideanspace, there is a t -spanner on the point set V with O ( n ) edges. For a summary of the consider-able research on geometric spanners, see the surveys [5, 9, 19] and the book by Narasimhan andSmid [17]. Spanners in doubling metrics [4, 8, 11] and in general graphs [3, 18, 20] have also receivedconsiderable attention.Most of the literature on spanners focuses on building the graph from scratch. This paperinstead focuses on adding edges to improve an existing graph. Applications where graph networkstend to be better connected over time include road, rail, electric and communication networks. Theoverall quality of these networks depend on both the quality of the initial design and the qualityof the additions. In this paper, we focus on the latter. In particular, given an initial metric graph,and a budget of k edges, which k edges do we add to produce a minimum-dilation graph?Figure 1: An example where k = 2 edges (red) are added to an initial graph G (black) to producea minimum-dilation graph. 1 a r X i v : . [ c s . C G ] J u l roblem 1. Given a positive integer k and a metric graph G = ( V, E ), compute a set S ⊆ V × V of k edges so that the dilation of the resulting graph G (cid:48) = ( V, E ∪ S ) is minimised.The problem stated is a major open problem in the field [6, 15, 21]. It is also one of twelveopen problems posed in the final chapter of Narasimhan and Smid’s book [17]. As no majorbreakthroughs have been made, special cases have been studied.The first special case is when k = 1. Let n and m be the number of vertices and edges of thegraph G , respectively. Farshi et al . [6] provided an O ( n ) time exact algorithm and an O ( mn + n log n ) time 3-approximation. Wulff-Nilsen [21] improved the running time of the exact algorithmto O ( n log n ), and in a follow-up paper Luo and Wulff-Nilsen [15] provided an O (( n log n ) / √ m )time exact algorithm that uses linear space. Several of the papers that study the k = 1 case mentionthe k > G is an empty graph. Giannopoulos et al . [7] and Gudmundssonand Smid [10] independently proved that it is NP -hard to produce the highest quality spanner byadding k edges to an empty graph. This implies that Problem 1 is NP -hard. If we restrict ourselvesto polynomial time algorithms, it therefore makes sense to consider approximation algorithms. InEuclidean space, Aronov et al . [2] showed how to add k = n − (cid:96) edges to an empty graphto produce an O ( n/ ( (cid:96) + 1))-spanner in O ( n log n ) time. By setting (cid:96) = εn , this result implies an O (1 /ε )-approximation to Problem 1 for all k ≥ (1 + ε ) n . However, the general case where G is anon-empty (Euclidean or metric) graph and k ≤ n − et al . [6] conjectured that generalising their algorithm to general k may provide a reason-able approximation algorithm. In the appendix, we show an Ω(2 k ) lower bound for their algorithm.In this paper we obtain the first positive result for the general case. Our approximation al-gorithm runs in O ( n log n ) time and guarantees an O ( k )-approximation factor. Although ouralgorithm may not be optimal, we hope that we provide some insight for further research, or forrelated graph augmentation problems [1, 12, 13, 14].We provide a tight analysis of our algorithm. We show that, for any ε >
0, our algorithm yieldsan approximation factor of (1 + ε )( k + 1), but the same algorithm cannot yield an approximationfactor better than (1 − ε )( k + 1). We achieve our main result by reducing Problem 1 to the followingapproximate decision version: Problem 2.
Given an integer k , a real number t , and a metric graph G = ( V, E ), decide whether t ∗ ≤ t or t ∗ > tk +1 , where t ∗ is the minimum dilation of G (cid:48) = ( V, E ∪ S ) over all sets S where S ⊆ V × V and d M ( s ) = k . In the case where tk +1 < t ∗ ≤ t , either of the two options may bechosen arbitrarily.Our algorithm for Problem 2 is a slight modification of the standard greedy t -spanner algorithm.We provide details of our algorithm and argue its correctness in Section 2. In Section 3, we show howto use the approximate decision algorithm for Problem 2 to develop an approximation algorithmfor Problem 1. We prove that only O (log n ) calls to the greedy algorithm is required to obtain an(1 + ε )( k + 1)-approximation. Finally, in Section 4, we provide a construction to show that thesame algorithm cannot yield an approximation factor better than (1 − ε )( k + 1). As mentioned in the introduction, our approach to solving Problem 2 is a modified greedy t -spannerconstruction. We introduce some notation for the purposes of stating the algorithm. For an edge e ∈ V × V , let d M ( e ) denote the length of the edge e in the metric space M . Given a graph G ,2et δ G ( e ) denote the shortest path between the endpoints of e in the graph G . Let d G ( e ) be thetotal length of edges along the path δ G ( e ).In the original greedy spanner construction, the algorithm begins with an empty graph G , anda positive real value t >
1, and yields a t -spanner as follows: sort all the edges in { V × V } byincreasing weight and then process them in order. Processing an edge e entails a shortest pathquery. If d G ( e ) > t · d M ( e ), then the edge e is added to G , otherwise it is discarded. The algorithmterminates when all edges have been processed. The resulting graph is a t -spanner.In our setting we will start with a initial graph G , a positive real value t > k . Our modified greedy algorithm sorts the edges in { V × V } \ E by increasing weight andthen processes them in order. For each edge e , we perform a shortest path query. If d G ( e ) > t · d M ( e ),then the edge e is added to G , otherwise it is discarded. The algorithm terminates if all edges havebeen processed, or if k + 1 edges have been added to G by the algorithm.Formally, if a i is the i th greedy edge to be added, and G i is the graph after adding { a , a , . . . , a i } to G , then we have: Definition 1.
Let G = G , and for 1 ≤ i ≤ k + 1, let G i = G i − ∪ a i where a i is the shortest edgein V × V satisfying d G i − ( a i ) > t · d M ( a i ).If the algorithm terminates after all the edges have been processed, then at most k edges havebeen added to yield a t -spanner. Therefore t ∗ ≤ t . Otherwise, if at least k + 1 edges are added, wewill prove in Section 2.1 that t ∗ > tk +1 . Our approach is to use the edges added by the greedy algorithm to obtain an upper bound on t withrespect to t ∗ . Our upper bound comes from the following relationship, which is a straightforwardconsequence of Definition 1: Observation 1.
In the graph G i − , if there is a connected path between the endpoints of a i withtotal length L , then L > t · d M ( a i ) . Our goal is to construct a connected path in G i − between the endpoints of a i and to boundits length by ( k + 1) t ∗ · d M ( a i ). If we are able to do this, then Observation 1 would immediatelyimply that ( k + 1) t ∗ > t , as required.To motivate how we construct a connected path in G i − between the endpoints of a i , let usconsider a special case where k = 1. Let G be the initial graph and let the first two greedy edgesbe a and a . Suppose that an optimal edge to add is s , and let G ∗ = G ∪ { s } . See Figure 2. s a a Gδ G ∗ ( a ) γ Figure 2: The graph G with optimal edge s and greedy edges a and a .If we set i = 2 in Observation 1, then our goal is to construct a connected path in G = G ∪ { a } between the endpoints of a and upper bound its length by 2 t ∗ · d M ( a ).3 na¨ıve connected path between the endpoints of a that has length upper bounded by t ∗ · d M ( a )is the path δ G ∗ ( a ), which we recall is the shortest path between the endpoints of a in the graph G ∗ .The path is shown in Figure 2. The reason that d G ∗ ( a ) ≤ t ∗ · d M ( a ) is because the dilation of G ∗ is t ∗ . Unfortunately, the issue with this path is that it uses the edge s and therefore is not a pathin G , so Observation 1 does not apply.We modify δ G ∗ ( a ) into a longer path that does not use s . Our approach is to combine thepath with a cycle by using the symmetric difference operation. Recall that the symmetric differenceof a set of sets are all the elements that appear in an odd number of those sets.To remove s from the path δ G ∗ ( a ), we take its symmetric difference with the cycle γ , whichis formed by linking the path δ G ∗ ( a ) and the edge a end to end. Ideally, the symmetric differenceof δ G ∗ ( a ) and γ would form a connected path between the endpoints of a . Moreover, if both thepath δ G ∗ ( a ) and the cycle γ use the edge s exactly once, then taking the symmetric differencecancels the two occurrences of s , leaving a path that is entirely in G .In fact, we can show this approach works in general. We begin with the na¨ıve path δ G ∗ ( a i ),where G ∗ is the optimal graph defined as follows: Definition 2.
Let S ⊆ V × V be the set of k edges so that G ∪ S has dilation t ∗ . Then G ∗ = G ∪ S .Similar to the k = 1 case, the path δ G ∗ ( a i ) is not in the graph G i − . We modify the path δ G ∗ ( a i ) by taking its symmetric difference with a set of cycles. We prove that for any set of cycles,the symmetric difference of δ G ∗ ( a i ) and the set of cycles always contains a connected path betweenthe endpoints of a i . Moreover, we show how to select the set of cycles in such a way that all edgesin S are cancelled out by the symmetric difference. In this way, we have constructed a connectedpath in the graph G i − between the endpoints of a i .We first prove that taking the symmetric difference of δ G ∗ ( a i ) with any set of cycles maintainsthe invariant that there always exists a connected path between the endpoints of a i . Lemma 1.
The symmetric difference of a path P with any number of cycles contains a connectedpath between the endpoints of P . See Figure 3. P Figure 3: Given a path (black) and cycles (red, green, blue), the symmetric difference (solid)contains a connected path between the endpoints of the black path.
Proof.
Consider a subgraph formed by the symmetric difference of P and a set of cycles. We willlook at the degree of all vertices in this subgraph.4onsider the parity of the degree of each vertex. Taking the symmetric difference maintainsthe parity of the sum of the degrees. The contribution of a cycle to the degree of all vertices iseven, whereas the contribution of P to the degree of all vertices is even except for the endpointsof P . Hence, the only two vertices with odd degree are the endpoints of P . Applying Euler’stheorem to the connected component that contains the endpoints of P , we deduce that there is anEulerian trail between the two vertices of odd degree. Hence, there is a connected path betweenthe endpoints of P .Next, we construct the set of cycles Γ = { γ j : 1 ≤ j ≤ k + 1 } . We will apply Lemma 1 to ourna¨ıve connected path δ G ∗ ( a i ) and a subset of Γ. Each cycle γ j is simply a generalisation of γ fromthe k = 1 case, which we recall is formed by linking the path d G ∗ ( a ) and the edge a end to end. Definition 3.
Let γ j be a cycle formed by linking the path δ G ∗ ( a j ) and the edge a j end to end.A subset of Γ is chosen in such a way so that the symmetric difference with δ G ∗ ( a i ) consistsonly of edges in G i − . To ensure this, we must choose the path δ G ∗ ( a i ) and the cycles γ j so that alledges in S cancel out in the symmetric difference. We use elementary linear algebra to provide anon-constructive proof that there exists a path δ G ∗ ( a i ) and subset of Γ where this property holds. Lemma 2.
Let { a , a , . . . , a k +1 } be the first k + 1 edges given in Definition 1. Then there existsa non-empty subset I ⊆ { , , . . . , k + 1 } so that the symmetric difference of { δ G ∗ ( a j ) : j ∈ I } doesnot contain any edges of S .Proof. Consider δ G ∗ ( a j ) ∩ S , which is a subset of S . We can represent any subset of S as an elementof the vector space { , } S , as each binary digit simply represents whether an element is in thatsubset. Take the basis { j : j ∈ S } for the vector space { , } S . The basis element 1 j simplyrepresents whether the j th element of S is in that subset. Hence, we can expand δ G ∗ ( a j ) ∩ S intoa sum of basis elements by writing δ G ∗ ( a j ) ∩ S = (cid:80) λ ij j .As there are k + 1 subsets δ G ∗ ( a j ) ∩ S , their vector space expansions (cid:80) λ ij j must be linearlydependent. The linear dependence equation, when taken in modulo 2, can be rearranged into theform (cid:80) j ∈ I δ G ∗ ( a j ) ∩ S = 0 for some I ⊆ { , , . . . , k + 1 } and I (cid:54) = ∅ . The modulo 2 equation (cid:80) j ∈ I δ G ∗ ( a j ) ∩ S = 0 directly implies that the symmetric difference of { δ G ∗ ( a j ) ∩ S : j ∈ I } isempty.For the remainder of this section, let I ⊆ { , , . . . , k + 1 } be the subset that satisfies theconditions of Lemma 2, in other words, the symmetric difference of { δ G ∗ ( a j ) : j ∈ I } does notcontain any edges of S . We select the path δ G ∗ ( a i ) where i = max I . Let J = I \ { i } and select thesubset Γ (cid:48) = { γ j : j ∈ J } . We construct the set of edges that is the symmetric difference of δ G ∗ ( a i )and Γ (cid:48) . This completes the construction of the required path.For an illustrated example, see Figure 4. Let k = 3 and S = { s , s , s } , so that s ∈ δ G ∗ ( a ), s ∈ δ G ∗ ( a ) , δ G ∗ ( a ) and s ∈ δ G ∗ ( a ) , δ G ∗ ( a ). By Lemma 2, there must be a non-empty subset I ⊆ { , , , } so that the symmetric difference of { δ G ∗ ( a j ) : j ∈ I } does not contain any of theedges s , s or s . In particular, the subset I = { , , } includes s zero times, and s and s bothtwice. Hence, the symmetric difference of δ G ∗ ( a ) with the cycles Γ (cid:48) = { γ , γ } avoids all three ofthe edges s , s , and s .Now we show this symmetric difference indeed satisfies the conditions of Observation 1, so thatit can be applied to yield an upper bound on t with respect to t ∗ . Recall that the requirementsof Observation 1 are that the set of edges must contain a connected path between the endpointsof a i that uses only edges in G i − . By Lemma 1, the symmetric difference contains a connectedpath between the endpoints of a i . By Lemma 2, we have { δ G ∗ ( a j ) ∩ S : j ∈ I } = ∅ , so therefore5 s s d G ∗ ( a ) γ γ γ Figure 4: An example where we take the symmetric difference of d G ∗ ( a ), γ and γ to avoid allthree of the edges s , s and s that are not in G .the symmetric difference of { δ G ∗ ( a j ) : j ∈ I } does not contain any edges of S . This implies thatthe symmetric difference of { δ G ∗ ( a i ) } and Γ (cid:48) = { γ j : j ∈ J } also does not contain any edges of S .Hence, we have constructed a set of edges that contains a connected path in G i − between theendpoints of a i , as required.Observation 1 implies an upper bound on t in terms of the lengths of all the edges in thesymmetric difference of { δ G ∗ ( a i ) } and Γ (cid:48) = { γ j : j ∈ J } . In Lemma 3 we formalise this upperbound. Then, in Lemma 4, we use the fact that the dilation of G ∗ is t ∗ to obtain an upper boundon the sum of the lengths in each cycle γ j . In Lemma 5, we strengthen the inequality by givinga lower bound on the length of the edges that are both in γ i and S , and therefore cannot bepart of the final symmetric difference. In Theorem 2 we put this all together and prove the finalbound ( k + 1) t ∗ > t .Let c j be the total length of edges in the cycle γ j . Let c (cid:48) j be the total length of edges in theintersection δ G ∗ ( a j ) ∩ S . Then Observation 1 implies: Lemma 3. d G ∗ ( a i ) + (cid:80) j ∈ J c j − (cid:80) j ∈ I c (cid:48) j > t · d M ( a i ) Proof.
The total length of all edges in δ G ∗ ( a i ) is d G ∗ ( a i ). The total length of all edges in γ j is c j .Taking the sum d G i − ( a i ) + (cid:80) j ∈ J c j yields an upper bound on the total length of all edges in thesymmetric difference { δ G ∗ ( a i ) } and { γ j : j ∈ J } . However, this total length includes edges in S , inparticular, it includes the total length of all edges in the intersections { δ G ∗ ( a j ) ∩ S : j ∈ I } . We knowfrom Lemma 2 that no edge in S appears in the symmetric difference, so we do not need to includeany of the edges in { δ G ∗ ( a j ) ∩ S : j ∈ I } in the total length. Hence, d G i − ( a i ) + (cid:80) j ∈ J c j − (cid:80) j ∈ I c (cid:48) j isan upper bound on the total length of the edges in the symmetric difference. Since the symmetricdifference contains a connected path in G i − between the endpoints of a i , Observation 1 impliesthe stated inequality.Next, we use the relationship between γ j and the graph G ∗ to obtain an upper bound on c j . Lemma 4. c j ≤ ( t ∗ + 1) · d M ( a j ) Proof.
Recall from Definition 3 that the cycle γ j is the path δ G ∗ ( a j ) and the edge a j linked end toend. Since the dilation of G ∗ is t ∗ , we have d G ∗ ( a j ) ≤ t ∗ · d M ( a j ). Therefore, c j = d G ∗ ( a j )+ d M ( a j ) ≤ ( t ∗ + 1) · d M ( a j ). 6e strengthen the inequality in Lemma 3 by providing a lower bound on the edges that arein γ j but cannot be part of the final symmetric difference. Lemma 5. If t ≥ ( k + 1) t ∗ , then kk +1 · d M ( a j ) ≤ c (cid:48) j for all j .Proof. First, we prove the inequality d G j − ( a j ) ≤ d G ∗ ( a j ) + (cid:88) s ∈ δ G ∗ ( a j ) ∩ S d G j − ( s ) . We do so in a similar manner to Lemma 3. We construct a path in G j − between the endpoints of a j that has length d G ∗ ( a j ) + (cid:80) s ∈ δ G ∗ ( a j ) ∩ S d G j − ( s ). We start with the path d G ∗ ( a j ). We modifyit taking the symmetric difference of d G ∗ ( a j ) with a set of cycles β = { β s : s ∈ δ G ∗ ( a j ) ∩ S } . Thecycle β s is formed by linking the path d G j − ( s ) and the edge s end to end. The cycle β s replacesevery edge s ∈ δ G ∗ ( a j ) ∩ S with the path d G j − ( s ) ∈ G j − . Hence, the symmetric differenceof d G ∗ ( a j ) with the set β is a path in G j − between the endpoints of a j . Therefore, we have d G j − ( a j ) ≤ d G ∗ ( a j ) + (cid:80) s ∈ δ G ∗ ( a j ) ∩ S d G j − ( s ).Suppose for sake of contradiction that kk +1 · d M ( a j ) > c (cid:48) j . Consider any s ∈ δ G ∗ ( a j ) ∩ S . Then s is shorter than a j , since d M ( a j ) > kk +1 · d M ( a j ) > c (cid:48) j ≥ d M ( s ). In the graph G j − , the edge a j is a shortest edge satisfying d G j − ( a j ) > t · d M ( a j ). Since s is shorter than a j , we must have that d G j − ( s ) ≤ t · d M ( s ). Now, d G j − ( a j ) ≤ d G ∗ ( a j ) + (cid:80) s ∈ δ G ∗ ( a j ) ∩ S d G j − ( s ) ≤ t ∗ · d M ( a j ) + t · (cid:80) s ∈ δ G ∗ ( a j ) ∩ S d M ( s )= t ∗ · d M ( a j ) + t · c (cid:48) j < t ∗ · d M ( a j ) + t · kk +1 d M ( a j ) ≤ t · k +1 d M ( a j ) + t · kk +1 d M ( a j )= t · d M ( a j )where the second last line is given by t ≥ ( k +1) t ∗ . But we know from Definition 1 that d G j − ( a j ) >t · d M ( a j ), so we obtain a contradiction. Therefore, we must have kk +1 · d M ( a j ) ≤ c (cid:48) j .Using Lemmas 3-5 we are able to prove the main result of this section. Theorem 1.
Suppose the greedy algorithm adds k + 1 edges into the graph. Then ( k + 1) t ∗ > t .Proof. Combining Lemmas 3 and 4 yields: t · d M ( a i ) < d G ∗ ( a i ) + (cid:80) j ∈ J c j − (cid:80) j ∈ I c (cid:48) j ≤ t ∗ · d M ( a i ) + (cid:80) j ∈ J ( t ∗ + 1) d M ( a j ) − (cid:80) j ∈ I c (cid:48) j = t ∗ · (cid:80) j ∈ I d M ( a j ) + (cid:80) j ∈ J d M ( a j ) − (cid:80) j ∈ I c (cid:48) j Suppose for sake of contradiction that t ≥ ( k + 1) t ∗ . By Lemma 5 we have kk +1 d M ( a j ) ≤ c (cid:48) j .Summing over all j ∈ I yields: (cid:80) j ∈ I c (cid:48) j ≥ (cid:80) j ∈ I kk +1 d M ( a j )= kk +1 d M ( a i ) + (cid:80) j ∈ J kk +1 d M ( a j ) ≥ (cid:80) j ∈ J ( kk +1 d M ( a j ) + k +1 d M ( a i )) ≥ (cid:80) j ∈ J d M ( a j )The final step is because j < i so a j is not longer than a i . Therefore,7 · d M ( a i ) < t ∗ (cid:80) j ∈ I d M ( a j ) + (cid:80) j ∈ J d M ( a j ) − (cid:80) j ∈ I c (cid:48) j ≤ t ∗ (cid:80) j ∈ I d M ( a j ) ≤ t ∗ · ( k + 1) · d M ( a i )which implies ( k + 1) t ∗ > t , as required. We analyse the running time of the greedy algorithm. Recall that the greedy algorithm sorts theedges in { V × V } \ E by increasing length and then processes them in order. Processing an edge e entails a shortest path query. If d G ( e ) > t · d M ( e ), then the edge e is added to G , otherwise it isdiscarded.Our algorithm performs efficient shortest path queries by building and maintaining an all pairsshortest paths (APSP) data structure for each of the graphs G i . When an edge pq is added tothe graph, the data structure updates the length of the shortest path between every pair of points u, v ∈ V . We compute the paths u → v , u → p → q → v , and u → q → p → v , and choose theminimum length. For a fixed u, v ∈ V , this can be handled in constant time, since all pairwisedistances are stored.Hence, the overall running time of the algorithm is as follows. In preprocessing, we build theAPSP data structure in O ( mn + n log n ) time. Sorting the edges in { V × V } \ E takes O ( n log n )time. Querying whether d G ( e ) > t · d M ( e ) can be handled in constant time, and there are at most O ( n ) such queries. Updating the APSP data structure takes O ( n ) time, and there are at most k + 1 updates. Putting this all together yields: Theorem 2.
Given an integer k , a real number t and a graph G with n vertices and m edges, thereis an O (( m + n log n + kn ) · n ) time algorithm that returns either t ∗ ≤ t or t ∗ > tk +1 . We return to Problem 1, which is to compute a (1 + ε )( k + 1)-approximation for the minimumdilation t ∗ . For any real value t , we can use Theorem 2 to decide whether t ∗ ≤ t or t ∗ > tk +1 .Hence, it remains only to provide some bounded interval that t ∗ is guaranteed to be in. Oncewe have such an interval, then we can binary search on an ε -grid of the interval to obtain a(1 + ε )( k + 1)-approximation.We compute this interval in two steps. Our first step is to identify a set T of O ( n ) real numbersso that at least one of these numbers is an O ( n )-approximation of t ∗ . Our second step is to use theapproximate decision algorithm in Theorem 2 to perform a binary search on the set T and yieldan O ( nk )-approximation for t ∗ . The O ( nk )-approximation provides the required interval.We begin by identifying the set T of O ( n ) real numbers. Lemma 6.
Define T = { d M ( u,v ) d M ( p,q ) : u, v, p, q ∈ V, u (cid:54) = v, p (cid:54) = q } . Then there exists an element t ∈ T such that t ≤ t ∗ ≤ n · t .Proof. Consider the graph G ∗ = ( V, E ∪ S ). Let the dilation of t ∗ be attained by the pair of points u, v ∈ V . Let pq be a longest edge along the shortest path from u to v in G ∗ . See Figure 5.Recall that d G ∗ ( u, v ) is the length of the shortest path from u to v in the graph G ∗ . The dilationof t ∗ is attained by the pair of points u, v , which implies d G ∗ ( u, v ) = t ∗ · d M ( u, v ). The shortestpath from u to v has total length d G ∗ ( u, v ) and has at most n edges, where the length of each edge8 q vu d M ( u, v ) Figure 5: The edge pq is a longest edge on the shortest path from u to v .is at most d M ( p, q ). This implies d M ( p, q ) ≤ d G ∗ ( u, v ) ≤ n · d M ( p, q ). But d G ∗ ( u, v ) = t ∗ · d M ( u, v ),so this inequality rearranges to give d M ( u, v ) d M ( p, q ) ≤ t ∗ ≤ n · d M ( u, v ) d M ( p, q ) , as required.Next, we use the approximate decision algorithm in Theorem 2 to binary search the set T = { d M ( u,v ) d M ( p,q ) : u, v, p, q ∈ V, u (cid:54) = v, p (cid:54) = q } in order to yield an O ( nk )-approximation. A na¨ıve im-plementation of the binary search would entail computing and sorting the elements in T , whichwould require O ( n log n ) time. We avoid this O ( n log n ) preprocessing step by using the result ofMirzaian and Arjomandi [16], which states that given two sorted lists X and Y each of size n , onecan select the i th smallest element of the set X + Y = { x + y : x ∈ X, y ∈ Y } in O ( n ) time. Lemma 7.
There is an O (( m + n log n + kn ) · n log n ) time algorithm that computes an O ( nk ) -approximation for t ∗ .Proof. In a preprocessing step, construct and sort the sets X = { log( d M ( u, v )) : u, v ∈ V, u (cid:54) = v } and Y = {− log( d M ( p, q )) : p, q ∈ V, p (cid:54) = q } . To perform the binary search, select the i th smallestelement of X + Y = { log( d M ( u,v ) d M ( p,q ) ) : u, v, p, q ∈ V, u (cid:54) = v, p (cid:54) = q } . Reverse the log transformation toobtain the i th smallest element of T . Call this element t i ∈ T . Apply Theorem 2 to the two dilationvalues · t i and n ( k + 1) · t i . This returns one of three possibilities:1. t ∗ ≤ · t i and t ∗ ≤ n ( k + 1) · t i , or2. t ∗ > · t i k +1 and t ∗ ≤ n ( k + 1) · t i , or3. t ∗ > · t i k +1 and t ∗ > n · t i .The fourth combination cannot occur as it yields a contradiction. Notice that in case one, wehave t ∗ < t i , so the element t ∈ T satisfying t ≤ t ∗ ≤ n · t must be less than t i . We can continuethe binary search over the elements in T that are less than t i . Similarly, in case three, we have t ∗ > n · t i , so the element t ∈ T satisfying t ≤ t ∗ ≤ n · t must be greater than t i . We can continuethe binary search over the elements in T that are greater than t i . In case two we halt, since wehave an O ( nk )-approximation for t ∗ .We analyse the running time of this algorithm. Sorting the sets X and Y takes O ( n log n ) time.For each of the O (log n ) binary search step, selecting the i th element of X + Y takes O ( n ) time [16].For each of the O (log n ) binary search steps, applying Theorem 2 takes O (( m + n log n + kn ) · n ).Putting this all together yields the stated running time.9inally, we apply a multiplicative (1 + ε )-grid to the O ( nk )-approximation to yield an (1 + ε )( k + 1)-approximation. Theorem 3.
For any fixed ε > , there is an O (( m + n log n + kn ) · n log n ) time algorithm thatcomputes a (1 + ε )( k + 1) -approximation for t ∗ . To simplify the running time, we note that if k ≥ n −
1, then adding the minimum spanning treeto any graph makes it an n -spanner, which is a ( k + 1)-approximation for the minimum dilation.Plugging in k < n − m ≤ n into Theorem 3 yields: Theorem 4.
For any fixed ε > , there is an O ( n log n ) time algorithm that computes an (1 + ε )( k + 1) -approximation for t ∗ . (1 − ε )( k + 1) We provide a construction to show that the algorithms in Theorem 2 and Theorem 3 cannot yieldan approximation factor better than (1 − ε )( k + 1). Theorem 5.
For any k ≥ and ε > , there exists a graph so that for any t ≤ (1 − ε )( k + 1) · t ∗ ,the greedy algorithm in Definition 1 adds at least k + 1 edges to the graph.Proof. Fix h to be a small positive constant less than t , and fix a constant h (cid:48) to be arbitrarilysmall relative to h . We construct the graph G shown in Figure 6.Let the vertices of G be a = (0 , h ) b i = (1 , ih ) ∀ ≤ i ≤ kc i = (2 , ih ) ∀ ≤ i ≤ kd i = ( k + 3 + i, ih ) ∀ ≤ i ≤ ke i = ( k + 3 + i, (2 i + 1) h ) ∀ ≤ i ≤ kf i = (2 , (2 i + 1) h − h (cid:48) ) ∀ ≤ i ≤ kg i = (1 , (2 i + 1) h ) ∀ ≤ i ≤ ky = (0 , (2 k + 1) h ) z = (0 , h )The graph G is a path between these vertices. The edges of G are between consecutive elementsin the sequence a , b , c , d , e , f , g , b , c , . . . , f k , g k , y , z . See Figure 6. a b c f g b c f g b c f g d e d e d e y z Figure 6: The construction for k = 3.The pairs of points with the largest dilation are ( a , z ), ( b i , g i ) and ( c i , f i ). We can pick a smallenough value of h so that the dilation of all other pairs are relatively insignificant. The optimal k b i , g i ) for all 1 ≤ i ≤ k . After adding these k edges, the pairs of points with thelargest dilation are ( a , z ) and ( c i , f i ). Of these, the pair of points ( a , z ) realises the maximumdilation, which is t ∗ = (2 + (4 k + 1) h ) /h ≈ /h .Now let us run the greedy spanner construction for some t ≤ (1 − ε )( k + 1) · t ∗ . All pairs ofpoints ( a , z ), ( b i , g i ) and ( f i , c i ) start off with dilation greater than 2( k + 4) /h . But 2( k + 4) /h =( k + 4) · /h > ( k + 3) · t ∗ > t , where the second inequality is true for sufficiently small values of h .The pairs of points with highest dilation are ( a , z ), ( b i , g i ) and ( f i , c i ), and the edges connectingthese pairs of points satisfies d G i ( e ) > t · d M ( e ). The shortest of these edges will be added firstby the greedy t -spanner construction. The pairs ( c i , f i ) have distance h − h (cid:48) , making the edgebetween them the shortest and first to be considered by the greedy algorithm. Adding an edgebetween ( c i , f i ) does not reduce the dilation of the other pairs of points ( c j , f j ). Therefore, thegreedy spanner construction first adds the edges ( c i , f i ) for all 1 ≤ i ≤ k .After adding ( c i , f i ) for all 1 ≤ i ≤ k , the dilation between the pair of points a and z is now(2 k + 2 + (4 k + 1) h ) /h . But (2 k + 2 + (4 k + 1) h ) /h = (2 k + 2 + (4 k + 1) h ) / (2 + (4 k + 1) h ) · t ∗ > (1 − ε )( k + 1) · t ∗ for sufficiently small values of h relative to ε . Therefore, the greedy t -spannerconstruction must add the edges ( c i , f i ) for all 1 ≤ i ≤ k plus at least one additional edge, so itadds at least k + 1 edges in total.Our construction shows that in Theorem 2 we cannot hope to obtain a bound that is muchbetter than t ∗ > tk +1 . Similarly, in Theorem 3, our construction implies that the algorithm maycontinue searching for higher dilation values up until (1 − ε )( k + 1) · t ∗ . Therefore, we cannot hopeto obtain a much better approximation ratio than (1 + ε )( k + 1) with our algorithm. In Farshi et al . [6] it was conjectured that generalising their algorithm to any positive integer k may provide a reasonable approximation algorithm. In the appendix, we showed an Ω(2 k ) lowerbound for the approximation factor. We obtained the first positive result for the general case. Ourapproximation algorithm runs in O ( n log n ) time and guarantees an O ( k )-approximation factor.Two obvious open problems are to develop an algorithm with a better approximation factor, orto show an inapproximability bound. References [1] Hee-Kap Ahn, Mohammad Farshi, Christian Knauer, Michiel H. M. Smid, and Yajun Wang.Dilation-optimal edge deletion in polygonal cycles.
International Journal of ComputationalGeometry & Applications , 20(1):69–87, 2010.[2] Boris Aronov, Mark de Berg, Otfried Cheong, Joachim Gudmundsson, Herman J. Haverkort,Michiel H. M. Smid, and Antoine Vigneron. Sparse geometric graphs with small dilation.
Computational Geometry , 40(3):207–219, 2008.[3] Surender Baswana and Sandeep Sen. A simple and linear time randomized algorithm forcomputing sparse spanners in weighted graphs.
Random Structures & Algorithms , 30(4):532–563, 2007.[4] Hubert T.-H. Chan, Anupam Gupta, Bruce M. Maggs, and Shuheng Zhou. On hierarchi-cal routing in doubling metrics. In
Proceedings of the 16th Annual Symposium on DiscreteAlgorithms, SODA , pages 762–771. SIAM, 2005.115] David Eppstein. Spanning trees and spanners. In J¨org-R¨udiger Sack and Jorge Urrutia, editors,
Handbook of Computational Geometry , pages 425–461. Elsevier, 2000.[6] Mohammad Farshi, Panos Giannopoulos, and Joachim Gudmundsson. Improving the stretchfactor of a geometric network by edge augmentation.
SIAM Journal of Computing , 38(1):226–240, 2008.[7] Panos Giannopoulos, Rolf Klein, Christian Knauer, Martin Kutz, and D´aniel Marx. Comput-ing geometric minimum-dilation graphs is NP-hard.
International Journal of ComputationalGeometry & Applications , 20(2):147–173, 2010.[8] Lee-Ad Gottlieb. A light metric spanner. In
Proceedings of the 56th Symposium on Foundationsof Computer Science, FOCS , pages 759–772, 2015.[9] Joachim Gudmundsson and Christian Knauer. Dilation and detours in geometric networks.In Teofilo F. Gonzalez, editor,
Handbook of Approximation Algorithms and Metaheuristics .Chapman and Hall/CRC, 2007.[10] Joachim Gudmundsson and Michiel H. M. Smid. On spanners of geometric graphs.
Interna-tional Journal of Foundations of Compututer Science , 20(1):135–149, 2009.[11] Sariel Har-Peled and Manor Mendel. Fast construction of nets in low-dimensional metrics andtheir applications.
SIAM Journal of Computing , 35(5):1148–1184, 2006.[12] Jan-Henrik Haunert and Wouter Meulemans. Partitioning polygons via graph augmentation.In Jennifer A. Miller, David O’Sullivan, and Nancy Wiegand, editors,
Proceedings of the 9thGeographic Information Science, GIScience , volume 9927, pages 18–33, 2016.[13] Ferran Hurtado and Csaba D T´oth. Plane geometric graph augmentation: a generic perspec-tive. In
Thirty Essays on Geometric Graph Theory , pages 327–354. Springer, 2013.[14] Rolf Klein, Christian Knauer, Giri Narasimhan, and Michiel H. M. Smid. On the dilationspectrum of paths, cycles, and trees.
Computational Geometry , 42(9):923–933, 2009.[15] Jun Luo and Christian Wulff-Nilsen. Computing best and worst shortcuts of graphs embeddedin metric spaces. In
Proceedings of the 19th International Symposium on Algorithms andComputation, ISAAC , volume 5369, pages 764–775, 2008.[16] Andranik Mirzaian and Eshrat Arjomandi. Selection in X+Y and matrices with sorted rowsand columns.
Information Processing Letters , 20(1):13–17, 1985.[17] Giri Narasimhan and Michiel H. M. Smid.
Geometric spanner networks . Cambridge UniversityPress, 2007.[18] David Peleg.
Distributed Computing: a Locality-Sensitive Approach . SIAM, 2000.[19] Michiel H. M. Smid. Closest-point problems in computational geometry. In J¨org-R¨udiger Sackand Jorge Urrutia, editors,
Handbook of Computational Geometry , pages 877–935. Elsevier,2000.[20] Mikkel Thorup and Uri Zwick. Approximate distance oracles.
Journal of the ACM , 52(1):1–24,2005.[21] Christian Wulff-Nilsen. Computing the dilation of edge-augmented graphs in metric spaces.
Computational Geometry , 43(2):68–72, 2010.12
The Bottleneck Algorithm
Farshi et al . [6] studied the special case where k = 1. They achieved a 3-approximation by addingthe bottleneck edge , which is an edge between a pair of points that achieves the maximum dilation.They also provided a generalisation of their algorithm for k >
1. The generalisation consists of k stages. In each stage, the dilation of the graph is computed, and a pair of points that achieves themaximum dilation is identified. Then an edge is added between those pair of points. Formally,given an initial metric graph G , and an integer k : Definition 4.
Let G = G , and for 1 ≤ i ≤ k , let G i = G i − ∪ b i where b i is an edge between thepair of points that achieves the maximum dilation of G i − .Farshi et al . [6] conjectured that the dilation of the augmented graph G k may be reasonableapproximation for the dilation of the optimal graph G ∗ . We provide a negative result that statesthat their algorithm cannot yield an approximation factor better than 2 k . Theorem 6.
For any k ≥ , there exists a initial graph G where bottleneck algorithm in Definition 4yields a graph G k with dilation k times that of the dilation of the optimal graph G ∗ .Proof. Fix h to be a small constant. Let the vertices of G be x = ( − , h ) y i = (0 , i h ) ∀ ≤ i ≤ k + 1 z i = (2 i − , · i − h ) ∀ ≤ i ≤ kx = ( − , k +1 h + h )Join the vertices together to form a path x , y , z , y , z , . . . , y k , z k , y k +1 , x . See Figure 7. y y y y z z z x x z z Figure 7: The construction for k = 3.It is straightforward to check that all edges in G have gradient ± h . Since h is a small constant,all edges are almost horizontal. Therefore, the pairs of vertices with maximum dilation are thosethat are vertically above one another, in other words, the pairs ( x , x ), or ( y i , y i +1 ) for 1 ≤ i ≤ k .In particular, all the pairs listed have a dilation value of √ h /h .Since ( x , x ) is one of the pairs of vertices with maximum dilation, we can choose the firstbottleneck edge b to connect these two points. It is easy to check that since the distance in thegraph between ( x , x ) is twice the distance of any other pair ( y i , y i +1 ), adding the first bottleneckedge does not reduce the dilation of any of the pairs ( y i , y i +1 ). Inductively, we can show that for i ≥ b i = ( y k − i +2 , y k − i +3 ) is the i th bottleneck edge added. This is because it initially had themaximum dilation of √ h /h , and adding the bottleneck edges b , b , . . . b i − did not reduce13ts dilation factor. Finally, after adding b , . . . b k , the dilation of the augmented graph G k is still √ h /h and is attained by ( y , y ).The optimal placements of k edges would be the edges ( y , y ) , . . . ( y k , y k +1 ). Under this place-ment of k edges, the maximum dilation value is attained by ( x , x ), and is 2 √ h / (2 k +1 · h ) = √ h / (2 k · h ). Hence, the augmented graph G k has a dilation of 2 k times the dilation of theoptimal graph G ∗ .Note that in our construction, ties are broken adversarially when choosing the bottleneck edgeto add. If we would like to lift the requirement on the adversarial choice of which bottleneck edgeto add, we can perturb x and x vertically towards each other, which guarantees that ( x , x1