Computer Science Data Structures and Algorithms

Deterministic Tree Embeddings with Copies for Algorithms Against Adaptive Adversaries

Bernhard Haeupler, D Ellis Hershkowitz, Goran Zuzic

Abstract

Embeddings of graphs into distributions of trees that preserve distances in expectation are a cornerstone of many optimization algorithms. Unfortunately, online or dynamic algorithms which use these embeddings seem inherently randomized and ill-suited against adaptive adversaries. In this paper we provide a new tree embedding which addresses these issues by deterministically embedding a graph into a single tree containing O(\log n) copies of each vertex while preserving the connectivity structure of every subgraph and O(\log^2 n)-approximating the cost of every subgraph. Using this embedding we obtain several new algorithmic results: We reduce an open question of Alon et al. [SODA 2004] -- the existence of a deterministic poly-log-competitive algorithm for online group Steiner tree on a general graph -- to its tree case. We give a poly-log-competitive deterministic algorithm for a closely related problem -- online partial group Steiner tree -- which, roughly, is a bicriteria version of online group Steiner tree. Lastly, we give the first poly-log approximations for demand-robust Steiner forest, group Steiner tree and group Steiner forest.

Full PDF

DDeterministic Tree Embeddings with Copies forAlgorithms Against Adaptive Adversaries ∗ Bernhard Haeupler † D Ellis Hershkowitz ‡ Goran Zuzic § Carnegie Mellon University Carnegie Mellon University ETH Z¨urich& ETH Z¨urich

Abstract

Embeddings of graphs into distributions of trees that preserve distances in expectation are a corner-stone of many optimization algorithms. Unfortunately, online or dynamic algorithms which use theseembeddings seem inherently randomized and ill-suited against adaptive adversaries.In this paper we provide a new tree embedding which addresses these issues by deterministically embedding a graph into a single tree containing O ( log n ) copies of each vertex while preserving theconnectivity structure of every subgraph and O ( log n ) -approximating the cost of every subgraph.Using this embedding we obtain several new algorithmic results: We reduce an open question of Alonet al. [7]—the existence of a deterministic poly-log-competitive algorithm for online group Steiner treeon a general graph—to its tree case. We give a poly-log-competitive deterministic algorithm for a closelyrelated problem—online partial group Steiner tree—which, roughly, is a bicriteria version of online groupSteiner tree. Lastly, we give the ﬁrst poly-log approximations for demand-robust Steiner forest, groupSteiner tree and group Steiner forest. ∗ Supported in part by NSF grants CCF-1527110, CCF-1618280, CCF-1814603, CCF-1910588, NSF CAREER award CCF-1750808, a Sloan Research Fellowship, funding from the European Research Council (ERC) under the European Union’s Horizon2020 research and innovation program (ERC grant agreement 949272), Swiss National Foundation (project grant 200021-184735)and the Air Force Oﬃce of Scientiﬁc Research under award number FA9550-20-1-0080. † [email protected] ‡ [email protected] § [email protected] a r X i v : . [ c s . D S ] F e b ontents f -Partial Group Steiner Tree on a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226.1.1 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226.1.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.1.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.2 Online f -Partial Group Steiner Tree on General Graphs . . . . . . . . . . . . . . . . . . . . . . 24 G is a Tree . . . . . . . . . . . . . . . . . . . . . . . 297.4 Demand-Robust Group Steiner Forest When G is a Tree . . . . . . . . . . . . . . . . . . . . . . 32 Introduction

Probabilistic embedding of general metrics into distributions over trees are one of the most versatile tools incombinatorial and network optimization. The beauty and utility of these tree embeddings comes from thefact that their application is often simple, yet extremely powerful. Indeed, when modeling a network withlength, costs, or capacities as a weighted graph, these embeddings often allow one to pretend that the graphis a tree. A common template for countless network design algorithms is to (1) embed the input weightedgraph G into a randomly sampled tree T that approximately preserves the weight structure of G ; (2) solvethe input problem on T and; (3) project the solution on T back into G .A long and celebrated line of work [5, 11, 29, 44] culminated in the embedding of Fakcharoenphol, Rao andTalwar [29]—henceforth the “FRT embedding”—which showed that any weighted graph on n nodes can beembedded into a distribution over weighted trees in a way that O ( log n ) -approximately preserves distances inexpectation. Together with the above template this reduces many graph problems to much easier problemson trees at the cost of an O ( log n ) approximation factor. This has lead to a myriad of approximation, online,and dynamic algorithms with poly-logarithmic approximations and competitive ratios for NP-hard problemssuch as for k -server [10], metrical task systems [12], group Steiner tree and group Steiner forest [7, 34, 53],buy-at-bulk network design [9] and (oblivious) routing [54]. For many of these problems tree embeddingsare the only known way of obtaining such algorithms on general graphs.However, probabilistic tree embeddings have one drawback: Algorithms based on them naturally require ran-domization and their approximation guarantees only hold in expectation. For approximation algorithms—i.e., in the oﬄine setting—there are derandomization tools, such as the FRT derandomizations given in[19, 29], to overcome these issues. These derandomization results are so general that essentially any oﬄinealgorithm based on tree embeddings can be transformed into a deterministic algorithm with matching approx-imation guarantees (with only a moderate increase in running time). Unfortunately, these strategies are notapplicable to online or dynamic settings where an adversary progressively reveals the input. Indeed, to ourknowledge, all online and dynamic algorithms that use FRT are randomized (e.g. [7, 12, 27, 28, 31, 36, 40, 53]).This overwhelming evidence in the literature is driven by a well-known and fundamental barrier to the useof probabilistic tree embeddings in deterministic online and dynamic algorithms. More speciﬁcally and evenworse, this is a barrier which prevents these algorithms from working against all but the weakest type ofadversary. In particular, designing an online or dynamic algorithm which is robust to an oblivious adversary(which ﬁxes all requests in advance, independently of the algorithm’s randomness) is often much easier thandesigning an algorithm which is robust to an adaptive adversary (which chooses the next request based onthe algorithm’s current solution). As the actions of a deterministic algorithm can be fully predicted thisdistinction only holds for randomized algorithms—any deterministic algorithm has to always work againstan adaptive adversary. For these reasons, many online and dynamic algorithms have exponentially worsecompetitive ratios in the deterministic or adaptive adversary setting than in the oblivious adversary setting.This is independent of computational complexity considerations.The above barrier results from a repeatedly recognized and seemingly unavoidable phenomenon which pre-vents online algorithms built on FRT from working against adaptive adversaries. Speciﬁcally, there aregraphs where every tree embedding must have many node pairs with polynomially-stretched distances [11].There is nothing that prevents an adversary then from learning through the online algorithm’s responseswhich tree was sampled and then tailoring the remainder of the online instance to pairs of nodes that havehighly stretched distances. The exact same phenomenon occurs in the dynamic setting; see, for example,Guo et al. [36] and Gupta et al. [40] for dynamic algorithms with expected cost guarantees that only holdagainst oblivious adversaries because they are based on FRT. In summary, online and dynamic algorithmsthat use probabilistic tree embeddings seem inherently randomized and seem to necessarily only work againstadversaries oblivious to this randomness.Similar, albeit not identical, issues also arise in other settings, most notably demand-robust optimization. We remark that, unlike the online and dynamic setting, the barrier to obtaining demand-robust algorithms which work T in the FRT distribution contains aninstance for which T is an arbitrarily bad approximation and then always choose the worst-case probleminstance. The fact that there do not exist any demand-robust algorithms which use FRT despite this settinghaving received considerable attention seems at least partially due to the issues pointed out here.Overall it seems fair to say that prior to this work tree embeddings seemed fundamentally incapable ofenabling adaptive-adversary-robust and deterministic algorithms in several well-studied settings. We provide a conceptually new type of metric embedding—the copy tree embedding— which is deterministicand therefore also adaptive-adversary-robust. Speciﬁcally, we show that any weighted graph G can bedeterministically embedded into a single weighted tree with a small number of copies for each vertex. Anysubgraph of G will project onto this tree in a connectivity and approximate-cost preserving way.To precisely deﬁne our embeddings we deﬁne a copy mapping φ which maps a vertex v to its copies. Deﬁnition 1 (Copy Mapping) . Given vertex sets V and V ′ we say φ ∶ V → V ′ is a copy mapping if everynode has at least one copy (i.e. ∣ φ ( v )∣ ≥ for all v ∈ V ), copies are disjoint (i.e. φ ( v ) ∩ φ ( u ) = ∅ for u ≠ v )and every node in V ′ is a copy of some node (i.e. for every v ′ ∈ V ′ there is some v ∈ V where v ′ ∈ φ ( v ) ). For v ′ ∈ V ′ , we use the shorthand φ − ( v ′ ) to stand for the unique v ∈ V such that v ′ ∈ φ ( v ) . A copy tree embedding for a weighted graph G now simply consists of a tree T on copies of vertices of G with one distinguished root and two mappings π G → T and π T → G which map subsets of edges from G to T and from T to G in a way that preserves connectivity and approximately preserves costs. We say that twovertex subsets U, W are connected in a graph if there is a u ∈ U and w ∈ W such that u and w are connected.We also say that a mapping π ∶ E → E ′ is monotone if for every A ⊆ B we have that π ( A ) ⊆ π ( B ) . Arooted tree T = ( V, E, w ) is well-separated if for all edges e if e ′ is a child edge of e in T then w ( e ′ ) ≤ w ( e ) . Deﬁnition 2 ( α -Approximate Copy Tree Embedding with Copy Number χ ) . Let G = ( V, E, w ) be a weightedgraph with some distinguished root r ∈ V . An α -approximate copy tree embedding with copy number χ consists of a weighted rooted tree T = ( V ′ , E ′ , w ′ ) , a copy mapping φ ∶ V → V ′ and edge mapping functions π G → T ∶ E → E ′ and π T → G ∶ E ′ → E where π T → G ∶ E ′ → E is monotone and:1. Connectivity Preservation:

For all F ⊆ E and u, v ∈ V if u, v are connected by F , then φ ( u ) , φ ( v ) ⊆ V ′ are connected by π G → T ( F ) . Symmetrically, for all F ′ ⊆ E ′ and u ′ , v ′ ∈ V ′ if u ′ and v ′ are connectedby F ′ then φ − ( u ′ ) and φ − ( v ′ ) are connected by π T → G ( F ′ ) .2. α -Cost Preservation : For any F ⊆ E we have w ( F ) ≤ α ⋅ w ′ ( π G → T ( F )) and for any F ′ ⊆ E ′ we have w ′ ( F ′ ) ≤ w ( π T → G ( F ′ )) .3. Copy Number: ∣ φ ( v )∣ ≤ χ for all v ∈ V and φ ( r ) = { r ′ } where r ′ is the root of T .A copy tree embedding is eﬃcient if T , φ , and π T → G are deterministically poly-time computable and well-separated if T is well-separated. against the “adaptive adversary” implicit in the setting is merely computational and thus seems potentially less inherent. a) Graph G . (b) Compute partial tree embeddings. (c) Merge trees. Figure 1: Illustration of our ﬁrst construction where we merge O ( log n ) partial tree embeddings. (a) Graph G . (b) Enumerate FRT support. (c) Merge trees. Figure 2: Illustration of our second construction where we merge the O ( n log n ) trees in the FRT support.We emphasize that, whereas standard tree embeddings guarantee costs are preserved in expectation, ourcopy tree embeddings preserve costs deterministically. Also notice that for eﬃcient copy tree embeddingswe do not require that π G → T is eﬃciently computable; this is because π G → T will be used in our analyses butnot in any of our algorithms.We ﬁrst give two copy tree embedding constructions which trade oﬀ between the number of copies and costpreservation. Both constructions are based on the idea of merging appropriately chosen tree embeddings aspictured in Figure 1 and Figure 2 where we color nodes according to the node whose copy they are. Construction 1: Merging Partial Tree Embeddings (Section 4) . The cornerstone of our ﬁrst con-struction is the idea of merging embeddings which give good deterministic distance preservation. If our goalis to embed the entire input metric into a tree this is impossible. However, it is possible to embed a randomconstant fraction of nodes in an input metric into a tree in a way that deterministically preserves distances ofthe embedded nodes; an embedding which we call a “partial tree embedding” (see also Gupta et al. [37], Hae-upler et al. [41]). We then use the method of conditional expectation to derandomize a node-weighted versionof this random process and apply this derandomization O ( log n ) times, down-weighting nodes as they areembedded. The result of this process is O ( log n ) partial tree embeddings where a multiplicative-weights-typeargument shows that each node appears in a constant fraction of these embeddings. Merging these O ( log n ) embeddings gives our copy tree while an Euler-tour-type proof shows that subgraphs of the input graph canbe mapped to our copy tree in a cost and connectivity-preserving fashion. The following theorem summarizesour ﬁrst construction. Theorem 3.

Our second construction follows from a knownfact that the size of the support of the FRT distribution can be made O ( n log n ) and this support can becomputed deterministically in poly-time [19]. Merging each tree in this support at the root and some simpleprobabilistic method arguments give a copy tree embedding that is O ( log n ) -cost preserving but with an O ( n log n ) copy number. The next theorem summarizes this construction. Theorem 4.

There is a poly-time deterministic algorithm which given any weighted graph G = ( V, E, w ) and root r ∈ V computes an eﬃcient and well-separated O ( log n ) -approximate copy tree embedding with copynumber O ( n log n ) . π G → T is monotone (in addition to π T → G being monotone as stipulated by Deﬁnition 2); (2) if u and v are connected by F ⊆ E then Ω ( log n ) verticesof φ ( u ) are connected to Ω ( log n ) vertices of φ ( v ) in π G → T ( F ) (as opposed to just one vertex of φ ( u ) andone vertex of φ ( v ) as in Deﬁnition 2) and; (3) if u is connected to r by F ⊆ E then every vertex in φ ( u ) isconnected to φ ( r ) in π G → T ( F ) (as opposed to just one vertex of φ ( u ) as in Deﬁnition 2).We next apply our constructions to obtain new results for several online and demand-robust connectivityproblems whose history we brieﬂy summarize now. Group Steiner tree and group Steiner forest are twowell-studied generalizations of set cover and Steiner tree. In the group Steiner tree problem, we are givena weighted graph G = ( V, E, w ) and groups g , . . . , g k ⊆ V and must return a subgraph of G of minimumweight which contains at least one vertex from each group. The group Steiner forest problem generalizesgroup Steiner tree. Here, we are given A i , B i ⊆ V pairs and for each i we must connect some vertex from A i to some vertex in B i . Alon et al. [7] and Naor et al. [53] each gave a poly-log approximation for online groupSteiner tree and forest respectively but both of these approximation guarantees are randomized and onlyhold against oblivious adversaries because they rely on FRT. Indeed, Alon et al. [7] posed the existence of adeterministic poly-log approximation for online group Steiner tree as an open question which has since beenrestated several times [15, 17]. Similarly, while demand-robust minimum spanning tree and special cases ofdemand-robust Steiner tree have received considerable attention [25, 45, 47], there are no known poly-logapproximations for demand-robust Steiner tree, group Steiner tree or group Steiner forest. Application 1: Reducing Deterministic Online Group Problems to Tree Case (Section 5).

Inour ﬁrst application we demonstrate that our copy tree embeddings reduce solving online group Steiner treeand forest deterministically on a general graph to the case of solving it on a tree. In particular, we showthat a deterministic poly-log approximation for online group Steiner tree and forest on a tree graph gives adeterministic poly-log approximation on general graphs, thereby reducing the aforementioned open questionof Alon et al. [7] to its tree case.

Theorem 5.

If there exists an α -competitive poly-time deterministic algorithm for group Steiner tree (resp.group Steiner forest) on well-separated trees then there exists an O ( log n ⋅ α ) -competitive poly-time determin-istic algorithm for group Steiner tree (resp. group Steiner forest) on general graphs. Group Steiner tree has the notable property that mapping it onto a copy tree embedding simply results inanother instance of the group Steiner tree problem, this time on a tree (our application 2 shows that this isnot always the case). Therefore, this result is nearly immediate from either of the above constructions. Inparticular, if we have an instance of group Steiner tree on a general graph with groups { g i } i then we cansolve group Steiner tree on our embedding with groups { g ′ i } i where g ′ i ∶= ⋃ v ∈ g i φ ( v ) and our root is the onecopy of r , say r ′ . The connectivity properties of our mappings guarantee that a feasible solution for oneof these problems is a feasible solution for the other when projected: if g i is connected to r by F then g ′ i is connected to r ′ by π G → T ( F ) and if g ′ i is connected to r ′ by F ′ then g i is connected to r by π T → G ( F ′ ) .Moreover, the cost preservation of π G → T applied to the optimal solution on the input graph shows that ourproblem on the embedding has a cheap solution while the cost preservation of π T → G allows us to map oursolution on the embedding back to the input graph without increasing its cost. Lastly, the monotonicity of π T → G guarantees that the resulting online algorithm only adds and never attempts to remove edges from itssolution in G . Application 2: Deterministic Online Partial Group Steiner Tree (Section 6).

We next introducea new group connectivity problem—the online partial group Steiner tree problem. Partial group Steiner treeis group Steiner tree but where we must connect at least half of the vertices in each group to the root. As wediscuss in Section 6, partial group Steiner tree generalizes group Steiner tree. However, unlike group Steiner4ree it admits a natural bicriteria relaxation: instead of connecting of the nodes in each group we couldrequire that our algorithm only connects, say, ( − (cid:15) ) of all nodes in each group for some (cid:15) >

0. Thus, this resultcan be seen as showing that there is indeed a deterministic poly-log competitive algorithm for online groupSteiner tree—as posed in the above open question of Alon et al. [7]— provided the algorithm can be bicriteria in the relevant sense. More formally, we obtain a deterministic poly-log bicriteria approximation for thisproblem which connects at least − (cid:15) of the nodes in each group (notated “ ( − (cid:15) ) -connection competitive”below) by using our copy tree embeddings and a “water-ﬁlling” algorithm to solve the tree case. Theorem 6.

There is a deterministic poly-time algorithm for online partial group Steiner tree which givenany (cid:15) > is O ( log n(cid:15) ) -cost-competitive and ( − (cid:15) ) -connection competitive. As we later observe, providing a deterministic poly-log-competitive algorithm for online partial groupSteiner tree with any constant bicriteria relaxation is strictly harder than providing a deterministic poly-log-competitive algorithm for online (non-group) Steiner tree. Thus, this result also generalizes the fact thata deterministic poly-log approximation is known for online (non-group) Steiner tree [43]. Additionally, asa corollary we obtain the ﬁrst non-trivial deterministic approximation algorithm for online group Steinertree—albeit one with a linear dependence on the maximum group size. As mentioned above, our approachfor this problem requires that we use a copy tree with a poly-log copy number, thereby requiring that weuse our ﬁrst rather than our second construction.We next adapt and apply our embeddings in the demand-robust setting.

Application 3: Demand-Robust Steiner Problems (Section 7).

We begin by generalizing copytree embeddings to demand-robust copy tree embeddings. Roughly, these are copy tree embeddings whichsimultaneously work well for every possible demand-robust scenario. We then adapt our analysis from ourprevious constructions to show that these copy tree embeddings exist. Lastly, we apply demand-robustcopy tree embeddings to give poly-log approximations for the demand-robust versions of several Steinerproblems—Steiner forest, group Steiner tree and group Steiner forest—for which, prior to this work, nearlynothing was known. In particular, the only non-trivial algorithms known for demand-robust Steiner problemsprior to this work are an algorithm for Steiner tree [25] and an algorithm for demand-robust Steiner forest ontrees with exponential scenarios [30] (which is, in general, incomparable to the usual demand-robust setting).To show these results, we apply our demand-robust copy tree embeddings to reduce these problems to theirtree case. Thus, we also give our results on trees which are themselves non-trivial.

Theorem 7.

There is a randomized poly-time O ( log n ) -approximation algorithm for the demand-robustgroup Steiner tree problem on weighted trees. Theorem 8.

There is a randomized poly-time O ( D ⋅ log n ) -approximation algorithm for the demand-robustgroup Steiner forest problem on weighted trees of depth D . Theorem 9.

There is a randomized poly-time O ( log n ) -approximation algorithm for the demand-robustgroup Steiner tree problem on weighted graphs. Theorem 10.

There is a randomized poly-time O ( log n ) -approximation for the demand-robust group Steinerforest problem on weighted graphs with polynomially-bounded aspect ratio. We explicitly note here that this bicriteria guarantee does not yield a solution to the open problem of [7] of ﬁnding apoly-log deterministic approximation to the online group Steiner tree problem. on trees . Thus, we emphasize that going through the copy tree embedding is crucial for our application—amore direct approach of using online rounding schemes on the general problem does not seem to yield usefulresults.

Further Applications.

Lastly, we note that copy tree embeddings were integral to a follow-up work of thesame set of authors [41], in which we gave the ﬁrst poly-log approximations for the hop-constrained versionof many classic network design problems, including hop-constrained Steiner forest [4], group Steiner tree andbuy-at-bulk network design [9].

We survey some additional work before moving on to our results.

The group Steiner tree problem was introduced by Reich and Widmayer [55] as an important problem inVLSI design. Garg et al. [34] gave the ﬁrst randomized poly-log approximation for oﬄine group Steiner treeusing linear program rounding. Charikar et al. [18] derandomized this result and Chekuri et al. [21] showedthat a greedy algorithm achieves similar results. Demaine et al. [24] gave improved algorithms for groupSteiner tree on planar graphs.As earlier mentioned, Alon et al. [7] gave the ﬁrst randomized poly-logarithmic algorithm for online groupSteiner tree which works against oblivious adveraries and posed the existence of a deterministic poly-logapproximation as an open question. Very recently Bienkowski et al. [15] made exciting progress towards thisopen question by giving a poly-log deterministic approximation for online non-metric facility location—whichis equivalent to the online group Steiner tree on trees with depth 2. We complement this result by narrowingthe remaining gap on this question “from the other end” by showing that the tree case is all that needs to beconsidered. The authors also note that they believe that their methods could be used to give a deterministicpoly-log-competitive algorithm for group Steiner tree on trees which, when combined with our own results,would settle this open question.Alon et al. [7] introduced the group Steiner forest problem to study online network formation. Chekuri et al.[23] gave the ﬁrst poly-log approximation algorithm for oﬄine group Steiner forest and posed the existenceof a poly-log-competitive online algorithm as an open question. Naor et al. [53] answered this question inthe aﬃrmative by showing that a randomized algorithm which works against oblivious adversaries exists butpresently no adaptive-adversary-robust or deterministic poly-log-competitive online algorithm is known.We note some nuances regarding necessary assumptions on the power of online algorithms for group Steinertree and forest with an adaptive adversary. Alon et al. [6] observed that online set cover has no sub-polynomial-competitive algorithm against an adaptive adversary if the set system is not known beforehand.On the other hand, the same work showed how to give a poly-log-competitive algorithm for online setcover if the algorithm knows all possible elements the adaptive adversary might reveal (where the poly-logis poly-logarithmic in the total number of possible revealed elements). Set cover can easily be reducedto group Steiner tree on a tree where edges correspond to sets and elements correspond to leaves of thetree. Consequently, formulating any poly-log-competitive and adaptive-adversary-robust or deterministicalgorithm for group Steiner tree requires that the algorithm knows all possible groups the adversary mightreveal and that the number of possible groups is polynomially-bounded. As group Steiner tree is a special6ase of group Steiner forest, an analogous fact holds for group Steiner forest; namely all possible ( A i , B i ) pairs that the adaptive adversary might reveal must be known beforehand to the algorithm for a poly-logcompetitive ratio and the number of such pairs must be polynomially-bounded. Our embeddings are similar in spirit to Ramsey trees and Ramsey tree covers [3, 13, 16, 51, 52]. Speciﬁcally,it is known that for every metric ( V, d ) and k there is some subset S ⊆ V of size at least n − / k which embedsinto a tree—a so-called Ramsey tree—with distortion O ( k ) [51]. Recursively applying (a slight strengtheningof) this fact shows that there exist collections of Ramsey trees—so-called Ramsey tree covers—where eachvertex v has some “home tree” in which the distances to v are preserved. A concurrent work of Filtser [32]employed this machinery to devise “clan embeddings” where the trees of a Ramsey tree cover are mergedand—like in our work—each vertex is mapped to its copies. This line of work has led to many applicationsin metric-type problems such as compact routing schemes. However, the guarantees of Ramsey tree coversand the embeddings built on them are insuﬃcient for the connectivity problems in which we are interested ina slightly subtle way. We are interested in preserving the costs of entire subgraphs which, roughly speaking,requires that pairwise distances be preserved in every tree that we merge. For this reason our copy treeembedding construction will use much of the machinery of the “well-padded tree covers” of Gupta et al. [37]which (implicitly) give exactly this guarantee rather than Ramsey-tree-type machinery.Another recent work of Bartal et al. [14] was also concerned with tree embeddings for (not necessarilydeterministic) online algorithms. This work designed tree embeddings to give competitive algorithms fornetwork design problems competitive ratios are poly-logarithmic in the number of relevant terminals asopposed to the total number of nodes, n .Lastly, we note that there has been considerable work on extending the power of tree embeddings to avariety of other settings including tree embeddings for planar graphs [49], dynamic tree embeddings [20, 33],distributed tree embeddings [46] and tree embeddings where the resulting tree is a subgraph of the inputgraph [1, 2, 5, 26, 50]. Throughout this paper we will work with weighted graphs of the form G = ( V, E, w ) where V and E are thevertex and edge sets of G and w ∶ E → R ≥ gives the weight of edges. We typically assume that n ∶= ∣ V ∣ is thenumber of nodes and write [ n ] = { , , . . . , n } . We will also use V ( G ) , E ( G ) and w G to stand for the vertexset, edge set and weight function of G . Similarly, we will use w e to stand for w ( e ) where convenient. For asubset of edges F ⊆ E , we use the notation w ( F ) ∶= ∑ e ∈ F w G ( e ) . We use d G ∶ V × V → R ≥ to give the shortestpath metric according to w . We will talk about the diameter of a metric ( V, d ) which is max u,v ∈ V d ( u, v ) ; wenotate the diameter with D . We use B ( v, x ) ∶= { u ∈ V ∶ d ( v, u ) ≤ x } to stand for the closed ball of v of radius x in metric ( V, d ) and and B G ( v, x ) if ( V, d ) is the shortest path metric of G and we need to disambiguatewhich graph we are taking balls with respect to. We will sometimes identify a graph with the metric whichit induces.Notice that we have assumed that edge weights are non-zero and at least 1. This will be without lossgenerality as for our purposes any 0 weight edges may be contracted and scaling of edge weights ensures thatthe minimum edge weight is at least 1. In this section we give our two constructions of copy tree embeddings. We begin by giving our ﬁrst copytree embedding construction based on merging partial tree embeddings.7 heorem 3.

There is a poly-time deterministic algorithm which given any weighted graph G = ( V, E, w ) and root r ∈ V computes an eﬃcient and well-separated O ( log n ) -approximate copy tree embedding with copynumber O ( log n ) . If it were possible to give a single tree embedding which simultaneously preserved all distances betweenall nodes then we could simply take such a tree embedding as our copy tree embedding. However, such atree embedding is, in general, impossible. The key insight we use to overcome this issue is that one canapproximately preserve distances in a deterministic way if one only embeds a constant fraction of all nodesin the input metric; we call such an embedding a partial tree embedding. Combining O ( log n ) such partialtree embeddings will give our construction.In more detail, in Section 4.1 we show that an appropriate O ( log n ) “padded hierarchical decompositions”gives O ( log n ) partial tree embeddings where every node is embedded a constant number of times. Next,we show that such a collection of partial tree embeddings indeed gives us a copy tree embedding as inTheorem 3; the main observation that this reduction relies on is the constant congestion induced by Eulertours which will allow us to project from our input graph to our partial tree embeddings in a cost andconnectivity-preserving fashion. Thus, our goal after this point is to compute an appropriate collection ofpadded hierarchical decompositions.In Section 4.2 we proceed to show how to compute the required collection of padded hierarchical decom-positions. Our construction of hierarchical decompositions will make use of the FRT cutting scheme andpaddedness properties of it previously observed by Gupta et al. [37]. To this end, we provide a novelderandomization of a node-weighted version of the FRT cutting scheme by combing the powerful multiplica-tive weights methodology [8] together with the classic method of conditional expectation and pessimisticestimators. Gupta et al. [37] introduced the idea of padded hierarchical decompositions which we illustrate in Figure 3.

Deﬁnition 11.

A hierarchical decomposition H of a metric ( V, d ) of diameter D is a sequence of partitions P , . . . , P h of V where h = Θ ( log D ) and:1. The partition P h is one part containing all of V ;2. Each part in P i has diameter at most i ;3. P i is a reﬁnement of P i + ; that is, every part in P i is contained in some part of P i + . Notice that each part of P is a singleton node by our assumption that edge weights are at least 1 (we assumethat the constant in the theta notation of h = Θ ( log D ) is suﬃciently large). Deﬁnition 12 ( α -Padded Node) . For some α ≤ , a node v is α -padded in hierarchical decomposition P , . . . , P h if for all i ∈ [ , h ] the ball B ( v, α ⋅ i ) is contained in some part of P i . The main result we show in this section is how to use a collection of padded hierarchical decompositions toconstruct a copy tree embedding.

Lemma 13.

7. Each part in each P i ∈ H iscolored according to i ; singleton parts not pictured. We give α -padded nodes in green and all other nodesin red where we illustrate why the node on the far left is α -padded and the node on the far right is not bydrawing B ( v, α ⋅ i ) for i ≥ i for these two nodes. We now formalize the notion of a partial tree embedding.

Deﬁnition 14 (Partial Tree Embedding) . A γ -partial tree embedding of metric ( V, d ) is a well-separatedweighted tree T = ( V ′ , E ′ , w ) where:1. Partial Embedding: V ′ ⊆ V ;2. Worst-Case Distance Preservation

For any u, v ∈ V ′ we have d ( u, v ) ≤ d T ( u, v ) ≤ γ ⋅ d ( u, v ) . In the remainder of this section we show how good padded hierarchical decompositions deterministically givegood partial tree embeddings.The reason padded decompositions will be useful for us is that—as we prove in the following lemma—alldistances between padded nodes are well-preserved. Given a hierarchical decomposition H we let T H bethe natural well-separated tree corresponding to H . In particular, a hierarchical decomposition H naturallycorresponds to a well-separated tree which has a node for each part and an edge of weight 2 i between a partin P i and a part in P i + if the latter contains the former. In Figure 4a we illustrate the well-separated treecorresponding to the hierarchical decomposition in Figure 3a. We will slightly abuse notation and identifyeach singleton set in such a tree with its one constituent vertex. Lemma 15.

If nodes u, v are α -padded in a hierarchical decomposition H then d ( u, v ) ≤ d T H ( u, v ) ≤ O ( α ⋅ d ( u, v )) .Proof. Let T H be the well-separated tree corresponding to H . Let w be the least common ancestor of u and v in T H and let l be the height of w in T H . By the deﬁnition of T H , the distance between u and v in T H is d T H ( u, v ) = ⋅ ∑ li = i and so we have 2 l + ≤ d T H ( u, v ) ≤ l + . (1)We next prove that d T H ( u, v ) ≤ O ( α ⋅ d ( u, v )) . Notice that for j = ⌈ log ( d ( u, v )/ α )⌉ we know that B ( v, α ⋅ j ) contains u since for this j it holds that α ⋅ j ≥ d ( u, v ) . Since H is α -padded it follows that B ( v, α ⋅ j ) is This fact seems to be implicit in Gupta et al. [37] but is never explicitly proven. (a) Tree corresponding to Figure 3ahierarchical decomposition. r

11 1 1 1 1222 4 (b) Contract to ensure r is root of re-sulting tree. r

84 816 (c) Multiply weights by 4 and con-tract non- α -padded vertices. Figure 4: How to turn a hierarchical decomposition into a partial tree embedding. We color nodes from theinput metric in green if they are padded and red otherwise. Remaining nodes colored according to theircorresponding hierarchical decomposition part. r is the node on the far left of the tree.contained in some part of P j ; but it then follows that the least common ancestor of u and v is at height atmost j and so l ≤ ⌈ log ( d ( u, v )/ α )⌉ . Combining this with the upper bound in Equation (1) we have d T H ( u, v ) ≤ l + ≤ ⌈ log ( d ( u,v )/ α )⌉+ ≤ O ( α ⋅ d ( u, v )) We now prove that d ( u, v ) ≤ d T H ( u, v ) . Since the diameter of each part in P i is at most 2 i we know thatthe least common ancestor of u and v in T corresponds to a part with diameter at most 2 l . However, sincethe least common ancestor of u and v corresponds to a part which contains both u and v , we must have d ( u, v ) ≤ l ≤ l + . Combining this with the lower bound in Equation (1) we have d ( u, v ) ≤ d T H ( u, v ) asdesired.We show how to turn a hierarchical decomposition into a partial tree embedding in the next lemma whichwe illustrate in Figure 4. Lemma 16.

Given a hierarchical decomposition H on metric ( V, d ) and root r ∈ V which is α -padded in H ,one can compute in deterministic poly-time a O ( α ) -partial tree embedding T = ( V ′ , E ′ ) with root r where V ′ ∶= { v ∈ V ∶ v is α padded } ,Proof. Let T H be the well-separated tree which corresponds to H as described above.We construct T from T H using Lemma 15 and a trick of Konjevod et al. [49]. Let V ′ be all leaves of T H whose corresponding nodes are Ω ( n ) -padded in H . Next, contract the path from r to the root of T H andidentify the resulting node with r . Then, delete from T H all sub-trees which do not contain a node in V ′ ; inthe resulting tree every node is either in V ′ or the ancestor of a node in V ′ . Next, while there exists a node v such that its parent u is not in V ′ we contract { v, u } into one node and identify the resulting node with v . Lastly, we multiply the weight of every edge by 4 and return the result as T = ( V ′ , E ′ , w ) where, again, w is the weight function of T H times 4.Clearly, the vertex set of T will be V ′ . Moreover, T is well-separated since T H was well-separated and r willbe the root of T by construction.We now use an analysis of Konjevod et al. [49] to show that for any pair of vertices u, v ∈ V ′ we have d T H ( u, v ) ≤ d T ( u, v ) ≤ ⋅ d T H ( u, v ) (2)The upper bound is immediate from the fact that we only contract edges and then multiply all edge weightsby 4. To see the lower bound— d T H ( u, v ) ≤ d T ( u, v ) —notice that if u and v have a least common ancestor a

10t height l in T H , then d T H ( u, v ) = l + −

4. However, the closest u and v can be in T is if (without loss ofgenerality) u is identiﬁed with a and (without loss of generality) v is a child of u in T ; the length of this edgeis the length of a child edge of a in T H times four which is 2 l + . Thus d T H ( u, v ) = l + − ≤ l + = d T ( u, v ) .Finally, we conclude by applying Lemma 15. In particular, it remains to show d ( u, v ) ≤ d T ( u, v ) ≤ O ( α ⋅ d ( u, v )) but this is immediate by combining Lemma 15 and Equation (2). We now describe how partial tree embeddings satisfy useful connectivity properties and then use theseproperties to construct a copy tree embedding from a collection of good partial tree embeddings.The following two lemmas demonstrate how to map to and from partial tree embeddings in a way thatpreserves cost and connectivity.

Lemma 17 (Graph → Partial Tree Projection) . Let G = ( V, E, w G ) be a weighted graph and let T =( V ′ , E ′ , w T ) be a γ -partial tree embedding of (the metric induced by) G . There exists a deterministic, poly-time computable function π ∶ E → E ′ such that for all sets of edges F ⊆ E the following holds:1. Connectivity Preservation: If u, v ∈ V ′ are connected by F in G , then they are connected in π ( F ) in T ;2. Cost Preservation: w T ( π ( F )) ≤ O ( γ ) ⋅ w G ( F ) .Proof. We ﬁrst simplify F by noticing it is suﬃcient to prove the claim on every connected component inisolation. Furthermore, we can assume without loss of generality that F is a tree since taking a spanningtree of F can only decrease w G ( F ) and appropriately maintains connectivity. Finally, we delete every leafthat is not in V ′ , which decreases w G ( F ) and maintains connectivities in V ′ .We deﬁne π ( F ) to be the unique minimal subtree of T which contains all nodes of V ′ that are incident toan edge in F . By transitivity of connectedness, we know that if u, v ∈ V ′ are connected in F then they mustalso be connected in π ( F ) . Also, note that π is trivially deterministic poly-time computable.It remains to argue the γ -cost preservation property. Double the edges of F ; we call this multigraph 2 F .Since the degree of every vertex in 2 F is even, we know that 2 F has an Euler tour. Using this tour wecan partition 2 F into a set P of paths where each path connects two nodes in V ′ and the paths in P aremultiedge-disjoint. Therefore, we have that 2 w G ( F ) = ∑ P ∈P w G ( P ) .For each path P ∈ P in the tour between nodes u, v ∈ V ′ , we say that P covers all edges in T between u and v and let P ′ be the path in T between u and v . We note that every edge in π ( F ) is covered by at least onepath, hence w T ( π ( F )) ≤ ∑ P ∈P w T ( P ′ ) .For every path in G connecting two nodes u, v ∈ V ′ the distance-preservation properties of γ -partial treeembeddings implies that w T ( P ′ ) ≤ O ( γ ) ⋅ w G ( P ) . Hence we have that w T ( π ( F )) ≤ ∑ P ∈P w T ( P ′ ) ≤ O ( γ ) ⋅∑ P ∈P w G ( P ) ≤ O ( γ ) ⋅ w G ( F ) as required.We now show how to project in the reverse direction. Lemma 18 (Partial Tree → Graph Projection) . Let G = ( V, E, w G ) be a weighted graph and let T =( V ′ , E ′ , w T ) be a γ -partial tree embedding of (the metric induced by) G . There exists a deterministic, poly-time computable function ı ∶ E ′ → E such that for all sets of edges F ′ ⊆ E ′ the following holds:1. Connectivity Preservation: If u, v ∈ V ′ are connected by F ′ in T , then they are connected by ı ( F ) in G ;2. Cost Preservation: w G ( ı ( F ′ )) ≤ w T ( F ′ ) . roof. For an edge e ′ ∈ E ′ , connecting u, v ∈ V ′ , we deﬁne ı ({ e ′ }) as some shortest path between u and v in G . Note that this implies that w G ( ı ({ e ′ })) ≤ w T ( e ′ ) by the properties of a partial tree embedding.We extend ı to F ′ ⊆ E ′ by deﬁning ı ( F ′ ) ∶= ⋃ e ′ ∈ F ′ ı ({ e ′ }) . Notice that ı is indeed deterministic, poly-timecomputable and is connectivity preserving by the transitivity of connectivity.We now verify the cost preservation of ı : we have that w G ( ı ( F ′ )) = w G (⋃ e ′ ∈ F ′ ı ({ e ′ })) ≤ ∑ e ′ ∈ F ′ w G ( ı ({ e ′ })) ≤∑ e ′ ∈ F ′ w T ( e ′ ) = w T ( F ′ ) .Using these two properties we can conclude our proof of Lemma 13, which we restate here. Lemma 19.

Let {H i } ki = be a collection of hierarchical decompositions of weighted graph G = ( V, E, w ) suchthat every v is α -padded in at least . k decompositions. Then, there is a poly-time deterministic algorithmwhich, given {H i } ki = and a root r ∈ V , returns an eﬃcient and well-separated O ( kα ) -approximate copy treeembedding with copy number k .Proof. Our embedding is gotten by combining the above lemmas in the natural way.Speciﬁcally, we ﬁrst apply Lemma 16 to all decompositions in {H i } ki = in which r is α -padded to get back O ( α ) -partial tree embeddings { T i } i where V ( T i ) = { v ∶ v is α -padded in H i } . Next we apply Lemma 17 andLemma 18 to each T i to get back mapping functions π i and ı i respectively.We now describe our O ( kα ) -approximate copy tree embedding ( T, φ, π G → T , π T → G ) . We let T be the treeresulting from taking all trees in { T i } i and then identifying all copies of r as the same vertex. Similarly, welet φ ( v ) be the set of all copies of v in T in the natural way. Next we let π G → T ( F ) be ⋃ i π i ( F ) where π i isprojected onto T in the natural way. We let π T → G ( F ′ ) ∶= ⋃ i ı i ( F ′ ) be deﬁned analogously.Since each vertex appears in at least a . T i , by the pigeonhole principle we know that any pairconnected by F in G must occur in some H i together with r and so must be connected in π i ( F ) for some i where T i ∈ { T i } i and so some pair of corresponding copies are connected by π G → T ; an analogous result holdsfor π T → G . The remaining properties of our embedding are immediate from the above cited lemmas. In the previous section we reduced computing good copy tree embeddings to computing good hierarchicaldecompositions. The existence of good hierarchical decompositions is immediate from prior work of Guptaet al. [37] and FRT.

Lemma 20 (Gupta et al. [37]) . Let H be the hierarchical decompositions resulting from a tree drawn fromthe Fakcharoenphol et al. [29] cutting scheme. Then, every vertex is Ω ( n ) -padded with constant probabilityin H . A simple Chernoﬀ and union bound proof then gives that O ( log n ) draws gives a collection of hierarchicaldecompositions in which every vertex is Ω ( n ) -padded in a constant fraction of the decompositions withhigh probability , i.e. at least 1 − ( n ) .However, we are ultimately interested in a deterministic algorithm which is robust to adaptive adversariesand so we must derandomize the above with high probability result. We proceed to do so in this section.To our knowledge, prior derandomizations of this cutting scheme—see, e.g. Chekuri et al. [22] or Fakcharoen-phol et al. [29]—do not provide suﬃciently strong guarantees for our purposes. We also note that the authorsof Gupta et al. [37] claim to give a deterministic algorithm for computing hierarchical decompositions in aforthcoming journal version of their paper but said journal version never seems to have been published.12 .2.1 Derandomization Intuition The intuition behind our derandomization is as follows. A single draw from the FRT cutting scheme guar-antees that each node is Ω ( / log n ) -padded with constant probability. If we could derandomize this resultthen we could produce one hierarchical decomposition such that at least a .

99 fraction of all nodes areΩ ( / log n ) -padded. Indeed, as we will see, standard derandomization techniques—the method of pessimisticestimators and conditional expectation—will allow us to do exactly this. However, since we must producea collection of hierarchical decompositions in which every node is in a large percentage in all decomposi-tions it is not clear how, then, to handle the remaining .

01 fraction of nodes. One might simply rerun theaforementioned derandomization result on the remaining .

01 nodes, then on the remaining .

001 nodes and soon logarithmically-many times; however, it is easy to see that in the resulting collection of decompositions,while every node is padded in some decomposition, no node is necessarily padded in a large fraction of allthe decompositions.Rather, we would like to repeatedely run our derandomization on all nodes but in a way that takes intoaccount which nodes are already padded in a large fraction of the decompositions we have already produced.In particular, if a node was already padded in most of the decompositions we have so far produced, we neednot worry about producing decompositions in which this node is padded. Thus, we would like to derandomizein a way that would make such a node less likely to be padded in the remaining decompositions we producewhile making nodes which have not so far been padded in many decompositions we produced more likely tobe padded.To accomplish this, we will formulate and then derandomize a node-weighted version of Lemma 20; this, inturn, will allow us to down-weight nodes which are padded in a large fraction of the decompositions we haveso far produced when we run our derandomization; a multiplicative-weights-type analysis will then allow usto conclude our deterministic construction.

In order to give our deterministic construction we must unpack the black box of the FRT cutting scheme.The Fakcharoenphol et al. [29] cutting scheme given metric ( V, d ) where d ( u, v ) ≥ u, v ∈ V produces ahierarchical decomposition H = {P , . . . , P h } and is as follows. We ﬁrst pick a uniformly random permutation π on V and a uniformly random value β ∈ [ , ) . We let the radius for level i be r i ∶= i − ⋅ β .We let P h be the trivial partition containing all vertices of V with h = O ( log max u,v d ( u, v )) . Next, weconstruct P i by reﬁning P i + ; in particular we divide each part P i + ∈ P i + into additional parts as follows.Each v ∈ P i + is assigned to the ﬁrst vertex u in π for which v ∈ B ( u, r i ) . Notice that u need not be in P i + .Let C u be all vertices in P i + which are assigned to u and add to P i all C u which are non-empty. Notice thathere C u really depends on i ; we suppress this dependence in our notation for cleanliness of presentation.One can easily verify that the resulting partitions indeed form a hierarchical decomposition. As discussed above, our goal is to derandomize Lemma 20 while taking node weights into account. Supposewe have a distribution { p v } v over vertices in v ; intuitively this distribution how important each vertex is inregards to being α -padded. Then by Lemma 20 and linearity of expectation we have E π,β [∑ v p v ⋅ I ( v is Ω ( n ) -padded in H)] = ∑ v p v ⋅ Pr π,β ( v is Ω ( n ) -padded in H)≥ . . where I is the indicator function. 13hus, our goal will be to gradually ﬁx the randomness of π and β until we have found a way to determinis-tically set β and π so that at least a .

95 fraction of nodes (weighted by p v s) are Ω ( n ) -padded. That is,we aim to use the method of conditional expectation. We will treat a permutation π as an ordering of theelements of [ V ] . E.g. ( v , v , v ) is a permutation of V = { v , v , v } . Now, suppose we have ﬁxed a preﬁx π P of π which orders nodes P ⊆ V and among the remaining ¯ P ∶= V ∖ P we uniformly at randomly choosethe remaining suﬃx π ¯ P . That is, π = π P ⊙ π ¯ P where π P is ﬁxed and π ¯ P is a uniformly random permutationover ¯ P and ⊙ is concatenation. Notice that it follows that every vertex of P will precede every vertex of ¯ P in π .Let H( π P , β ) be the hierarchical decomposition returned when we run the FRT cutting scheme as abovewith the input value of β and with π chosen as π = π P ⊙ π ¯ P . Notice that provided P ≠ V we have that H is a randomly generated. Let f ( π P , β ) ∶= ∑ v p v ⋅ Pr π ¯ P ( v is Ω ( n ) -padded in H( π P , β )) be the fractionof Ω ( n ) -padded nodes by weight in expectation in H( π P , β ) . We now show that there is a so called“pessimistic estimator” ˆ f of f . Lemma 21.

There is a function ˆ f such that1. Good start:

There is some deterministically poly-time computable set R ⊆ R such that for some β ∈ R we have ˆ f ( π ∅ , β ) ≥ . .and for any P ⊆ V , π P and β Computable: ˆ f ( π P , β ) is computable in deterministic poly-time;3. Monotone: ˆ f ( π P , β ) ≤ ˆ f ( π P ∪{ v } , β ) for some v ∈ ¯ P ;4. Pessimistic: ˆ f ( π P , β ) ≤ f ( π P , β ) for all π P and β .Proof. We will use an analysis similar to Gupta et al. [37] but which accounts for the ﬁxed preﬁx π P of ourpermutation, demonstrates the above properties of our pessimistic estimator and which guarantees that R is computable in deterministic, poly-time.We begin by deﬁning ˆ f . Fix a π P and β and let α = Ω ( n ) .For node v , let B i,v ∶= B ( v, α i ) . Say that node u protects B i,v if its ball at level i contains B i,v , i.e. if r i ≥ d ( u, v ) + i α . Say that u threatens B i,v if its ball at level i intersects B i,v but does not contain it, i.e. d ( u, v ) − α i < r i < d ( u, v ) + i α . Finally, say that u cuts B i,v if it threatens B i,v and is the ﬁrst node in π to threaten or protect B i,v . Notice that if B i,v is not cut by any node for all i then v will be α -padded.In order for B i,v to be cut by u it must be the case that u threatens B i,v and no node before u in π threatensor protects B i,v . By how we choose r i , u threatens B i,v if d ( u, v ) − i α < β ⋅ i − < d ( u, v ) + i α (3)In order for u to be the ﬁrst node to threaten or protect B i,v , it certainly must be the case that every nodewhich is closer to v than u appears after u in π (since every such node either threatens or protects B i,v ).Thus, we let N v ( u ) ∶= { w ∶ d ( w, v ) ≤ d ( u, v )} be all nodes which are nearer to v than u .Lastly, a node which is too far or too close to v cannot cut B i,v . In particular, a node u can only cut B i,v if2 i − − i α ≤ d ( u, v ) ≤ i − + i α (4)We let C i,v ∶= { u ∶ i − − i α ≤ d ( u, v ) ≤ i − + i α } be all such nodes which might cut B i,v .14t follows that we have that B i,v is cut only if there exists some u in C i,v which both threatens v and precedesall w ∈ N v ( u ) ∖ { u } in π . Thus, we deﬁne ˆ f as followsˆ f ( π P , β ) ∶= − ∑ v,i p v ∑ u ∈ C i,v Pr π ¯ P ( u precedes all w ∈ N v ( u ) ∖ { u } in π ) ⋅ I ( u threatens B i,v ) . where, again, I is the indicator function. We now verify properties (2)-(4).2. Computable: Clearly C i,v is deterministically computable in poly-time since we need only check if Equa-tion (4) holds for each vertex. Similarly I ( u threatens B i,v ) for each u ∈ C i,v can be computed by check-ing if Equation (3) holds. We can deterministically compute Pr π ¯ P ( u precedes all w ∈ N v ( u ) ∖ { u } in π ) for each u ∈ C i,v as follows: if u precedes all w ∈ N v ( u ) ∩ π P then this probability is 1; if u is precededin π P by some w ∈ N v ( u ) then this probability is 0; otherwise π P ∩ N v ( u ) = ∅ , meaning all nodes in N v ( u ) ’s order in π are set by π ¯ P ; in this case u precedes all nodes in N v ( u ) ∖ { u } with probabilityexactly ∣ N v ( u )∣ .3. Monotonicity is immediate by an averaging argument: in particular, ˆ f ( π P , β ) is just an expectationtaken over the randomness of π ¯ P and so there must be some way to ﬁx an element of P to achieve theexpectation.4. Pessimism is immediate from the above discussion; in particular, as discussed above a ball B i,v is cutonly if there is some u ∈ C i,v which threatens B i,v and which precedes all w in N v ( u ) ∖ { u } in π ; itfollows by a union bound that v fails to be α -padded with probability at most ∑ i ∑ u ∈ C i,v Pr π ¯ P ( u precedes all w ∈ N v ( u ) ∖ { u } in π ) ⋅ I ( u threatens B i,v ) . Finally, we conclude property (1): that there is some β ∈ R where R is computable in deterministic poly-time and ˆ f ( π ∅ , β ) ≥ .

95. Consider drawing a β ∈ [ , ] as in the FRT cutting scheme; we will argue that E β [ ˆ f ( π ∅ , β )] ≥ .

95 and so there must be some β for which ˆ f ( π ∅ , β ) ≥ . π be a uniformly random permutation, we have E β [ ˆ f ( π ∅ , β )] = − ∑ v,i p v ∑ u ∈ C i,v Pr π ( u precedes all w ∈ N v ( u ) ∖ { u } in π ) ⋅ Pr β ( u threatens B i,v ) . If u is the s th closest node to v then we have that Pr π ( u precedes all w ∈ N v ( u ) ∖ { u } in π ) = s . Moreover, u threatens B i,v only if Equation (3) holds and since β ⋅ i − is distributed uniformly in [ i − , i − ) , thishappens with probability 2 i + α / i − = α . Next, we claim that for a ﬁxed v , each u occurs in at most 3 ofthe C i,v . In particular, notice that if u is in C i,v and C i ′ ,v then we know that 2 i − − i α ≤ d ( u, v ) ≤ i ′ − + i ′ α which for α ≤ (which we may assume since α = Ω ( n ) ) implies i < i ′ +

3. Combining these facts with thefact that H n ∶= ∑ ni = i ≤ O ( log n ) we get E β [ ˆ f ( π ∅ , β )] ≥ − O ( α log n ) . and since α = Ω ( n ) , by ﬁxing the constant in the Ω ( n ) to be suﬃciently small we have E β [ ˆ f ( π ∅ , β )] ≥ .

95 as desiredLastly, we deﬁne R and argue that there must be some β ∈ R such that ˆ f ( π ∅ , β ) ≥ .

95. In particular, noticethat since E β [ ˆ f ( π ∅ , β )] ≥ .

95, it suﬃces to argue that there are polynomially-many eﬃciently computableintervals which partition [ , ) such that any β and β in the same interval satisfy ˆ f ( π ∅ , β ) = ˆ f ( π ∅ , β ) ;letting R take an arbitrary element from each such interval will give the desired result.15otice that ˆ f ( π ∅ , β ) ≠ ˆ f ( π ∅ , β ) only if there is some i, v and u such that u threatens B i,v with β set to β but does not threaten B i,v with β set to β . By deﬁnition of what it means to threaten, we have d ( u, v ) − i α < β ⋅ i − < d ( u, v ) + i α but either d ( u, v ) − i α ≥ β ⋅ i − or β ⋅ i − ≥ d ( u, v ) + i α . We then have either β ≤ d ( u, v ) ⋅ − i − α < β (5)or β < d ( u, v ) ⋅ − i + α ≤ β . (6)With Equations 5 and 6 in mind, we deﬁne R l ∶= { d ( u, v ) ⋅ − i + α ∶ u, v ∈ V, i ∈ [ h ]} to be all the lowerthresholds of when a change in β aﬀects ˆ f and deﬁne R u ∶= { d ( u, v ) ⋅ − i − α ∶ u, v ∈ V, i ∈ [ h ]} to be allsuch upper thresholds. Let t ( l ) be the l th largest element of ( R l ∪ R r ) ∩ [ , ) and let R consist of onearbitrary element from the interval between t ( l ) and t ( l + ) for l ≥ t ( l ) only if t ( l ) ∈ R l and t ( l + ) only if t ( l + ) ∈ R u ; t ( ) = is always included and t (∣ R ∣) = β and β which are in the same interval satisfy ˆ f ( π ∅ , β ) = ˆ f ( π ∅ , β ) ; moreover,these intervals partition [ , ] by construction.We know ∣ R ∣ = poly ( n ) since h ≤ O ( log n ) by our assumption that max u,v d ( u, v ) is poly ( n ) and there are n pairs u, v . Clearly R is computable in deterministic poly-time. Thus, by the above discussion R mustcontain some β such that ˆ f ( π ∅ , β ) ≥ . Lemma 22.

There is a deterministic algorithm which given metric ( V, d ) and a distribution { p v } v overnodes returns a hierarchical decomposition H in which at least a . fraction of nodes are Ω ( n ) -padded byweight; i.e. ∑ v p v ⋅ I ( v is Ω ( n ) -padded in H) ≥ . . Proof.

Our derandomization algorithm is as follows. First, choose the β ∈ R which maximizes ˆ f ( π ∅ , β ) . Callthis β ∗ . Next, initially let P = ∅ and repeat the following until P = V : for v ∈ ¯ P we compute f ( π P ∪{ v } , β ∗ ) ;we add to P whichever v maximizes f ( π P ∪{ v } , β ∗ ) . Lastly, we return H( π V , β ∗ ) .By Lemma 21 we know that β ∗ will satisfy ˆ f ( π ∅ , β ∗ ) ≥ .

95. Moreover, since ˆ f is monotone by Lemma 21we know that the π V we choose will satisfy ˆ f ( π V , β ∗ ) ≥ .

95. Lastly, since ˆ f is pessimistic, it follows that f ( π V , β ∗ ) ≥ f ( π V , β ∗ ) ≥ .

95 and so H( π V , β ∗ ) is padded on a .

95 fraction of nodes by weight as desired.The deterministic polynomial runtime of our algorithm is immediate from the deterministic poly-time com-putability of ˆ f and the fact that R is computable in deterministic poly-time.Using the above node-weighted derandomization lemma gives our deterministic copy tree embedding con-struction. In particular, we run the following multiplicative-weights-type algorithm with (cid:15) = .

01 and set thenumber of iterations as τ ∶= n / (cid:15) . In the following we let p ( t ) v ∶= w ( t ) v / ∑ v w ( t ) v be the proportional shareof v ’s weight in iteration t .1. Uniformly set the initial weights: w ( ) v = v ∈ V .2. For t ∈ [ τ ] : 16a) Run the algorithm given in Lemma 22 using distribution p ( t ) and let H t be the resulting hierar-chical decomposition.(b) Set mistakes:

For each vertex v which is Ω ( n ) -padded in H t let m ( t ) v =

1. Let m ( t ) v = v .(c) Update weights: for all v ∈ V , let w ( t + ) v ← exp (− (cid:15)m ( t ) v ) ⋅ w ( t ) v .3. Return (H t ) τt = .We state a well-known fact regarding multiplicative weights in our notation. Readers familiar with multi-plicative weights may recognize this as the fact that the expected performance of mutliplicative weights overlogarithmically-many rounds is competitive with every expert. Lemma 23 (Arora et al. [8]) . The above algorithm guarantees that for any v ∈ V we have T ∑ t ≤ τ p ( t ) ⋅ m ( t ) ≤ (cid:15) + T ∑ t ≤ τ m ( t ) v where p ( t ) ⋅ m ( t ) ∶= ∑ v p ( t ) v m ( t ) v is the usual inner product. Using this fact we conclude that we are able to produce a good set of hierarchical decompositions.

Lemma 24.

The above algorithm returns a collection of hierarchical decompositions {H t } τt = where τ = Θ ( log n ) and every vertex is Ω ( / log n ) -padded in at least . τ of the decompositions.Proof. Since τ ∶= n / (cid:15) we know that τ = Θ ( log n ) .We need only argue, then, that each node is padded in at least a . τ total H t . Let f v ∶= τ ∑ t ≤ τ I ( v is Ω ( n ) -padded in H t ) be the fraction of the decompositions in which v is padded. Consider a ﬁxed node v . By Lemma 23 we knowthat 1 τ ∑ t ≤ τ p ( t ) ⋅ m ( t ) ≤ (cid:15) + τ ∑ t ≤ τ m ( t ) v (7)By deﬁnition of m ( t ) v we have that the right hand side of Equation (7) is (cid:15) + f v . On the other hand, byhow we set m ( t ) , the left hand side of Equation (7) is τ ∑ t ∑ v p ( t ) v ⋅ I ( v is Ω ( n ) -padded in H) which byLemma 22 is at least .

95. Combining these facts we have . ≤ (cid:15) + f v and so by our choice of (cid:15) we know . ≤ f v as desired.Combining Lemma 24 with Lemma 13 gives Theorem 3. In this section we observe that the support of the FRT distribution can be merged to produce copy treeembeddings with cost stretch O ( log n ) and copy number O ( n log n ) . In particular, we rely on the knownfact that one can make the size of the support of the FRT distribution O ( n log n ) and compute said supportin deterministic poly-time, as summarized in the following theorem.17 heorem 25 ([19, 29, 49]) . Given a weighted graph G = ( V, E, w ) and root r ∈ V , there exists a distribution D being supported over O ( n log n ) well-separated weighted trees on V rooted at r where for any u, v ∈ V wehave E T ∼D [ d T ( u, v )] ≤ O ( log n ⋅ d G ( u, v )) and for every T in the support of D we have d G ( u, v ) ≤ d T ( u, v ) .Also, (the support and probabilities of ) D can be computed in deterministic poly-time. Merging the trees of this distribution and some simple probabilistic method arguments give a copy treeembedding with the desired properties.

Theorem 4.

There is a poly-time deterministic algorithm which given any weighted graph G = ( V, E, w ) and root r ∈ V computes an eﬃcient and well-separated O ( log n ) -approximate copy tree embedding with copynumber O ( n log n ) .Proof. Let T , . . . , T k with k = O ( n log n ) be the trees in the support of the distribution D as guaranteed byTheorem 25. Then, we let T be the result of identifying each copy of r as the same vertex in each T i (butnot identifying copies of other vertices in V as the same vertex); that is, ∣ V ( T )∣ = k ⋅ n − ( k − ) . T ’s weightfunction is inherited from each T i in the natural way. Similarly, we let φ ( v ) be the set containing each copyof v in each of the T i . It is easy to verify that φ is indeed a copy mapping. Also, note that φ ( v ) is computablein deterministic poly-time, our copy number is O ( n log n ) by construction and that T is well-separated sinceeach T i is well-separated.We next specify π G → T ( F ) for a ﬁxed F . For tree T i , let T ′ i ⊆ T i be the subgraph of T i which contains theunique tree path between u and v iﬀ { u, v } ∈ F . By Theorem 25 we know that E T i ∼ D [ w T i ( T ′ i )] ≤ O ( log n ⋅ w G ( H )) and so there must be some j such that w T j ( T ′ j ) ≤ O ( log n ⋅ w G ( F )) . Thus, we let π G → T ( F ) ∶= T ′ j .We argue that π G → T requires the stated connectivity properties. In particular, notice that by constructionwe have that if u and v are connected in F then they will have some copy connected in π G → T ( F ) : if u and v are connected in F by path ( v , v , . . . ) then the path in T j which connects the copy of v l and the copyof v l + is contained in π G → T ( F ) and the concatenation of these paths for all l connects the copies of u and v contained in T j . Moreover, notice that π G → T ( F ) satisﬁes the required cost preservation properties since w T ( π G → T ( F )) = w T j ( T ′ j ) ≤ O ( log n ⋅ w G ( F )) by construction.Lastly, we specify π T → G ( F ′ ) . We let π T → G ( F ′ ) be the graph induced by { P uv ∶ { u ′ , v ′ } ∈ F ′ } where P uv isan arbitrary shortest path in G between u and v and u ′ and v ′ are copies of u and v . We ﬁrst verifythe required connectivity preservation properties: if u ′ and v ′ are connected in F ′ by path ( v ′ , v ′ . . . ) then we know that v l and v l + will be connected in π T → G ( F ′ ) for every l by P v l v l + where v ′ i is somecopy of v i . Thus, u and v will be connected in π T → G ( F ′ ) . We next verify the required cost-preservationproperties. By Theorem 25 we have for every i that w T i ( e ′ ) ≥ w G ( P uv ) for each e ′ = { u ′ , v ′ } ∈ T i . Thus, w T ( F ′ ) = ∑ e ′ ∈ F ′ w T ( e ′ ) ≥ ∑ { u ′ ,v ′ }∈ F ′ w G ( P uv ) ≥ w G ( π T → G ( F ′ )) where we have again used u and v to stand forthe φ − ( u ′ ) and φ − ( v ′ ) respectively. Lastly, we note that π T → G ( F ′ ) is trivially computable in deterministicpoly-time. In this section we prove that the guarantees of our copy tree embeddings are suﬃcient to generalize anydeterministic algorithm for online group Steiner tree on trees to general graphs, thereby reducing an openquestion posed by Alon et al. [7] to its tree case. We show that a similar result holds for the online groupSteiner forest problem which generalizes online group Steiner tree.In general, mapping an instance of a problem P onto an equivalent instance I ′ on the copy tree embeddingoften results that I ′ is not an instance of the same problem P . However, group Steiner tree (resp., forest)problems have the notable property that mapping them onto a copy tree embedding simply results in anotherinstance of the group Steiner tree (resp., forest) problem, this time on a tree. This property, albeit somewhathidden in the proof, is the main reason why copy tree embeddings are well suited for these two problems.18ecause past work on group Steiner and group Steiner forest have stated runtimes and approximationguarantees as functions of the maximum group size and number of groups rather than just n —see e.g.[14, 34]—we will give our results in the same generality with respect to these parameters. We begin with our results for online group Steiner tree.

Oﬄine Group Steiner Tree:

In the group Steiner Tree problem we are given a weighted graph G =( V, E, w ) as well as pairwise disjoint groups g , g , . . . , g k ⊆ V and root r ∈ V . We let N ∶= max i ∣ g i ∣ be themaximum group size. Our goal is to ﬁnd a (connected) tree T rooted at r which is a subgraph of G andsatisﬁes T ∩ g i ≠ ∅ for every i . We wish to minimize our cost, w ( T ) ∶= ∑ e ∈ E ( T ) w ( e ) . Online Group Steiner Tree:

Online group Steiner tree is the same as oﬄine group Steiner tree but whereour solution need not be a tree and groups are revealed in time steps t = , , . . . . That is, in time step t an adversary reveals a new group g t and the algorithm must maintain a solution T t where: (1) T t − ⊆ T t ;(2) T t is feasible for the group Steiner tree problem on groups g , . . . g t and; (3) T t is competitive with theoptimal oﬄine solution for this problem where the competitive ratio of our algorithm is max t w ( T t )/ OPT t where OPT t is the cost of the optimal oﬄine group Steiner tree solution on the ﬁrst t groups. Here, we willlet k be the number of possible groups revealed by the adversary. Theorem 26.

If there exists:1. A poly-time deterministic algorithm to compute an eﬃcient, well-separated α -approximate copy treeembedding with copy number χ and;2. A poly-time f ( n, N, k ) -competitive deterministic algorithm for online group Steiner tree on well-separatedtreesthen there exists an ( α ⋅ f ( χn, χN, k )) -competitive deterministic algorithm for group Steiner tree (on generalgraphs).Proof. We will use our copy tree embedding to produce a single tree on which we must solve deterministiconline group Steiner tree.In particular, consider an instance of online group Steiner tree on weighted graph G = ( V, E, w ) with root r . Then, we ﬁrst compute a copy tree embedding ( T, φ, π G → T , π T → G ) deterministically with respect to G and r as we assumed is possible by assumption. Next, given an instance I t of group Steiner tree on G withgroups g , . . . g t , we let I ′ t be the instance of group Steiner tree on T with groups φ ( g ) , . . . φ ( g t ) and root r ′ ∶= φ ( r ) where we have used the notation φ ( g i ) ∶= ⋃ v ∈ g i φ ( g i ) . Then, if the adversary has required that wesolve instance I t in time step t , then we require that our deterministic algorithm for online group Steinertree on trees solves I ′ t in time step t and we let H ′ t be the solution returned by our algorithm for I ′ t . Lastly,we return as our solution for I t in time step t the set H t ∶= π T → G ( H ′ t ) .Let us verify that the resulting algorithm is indeed feasible and of the appropriate cost.First, we have that H t ⊆ H t + for every t since H ′ t ⊆ H ′ t + because our algorithm for trees returns a feasiblesolution for its online problem and π T → G is monotone by deﬁnition of a copy tree embedding. Moreover, weclaim that H t connects at least one vertex from each g i to r for i ≤ t and every t . To see this, notice that H ′ t connects at least one vertex from φ ( g t ) to r ′ = φ ( r ) in t since it is a feasible solution for I ′ t and so at leastone copy of a vertex in g t ; by the connectivity preservation properties of a copy tree it follows that at leastone vertex from g t is connected to r . Thus, our solution is indeed feasible in each time step. The assumption that the tree is rooted in group Steiner tree is without loss of generality as we may always brute-forcesearch over a root. Similarly, the assumption that all groups are pairwise disjoint is without loss of generality since if v is ingroups { g , g , . . . } then we can remove v from all groups and add vertices v , v , . . . to G which are connected only to v so that v i ∈ g i and w (( v, v i )) = i . ′ t be the cost of the optimal solution to I ′ t and let n ′ and N ′ be the number of vertices and maximum size of a group in I ′ t for any t . By our assumption on the costof the algorithm we run on T and since n ′ ≤ χn and N ′ ≤ χN by deﬁnition of copy number, we know that w T ( H ′ t ) ≤ OPT ′ t ⋅ f ( n ′ , N ′ , k ) = OPT ′ t ⋅ f ( χn, χN, k ) . Next, let H ∗ t be the optimal solution to I t . We claim that π G → T ( H ∗ t ) is feasible for I ′ t . This follows because H ∗ t connects a vertex from g , . . . , g t to r and so by the connectivity preservation property of copy treeembeddings we know that some vertex from each of φ ( g ) , . . . , φ ( g t ) is connected to r ′ = φ ( r ) . Applyingthis feasibility of π G → T ( H ∗ t ) and the cost preservation property of our copy tree embedding, it follows thatOPT ′ t ≤ w T ( π G → T ( H ∗ t )) ≤ α ⋅ w G ( H ∗ t ) = α ⋅ OPT t .Similarly, we know by the cost preservation property of our copy tree embedding that w G ( π T → G ( H ′ t )) ≤ w T ( H ′ t ) . Combining these observations we have w G ( π T → G ( H ′ t )) ≤ w T ( H ′ t ) ≤ OPT ′ t ⋅ f ( χn, χN, k ) ≤ OPT t ⋅ α ⋅ f ( χn, χN, k ) , thereby showing that our solution is within the required cost bound.Plugging in our ﬁrst construction (Theorem 3) or our second construction (Theorem 4) of a copy treeembedding immediately gives the follow corollary. Corollary 27.

If there is an f ( n, N, k ) -competitive deterministic algorithm for online group Steiner tree onwell-separated trees then there are O ( log n ⋅ f ( O ( n log n ) , O ( nN ) , k )) and O ( log n ⋅ f ( O ( n log n ) , O ( N log n ) , k )) -competitive deterministic algorithms for online group Steiner tree (on general graphs). In this section we show a black-box reduction from the poly-log-approximate online deterministic groupSteiner forest in a general graph G to poly-log-approximate online deterministic group Steiner forest whenthe underlying graph is a tree. A formal deﬁnition of the problem follows. Oﬄine Group Steiner Forest:

In the group Steiner forest problem we are given a weighted graph G = ( V, E, w ) as well as pairs of subsets of nodes ( A , B ) , ( A , B ) , . . . , ( A k , B k ) where A i , B i ⊆ V . Ourgoal is to ﬁnd a forest F which is a subgraph of G and in which for each i there is an a i ∈ A i and b i ∈ B i such that a i and b i are connected in F . We wish to minimize our cost, w ( F ) ∶= ∑ e ∈ E ( F ) w ( e ) . We let N ∶= max i max (∣ A i ∣ , ∣ B i ∣) be the maximum subset size. Online Group Steiner Forest:

Online group Steiner forest is the same as group Steiner forest but eachpair ( A t , B t ) is revealed at time step t = , , . . . by an adversary and in each time step t we must maintaina forest F t which is feasible for pairs ( A , B ) , . . . ( A t , B t ) so that F t − ⊆ F t . The competitive ratio of anonline algorithm with solution { F t } t is max t w ( F t )/ OPT t where OPT t is the optimal oﬄine solution for thegroup Steiner forest problem we must solve in time step t . For the online problem let k be the number ofpossible pairs revealed by the adversary.Note that the group Steiner forest directly generalizes group Steiner tree since a tree instance on a weightedgraph G with root r ∈ V ( G ) can be reduced to an equivalent forest instance on the same graph G by mappingeach group g to the pair ({ r } , g ) . This reductions is valid in both the oﬄine and online setting (also in thelater deﬁned, demand-robust, setting).We now show that a deterministic algorithm for online group Steiner forest on trees gives a determinis-tic algorithm for online group Steiner forest on general graphs up to small losses. These results and thecorresponding proofs will be quite similar to those of the previous section so we defer a full proof to theappendix. 20 heorem 28. If there exists:1. A poly-time deterministic algorithm to compute an eﬃcient, well-separated α -approximate copy treeembedding with copy number χ and;2. A poly-time f ( n, N, k ) -competitive deterministic algorithm for online group Steiner forest on well-separated treesthen there exists an ( α ⋅ f ( χn, χN, k )) -competitive deterministic algorithm for group Steiner forest (on generalgraphs).Proof Sketch. The properties of a copy tree embedding show that an instance of group Steiner forest on atree exactly map to an instance of group Steiner forest on our copy tree. In particular, if we must connect ( A i , B i ) in the general graph then we can just connect (⋃ v ∈ A i φ ( v ) , ⋃ v ∈ B i φ ( v )) on our copy tree and mapback the solution with π T → G . The full proof is available in Appendix A.Plugging in our ﬁrst construction (Theorem 3) or our second construction (Theorem 4) of a copy treeembedding immediately gives the follow corollary. Corollary 29.

If there is an f ( n, N, k ) -competitive deterministic algorithm for online group Steiner forest onwell-separated trees then there are O ( log n ⋅ f ( O ( n log n ) , O ( nN ) , k )) and O ( log n ⋅ f ( O ( n log n ) , O ( N log n ) , k )) -competitive deterministic algorithms for online group Steiner forest (on general graphs). Lastly, we note that Theorem 5 follows immediately from Corollary 27 and Corollary 29.

In this section we give a deterministic bicriteria algorithm for the online partial group Steiner tree problemwhich is the same as online group Steiner tree but where we must connect at least of all vertices from eachgroup to the root. The algorithm is bicriteria in the sense that it relaxes both the 1 / G = ( V, E, w ) with groups { g i } i and root r to an instanceof partial group Steiner tree as follows. For each group g i we add ∣ g i ∣ − r . Our partial group Steiner tree problem will be on the resulting graph with root r andgroups { g ′ i } i where g ′ i consists of g i along with its corresponding ∣ g i ∣ − g i to r . Conversely,by connecting all of the dummy nodes we added to our graph to r by their cost 0 edges, it is easy to see thata solution for group Steiner tree on the input graph exactly corresponds to a solution for our partial groupSteiner tree instance. Moreover, it is also easy to see that any deterministic bicriteria algorithm for online partial group Steiner treealso gives a poly-log-competitive deterministic (unicriteria) algorithm for online (non-group) Steiner tree.In particular, given an instance of Steiner tree on weighted graph G = ( V, E, w ) with root r where we mustconnect terminals A ⊆ V to r , it suﬃces to solve the partial group Steiner tree problem where each vertexin A is in a singleton group with any constant bicriteria relaxation. This is because connecting any c > r will connect at least one vertex to r by the integrality of the number of connectedvertices. Thus, our result generalizes the fact that deterministic poly-log approximations are known for online(non-group) Steiner tree [43]. However, we do note that our (deterministic) poly-log-approximate bicriteria As a minor techincal caveat: we have assumed that edge weights are at least 1 throughout this paper; it is easy to see thatby scaling weights up by a polynomial factor and then using weight 1 edges instead of weight 0 edges this reduction still works. O ( max i ∣ g i ∣ f i ⋅ (cid:15) ) bicriteriaapproximation for what we call the f -partial group Steiner tree problem which requires connecting at least f i vertices from group g i to the root; our bicriteria algorithm will connect at least f i ⋅ ( − (cid:15) ) vertices fromeach group for any speciﬁed input (cid:15) >

0. It will be convenient for us to consider this problem as opposed topartial group Steiner tree since group Steiner tree is just f -partial group Steiner tree with f i = i .Thus, as an immediate corollary of our algorithm we will be able to give a deterministic algorithm for onlinegroup Steiner tree with a competitive ratio that is linear in the maximum group size. Oﬄine f -Partial Group Steiner: In the f -partial group Steiner Tree problem we are given a weightedgraph G = ( V, E, w ) as well as pairwise disjoint groups g , g , . . . , g k ⊆ V , desired connected vertices 1 ≤ f i ≤ ∣ g i ∣ for each group g i and root r ∈ V . Our goal is to ﬁnd a tree T rooted at r which is a subgraph of G andsatisﬁes ∣ T ∩ g i ∣ ≥ f i for every i . We wish to minimize our cost, w ( T ) ∶= ∑ e ∈ E ( T ) w ( e ) . Online f -Partial Group Steiner: Online f -partial group Steiner tree is the same as oﬄine partial groupSteiner tree but where our solution need not be a tree and groups are revealed in time steps t = , , . . . . Thatis, in time step t an adversary reveals a new group g t and the algorithm must maintain a solution T t where:(1) T t − ⊆ T t ; (2) T t is feasible for the (oﬄine) f -partial group Steiner tree problem on groups g , . . . g t and;(3) T t is cost-competitive with the optimal oﬄine solution for this problem where the cost-competitive ratioof our algorithm is max t w ( T t )/ OPT t where OPT t is the cost of the optimal oﬄine f -partial group Steinertree solution on the ﬁrst t groups. We will give a bicriteria approximation for online f -partial group Steinertree; thus we say that an online solution is ρ -connection-competitive if for each t we have ∣ T t ∩ g i ∣ ≥ ( f i ⋅ ρ ) for every i ≤ t .We note that the partial group Steiner tree problem as mentioned above is simply the special case of f -partialgroup Steiner tree but where f i = g i for every i . f -Partial Group Steiner Tree on a Tree We begin by giving a bicriteria deterministic online algorithm for f -partial group Steiner tree on trees basedon a “water-ﬁlling” approach. Informally, in iteration t each unconnected vertex in each group will grow thesolution towards the root at an equal rate until at least f i ⋅ ( − (cid:15) ) vertices in g t are connected to r . More formally we will solve a problem which is a slight generalization of f -partial group Steiner tree on trees.We solve this problem on a tree rather than just f -partial group Steiner tree on a tree because, unlike groupSteiner tree, the “groupiﬁed” version of f -partial group Steiner tree is not necessarily another instance of f -partial group Steiner tree. Roughly, instead of groups we now have groups of groups, hence we call thisproblem 2-level f -partial group Steiner tree. Oﬄine -Level f -Partial Group Steiner Tree : In 2-level f -Partial Group Steiner tree we are givena weighted graph G = ( V, E, w ) , root r ∈ V and groups of groups G , . . . G k where G i consists of groups { g ( i ) , . . . g ( i ) k i } where each g ( i ) j ⊆ V . We are also given connectivity requirements f , . . . , f k . Our goal is tocompute a minimum-weight tree T containing r where for each i ≤ k we have ∣{ g ( i ) j ∶ g ( i ) j ∩ T ≠ ∅}∣ ≥ f i . Welet n i ∶= ∣{ v ∶ ∃ j s.t. v ∈ g ( i ) j }∣ . Notice that f -partial group Steiner tree is just 2-level f -partial group Steinertree where each g ( j ) i is a singleton set. As with group Steiner tree the assumption that the tree is rooted and that the groups are pairwise disjoint is without lossof generality. (a) Graph T .

41 52 3 3 1 111 5 (b) G arrives.

41 2/52 3 3 1 4/5 111 (c) “Fill water.”

41 2/52 3 3 1 4/5 111 (d) Choose solution.

Figure 5: Solution our algorithm gives after one group of groups, G , is revealed where f =

2. Nodes ingroups in G outlined in green and nodes colored according to the group of G which contains them. Saturatededges given in blue and edges with 0 < x e < w e annoted with “ x e / w e ”. All other edges labeled by w e . Online -Level f -Partial Group Steiner Tree : Online 2-level f -Partial Group Steiner tree is the sameas the oﬄine problem but where G t is revealed in time step t by an adversary. In particular, for each timestep t we must maintain a solution T t where: (1) T t − ⊆ T t for all t ; (2) T t is feasible for the (oﬄine) 2-level f -partial group Steiner tree problem on G , . . . , G t with connectivity requirements f , . . . , f t and; (3) T t iscost-competitive with the optimal oﬄine solution for this problem where the cost-competitive ratio of ouralgorithm is max t w ( T t )/ OPT t where OPT t is the cost of the optimal oﬄine 2-level f -partial group Steinertree solution on the ﬁrst t groups of groups.We will give a bicriteria approximation for online 2-level f -partial group Steiner tree on trees; thus we saythat an online solution is ρ -connection-competitive if for each t we have ∣{ g ( i ) j ∶ g ( i ) j ∩ T ≠ ∅}∣ ≥ ρ ⋅ f i for every i ≤ t . We now formally describe our algorithm for 2-level f -partial group Steiner tree on weighted tree T = ( V, E, w ) given an (cid:15) >

0. We will maintain a fractional variable 0 ≤ x e ≤ w e for each edge indicating the extent towhich we buy e where our x e s will be monotonically increasing as our algorithm runs. Say that an edge e issaturated if x e = w e .Let us describe how we update our solution in the t th time step. Let T t be the connected component of allsaturated edges containing r . Then, we repeat the following until ∣{ g ( t ) j ∶ g ( t ) j ∩ T t ≠ ∅}∣ ≥ f t ⋅ ( − (cid:15) ) . Let G ′ t ∶= { g ( t ) j ∈ G t ∶ g ( t ) j ∩ T t = ∅} be all groups in G t not yet connected and let g ′ t ∶= ⋃ S ∈G ′ t S be all vertices ina group which have not yet been connected to r . We say that e is on the frontier for v ∈ g ′ t if it is the ﬁrstedge on the path from v to r which is not saturated. Similarly, let r e be the number of vertices in g ′ t forwhich e is on the frontier for v . Then, for each edge e we increase x e by r e ⋅ δ where δ = min e ( w e − x e )/ r e .Our solution in the t th time step is T t once ∣{ g ( t ) j ∶ g ( t ) j ∩ T t ≠ ∅}∣ ≥ ( − (cid:15) ) ⋅ f t .We illustrate one iteration of this algorithm in Figure 5 We proceed to analyze the above algorithm and give its properties.

Theorem 30.

There is a deterministic poly-time algorithm for online -level f -partial group Steiner tree ontrees which is (cid:15) ⋅ ( max i n i f i ) -cost-competitive and ( − (cid:15) ) -connection-competitive.Proof. We begin by verifying that our algorithm returns a monotonically increasing and ( − (cid:15) ) -connection-competitive solution. First, notice that our solution is monotonically increasing since our x e s are mono-tonically increasing and our solution only includes saturated edges. To see that our solution is ( − (cid:15) ) -connection-competitive notice that at least one new edge becomes saturated from each update to the23 e s (namely arg min e ( w e − x e )/ r e ) and since if all edges are saturated then T t = T which clearly satis-ﬁes ∣{ g ( t ) j ∶ g ( t ) j ∩ T t ≠ ∅}∣ ≥ ( − (cid:15) ) ⋅ f t , this process will eventually halt with a ( − (cid:15) ) -connection-competitivesolution in the t th iteration. For the same reason our algorithm is deterministic poly-time.It remains to argue that our solution is (cid:15) ⋅ ( max i n i f i ) -cost-competitive. We will argue that we can uniquelycharge each unit of increase of our x e s to an appropriate cost portion of the optimal solution. Fix an iteration t . Next, let δ ( i,j ) for i ≤ t be the value of δ in the i th iteration the j th time we increase the value of our x e s. Similarly, let δ ( i,j ) x be the increase in ∑ e x e when we do so and let δ ( i,j ) y be the increase in ∑ e ∈ T ∗ t x e where T ∗ t is the optimal oﬄine solution to the 2-level f -partial group Steiner problem we must solve in the t th iteration. Lastly, let y ∶= ∑ i ≤ t ∑ j δ ( i,j ) y be the value of ∑ e ∈ T ∗ t x e at the end of the t th iteration; clearly wehave y ≤ OPT t . We claim that it suﬃces to show that for each i ≤ t and each j that δ ( i,j ) x ≤ (cid:15) δ ( i,j ) y n i f i sinceit would follow that at the end of iteration t we have that w ( T t ) ≤ ∑ e x e = ∑ i ≤ t ∑ j δ ( i,j ) x ≤ (cid:15) ∑ i ≤ t ∑ j n i f i δ ( i,j ) y ≤ (cid:15) ( max i n i f i ) y ≤ (cid:15) ( max i n i f i ) OPT t . We proceed to show that δ ( i,j ) x ≤ (cid:15) δ ( i,j ) y n i f i for each i ≤ t and j . We ﬁx an i and j and for cleanliness ofnotation we will drop the dependence on i and j in our δ s henceforth.First, notice that we have that δ x ≤ n i ⋅ δ (8)since each vertex v ∈ g i is uniquely responsible for up to a δ increase on x e where e is the edge on v ’s frontier.On the other hand, notice that if a group in G i is connected to r by T ∗ t but is not yet connected by T i thensuch a group uniquely contributes at least δ to δ y . Since T ∗ t connects at least f i groups in G i to r but at themoment of our increase T i connects at most ( − (cid:15) ) ⋅ f i , there are at least (cid:15) ⋅ f i such groups in G i which areconnected to r by T ∗ t but not by T i . Thus, we have that δ y ≥ (cid:15) ⋅ f i ⋅ δ (9)Combining Equations 8 and 9 shows δ x ≤ (cid:15) δ y n i f i as required. f -Partial Group Steiner Tree on General Graphs Next, we apply our ﬁrst construction to give an algorithm for f -partial group Steiner tree on general graphs.Crucially, the following result relies on a single copy tree embedding with poly-logarithmic copy number,making our second construction unsuitable for this problem. Theorem 31.

There is a deterministic poly-time algorithm for online f -partial group Steiner tree (on generalgraphs) which is O ( log n(cid:15) ⋅ max i ∣ g i ∣ f i ) -cost-competitive and ( − (cid:15) ) -connection-competitive.Proof. We will use our copy tree embedding to produce a single tree on which we must deterministicallysolve online 2-level partial group Steiner tree. We will then apply the algorithm from Theorem 30 to solveonline 2-level partial group Steiner tree on this tree.More formally, consider an instance of online partial group Steiner tree on weighted graph G = ( V, E, w ) withroot r . Then, we ﬁrst compute a copy tree embedding ( T, φ, π G → T , π T → G ) deterministically with respect to G and r as in Theorem 3 with cost approximation O ( log n ) and copy number O ( log n ) . Next, given ourinstance I t of partial group Steiner tree on G with groups g , . . . g t and connection requirements f , . . . , f t we let I ′ t be the instance of 2-level partial group Steiner tree on T with groups of groups G , . . . G t where G i = { φ ( v ) ∶ v ∈ g i } , connection requirements f , . . . , f t and root φ ( r ) . Then if the adversary has required24hat we solve instance I t in time step t , then we require that the algorithm in Theorem 30 solves I ′ t in timestep t and we let H ′ t be the solution returned by our algorithm for I ′ t . Lastly, we return as our solution for I t in time step t the set H t ∶= π T → G ( H ′ t ) .Let us verify that the resulting algorithm is indeed feasible (i.e. monotone and -connection-competitive)and of the appropriate cost.First, we have that H t ⊆ H t + for every t since H ′ t ⊆ H ′ t + because our algorithm for trees returns a feasiblesolution for its online problem and π T → G is monotone by deﬁnition of a copy tree embedding. Moreover, weclaim that H t connects at least ( − (cid:15) ) ⋅ f i vertices from g i to r for i ≤ t and every t . To see this, notice thatthere at least ( − (cid:15) ) ⋅ f i groups from G i containing a vertex connected to r by H ′ t . Since each such groupconsists of the copies of a distinct vertex, by the connectivity preservation properties of a copy tree it followsthat H t connects at least ( − (cid:15) ) ⋅ f i vertices from g i to r .Next, we verify the cost of our solution. Let OPT ′ t be the cost of the optimal solution to I ′ t . Notice thatsince our copy number is O ( log n ) , it follows that n i ≤ O ( log n ⋅ ∣ g i ∣) . Thus, by the guarantees of Theorem 30we have w T ( H ′ t ) ≤ (cid:15) ⋅ ( max i n i f i ) OPT ′ t ≤ O ( log n(cid:15) ) ⋅ ( max i ∣ g i ∣ f i ) OPT ′ t . (10)Next, we bound OPT ′ t . Let H ∗ t be the optimal solution to I t . We claim that π G → T ( H ∗ t ) is feasible for I ′ t . This follows because H ∗ t connects at least f i vertices from g i to r for i ≤ t and so by the connectivitypreservation property of copy tree embeddings we know that there are at least f i groups in G i with a vertexconnected to r by π G → T ( H ∗ t ) . Thus, combining this with the O ( log n ) cost preservation of our copy treeembedding we have OPT ′ t ≤ w T ( π G → T ( H ∗ t )) ≤ O ( log n ) ⋅ w G ( H ∗ t ) . (11)Lastly, by the cost preservation property of our copy tree embedding we have that w G ( H t ) ≤ w T ( H ′ t ) whichwhen combined with Equations 10 and 11 gives w G ( H t ) ≤ O ( log n(cid:15) ⋅ max i ∣ g i ∣ f i ) ⋅ w G ( H ∗ t ) . thereby showing that our solution is within the required cost bound.As a consequence of the above result we have a poly-log bicriteria deterministic approximation algorithm foronline partial group Steiner tree; we restate the relevant theorem below. Theorem 6.

There is a deterministic poly-time algorithm for online partial group Steiner tree which givenany (cid:15) > is O ( log n(cid:15) ) -cost-competitive and ( − (cid:15) ) -connection competitive. Since group Steiner tree is exactly f -partial group Steiner tree where f i = i ∣ g i ∣ f i ≤ N whereagain N is the maximum size of a group. Moreover, since any solution can only connect an integral numberof vertices from each group, it follows that a -connection-competitive solution for partial group Steiner treewhere f i = We note that one can use an aforementioned property of our ﬁrst construction—that if u is connected to r by F ⊆ E thenevery vertex in φ ( u ) is connected to φ ( r ) in π G → T ( F ) —to reduce the O ( log n ) s in this section to O ( log n ) s. In particular, ifone were to use this property then when we map the solution to our f -partial group Steiner tree problem on G to our copy treeembedding, the resulting solution will connect at least f i groups in G i at least Θ ( log n ) times. It follows that when we run ourwater ﬁlling algorithm each time it increases ∑ e x e by 1 we know that it cover at least Ω ( log n ) units of the optimal solutionby weight rather than 1 unit of the optimal solution as in the current analysis. orollary 32. There is an O ( N log n ) -competitive deterministic algorithm for online group Steiner treewhere N ∶= max i ∣ g i ∣ is the maximum group size. In this section, we give a poly-log-approximate algorithm for the demand-robust versions of the group Steinertree and group Steiner forest problems. The high-level approach will be to ﬁnd a black-box reduction fromthe problem on a general graph to a problem on a tree, and then to solve the tree problem. However, theproperties that the copy tree embedding need to ensure in this setting are slightly diﬀerent, hence we willdeﬁne and introduce a new, demand-robust copy tree embedding, in Deﬁnition 33.On a general note, the demand-robust setting provides a robust counterpart to classic optimization problemslike (group) Steiner tree, minimum cut, shortest path, etc. In this setting, instead of a single input, oneis given a set of scenarios

S = { S , . . . , S m } , where each scenario S i corresponds to a classic input to theproblem. The goal is to “prepare” for the worst-case scenario in S by buying a “ﬁrst-stage solution” X ata discount before one knows which scenario is realized. After committing to X , the realized scenario S i isrevealed and one needs to extend X with a “second-stage solution” X i (where the cost of X i is inﬂated bya factor σ i ≥

1) such that X ∪ X i satisﬁes scenario S i . We want to minimize the total cost (of both theﬁrst-stage and the second-stage solution) in case of a realization of the worst-case scenario.We ﬁrst give formal descriptions of the demand-robust group Steiner tree and group Steiner forest problems.Note that the formal descriptions of the oﬄine versions were given in Section 5.1 and Section 5.2, respectively. Demand-robust versions of the group Steiner tree/forest problem:

Let G = ( V, E, w ) be a weightedgraph with a distinguished node r ∈ V called the root where the weight w ( e ) is the “ﬁrst-stage cost” of anedge e . We are given a set of scenarios S = { S , . . . , S m } with m ∶= ∣S∣ ≤ poly ( n ) where:1. In the group Steiner tree problem, a scenario S i consists of a set of groups g i, , g i, , . . . , g i,k ( i ) , with g i,j ⊆ V , and an inﬂation factor σ i ≥

1. We assume k ( i ) ≤ poly ( n ) .2. In the group Steiner forest problem, a scenario S i consists of a set of pairs ( A i, , B i, ) , ( A i, , B i, ) , . . . , ( A i,k ( i ) , B i,k ( i ) ) , with A i,j , B i,j ⊆ V , and an inﬂation factor σ i ≥

1. We assume k ( i ) ≤ poly ( n ) .We wish to buy the (optimal) set of ﬁrst-stage edges X ⊆ E in order to minimize the cost of the worst-casescenario being realized. The cost of scenario S i being realized is the smallest value w ( X ) + σ i ⋅ w ( X i ) overall set of edges X i ⊆ E such that X ∪ X i is a valid solution to the oﬄine version of the problem for scenario i (e.g., in the group Steiner tree problem, X ∪ X i connects at least one node v ∈ g i,j to the root r for eachgroup g i,j in scenario i ):An alternative way to deﬁne the demand-robust version of the above problems is to say that we want to ﬁndsubsets X , X , . . . , X m which minimize max mi = w ( X ) + σ i ⋅ w ( X i ) such that ∀ ≤ i ≤ m, X ∪ X i satisﬁesscenario S i for the oﬄine version. Let OPT ∶= max mi = w ( X ) + σ i ⋅ w ( X i ) be the cost of the optimal solution. We now introduce the demand-robust copy tree embedding and prove its existence. One notable diﬀerencebetween this embedding (which is appropriate for the demand-robust setting) and the copy tree embeddingof Deﬁnition 2 is that the forward- and backward-mapping function map tuples of subgraphs to tuples ofsubgraphs (of equal length). This is because the ﬁrst- and second-stage solutions must be mapped in acoordinated fashion, a requirement that was not necessary in the previous settings.

Deﬁnition 33.

Let G = ( V, E, w ) be a weighted graph with some distinguished root r ∈ V . An α -approximatedemand-robust copy tree embedding C = (

T, φ, π G → T , π T → G ) consists of a weighted rooted tree T = ( V ′ , E ′ , w ′ ) with root r ′ , a copy mapping φ ∶ V → V ′ with φ ( r ) = { r ′ } , and edge mapping functions π G → G and π T → G thatmaps tuples of subgraphs (of any length m ) to equal-length tuples of subgraphs. he “forward-mapping function” π G → T maps at most m ≤ poly ( n ) subgraphs (more precisely, subsets of E ),namely X , X , . . . , X m , to subsets of E ′ , namely X ′ , X ′ , . . . , X ′ m such that the following always holds:1. Demand-robust Connectivity Preservation : For all ≤ i ≤ m , and all u, v ∈ V that are connectedvia X ∪ X i , we have that φ ( u ) and φ ( v ) are connected via X ′ ∪ X ′ i .2. Cost Preservation : For every ≤ i ≤ m we have that w ′ ( X ′ i ) ≤ α ⋅ w ( X i ) .The “backward-mapping function” π T → G maps m ≤ poly ( n ) subsets of E ′ , namely X ′ , X ′ , . . . , X ′ m , to subsetsof E , namely X , X , . . . , X m such that the following always holds:1. Demand-Robust Connectivity Preservation : For all ≤ i ≤ m , and all u, v ∈ V ′ that are connectedvia X ′ ∪ X ′ i , we have that φ − ( u ) and φ − ( v ) are connected via X ∪ X i .2. Cost Preservation : For every ≤ i ≤ m we have that w ( X i ) ≤ w ′ ( X ′ i ) .A copy tree embedding is eﬃcient if T , φ , and π T → G are all poly-time computable, and well-separated if T iswell-separated. Comparing the above with Deﬁnition 2, we note that an α -approximate demand-robust copy tree embeddingis also an α -approximate copy tree embedding. However, the converse might not hold—for example, the“merging FRT support construction” as deﬁned in Section 4.3 (in particular, where the mapping function π G → T simply embeds a subgraph into the cheapest tree) is not a log O ( ) n -approximate demand-robust copytree embedding. However, changing the forward mapping function of the FRT support construction, we areable to obtain the following guarantees. Theorem 34.

There is a poly-time deterministic algorithm which given any weighted graph G = ( V, E, w ) and root r ∈ V computes an eﬃcient and well-separated O ( log n ) -approximate demand-robust copy treeembedding.Proof. We show that the “merging FRT support construction” (same as Section 4.3, which we reintroducehere for convenience) also suﬃces for the demand-robust setting. We let T , T , . . . , T q be the trees in thesupport of the FRT distribution guaranteed by Theorem 25. Then, we let T be the result of identifyingeach copy of r as the same vertex in each T i (but not identifying copies of other vertices in V as the samevertex). T ’s weight function w ′ T is inherited from each T i in the natural way. Similarly, we let φ ( v ) be theset containing each copy of v in each of the T i . It is easy to verify that φ is indeed a copy mapping. Also,note that φ ( v ) is computable in deterministic poly-time.We now describe π G → T . Let X , X , . . . , X m ⊆ E be a tuple of subgraphs of E . We use the probabilisticmethod to show there exists a tuple of subsets X ′ , X ′ , . . . , X ′ m ⊆ E ′ ∶= E ( T ) which satisfy the above proper-ties. Note that the overall construction will still be deterministic as we only need to show the existence of π G → T (e.g., we are not to be able to eﬃciently compute π G → T ).Independently sample k ∶= O ( log m ) = O ( log n ) random FRT trees, namely, T ′ , . . . , T ′ k and let w T i be theircorresponding weights. In each T ′ i let T ′ i ( X ) be the unique forest (subgraph of T ′ i ) which has the sameconnected components as X . Finally, we set X ′ ∶= ⋃ ki = T ′ i ( X ) . Due to the properties of FRT, we havethat E [ w T ′ i ( X )] ≤ O ( log n ) ⋅ w G ( X ) , hence w ′ T ( X ′ ) ≤ k ⋅ O ( log n ) ⋅ w G ( X ) = O ( log n ) ⋅ w G ( X ) with atleast constant probability.We now ﬁx a subset X i . For each j ∈ [ k ] we have that w T ′ i ( X i ) ≤ O ( log n ) ⋅ w G ( X i ) with at least constantprobability, hence with probability at least 1 − exp (− O ( k )) = − n − O ( ) there exists some j ( i ) ∈ [ k ] where theproperty holds. Assuming this is the case, we set X ′ i ∶= T ′ j ( i ) ( X i ) . Applying a union bound over all subgraphs X ′ i for i ∈ { , , . . . , m } , we conclude all of the above properties are satisﬁed with at least constant probability,hence via the probabilistic method at least one such mapping exists. By construction, the forward mappingsatisﬁes the cost preservation properties with α = O ( log n ) . Furthermore, if two nodes u, v ∈ V ( G ) are27onnected in X ∪ X i , then they are connected in T ′ j ( i ) ( X ) ∪ T ′ j ( i ) ( X i ) ⊇ X ′ ∪ X ′ i —consider an edge eitherin e ∈ X or in e ∈ X i , in the former case the endpoints of the edge are connected in T ′ j ( i ) ( X ) and in thelatter they are connected in T ′ j ( i ) ( X i ) .Lastly, we specify π T → G . While the original deﬁnition acts on a tuple ( X ′ i ) mi = of subsets of E ′ , we specify itsaction on a single subset π T → G ( F ′ ) and then this function to all elements of the tuple, i.e., X i ∶= π T → G ( X ′ i ) for all i . We let π T → G ( F ′ ) be ⋃ ( u ′ ,v ′ )∈ F ′ P uv where P uv is an arbitrary shortest path in G between u and v and u ′ and v ′ are copies of u and v . We ﬁrst verify the cost preservation: for every F ′ ⊆ E ( G ) wehave w G ( π T → G ( F ′ )) ≤ ∑ ( u ′ ,v ′ )∈ F ′ w G ( P uv ) ≤ ∑ ( u ′ ,v ′ )∈ F ′ w ′ T ( u ′ , v ′ ) = w ′ ( F ′ ) , where the last inequality holdsbecause distances in FRT trees dominate distances in G . This proves the cost preservation.Next, we verify the demand-robust connectivity preservation: for each edge X ′ ∪ X ′ i , its endpoints areconnected either via X = π T → G ( X ′ ) (if e ∈ X ′ ), or via X i = π T → G ( X ′ i ) (if e ∈ X ′ i ), hence if two nodes areconnected via X ′ ∪ X ′ i , then they are connected via X ∪ X i . It is easy to check that T, φ , and π G → T can allbe constructed in deterministic poly-time.We also remark that the construction of merging partial tree embeddings can also be made into a demand-robust embedding of a smaller size. However, this approach seems more complicated and yields the samecost approximation, hence we do not present it here. In this section we show how to map the demand-robust group Steiner tree and forest problems on a gen-eral graph to an equivalent problem on a demand-robust copy tree embedding with a poly-log loss in theapproximation factor. We formally describe the mapping and then proceed to prove its properties.

Mapping to a copy tree embedding . We describe how to map an instance I = ( G, r, S) of the demand-robust group Steiner tree/forest to a copy tree embedding C = (

T, φ, π G → T , π T → G ) . We deﬁne an instance I ′ = ( G ′ , r ′ , S ′ ) where G ′ ∶= T with r ′ being the root of T . We set S ′ ← S with the following changed applied:1. In the group Steiner tree problem, each group g ∈ S i ∈ S is changed to g ′ ∶= ⋃ v ∈ g φ ( v ) . In other words,each node v in a group is replaced by all of its copies φ ( v ) in the copy tree embedding.2. In the group Steiner forest problem, each pair ( A, B ) ∈ S i ∈ S is changed to (⋃ a ∈ A φ ( a ) , ⋃ b ∈ B φ ( b )) .Note that the demand-robust group Steiner tree/forest instance maps to another instance of the same problem(e.g., a group Steiner tree problem maps to a group Steiner tree problem).We remind the reader that the group Steiner forest problem directly generalizes the group Steiner treeproblem—given a group Steiner tree problem on g with groups ( g i ) we can reduce it to an equivalent groupSteiner forest problem on the same graph G and root r , where each group g is mapped to the pair ({ r } , g ) .Comparing the mapping to the copy-tree-embedding with the above reduction, a natural question ariseswhether one should apply the reduction before or after applying the mapping to the copy tree embedding.However, one can easily check that there is no diﬀerence—these two transformations “commute”.The following lemma illustrates why such a mapping deﬁnition is appropriate and it shows the utility ofDeﬁnition 33. Lemma 35.

Suppose that an instance I of the demand-robust group Steiner tree (resp., forest) problemmaps to a demand-robust group Steiner tree (resp., forest) instance I ′ via a α -approximate demand-robustcopy tree embedding C . Then:1. If X , X , . . . , X m ( X i ⊆ E ( G ) ) is a feasible solution for I of cost OPT , then ( X ′ i ) mi = ∶= π G → T (( X i ) mi = ) is a feasible solution to I ′ with cost at most α ⋅ OPT . . If X ′ , X ′ , . . . , X ′ m ( X ′ i ⊆ E ( T ) ) is a feasible solution for I ′ of cost ALG , then ( X i ) mi = ∶= π T → G (( X ′ i ) mi = ) is a feasible solution to I with cost at most ALG .Proof.

We ﬁrst prove (1). It is suﬃcient to prove the result for the forest problem—take the tree instance on G with a feasible solution X of cost OPT, reduce it to an equivalent forest instance, map it to C and, applyingthe forest claim, conclude there is a feasible solution X ′ of value at most α ⋅ OPT. By commutativity, X ′ isalso a feasible solution for the reduction of the original tree instance to the mapping to C , hence is a feasiblesolution (of cost at most α ⋅ OPT) for the mapping of the original problem to C , proving the claim.We now prove (1) for the forest problem. Fix a scenario S i ∈ S . By feasibility, for each pair ( A, B ) ∈ S i in the original instance, there exists a ∈ A and b ∈ B which are connected via X ∪ X i . Therefore, by thedemand-robust connectivity preservation, there exits a ′ ∈ φ ( a ) and b ′ ∈ φ ( b ) that are connected via X ′ ∪ X ′ i .In other words, the set of vertices ⋃ a ∈ A φ ( a ) is connected to the set of vertices ⋃ b ∈ B φ ( v ) via X ′ ∪ X ′ i , hencethe solution is feasible for I ′ .Finally, we analyze the cost. By the cost preservation property, we have that w ′ ( X ′ i ) ≤ α ⋅ w ( X i ) , hence thecost is: w ′ G ( X ′ ) + max ≤ i ≤ m σ i ⋅ w ′ G ( X ′ i ) ≤ α ⋅ ( w ′ ( X ′ ) + max ≤ i ≤ m σ i ⋅ w ′ ( X ′ i )) ≤ α ⋅ OPT . Next, we prove (2). It is suﬃcient to prove the result for the forest problem—take the tree problem on G ,map it to C , then reduce to a forest problem and obtain a feasible solution X ′ of cost ALG. By commutativityand assuming the claim for the forest problem, X is a feasible solution to the reduction of the original treeinstance to a forest instance. Hence, X is a feasible solution (of cost at most ALG) to the original treeinstance.We now prove (2) for the forest problem. Fix a scenario S i ∈ S . By feasibility, for each pair ( A, B ) ∈ S i in theoriginal instance, the set of vertices ⋃ a ∈ A φ ( a ) is connected to the set of vertices ⋃ b ∈ B φ ( v )) . Therefore, thereexits a ′ ∈ φ ( a ) , a ∈ A and b ′ ∈ φ ( b ) , b ∈ B such that a ′ , b ′ are connected via X ′ ∪ X ′ i . By the demand-robustconnectivity preservation, we have that a = φ − ( a ′ ) and b = φ − ( b ′ ) are connected via X ∪ X i , hence thesolution is feasible for I .Finally, we analyze the cost. By the cost preservation property, we have that w ( X i ) ≤ w ′ ( X ′ i ) , hence thecost is: w G ( X ) + max ≤ i ≤ m σ i ⋅ w G ( X i ) ≤ w ′ ( X ′ ) + max ≤ i ≤ m σ i ⋅ w ′ ( X ′ i ) ≤ ALG . G is a Tree In this section we give a poly-log-approximation algorithm for the demand-robust group Steiner tree problemwhen the underlying graph G is a weighted and rooted tree. The main result of the section follows. Theorem 7.

There is a randomized poly-time O ( log n ) -approximation algorithm for the demand-robustgroup Steiner tree problem on weighted trees. We note that combining Theorem 7 with the mapping of Lemma 35 and the demand-robust copy tree em-bedding construction Theorem 34 immediately yields a randomized O ( log ) -competitive poly-time algorithmfor the group Steiner tree on general graphs, namely Theorem 9.The rest of this section is dedicated to proving Theorem 7. The general outline of our proofs is as follows.29. We prove an important structural property on the ﬁrst-stage solution that allows us to conclude thatthe there exists a ﬁrst-stage solution that is a rooted subtree of G (i.e., it is connected and containsthe root of G ).2. We write the linear program that fractionally relaxes the demand-robust group Steiner tree problem.3. We show how to utilize the randomized rounding for the online group Steiner tree problem of [7] toconstruct a demand-robust solution. We remark that a more naive attempt at utilizing the randomizedrounding techniques on a general graph (i.e., without transfering the problem to a demand-robust copytree embedding) would not yield a poly-logarithmic approximation ratio—we crucially use the fact that G is a tree to make the randomized rounding work.First, we prove an important structural property on the ﬁrst-stage solution, ﬁrst proved in [25]: there existsa 2-approximate ﬁrst-stage solution that is a union of minimal feasible solutions for a subset of scenarios.For the demand-robust group Steiner tree problem, we say that M i ⊆ E is a minimal feasible solution to thescenario S i if no proper subset M ′ i ⊋ M i is feasible for the scenario (i.e., there exists at least one group in S i that is not connected to the root via M ′ i ). Lemma 36 (Adapted from [25]) . In the demand-robust group Steiner tree problem on the graph G = ( V, E ) ,there exists a ﬁrst-stage solution X ⊆ E which can be extended to a solution of (worst-case realization)cost ⋅ OPT which has the following structure. There exists a subset I ⊆ { , , . . . , m } and a set { M i } i ∈ I ,where M i is some minimal feasible solution (i.e., no proper subset is feasible) to the scenario S i , such that X = ⋃ i ∈ I M i . The proof of this result is directly argued via the proof of Lemma 4.1 in Section 4.1 of [25]. However,our claim requires slightly weaker structural properties compared to [25]—it stipulates that the ﬁrst-stagesolution X is a union of minimal feasible solutions instead of being the minimal solution for a particularinstance. The proof remains unchanged: every time when the if condition in (2b) is true (as given in [25]), weadd I ← I ∪{ i } and observe that M i ∶= X ∗ i ∪ X ∗ i is a minimal feasible solution for scenario i . By construction, X = ⋃ i ∈ I M i and, as argued in the proof, the cost of X is at most 2 ⋅ OP T . Relaxation LP GST . We now give the linear program for a tree G = ( V, E, w ) with a root r ∈ V that relaxesthe original problem. We say that a vector x ∈ R E is decreasing on root-leaf path if for every e ∈ E notincident to the root r and its parent edge parent ( e ) we have x parent ( e ) ≥ x e —this condition is required by therandomized online rounding technique and can be argued to be a valid constraint due to Lemma 36. The LPjointly optimizes over the ﬁrst-stage solution { x ,e } e and second-stage parts of the solution { x i,e } i ∈[ m ] ,e ∈ E while ensuring (1) the ﬁrst-stage solution is decreasing on root-leaf paths, and (2) that the maximum ﬂowbetween the root and each group g i,j (in scenario i ) is at least 1 when using x + x i as edge capacities. Weformally write out the linear program LP GST .min z such that ∀ i ∈ [ m ] ∑ e ∈ E w ( e ) [ x ,e + σ i ⋅ x i,e ] ≤ z ∀ i ∈ [ m ] , ∀ j ∈ [ k ( i )] maxﬂow ( x + x i , { r } , g i,j ) ≥ ∀ e ∈ E if e is not incident to r , then x , parent ( e ) ≥ x ,e ∀ i ∈ { } ∪ [ m ] , ∀ e ∈ E x i,e ≥ GST

30n the linear program we introduced the notation maxﬂow ( x, A, B ) where x ∈ R E ≥ , A ⊆ V, B ⊆ V whichcorresponds to the maximum ﬂow between the set A and set B when the capacity of an edge e are set to x e .The maximum ﬂow between two sets A , B is deﬁned as the ﬂow between the super-source a and super-sink b when a new virtual node a is connected to all nodes in A with inﬁnite capacity and analogously for b . Thecondition that this maximum ﬂow using capacities x + x i is at least 1 can be expressed as a linear programwith a polynomial number of variables and constraints, hence LP GST can be solved in poly-time.Let z ∗ be the optimal cost of the linear program. We argue that the LP is a relaxation of the original problem(with a factor-2 loss), i.e., z ∗ ≤ X ∗ be the ﬁrst-stage solution that satisﬁes the stipulations ofLemma 36, hence w ( X ∗ ) ≤ X ∗ is decreasing on root-leaf paths since each minimalfeasible solution is decreasing on root-leaf paths, hence we can deduce the same about their union. The ﬂowand positivity properties are trivially satisﬁed by any feasible integral solution. Therefore, z ∗ ≤ w ( X ∗ ) ≤ Rounding the LP.

We use the online algorithm for the group Steiner tree problem on trees from Alonet al. [7]. Intuitively, given a sequence of fractional solutions y , y , . . . , where each y i ∈ [ , ] E represents theextent to which the edges in E are bough and satisfy some simple monotonicity properties, the algorithmmaintains a sequence of non-decreasing integral solutions F , F , . . . where F i ⊆ E such that (1) the costof the integral solution is competitive with the cost of the fractional solution, and (2) the integral solutionsatisﬁes the same set of constraints as the fractional solution. The result is formalized as follows. Lemma 37 ([7]) . Let G = ( V, E, w ) be a weighted tree with a distinguished root r ∈ V . There exists apolynomial-time randomized algorithm which accepts a sequence of vectors y , y , . . . , y T ∈ [ , ] E where each y i is decreasing on root-leaf paths for i ∈ { , . . . , T } and y i ( e ) ≤ y i + ( e ) for all i ∈ { , . . . , T − } , e ∈ E .For each i ∈ { , . . . , T } , upon receiving the vector y i , the algorithm outputs a set F i ⊆ E which includes theprevious output (i.e., F i − ⊆ F i if i > ) and (1) Pr [ e ∈ F i ] = y i for each e ∈ E , and (2) for each i and everyset g ⊆ V if maxﬂow ( y i , { r } , g ) ≥ , then F i connects some node of g to the root with probability at least Ω ( / log n ) . This algorithm is explicitly explained in Section 4.2 of [7]. Property (1) is argued via Lemma 10 and Property(2) matches Lemma 12.Using the online rounding scheme of Lemma 37, we show how to round LP

GST to obtain an (integral)demand-robust solution.

Lemma 38.

Consider a demand-robust group Steiner tree problem on a weighted rooted tree G = ( V, E, w ) .Given a feasible solution x to LP GST with objective value z , there exists a polynomial-time randomizedalgorithm that outputs X ⊆ E, . . . , X m ⊆ E such that w ( X ) + σ i ⋅ w ( X i ) ≤ O ( log n ) ⋅ z for all i ∈ [ m ] , andeach group g i,j is connected to the root via X ∪ X i with probability at least − n − O ( ) (both O -constants canbe jointly increased).Proof. We run C ⋅ log n ( C > y ∶= x and note that x isvalid, since it is decreasing on root-leaf paths due to the constraint in LP GST . We output (the union of allthe copies) as the ﬁrst stage solution X . We remember the state of the algorithm copies and perform thefollowing for each scenario i ∈ [ m ] (reverting the state upon completion).Suppose now that some scenario S i ∈ S is realized. We set y ∶= x ∗ + x ∗ i , hence clearly y ≤ y . Furthermore,we can assume without loss of generality that y is decreasing on root-leaf paths since otherwise we canlower the value of any violating edge value ( y ) e without decreasing the maximum ﬂow to any group g ⊆ V ;clearly, the value will not fall below ( y ) e . Therefore, we can feed y to all the algorithms and recover (theunion of multiple copies of the their output) X i , which will be our second-stage solution.31e argue that this solution X , X , . . . , X m is feasible. We remark here that X only depends on y , and X i ⊇ X . Furthermore, the probability that a single copy does not satisfy a group is 1 − / O ( log n ) ≤ exp (− / O ( log n )) . Therefore, we can conclude via the independence of our algorithm copies’ randomnessand a union bound that every group is satisﬁed with at least one copy of the algorithm with probability atleast 1 − poly ( n ) ⋅ exp (− / O ( log n ) ⋅ C log n ) ≥ − n − C ′ (where C ′ = O ( ) can be made arbitrary by increasing C = O ( ) ).Finally, we argue our cost bound. Let z be the objective value of x and let ( F , F , . . . , F m ) be the outputof a ﬁxed copy of the algorithm. For each i ∈ [ m ] we have: E [ w ( F ) + σ i ⋅ w ( F )] = ∑ e ∈ E w ( e ) ( Pr [ e ∈ F ] + σ i ⋅ Pr [ e ∈ F ])≤ ∑ e ∈ E w ( e ) ( x ,e + σ i ⋅ x i,e ) ≤ z. Therefore, we have E [ w ( X ) + σ i ⋅ w ( X i )] ≤ C ⋅ log n ⋅ z = O ( log n ) ⋅ z , bounding the cost.We conclude with our proof of Theorem 7. Proof of Theorem 7.

Let x ∗ represent the optimal solution to LP GST . We apply Lemma 38 on x ∗ (with f i,j ∶= i, j ), the described poly-time algorithm outputs a feasible (integral) solution X , X , . . . , X m such that X ∪ X i connects each group g i,j to the root with probability at least 1 − n − O ( ) . Since there areat most m ≤ poly ( n ) scenarios, and each scenario has at most poly ( n ) groups, we can conclude via a unionbound that the solution is feasible with probability at least 1 − poly ( n ) ⋅ n − O ( ) ≥ − n − . G is a Tree In this section we give a poly-log-approximation algorithm for the demand-robust group Steiner forest prob-lem when the underlying graph G is a weighted and rooted tree. The main result of the section follows. Theorem 8.

There is a randomized poly-time O ( D ⋅ log n ) -approximation algorithm for the demand-robustgroup Steiner forest problem on weighted trees of depth D . We note that combining Theorem 8 with the mapping of Lemma 35 and the demand-robust copy tree embed-ding construction Theorem 34 immediately yields a randomized O ( log ) -competitive poly-time algorithm forthe group Steiner forest on general graphs when the aspect ratio is polynomial, namely Theorem 10. Notethat here we used the fact that for graphs with polynomial aspect ratio the depth of the FRT trees can beassumed to be D = O ( log n ) . The rest of this section is dedicated to proving Theorem 8.We proceed in a similar way to the demand-robust group Steiner tree on a tree: ﬁrst write a linear program-ming relaxation and then utilize the online rounding scheme for the group Steiner forest problem (presentedin [53]) to obtain a demand-robust solution. Again, we remark that using the randomized rounding schemein a more naive way (without going through the demand-robust copy tree embedding) does not immediatelyyield poly-logarithmic approximation ratios. Relaxation LP GSF . We write a somewhat more complicated linear programming relaxation than we didin the demand-robust group Steiner tree case. Remember that G is a rooted tree. We make D + G , G , . . . , G D of the tree G . Next, the (cid:96) th copy G (cid:96) deletes all nodes whose depth is less than (cid:96) (e.g., for (cid:96) = G and for (cid:96) = D the graph is a set of isolated nodes). Note that G (cid:96) is a forest; let T (cid:96) the setof (maximal) trees in G (cid:96) . For each edge e ∈ E ( G (cid:96) ) in a copy G (cid:96) we introduce ﬁrst-stage and second-stagevariables x (cid:96),i,e for (cid:96) ∈ { , , . . . , D } and i ∈ { , , . . . , m } . Similarly as in the group Steiner tree case, we requirethat the ﬁrst-stage solution is root-leaf decreasing in order for the online rounding scheme to work. Lastly,over (cid:96), i, j (same range as before) and for T ∈ T (cid:96) we introduce a “ﬂow variable” f (cid:96),T,i,j which corresponds to32he amount of ﬂow that can be routed via x (cid:96), + x (cid:96),i between the root of T and the nodes in A i,j and B i,j (we want the same amount of ﬂow to be routable to both of them). The linear program requires that thetotal amount of ﬂow f across all the trees in ⋃ D(cid:96) = T (cid:96) is at least 1.min z such that ∀ i ∈ [ m ] D ∑ (cid:96) = ∑ e ∈ E ( G (cid:96) ) w ( e ) [ x (cid:96), ,e + σ i ⋅ x (cid:96),i,e ] ≤ z ∀ (cid:96) ∈ { , . . . , D } , i ∈ [ m ] , ∀ j ∈ [ k ( i )] , ∀ T ∈ T (cid:96) maxﬂow ( x (cid:96), + x (cid:96),i , { root of T } , A i,j ) ≥ f (cid:96),T,i,j ∀ (cid:96) ∈ { , . . . , D } , i ∈ [ m ] , ∀ j ∈ [ k ( i )] , ∀ T ∈ T (cid:96) maxﬂow ( x (cid:96), + x (cid:96),i , { root of T } , B i,j ) ≥ f (cid:96),T,i,j ∀ i ∈ [ m ] , ∀ j ∈ [ k ( i )] D ∑ (cid:96) = ∑ T ∈T (cid:96) f (cid:96),T,i,j ≥ ∀ (cid:96) ∈ { , . . . , D } , ∀ e ∈ E ( G (cid:96) ) if e is not at the top of its tree in G (cid:96) , then x (cid:96), , parent ( e ) ≥ x (cid:96), ,e ∀ (cid:96) ∈ { , . . . , D } , ∀ i ∈ { } ∪ [ m ] , ∀ e ∈ E ( G (cid:96) ) x (cid:96),i,e ≥ ∀ (cid:96) ∈ { , . . . , D } , ∀ i ∈ [ m ] , ∀ e ∈ E ( G (cid:96) ) f (cid:96),i,e ≥ GSF

The condition that this maximum ﬂow using capacities x (cid:96), + x (cid:96),i is at least f (cid:96),T,i,j can be expressed asa linear program with a polynomial number of variables and constraints, hence LP GSF can be solved inpoly-time.We now argue that LP

GSF relaxes the original problem (up to a factor of O ( D ) loss). To this end weintroduce some notation. Let p be a simple path in G and consider the highest (closest to the root) node x ∈ V ( p ) it passes through. We say that p peaks at node x . The high-level idea is that we can considerthe optimal integral solution and, for each pair ( A i,j , B i,j ) observe the path that connects a node in A i,j with a node in B i,j . If this path peaks at node x , we assign this pair to the tree in T depth ( x ) whose root isexactly x . Then, by applying the structural Lemma 36 on each tree in ⋃ D(cid:96) = T (cid:96) , we can conclude that there isa root-leaf decreasing integral solution that solves the assigned pairs to the tree, hence the integral solutionsatisﬁes all the properties of LP GSF and is therefore a relaxation.

Lemma 39.

Let z ∗ be the optimal objective value of LP GSF with respect to some demand-robust groupSteiner forest problem with optimal value

OPT on an underlying tree with depth D . Then z ∗ ≤ O ( D ) ⋅ OPT .Proof.

Let X ∗ , X ∗ , . . . , X ∗ m be the optimal ﬁrst-stage and second-stage solutions (as deﬁned on G ). Wedeﬁne X ∗ (cid:96),T,i for (cid:96) ∈ { , , . . . , D } , i ∈ { , , . . . , m } , T ∈ T (cid:96) as a natural extension of X ∗ i to T : if e ′ ∈ E ( T ) iscopied from e ∈ E ( G ) , then e ′ ∈ X ∗ (cid:96),T,i ⇐⇒ e ∈ X ∗ i . Therefore, since each edge is copied D + i ∈ [ m ] we have that ∑ D(cid:96) = ∑ T ∈T (cid:96) w ( X ∗ (cid:96), ,T ) + σ i ⋅ w ( X ∗ (cid:96),T,i ) ≤ ( D + ) ⋅ OPT.Let p be the path connecting (some node in) A i,j to (some node in) B i,j . Suppose that p peaks at node x ,let (cid:96) be the depth (in G ) of x , and let T be the maximal tree in G (cid:96) whose root is at x . Since in the optimalsolution both A i,j and B i,j are connected to the root, we assign the “groups” A i,j and B i,j to T (both A i,j and B i,j are considered stand-alone groups, i.e., we forget that they were paired beforehand). Clearly, sincethe optimal solution is feasible, each (element of a) pair is assigned to exactly one tree.Fix a particular (maximal) tree T in ⋃ D(cid:96) = G (cid:96) and consider the set P T of groups assigned to T . Groupingby the groups their originating scenario, we can rewrite P T as P ′ T ∶= (P T,i ) mi = where P T,i is the set of33roups from P T that originated from scenario i . Finally, we note that ( X ∗ (cid:96),T,i ) mi = is a feasible solution to thedemand-robust group Steiner tree problem with scenarios P ′ (cid:96) .Applying Lemma 36 on each such tree T , there exists a (ﬁrst-stage and second-stage) solution ( X ′ (cid:96),T,i ) mi = such that for all (cid:96), T, i , we have (i) w ( X ′ (cid:96),T,i ) ≤ ⋅ w ( X ∗ (cid:96),T,i ) , (ii) the ﬁrst-stage solution X ′ (cid:96),T, is a subtree of T with coinciding roots, (iii) ( X ′ (cid:96),T,i ) i is a feasible solution to P ′ (cid:96) (i.e., for each pair ( A i,j , B i,j ) assigned to T , X ′ (cid:96),T, ∪ X ′ (cid:96),T,i connects A i,j to the root of T as well as B i,j ).We now deﬁne x (cid:96),i,e ∶= e ∈ X ′ (cid:96),T,i for the unique tree T ∈ T (cid:96) such that e ∈ E ( T ) , and 0 otherwise.Furthermore, if groups A i,j and B i,j are assigned to a tree T ∈ T (cid:96) , we can set f (cid:96),T,i,j ∶= f (cid:96),T,i,j ∶= ( x, f ) is a feasible solution to the linear program LP GSF .Property (ii) of X ′ ensures that the x (cid:96),i is decreasing on all root-leaf paths of each tree in G (cid:96) . Finally, fromproperty (iii) we conclude that the maximum ﬂow property being at least f (cid:96),T,i,j is also satisﬁed, henceproving that ( x, f ) is a feasible solution. Therefore, the objective follows from condition (i); for all i ∈ [ m ] we have that: z ∗ ≤ D ∑ (cid:96) = ∑ T ∈T (cid:96) ∑ e ∈ E ( T ) w ( e ) [ x (cid:96), ,e + σ i ⋅ x (cid:96),i,e ]= D ∑ (cid:96) = ∑ T ∈T (cid:96) w ( X ′ (cid:96),T, ) + σ i ⋅ w ( X ′ (cid:96),T,i )≤ D ∑ (cid:96) = ∑ T ∈T (cid:96) w ( X ∗ (cid:96),T, ) + σ i ⋅ w ( X ∗ (cid:96),T,i )≤ O ( D ) ⋅ OPT . We now present the randomized online rounding scheme from [53] which enables us to round LP

GSF into ademand-robust solution.

Lemma 40 ([53]) . Let G = ( V, E, w ) be a forest, namely a collection of (maximal) rooted trees G , G , . . . , G m with roots r , . . . , r m . There exists a polynomial-time randomized algorithm which accepts a sequence of vec-tors y , y , . . . , y T ∈ [ , ] E where each y i is decreasing on root-leaf paths for i ∈ { , . . . , T } and y i ( e ) ≤ y i + ( e ) for all i ∈ { , . . . , T − } , e ∈ E . For each i ∈ { , . . . , T } , upon receiving the vector y i , the algo-rithm outputs a set F i ⊆ E which includes the previous output (i.e., F i − ⊆ F i if i > ) and such that(1) Pr [ e ∈ F i ] = y i for each e ∈ E , and (2) for each i and each pair ( A, B ) where A ⊆ V, B ⊆ V , if ∑ mj = min ( maxﬂow ( y i , r j , A ) , maxﬂow ( y i , r j , B )) ≥ , then with probability Ω ( / log n ) there is a root r j con-nects to both a node in A and a node in B via F i . The algorithm is implicitly explained in Section 3 of [53]. Their description talks about an online roundingalgorithm for the group Steiner forest problem on a tree G . The algorithm accepts an increasing sequence ofvectors y , . . . , y T ∈ [ , ] E ( G ) and proceeds by splitting G ′ into a forest ⋃ D(cid:96) = T (cid:96) and providing the guaranteesspeciﬁed in this claim. The guarantees are proven in Lemma 6 of the paper.Finally, we combine the relaxation with the LP rounding to prove the main result of this section. Proof of Theorem 8.

Let ( x ∗ , f ∗ ) be the optimal LP solution of the demand-robust Steiner forest problemwith respect to scenarios S and let z ∗ ≤ O ( D ) ⋅ OPT be the objective value (Lemma 39).

Splitting G into a forest G ′ . Given a tree G , we construct a forest G ′ as composed of ⊔ D(cid:96) = T (cid:96) (i.e., eachtree in T (cid:96) will be included as a component in G ′ ). Note that for each i ∈ { , . . . , m } the input x i can benaturally understood as a real vector indexed over the set E ( G ′ ) .Furthermore, an edge in e ∈ E ( G ) corresponds to possibly multiple (but at most O ( D ) ) edges in E ( G ′ ) ,whereas an edge e ′ ∈ E ( G ′ ) corresponds to a unique edge e ∈ E ( G ) . Therefore, we deﬁne a projection34 G ′ → G ∶ E ( G ′ ) → E ( G ) which maps an edge e ′ ∈ E ( G ′ ) to its corresponding edge e = π G ′ → G ({ e ′ }) , and weextend this to subgraphs F ′ ⊆ E ( G ′ ) via π G ′ → G ( F ′ ) = ⋃ e ′ ∈ E ( F ′ ) π G ′ → G ({ e ′ }) . Constructing the solution.

We set y ∶= x and apply Lemma 40 on G ′ to obtain the integral ﬁrst-stage X . Note that x is decreasing on root-leaf paths due to a constraint in LP GST .The second-stage solutions ( X , . . . , X m ) are obtained by saving the state of the algorithm and performingthe following for each scenario i ∈ [ m ] (reverting the state upon completion). In case some scenario S i ∈ S isrealized, we set y ∶= x ∗ + x ∗ i , hence clearly y ≤ y . First, we note that x is decreasing on root-leaf paths dueto the constraint in LP GSF . Furthermore, we can assume without loss of generality that y is decreasing onroot-leaf paths since otherwise we can lower the value of any violating edge value ( y ) e without decreasingthe maximum ﬂow to any subset of V ; clearly, the value will not fall below ( y ) e . Therefore, y is valid andcan be fed to all the algorithms, recovering X i ⊆ E ( G ′ ) . Analysis.

The cost analysis is straightforward: E [ w ( X ) + σ i w ( X i )] ≤ z ∗ ≤ O ( D ) ⋅ OPT.By construction of LP

GSF , for each pair ( A i,j , B i,j ) the fractional solution x + x i ∈ R E ( G ′ )≥ yields a ﬂow ofat least 1 across all ⋃ (cid:96) T (cid:96) , or equivalently, G ′ . Therefore, by Lemma 37, with probability Ω ( / log n ) somenode in A i,j and some node in B i,j will be connected to the same root of a tree in G ′ via X ∪ X i ⊆ E ( G ′ ) .Furthermore, by construction of π G ′ → G , this implies that (with the same probability) π G ′ → G ( X ∪ X i ) areconnected in G . We can run O ( log n ) independent copies to recover the result with high probability (atleast 1 − / n ) and have the cumulative cost be O ( D ⋅ log n ) ⋅ OPT.

Online and dynamic algorithms built on probabilistic tree embeddings seem inherently randomized andnecessarily not robust to adaptive adversaries. In this work we gave an alternative to probabilistic treeembeddings—the copy tree embedding—which is better suited to deterministic and adaptive-adversary-robust algorithms. We illustrated this by giving several new results in online and demand-robust algorithms,including a reduction of deterministic online group Steiner tree and group Steiner forest to their tree cases, abicriteria deterministic algorithm for online partial group Steiner tree and new algorithms for demand-robustSteiner forest, group Steiner tree and group Steiner forest.As a conceptual contribution we believe that copy tree embeddings will prove to be useful far beyond theselected algorithmic problems covered in this paper. We conclude by providing just some directions for suchfuture works.As mentioned earlier, Bienkowski et al. [15] recently gave a deterministic algorithm for online non-metricfacility location—which is equivalent to online group Steiner tree on trees of depth 2—with a poly-log-competitive ratio and stated that they expect their techniques will extend to online group Steiner tree ontrees. A very exciting direction for future work would thus be to extend these techniques to general depthtrees which, when combined with our reduction to the tree case, would prove the existence of a deterministicpoly-log-competitive algorithm for online group Steiner tree, settling the open question of Alon et al. [7].While our focus has been on two speciﬁc constructions, it would be interesting to prove lower bounds oncopy tree embedding parameters, such as, more rigorously characterizing the tradeoﬀs between the number ofcopies and the cost approximation factor. One should also consider the possibility of improved constructions.For example: Is it possible to get a logarithmic approximation with few copies, maybe even a constant numberof copies? It is easy to see that with an exponential number of copies—one for each possible subgraph—aperfect cost approximation factor of one is possible. Can one show that a sub-logarithmic distortion isimpossible with a polynomial number of copies? We currently do not even have a proof that excludes aconstant cost approximation factor with a constant copy number.Furthermore, while this paper focused on online group Steiner problems, there are many other online anddynamic algorithms where copy tree embeddings might be able to give deterministic and adaptive-adversary-35obust solutions for general graphs. Several such works are: Englert et al. [28] and Englert and R¨acke [27]give an algorithm for the reordering buﬀer problem; Guo et al. [36] recently gave a dynamic algorithmfor facility location; Gupta et al. [40] gives an algorithm for fully dynamic metric matching. All theseworks feature a deterministic algorithm which works against adaptive adversaries in trees but then useFRT to obtain a randomized algorithm for general graph, which unsurprisingly only works against obliviousadversaries. The work on the reordering buﬀer problem seems especially promising since the algorithm fortrees is quite similar in spirit to our water-ﬁlling algorithm for partial group Steiner tree. We believe that thenatural generalization of this water-ﬁlling algorithm to copy tree embeddings should work and generalize thedeterministic algorithm from trees to general graphs. While there has been follow-up work on this problemwhich does not use FRT for this problem [48] this would still improve the known bounds for this problemfor some parameter settings.Lastly, a recent work of Bartal et al. [14] gave online embeddings for network design with logarithmicapproximation guarantees in the number of terminals rather than n . It would be exciting to marry theseideas with the ones presented here to get the best of both worlds: a deterministic online copy tree embeddingwith distortion as a function of the number of terminals. References [1] Ittai Abraham and Ofer Neiman. Using petal-decompositions to build a low stretch spanning tree. In

Annual ACM Symposium on Theory of Computing (STOC) , pages 395–406, 2012.[2] Ittai Abraham, Yair Bartal, and Ofer Neiman. Nearly tight low stretch spanning trees. In

Symposiumon Foundations of Computer Science (FOCS) , pages 781–790. IEEE, 2008.[3] Ittai Abraham, Shiri Chechik, Michael Elkin, Arnold Filtser, and Ofer Neiman. Ramsey spanningtrees and their applications. In

Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages1650–1664. SIAM, 2018.[4] Ajit Agrawal, Philip Klein, and Ramamoorthi Ravi. When trees collide: An approximation algorithmfor the generalized steiner problem on networks.

SIAM Journal on Computing , 24(3):440–456, 1995.[5] Noga Alon, Richard M Karp, David Peleg, and Douglas West. A graph-theoretic game and its applicationto the k-server problem.

SIAM Journal on Computing , 24(1):78–100, 1995.[6] Noga Alon, Baruch Awerbuch, and Yossi Azar. The online set cover problem. In

Annual ACM Sympo-sium on Theory of Computing (STOC) , pages 100–105, 2003.[7] Noga Alon, Baruch Awerbuch, Yossi Azar, Niv Buchbinder, and Joseph Naor. A general approach toonline network optimization problems.

ACM Transactions on Algorithms (TALG) , 2(4):640–660, 2006.[8] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a meta-algorithm and applications.

Theory of Computing , 8(1):121–164, 2012.[9] Baruch Awerbuch and Yossi Azar. Buy-at-bulk network design. In

Symposium on Foundations ofComputer Science (FOCS) , pages 542–547. IEEE, 1997.[10] Nikhil Bansal, Niv Buchbinder, Aleksander Madry, and Joseph Naor. A polylogarithmic-competitivealgorithm for the k-server problem. In

Symposium on Foundations of Computer Science (FOCS) , pages267–276. IEEE, 2011.[11] Yair Bartal. Probabilistic approximation of metric spaces and its algorithmic applications. In

Symposiumon Foundations of Computer Science (FOCS) , pages 184–193. IEEE, 1996.[12] Yair Bartal, Avrim Blum, Carl Burch, and Andrew Tomkins. A polylog (n)-competitive algorithm formetrical task systems. In

Annual ACM Symposium on Theory of Computing (STOC) , pages 711–719,1997. 3613] Yair Bartal, Nova Fandina, and Ofer Neiman. Covering metric spaces by few trees. In

InternationalColloquium on Automata, Languages and Programming (ICALP) , 2019.[14] Yair Bartal, Nova Fandina, and Seeun William Umboh. Online probabilistic metric embedding: a generalframework for bypassing inherent bounds. In

Annual ACM-SIAM Symposium on Discrete Algorithms(SODA) , pages 1538–1557. SIAM, 2020.[15] Marcin Bienkowski, Bj¨orn Feldkord, and Pawe(cid:32)l Schmidt. A nearly optimal deterministic online algorithmfor non-metric facility location. arXiv preprint arXiv:2007.07025 , 2020.[16] Guy E. Blelloch, Yan Gu, and Yihan Sun. Eﬃcient construction of probabilistic tree embeddings.In

International Colloquium on Automata, Languages and Programming (ICALP) , volume 80, pages26:1–26:14, 2017.[17] Niv Buchbinder and Joseph Naor.

The design of competitive online algorithms via a primal-dual ap-proach . Now Publishers Inc, 2009.[18] Moses Charikar, Chandra Chekuri, Ashish Goel, and Sudipto Guha. Rounding via trees: deterministicapproximation algorithms for group steiner trees and k-median. In

Annual ACM Symposium on Theoryof Computing (STOC) , pages 114–123, 1998.[19] Moses Charikar, Chandra Chekuri, Ashish Goel, Sudipto Guha, and Serge Plotkin. Approximating aﬁnite metric by a small number of tree metrics. In

Symposium on Foundations of Computer Science(FOCS) , pages 379–388. IEEE, 1998.[20] Shiri Chechik and Tianyi Zhang. Dynamic low-stretch spanning trees in subpolynomial time. In

AnnualACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 463–475. SIAM, 2020.[21] Chandra Chekuri, Guy Even, and Guy Kortsarz. A greedy approximation algorithm for the groupsteiner problem.

Discrete Applied Mathematics , 154(1):15–34, 2006.[22] Chandra Chekuri, Mohammad Taghi Hajiaghayi, Guy Kortsarz, and Mohammad R Salavatipour. Ap-proximation algorithms for non-uniform buy-at-bulk network design.

Symposium on Foundations ofComputer Science (FOCS) , pages 677–686, 2006.[23] Chandra Chekuri, Guy Even, Anupam Gupta, and Danny Segev. Set connectivity problems in undi-rected graphs and the directed steiner network problem.

ACM Transactions on Algorithms (TALG) , 7(2):1–17, 2011.[24] Erik D Demaine, MohammadTaghi Hajiaghayi, and Philip N Klein. Node-weighted steiner tree andgroup steiner tree in planar graphs. In

International Colloquium on Automata, Languages and Pro-gramming (ICALP) , pages 328–340. Springer, 2009.[25] Kedar Dhamdhere, Vineet Goyal, R Ravi, and Mohit Singh. How to pay, come what may: Approxi-mation algorithms for demand-robust covering problems. In

Symposium on Foundations of ComputerScience (FOCS) , pages 367–376. IEEE, 2005.[26] Michael Elkin, Yuval Emek, Daniel A Spielman, and Shang-Hua Teng. Lower-stretch spanning trees.

SIAM Journal on Computing , 38(2):608–628, 2008.[27] Matthias Englert and Harald R¨acke. Reordering buﬀers with logarithmic diameter dependency for trees.In

Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 1224–1234. SIAM, 2017.[28] Matthias Englert, Harald R¨acke, and Matthias Westermann. Reordering buﬀers for general metricspaces. In

Annual ACM Symposium on Theory of Computing (STOC) , 2007.[29] Jittat Fakcharoenphol, Satish Rao, and Kunal Talwar. A tight bound on approximating arbitrarymetrics by tree metrics.

Journal of Computer and System Sciences , 69(3):485–497, 2004.3730] Uriel Feige, Kamal Jain, Mohammad Mahdian, and Vahab Mirrokni. Robust combinatorial optimizationwith exponential scenarios. In

Conference on Integer Programming and Combinatorial Optimization(IPCO) , pages 439–453. Springer, 2007.[31] Amos Fiat and Manor Mendel. Better algorithms for unfair metrical task systems and applications.

SIAM Journal on Computing , 32(6):1403–1422, 2003.[32] Arnold Filtser. Clan embeddings into trees, and low treewidth graphs. arXiv preprint arXiv:2101.01146 ,2021.[33] Sebastian Forster, Gramoz Goranci, and Monika Henzinger. Dynamic maintanance of low-stretch prob-abilistic tree embeddings with applications. arXiv preprint arXiv:2004.10319 , 2020.[34] Naveen Garg, Goran Konjevod, and R Ravi. A polylogarithmic approximation algorithm for the groupsteiner tree problem.

Journal of Algorithms , 37(1):66–84, 2000.[35] Daniel Golovin, Vineet Goyal, and R Ravi. Pay today for a rainy day: improved approximation algo-rithms for demand-robust min-cut and shortest path problems. In

International Symposium on Theo-retical Aspects of Computer Science (STACS) , pages 206–217. Springer, 2006.[36] Xiangyu Guo, Janardhan Kulkarni, Shi Li, and Jiayi Xian. On the facility location problem in on-line and dynamic models. In

International Workshop on Approximation Algorithms for CombinatorialOptimization Problems (APPROX) . Schloss Dagstuhl-Leibniz-Zentrum f¨ur Informatik, 2020.[37] Anupam Gupta, Mohammad T Hajiaghayi, and Harald R¨acke. Oblivious network design. In

AnnualACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 970–979, 2006.[38] Anupam Gupta, Viswanath Nagarajan, and Ramamoorthi Ravi. Thresholded covering algorithms forrobust and max-min optimization. In

International Colloquium on Automata, Languages and Program-ming (ICALP) , pages 262–274. Springer, 2010.[39] Anupam Gupta, Viswanath Nagarajan, and R Ravi. Robust and maxmin optimization under matroidand knapsack uncertainty sets.

ACM Transactions on Algorithms (TALG) , 12(1):1–21, 2015.[40] Varun Gupta, Ravishankar Krishnaswamy, and Sai Sandeep. Permutation strikes back: The powerof recourse in online metric matching. In

International Workshop on Approximation Algorithms forCombinatorial Optimization Problems (APPROX) , 2020.[41] Bernhard Haeupler, D Ellis Hershkowitz, and Goran Zuzic. Tree embeddings for hop-constrained net-work design.

Annual ACM Symposium on Theory of Computing (STOC) , 2021.[42] D Ellis Hershkowitz, R Ravi, and Sahil Singla. Prepare for the expected worst: Algorithms for re-conﬁgurable resources under uncertainty. In

International Workshop on Approximation Algorithms forCombinatorial Optimization Problems (APPROX) , 2019.[43] Makoto Imase and Bernard M Waxman. Dynamic steiner tree problem.

SIAM Journal on DiscreteMathematics , 4(3):369–384, 1991.[44] Richard M Karp. A 2k-competitive algorithm for the circle.

Manuscript, August , 5, 1989.[45] Adam Kasperski and Pawe(cid:32)l Zieli´nski. On the approximability of robust spanning tree problems.

Theo-retical Computer Science , 412(4-5):365–374, 2011.[46] Maleq Khan, Fabian Kuhn, Dahlia Malkhi, Gopal Pandurangan, and Kunal Talwar. Eﬃcient distributedapproximation algorithms via probabilistic tree embeddings.

Distributed Computing , 25(3):189–205,2012.[47] Rohit Khandekar, Guy Kortsarz, Vahab Mirrokni, and Mohammad R Salavatipour. Two-stage robustnetwork design with exponential scenarios. In

Annual European Symposium on Algorithms (ESA) , pages589–600. Springer, 2008. 3848] Matthias Kohler and Harald R¨acke. Reordering buﬀer management with a logarithmic guarantee in gen-eral metric spaces. In

International Colloquium on Automata, Languages and Programming (ICALP) .Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.[49] Goran Konjevod, R Ravi, and F Sibel Salman. On approximating planar metrics by tree metrics.

Information Processing Letters (IPL) , 80(4):213–219, 2001.[50] Ioannis Koutis, Gary L Miller, and Richard Peng. A nearly-m log n time solver for sdd linear systems.In

Symposium on Foundations of Computer Science (FOCS) , pages 590–598. IEEE, 2011.[51] Manor Mendel and Assaf Naor. Ramsey partitions and proximity data structures. In

Symposium onFoundations of Computer Science (FOCS) , pages 109–118. IEEE, 2006.[52] Assaf Naor and Terence Tao. Scale-oblivious metric fragmentation and the nonlinear dvoretzky theorem.

Israel Journal of Mathematics , 192(1):489–504, 2012.[53] Joseph Naor, Debmalya Panigrahi, and Mohit Singh. Online node-weighted steiner tree and relatedproblems. In

Symposium on Foundations of Computer Science (FOCS) , pages 210–219. IEEE, 2011.[54] Harald Racke. Minimizing congestion in general networks. In

Symposium on Foundations of ComputerScience (FOCS) , pages 43–52. IEEE, 2002.[55] Gabriele Reich and Peter Widmayer. Beyond steiner’s problem: A vlsi oriented generalization. In

International Workshop on Graph-theoretic Concepts in Computer Science , pages 196–210. Springer,1989.

A Deferred Proofs

Theorem 28.

If there exists:1. A poly-time deterministic algorithm to compute an eﬃcient, well-separated α -approximate copy treeembedding with copy number χ and;2. A poly-time f ( n, N, k ) -competitive deterministic algorithm for online group Steiner forest on well-separated treesthen there exists an ( α ⋅ f ( χn, χN, k )) -competitive deterministic algorithm for group Steiner forest (on generalgraphs).Proof. We will use our copy tree embedding to produce a single tree on which we must solve deterministiconline group Steiner forest.In particular, consider an instance of online group Steiner forest on weighted weighted G = ( V, E, w ) . Then,we ﬁrst compute a copy tree embedding ( T, φ, π G → T , π T → G ) deterministically with respect to G and anarbitrary root r ∈ V as we assumed is possible by assumption. Next, given an instance I t of group Steinerforest on G with pairs ( S , T ) , . . . ( S t , T t ) , we let I ′ t be the instance of group Steiner forest on T with pairs ( φ ( S ) , φ ( T )) , . . . ( φ ( S t ) , φ ( T t )) where we have used the notation φ ( W ) ∶= ⋃ v ∈ W φ ( v ) for W ⊆ V . Then ifthe adversary has required that we solve instance I t in time step t , then we require that our deterministicalgorithm for online group Steiner forest on trees solves I ′ t in time step and we let H ′ t be the solution returnedby our algorithm for I ′ t . Lastly, we return as our solution for I t in time step t the set H t ∶= π T → G ( H ′ t ) .Let us verify that the resulting algorithm is indeed feasible and of the appropriate cost.First, we have that H t ⊆ H t + for every t since H ′ t ⊆ H ′ t + because our algorithm for trees returns a feasiblesolution for its online problem and π T → G is monotone by deﬁnition of a copy tree embedding. Moreover, weclaim that H t connects at least one vertex from S i to at least one vertex from T i for i ≤ t and every t . Tosee this, notice that H ′ t connects at least one vertex from φ ( S i ) to some vertex in φ ( T i ) since it is a feasible39olution for I ′ t and so at least one copy of a vertex in φ ( S i ) is connected to at least one copy of a vertex in φ ( T i ) ; by the connectivity preservation properties of a copy tree it follows that at least one vertex from S i is connected to at least one vertex from T i . Thus, our solution is indeed feasible in each time step.Next, we verify the cost of our solution. Let OPT ′ t be the cost of the optimal solution to I ′ t , let n ′ be thenumber of vertices in T and let N ′ be the maximum size of a set in a pair in I ′ t for any t . By our assumptionon the cost of the algorithm we run on T and since n ′ ≤ χn and N ′ ≤ χN by deﬁnition of copy number, weknow that w T ( H ′ t ) ≤ OPT ′ t ⋅ f ( n ′ , N ′ , k ) = OPT ′ t ⋅ f ( χn, χN, k ) . Next, let H ∗ t be the optimal solution to I t . We claim that π G → T ( H ∗ t ) is feasible for I ′ t . This follows because H ∗ t connects a vertex from S i to T i for every i ≤ t and so by the connectivity preservation property of copytree embeddings we know that some vertex from φ ( S i ) is connected to some vertex of φ ( T i ) for every i ≤ t in π G → T ( H ∗ t ) . Applying this feasibility of π G → T ( H ∗ t ) and the cost preservation property of our copy treeembedding, it follows that OPT ′ t ≤ w T ( π G → T ( H ∗ t )) ≤ α ⋅ w G ( H ∗ t ) = α ⋅ OPT t .Similarly, we know by the cost preservation property of our copy tree embedding that w G ( π T → G ( H ′ t )) ≤ w T ( H ′ t ) . Combining these observations we have w G ( π T → G ( H ′ t )) ≤ w T ( H ′ t ) ≤ OPT ′ t ⋅ f ( χn, χN, k ) ≤ OPT t ⋅ α ⋅ f ( χn, χN, k ) ,,