[PDF] Clan Embeddings into Trees, and Low Treewidth Graphs

Abstract

In low distortion metric embeddings, the goal is to embed a host "hard" metric space into a "simpler" target space while approximately preserving pairwise distances. A highly desirable target space is that of a tree metric. Unfortunately, such embedding will result in a huge distortion. A celebrated bypass to this problem is stochastic embedding with logarithmic expected distortion. Another bypass is Ramsey-type embedding, where the distortion guarantee applies only to a subset of the points. However, both these solutions fail to provide an embedding into a single tree with a worst-case distortion guarantee on all pairs. In this paper, we propose a novel third bypass called \emph{clan embedding}. Here each point x is mapped to a subset of points f(x), called a \emph{clan}, with a special \emph{chief} point \chi(x)\in f(x). The clan embedding has multiplicative distortion t if for every pair (x,y) some copy y'\in f(y) in the clan of y is close to the chief of x: \min_{y'\in f(y)}d(y',\chi(x))\le t\cdot d(x,y). Our first result is a clan embedding into a tree with multiplicative distortion O(\frac{\log n}{\epsilon}) such that each point has 1+\epsilon copies (in expectation). In addition, we provide a "spanning" version of this theorem for graphs and use it to devise the first compact routing scheme with constant size routing tables. We then focus on minor-free graphs of diameter prameterized by D, which were known to be stochastically embeddable into bounded treewidth graphs with expected additive distortion \epsilon D. We devise Ramsey-type embedding and clan embedding analogs of the stochastic embedding. We use these embeddings to construct the first (bicriteria quasi-polynomial time) approximation scheme for the metric \rho-dominating set and metric \rho-independent set problems in minor-free graphs.

Full PDF

CClan Embeddings into Trees, and Low Treewidth Graphs

Arnold Filtser ∗ Columbia UniversityEmail: [email protected]

Abstract

In low distortion metric embeddings, the goal is to embed a host “hard” metric space intoa “simpler” target space, while approximately preserving pairwise distances. A highly desirabletarget space is that of a tree metric. Unfortunately, such embedding will result in a huge dis-tortion. A celebrated bypass to this problem is stochastic embedding with logarithmic expecteddistortion. Another bypass is Ramsey type embedding, where the distortion guarantee appliesonly to a subset of the points. However both this solutions fail to provide an embedding into asingle tree with worst case distortion guarantee on all pairs. In this paper we propose a novelthird bypass called clan embedding . Here each point x is mapped to a subset of points f ( x ) (called a clan ) with a special chief point χ ( x ) ∈ f ( x ) . The clan embedding has multiplicativedistortion t if for every pair x, y some copy y ′ ∈ f ( y ) in the clan of y is close to the chief of x :min y ′ ∈ f ( y ) d ( y ′ , χ ( x )) ≤ t ⋅ d ( x, y ) . Our ﬁrst result is clan embedding into a tree with multiplica-tive distortion O ( log n(cid:15) ) such that each point has 1 + (cid:15) copies (in expectation). In addition, forgraphs we provide a “spanning” version of this theorem, and use it to devise the ﬁrst compactrouting scheme with constant size routing tables.Next we turn to minor free graphs, who were previously stochastically embedded intobounded treewidth graphs with expected additive distortion (cid:15)D ( D being the diameter). Wedevise a Ramsey type embedding and clan embedding analogs of the stochastic embedding. Weuse this embeddings to construct the ﬁrst (bicriteria quasi-polynomial) approximation schemesfor the metric ρ -dominating set and metric ρ -independent set problems in minor free graphs. ∗ The research was supported by the Simons Foundation. a r X i v : . [ c s . D S ] J a n ontents ρ -Independent Set (Theorem 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438.2 Metric ρ -Dominating Set (Theorem 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448.3 Compact Routing Scheme (Theorem 6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 A Local Search Algorithms 53 Introduction

Low distortion metric embeddings provide a powerful algorithmic toolkit, with applications rangingfrom approximation/sublinear/online/distributed algorithms [LLR95, AMS99, BCL +

18, KKM + + f from a metric space ( X, d X ) to a metric space ( Y, d Y ) has multiplicative distortion t , if for every pair of points u, v ∈ X it holds that d X ( u, v ) ≤ d Y ( f ( u ) , f ( v )) ≤ t ⋅ d X ( u, v ) . Typicalapplication of metric embeddings follow these lines: take some instance of a problem in a “hard”metric space ( X, d X ) . Embed X into a “simple” metric space ( Y, d Y ) via a low-distortion metricembedding f . Solve the problem in Y , and “pull-back” the solution to X . Thus the objectives are:low distortion, and “simple” target space.Simple target spaces that immediately come to mind are Euclidean space and tree metric, or evenbetter- an ultrametric. In a celebrated result, Bourgain [Bou85] showed that every n -point metricspace embeds into Euclidean space with multiplicative distortion O ( log n ) (which is tight [LLR95]).On the other hand, any embedding of the n -vertex cycle graph C n into a tree metric will incurmultiplicative distortion Ω ( n ) [RR98]. Karp [Kar89] observed that deleting a random edge from C n results with an embedding into a line with expected distortion 2 (see Figure 1(a)). This idea wasdeveloped by Bartal [Bar96, Bar98] (improving over [AKPW95]), and colmenating in the celebratedwork of Fakcharoenphol, Rao, and Talwar [FRT04] (see also [Bar04]) who showed that every n -pointmetric space stochastically embeds into trees (actually ultrametrics) with expected multiplicativedistortion O ( log n ) . Speciﬁcally, there is a distribution D , over dominating metric embeddings into trees (ultrametrics), such that ∀ u, v ∈ X , E ( f,T )∼D d T ( f ( u ) , f ( v )) ≤ O ( log n ) ⋅ d X ( u, v ) . The O ( log n ) multiplicative distortion is known to be optimal [Bar96]. Stochastic embeddings into treesare widely successful and found numerous applications (see e.g. [Ind01]).In many applications of metric embeddings a worst case distortion gurantee is required. Adiﬀerent type of compromise (compared to expected distortion) is provided by Ramsey type em-beddings. The classical Ramsey problem for metric spaces was introduced by Bourgain et al. [BFM86], and is concerned with ﬁnding ”nice” structures in arbitrary metric spaces. Following[BBM06, BLMN05a], Mendel and Naor [MN07] showed that for every integer parameter k ≥ n -point metric ( X, d ) has a subset M ⊆ X of size at least n − / k that embeds into a tree(ultrametric) with multiplicative distortion O ( k ) (see [NT12, BGS16, ACE +

20] for improvements).In fact, the embedding has multiplicative distortion O ( k ) for any pair in M × X . We say thatthe vertices in M are satisﬁed (see Figure 1(b) for an illustration). As a corollary, every n -pointmetric space ( X, d X ) , admits a collection T of k ⋅ n / k dominating trees over X , and a mapping home ∶ X → T , such that for every x, y ∈ X it holds that d home ( x ) ( x, y ) ≤ O ( k ) ⋅ d X ( x, y ) . Theseare called Ramsey trees, and they found applications to online algorithms [BBM06], approximatedistance oracles [MN07, Che15], and routing [ACE + New type of embedding: clan embedding

Recall that our initial goal was to embed ageneral metric space into a “simple” target space, speciﬁcally a tree metric. A drawback of boththe stochastic embedding and the Ramsey type embedding, is that the embedding is actually into acollection of trees rather than into a single one, thus the target space is not as simple as one might Ultrametric is a metric space satisfying a strong form of the triangle inequality: d ( x, z ) ≤ max { d ( x, y ) , d ( y, z )} (for all x, y, z ). Ultrametrics embed isometrically into both Euclidean space [Lem03], and tree metric. See Deﬁnition 1. Metric embedding f ∶ X → Y is dominating if ∀ u, v ∈ X , d X ( u, v ) ≤ d Y ( f ( u ) , f ( v )) . v i +1 − (cid:15)n v i v i +1 v ˜ v ˜ v ˜ v i v (cid:48) i ˜ v i +1 ˜ v i +2 v (cid:48) i +1 v (cid:48) i +2 v (cid:48) i + (cid:15)n ˜ v i + (cid:15)n ˜ v i +1 − (cid:15)n v (cid:48) i +1 − (cid:15)n v v j v i v i +1 v v n − v n − ˜ v n − v j +1 v i + (cid:15)n (a) (b) (c) Figure 1:

Three diﬀerent type of embeddings of the cycle graph C n into a tree. (a) On the left illustratedstochastic embedding that created by deleting an edge { v i , v i + } uniformly at random. The expected multi-plicative distortion of a pair of neighboring vertices v j , v j + is E [ d T ( v j , v j + )] = n − n ⋅ + n ⋅ ( n − ) = n − n < ≤ (b) In the middle illustrated a Ramsey type embedding: an arbitrary edge { v i , v i + } is deleted. The verticesin the subset M (on the thick red line), which constitutes an ( − (cid:15) ) fraction of the vertex set, are satisﬁed.That is, suﬀer from multiplicative distortion at most (cid:15) w.r.t. any other vertex. (c) On the right illustrated a clan embedding, where i is chosen uniformly at random. The chief of a vertex v j denoted ˜ v j . Each vertex v j ∈ { v i + − (cid:15)n , . . . , v i + (cid:15)n } has additional copy v ′ j , thus the probability that a vertexhas two copies is 2 (cid:15) , implying E [∣ f ( v a )∣] = + (cid:15) . The distortion is min { d ( ˜ v a , ˜ v b ) , d ( v ′ a , ˜ v b )} ≤ (cid:15) ⋅ d C n ( v a , v b ) . desire. Each embedding type makes a diﬀerent type of compromise: the distortion gurantee instochastic embedding is only in expectation, while in Ramsey type embedding only a subset of thevertices enjoy a bounded distortion gurantee. In this paper we propose a novel type of compromise,we call clan embedding. Here we will have a single embedding with worst case gurantee on allvertex pairs. The caveat is that each vertex might be mapped to multiple copies. This violatesthe classical paradigm of having a one-to-one correlation between the source and target spaces.However, we obtain a map into a single tree with worst case gurantee, this is beneﬁcial and opensa new array of possibilities.A one-to-many embedding f ∶ X → Y maps each point x into a subset f ( x ) ⊂ Y called the clan of x . Each vertex x ′ ∈ f ( x ) is called a copy of x (see Deﬁnition 2). Clan embedding is a pair ( f, χ ) ,where f is one-to-many embedding, and χ ∶ X → Y denotes for each clan f ( x ) , a special vertex χ ( x ) ∈ f ( x ) called the chief . Clan embeddings are dominating , that is for every x, y ∈ X the distancebetween every two copies is larger or equal to the original distance: min x ′ ∈ f ( x ) ,y ′ ∈ f ( y ) d Y ( x ′ , y ′ ) ≥ d X ( x, y ) . ( f, χ ) has multiplicative distortion t , if for every x, y ∈ X , some vertex in the clan of x isclose to the chief of y : min x ′ ∈ f ( x ) d Y ( x ′ , χ ( y )) ≤ t ⋅ d X ( x, y ) (see Deﬁnition 3). See Figure 1(c) foran illustration. Clan embeddings into trees

One can easily construct an isometric clan embedding into a treeby allowing n copies for each vertex. From the other hand, with a single copy per vertex clanembedding become a classic embedding, which requires multiplicative distortion of Ω ( n ) . Our goalis to construct low distortion clan embedding, while keeping the number of copies each vertex hasas small as possible. To this end, we construct distribution over clan embeddings, where all theembeddings in the support have worst case distortion gurantee, however the expected number ofcopies each vertex has is bounded by a constant arbitrarily close to 1.2 heorem 1 (Clan embedding into ultrametric) . Consider an n -point metric space ( X, d X ) , andparameter (cid:15) ∈ ( , ] . Then there is a distribution D over clan embeddings ( f, χ ) into ulrametricswith multiplicative distortion O ( log n(cid:15) ) , such that for every point x ∈ X , E f ∼D [∣ f ( x )∣] ≤ + (cid:15) .In addition, for every k ∈ N , there is distribution D over clan embeddings ( f, χ ) into ulrametricswith multiplicative distortion k such that for every point x ∈ X , E f ∼D [∣ f ( x )∣] = O ( n k ) . Our clan embedding into ultrametric is asymptotically tight (up to a constant factor in thedistortion), and cannot be improved even if we embed into a general tree (rather than to themuch more restricted structure of an ultrametric). Additionally, our lower bound implies thatthe ultra-sparse spanner construction of Elkin and Neiman [EN19] is asymptotically tight ([EN19]constructed a spanner with stretch O ( log n(cid:15) ) and ( + (cid:15) ) n edges, see Remark 1 for farther details). Theorem 2 (Lower bound for clan embedding into a tree) . For every ﬁxed (cid:15) ∈ ( , ) and largeenough n , there is an n -point metric spaces ( X, d X ) such that for every clan embedding ( f, χ ) of X into a tree with multiplicative distortion O ( log n(cid:15) ) it holds that ∑ x ∈ X ∣ f ( x )∣ ≥ ( + (cid:15) ) n .Further, for every k ∈ N , there is an n -point metric spaces ( X, d X ) such that for every clan embed-ding ( f, χ ) of X into a tree with multiplicative distortion O ( k ) it holds that ∑ x ∈ X ∣ f ( x )∣ ≥ Ω ( n + k ) . Often, we are given a weighted graph G = ( V, E, w ) , and the goal is to embed the shortestpath metric of the graph d G into a tree T . However, if for example one is required to construct anetwork while using only pre-existing edges from E , it is desirable that the tree T will be a sub-graph of G , also called a spanning tree. Abraham and Neiman [AN19] (improving over [EEST08])constructed stochastic embedding of general graphs into spanning trees with expected distortion O ( log n log log n ) (loosing a log log n factor compared to general trees [FRT04]). Later, Abraham etal. [ACE +

20] constructed Ramsey spanning trees, showing that for every k ∈ N , every graph can beembedded into a spanning tree with a subset M of at least n − k satisﬁed vertices, suﬀering distor-tion at most O ( k log log n ) w.r.t. any other vertex (again loosing a log log n factor compared withgeneral tree). Here we provide a “spanning” analog of Theorem 1. Similarly to [AN19, ACE + n factor compared to general trees (see the introduction to Section 4 for fartherdiscussion). In particular, by Theorem 2 our spanning clan embedding is optimal up to secondorder terms. As an application, we construct the ﬁrst compact routing scheme with routing tablesof constant size (in expectation, see Section 1.1.1). We say that a clan embedding ( f, χ ) of a graph G into a graph H is spanning if f ( V ( G )) = V ( H ) (i.e. every vertex in H is an image of a vertexin G ), and for every edge { v ′ , u ′ } ∈ E ( H ) where v ′ ∈ f ( v ) , u ′ ∈ f ( u ) it holds that { v, u } ∈ E ( G ) (seeDeﬁnitions 2 and 3). Theorem 3 (Spanning clan embedding into trees) . Consider an n -vertex weighted graph G =( V, E, w ) , and parameter (cid:15) ∈ ( , ] . Then there is a distribution D over spanning clan embeddings ( f, χ ) into trees with multiplicative distortion O ( log n log log n(cid:15) ) , such that for every vertex v ∈ V , E f ∼D [∣ f ( v )∣] ≤ + (cid:15) .In addition, for every k ∈ N , there is distribution D over spanning clan embeddings ( f, χ ) intotrees with multiplicative distortion O ( k log log n ) , where for every vertex v ∈ V , E f ∼D [∣ f ( v )∣] = O ( n k ) . Clan embedding from minor-free graphs to bounded treewidth graphs

As [Bou85] and[FRT04] are tight, a natural question arises: by embedding from simpler space (than general n -pointmetric space) into a richer space (than trees), could the distortion be reduced? The family of lowtreewidth graphs is an excellent candidate for a target space: it is much more expressive space target3han trees, while many hard problems remain tractable. Unfortunately, as implied by the work ofChakrabarti et al. [CJLV08] (see also [CG04]), there are n vertex planar graphs such that every(stochastic) embedding into o (√ n ) -treewidth graphs must incur expected multiplicative distortionΩ ( log n ) . Bypassing this roadblock, Fox-Epstein et al. [FKS19] (improving over [EKM14]), showedhow to embed planar metrics into bounded treewidth graphs while incurring only a small additive distortion. Speciﬁcally, given a planar graph G and a parameter (cid:15) , they constructed a deterministicdominating embedding f into a graph H of treewidth poly ( (cid:15) ) , such that ∀ u, v ∈ G , d H ( f ( u ) , f ( v )) ≤ d G ( u, v )+ (cid:15)D , where D is the diameter of G . While at ﬁrst impression (cid:15)D is a crude additive bound,it is actually was used to obtain approximation schemes for some classic problems: k -center, vehiclerouting, metric ρ -dominating set, and metric ρ -isolated set.Following the success in planar graphs, Cohen-Addad et al. [CFKL20] wanted to generalizeto minor free graphs. Unfortunately, they showed that already obtaining additive distortion D for K -free graphs requires embedding into treewidth Ω (√ n ) graphs. Inspired by the case oftrees, [CFKL20] bypass this barrier by constructing a stochastic embedding from K r -free n -vertexgraphs into distribution over treewidth O r ( log n(cid:15) ) graphs with expected additive distortion (cid:15)D , that is ∀ u, v ∈ G , E ( f,H )∼D [ d H ( f ( u ) , f ( v ))] ≤ d G ( u, v ) + (cid:15)D . Similarly to the case in planar graphs,Cohen-Addad et al. [CFKL20] used their embedding to construct an approximation scheme for thecapacitated vehicle routing problem in K r -minor-free graphs. However, due to the stochastic natureof the embedding, it was not strong enough to imply any results for the metric ρ -dominating/isolatedproblems in minor free graphs, which remain wide open.In this paper, similarly to the case of trees, we construct Ramsey type and clan embeddinganalogs to the stochastic embedding of [CFKL20]. Our Ramsey type embedding bypasses the lowerbound of Ω (√ n ) from [CFKL20] while guaranteeing worst case distortion (for a large random subsetof vertices). As an application, we obtain a bicriteria quasi polynomial approximation scheme(QPTAS) for the metric ρ -independent set problem in minor free graphs (see Section 1.1.2). Theorem 4 (Ramsey type embedding for minor free graphs) . Given a K r -free n -vertex graph G = ( V, E, w ) with diameter D , and parameters (cid:15) ∈ ( , ) , δ ∈ ( , ) , there is a distribution overdominating embeddings g ∶ G → H , into graphs of treewidth O h ( log n(cid:15)δ ) , such that there is a subset M ⊆ V of vertices for which the following holds:1. For every u ∈ V , Pr [ u ∈ M ] ≥ − δ .2. For every u ∈ M and v ∈ V , d H ( g ( u ) , g ( v )) ≤ d G ( u, v ) + (cid:15)D . By setting δ = and repeating log n times, a straightforward corollary is the following. Corollary 1.

Given a K r -free n -vertex graph G = ( V, E, w ) with diameter D , and parameter (cid:15) ∈ ( , ) , there are log n dominating embeddings g , . . . , g log n into graphs of treewidth O h ( log n(cid:15) ) ,such that for every vertex v there is some embedding g i v , such that ∀ u ∈ V, d H iv ( g i v ( u ) , g i v ( v )) ≤ d G ( u, v ) + (cid:15)D . While Ramsey type embedding is suﬃcient for the metric ρ -independent set problem (as wecan restrict our search to independent sets in M ), we cannot use it for the metric ρ -dominating setproblem (as every good solution might contain vertices out of M ). We construct a clan embedding O r hides some function depending only on r . That is there is some function χ ∶ N → N such that O r ( x ) ≤ χ ( r ) ⋅ x .

4f minor free graphs into bounded treewidth graphs. As we have a worst case distortion guranteefor all vertex pairs, we obtain a QPTAS for the metric ρ -dominating set problem in minor freegraphs (see Section 1.1.2). Theorem 5 (Clan embedding for minor free graphs) . Consider a K r -free n -vertex graph G =( V, E, w ) of diameter D , and parameters (cid:15) ∈ ( , ) , δ ∈ ( , ) , there is a distribution D over clanembeddings ( f, χ ) with additive distortion (cid:15)D into graphs of treewidth O h ( log nδ(cid:15) ) , such that forevery v ∈ V , E [∣ f ( v )∣] ≤ + δ . A routing scheme in a network is a mechanism that allows packets to be delivered from any nodeto any other node. The network is represented as a weighted undirected graph, and each node canforward incoming data by using local information stored at the node, called a routing table , andthe (short) packet’s header . The routing scheme has two main phases: in the preprocessing phase,each node is assigned a routing table and a short label . In the routing phase, when a node receivesa packet, it should make a local decision, based on its own routing table and the packet’s header(which may contain the label of the destination, or a part of it), where to send the packet. The stretch of a routing scheme is the worst ratio between the length of a path on which a packet isrouted, to the shortest possible path.Compact routing schemes were extensively studied [PU89, ABLP90, AP92, Cow01, EGP03,TZ01, Che13, ACE + O ( n k ) table size, Awerbuchet al. [ABLP90] obtained stretch O ( k k ) , later is was improved to O ( k ) by Awerbuch andPeleg [AP92]. In their celebrated compact routing scheme, Thorup and Zwick [TZ01] obtainedstretch 4 k − O ( k ⋅ n / k ) size tables, and labels of size O ( k log n ) . The stretch wasimproved to roughly 3 . k by Chechik [Che13], using a scheme similar to [TZ01] (while keeping allother parameters in-tact). Recently, Abraham et al. [ACE +

20] devise a compact routing scheme(using Ramsey spanning trees) with labels of size O ( log n ) , tables of size O ( k ⋅ n / k ) , and stretch O ( k log log n ) .In all previous works the guarantees on the table size are worst case. That is, the table sizeof every node in the network is bounded by a certain parameter. Here our gurantee is only inexpectation. Note that such an expected gurantee makes a lot of sense for a central plannerconstructing a routing scheme for a network, the goal of whom is to minimize the total amountof resources, rather than the maximal amount of resources in a single spot. Even though previousworks analyzed worst case gurantee, if one analyzes their expected bounds per vertex, the guranteewill not be improved. Our contribution is the following: Theorem 6 (Compact routing scheme) . Given a weighted graph G = ( V, E, w ) on n vertices andinteger parameter k > , there is a compact routing scheme with stretch O ( k log log n ) that has(worst case) labels (and headers) of size O ( log n ) , and the expected size of the routing table of eachvertex is O ( n / k ) . See Table 1 for comparison of our and previous results. We mainly focus on the very compactregime, where all the parameters are at most poly-logarithmic. A key result in [TZ01] is a stretch 1 Unless stated otherwise, we measure space in machine words, each word is Θ ( log n ) bits. The table compares various routing schemesfor n vertex graphs. In rows 1-4 we compare the diﬀerentschemes in their full generality, here k is an integer pa-rameter. In rows 5,6,8,10 we ﬁx k = log n , while in rows 7and 9 we ﬁx k = log n log log n . Note that our result in line 9 issuperior to all previous results: it has reduced label sizecompared to lines 5-6, reduce table size compared to line7, and reduced stretch compared to line 8. Our result inline 10 is the ﬁrst one to obtain a constant table size.The size of the table and label is measured in words (eachword is O ( log n ) bits). The header size is asymptoticallyequal to the label size in all the compared routing schemes.The main caveat is that while in all previous results thetable size is analyzed w.r.t. a worst case gurantee, we onlyprovide bounds in expectation. The label size (as well asthe stretch) is a worst case gurantee in our work as well. Routing s. Stretch Label Table

1. [TZ01] 4 k − O ( k log n ) O ( kn / k )

2. [Che13] 3 . k O ( k log n ) O ( kn / k )

3. [ACE + O ( k log log n ) O ( log n ) O ( kn / k )

4. Thm. 6 O ( k log log n ) O ( log n ) O ( n / k )

5. [TZ01] O ( log n ) O ( log n ) O ( log n )

6. [Che13] O ( log n ) O ( log n ) O ( log n )

7. [ACE + O ( log n ) O ( log n ) O ( log n )

8. [ACE + ̃ O ( log n ) O ( log n ) O ( log n )

9. Thm. 6 O ( log n ) O ( log n ) O ( log n )

10. Thm. 6 ̃ O ( log n ) O ( log n ) O ( ) routing scheme for the special case of a tree, where a routing table has constant size, and logarithmiclabel size (see Theorem 13). All the previous works are based on constructing a collection of trees.Speciﬁcally, in [TZ01, Che13] there are n trees, where each vertex belongs to O ( log n ) trees, and foreach pair of nodes there is a tree which guarantees small stretch. Routing is then done in that tree.This is the reason for their large label size of log n (as a label consist of log n labels in diﬀerenttrees). [ACE +

20] constructs log n (Ramsey spanning) trees in total, where each vertex v has a hometree T v , such that v enjoys a small stretch w.r.t. any other vertex in T v . The label is then consistof the name of T v and the label of v in T v . However, the routing table is still somewhat large asone need to store the routing information in log n diﬀerent trees.In contrast, our construction is based on the spanning clan embedding ( f, χ ) of Theorem 3 intoa single tree T , where the clan of each vertex consists of O ( ) copies (in expectation). The label ofeach vertex v is simply the label of χ ( v ) in T . The routing table corresponds to the routing tablesof all the corresponding copies in f ( v ) . Baker [Bak94] introduced a “layering” technique in order to construct eﬃcient polynomial approxi-mation schemes (EPTAS) for many “local” problems in planar graphs such as minimum-measure dominating set (subset S of vertices of minimum measure such that each vertex is within a singlehop from S ) and maximum-measure independent set (subset S of vertices of maximum measure notsharing an edge). The key observation was that planar graphs have the “bounded local treewidth”property. Baker showed that for some problems solvable on bounded treewidth graphs, one can con-struct eﬃcient approximation schemes for graphs possessing the bounded local treewidth property.This approach was generalized by Demaine et al. [DHK05] to minor free graphs.Eisenstat et al. [EKM14] proposed metric generalizations of Baker problems: minimum measure ρ -dominating set , and maximum measure ρ -independent set . Given a metric space ( X, d X ) , ρ -independent set is a subset S ⊆ X of points such that for every x, y ∈ S , d X ( x, y ) > ρ . Similarly, a ρ -dominating set is a subset S ⊆ X such that for every x ∈ X , there exist y ∈ S , such that d X ( x, y ) ≤ ρ . A polynomial approximation scheme (PTAS) is an algorithm that for any (cid:15) ∈ ( , ) provides an 1 + (cid:15) approximationin n f ( (cid:15) ) time, where n is the size of the input and f is some function of (cid:15) . An eﬃcient polynomial approximationscheme (EPTAS) has running time n O ( ) ⋅ f ( (cid:15) ) . A quasi polynomial approximation scheme (QPTAS) has runningtime 2 f ( (cid:15) )⋅ polylog ( n ) . µ ∶ X → R + , the goal in the metric ρ -dominating (resp. independent) set problemis to ﬁnd a ρ –dominating (resp. independent) set of minimal (resp. maximum) measure. It isoften the case that the problems are much easier under uniform measure. Sometimes in additionwe are given a set of terminals K ⊆ X , and require only that the terminals will be dominated( ∀ x ∈ K , ∃ y ∈ S s.t. d X ( x, y ) ≥ ρ ). Note that the metric generalization of Becker problems instructured graphs (e.g. planar) is considerably harder than the original problems. This is as thegraph describing dominance/independence relations will not longer posses the original structure(e.g. planarity).An approximation scheme for the ρ -dominating (resp. independent) set problem returns a ρ -dominating (resp. independent) set S such that for every ρ -dominating (resp. independent) set S ′ it holds that µ ( S ) ≤ ( + (cid:15) ) µ ( S ′ ) (resp. µ ( S ) ≥ ( − (cid:15) ) µ ( S ′ ) ). A bicriteria approximationscheme for the ρ -dominating (resp. independent) set problem returns a ( + (cid:15) ) ρ -dominating (resp. ( − (cid:15) ) ρ -independent) set S such that for every ρ -dominating (resp. independent) set S ′ it holdsthat µ ( S ) ≤ ( + (cid:15) ) µ ( S ′ ) (resp. µ ( S ) ≥ ( − (cid:15) ) µ ( S ′ ) ).For unweighted graphs with treewidth tw, Borradaile and Le [BL16] provided an exact algorithmfor the ρ -dominating set problem with O (( ρ + ) tw + n ) running time (see also [DFHT05]). Forgeneral treewidth tw graphs, using dynamic programming technique, Katsikarelis et al. [KLP19]constructed ﬁxed parameter tractable (FPT) approximation algorithm for the metric ρ -dominatingset problem with ( tw / (cid:15) ) O ( tw ) ⋅ poly ( n ) runtime that returns a ( + (cid:15) ) ρ -dominating set S , such thatfor every ρ -dominating set S ′ it holds that µ ( S ) ≤ µ ( S ′ ) . Similar result was also obtained forthe metric ρ -independent set problem [KLP20]. In particular, for the very basic case of boundedtreewidth graphs, no true approximation scheme (even with quasi-polynomial time) is known forthis problems. Additional evidence was provided by Marx and Pilipczuk [MP15] (see also [FKS19]),who showed that the existence of EPTAS for either ρ -dominating/independent set problem inplanar graphs will refute the exponential-time hypothesis (ETH). Given this evidence, it is naturalto settle for bicriteria approximation.For unweighted planar graphs and constant ρ , there are linear time approximation schemes (notbicriteria) for the metric ρ -independent/dominating set problems [EILM16, DFHT05]. In weightedplanar graphs, under uniform measure, Marx and Pilipczuk [MP15] gave exact n O (√ k ) time solutionto both metric ρ -dominating/isolated set problems, provided that the solution is guaranteed to beof size at most k . Using their embedding of planar graphs into (cid:15) − O ( ) log n treewidth graphs withadditive distortion (cid:15)D , Eisenstat et al. [EKM14] provided a bicriteria PTAS for both metric ρ -independent/dominating set problems in planar graphs. Later, using their improved embeddinginto (cid:15) − O ( ) -treewidth graphs, Fox-Epstein et al. [FKS19] improved to a bicriteria EPTAS .Finally we turn to the most challenging case of minor-free graphs. For the restricted uni-form measure case, using local search (similarly to [CKM19]), we construct PTAS for both metric ρ -dominating/independent set problems. See Theorems 16 and 17 in Appendix A for details. How-ever, local search approach seem to be hopeless for general measures. Alternately, one can try themetric embedding approach (for which bicriteria approximation is inherit). Unfortunately, unlikethe classic embeddings in [EKM14, FKS19], Cohen-Addad et al. [CFKL20] provided a stochasticembedding with expected distortion gurantee. Such a stochastic gurantee is not strong enough toconstruct approximation schemes for the metric ρ -independent/dominating set problems. Usingour clan and Ramsey-type embeddings, we are able to provide the ﬁrst bicriteria QPTAS forthese problems. See Table 2 for a summery of previous and current results. Theorem 7 (Metric ρ -independent set) . There is a bicriteria quasi-polynomial approximation eference Family Result Technique

1. [MP15] planar No EPTAS under ETH2. [KLP19, KLP20] treewidth FPT with approx ( + (cid:15) ) ρ Dynamic programming3. [EKM14] planar Bicriteria PTAS Deterministic embedding4. [FKS19] planar Bicriteria EPTAS Deterministic embedding5. Theorems 16&17 minor-free PTAS (uniform measure) Local search6. Theorems 7&8 minor-free Bicriteria QPTAS Clan/Ramsey type embedding

Table 2:

The table compares diﬀerent approximation schemes for metric Becker problems on weightedgraphs. All compared results apply to both metric ρ -dominating/independent set problems. All the results(other than in line 5) apply to the general measure case. scheme (QPTAS) for the metric ρ -independent set problem in K r -free graphs.Speciﬁcally, given a weighted n -vertex K r -free graph G = ( V, E, w ) , measure µ ∶ X → R + andparameters (cid:15) ∈ ( , ) , ρ > , in ˜ O r ( log2 n(cid:15) ) time, one can ﬁnd a ( − (cid:15) ) ρ -independent set S ⊆ Y suchthat for every ρ -independent set ˜ S , µ ( S ) ≥ ( − (cid:15) ) µ ( ˜ S ) . Theorem 8 (Metric ρ -dominating set) . There is a bicriteria quasi-polynomial approximationscheme (QPTAS) for the metric ρ -dominating set problem in K r -free graphs.Speciﬁcally, given a weighted- n vertex K r -free graph G = ( V, E, w ) , measure µ ∶ V → R + , a subsetof terminals K ⊆ V , and parameters (cid:15) ∈ ( , ) , ρ > , in ˜ O r ( log2 n(cid:15) ) time, one can ﬁnd a ( + (cid:15) ) ρ -dominating set S ⊆ V such that for every ρ -dominating set ˜ S , µ ( S ) ≤ ( + (cid:15) ) µ ( ˜ S ) . The paper overview uses terminology presented in the preliminaries section 2.

Clan embedding into ultrametric

The main task is to prove a “distributional” version ofTheorem 1. Speciﬁcally, given a parameter k , and a measure µ ∶ X → R ≥ , we construct a clanembedding with distortion 16 k such that ∑ x ∈ X µ ( x ) ⋅ ∣ f ( x )∣ ≤ µ ( X ) + k , where µ ( X ) = ∑ x ∈ X µ ( x ) (Lemma 2). Later, Theorem 1 follows using the minimax theorem.The algorithm to construct the distributional version is a deterministic recursive ball grow-ing algorithm, somewhat similar to previous deterministic algorithms constructing Ramsey trees[Bar11, ACE + D be the diameter of the metric space. We grow a ball B ( v, R ) around a point v and partition the space into two clusters: the interior B ( v, R + D k ) and exterior X ∖ B ( v, R − D k ) of the ball, while points at distance D k from the boundary of the ball belong to both clusters.Then recursively, we create clan embedding into ultrametric for each of the two clusters. These twoembeddings are later combined into a single ultrametric, where the root has label D . See Figure 2for illustration. The 16 k distortion gurantee follows from the wide “belt” around the boundary ofthe ball belonging to both clusters. Note that the image of each vertex in this “belt” will havecopies from the clan embeddings of both clusters, while “non-belt” points will have copies comingfrom a single embedding only. However, the two clusters have cardinality smaller than X . Thekey is to carve the partition while controlling that the relative measure of points belonging to bothclusters will be small compared to the reduction in cardinality.8 panning clan embedding into trees In Theorem 3, the spanning version, we try to imitatethe approach of Theorem 1. However, we cannot simply curve balls and continue recursively. Thereason being that the diameter of a cluster could grow unboundedly after deleting some vertices.In particular, there is no clear upper bound on the distance between separated points.To imitate the ball growing approach nonetheless, we use the petal-decomposition frameworkthat was previously applied to create stochastic embedding into spanning trees [AN19], and Ramseyspanning trees [ACE + petals ), which have properties resembling balls. The algorithm continuesrecursively on the petals. Later, the petals are connected back to create a spanning tree. The keyproperty is that while creating a petal, we have certain degree of freedom to chose its “radius”,which enables us to use the ball growing approach from above. Crucially, the framework guaranteesthat for every choice of radii (within the certiﬁed limits), the diameter of the resulting tree will beonly constant times larger from that of the original graph. However, petal decomposition frameworkdoes not provide us with the freedom of choosing the center of the petal. This makes the task ofcontrolling over the number of copies more subtle. Lower bound for clan embedding into a tree

We provide here a proof sketch for the ﬁrstassertion in Theorem 2. We begin by constructing an n -vertex graph G = ( V, E ) with ( + (cid:15) ) n edgesand girth g = Ω ( log n(cid:15) ) (the girth being the length of the shortest cycle). Consider an arbitrary clanembedding of G into a tree T with distortion gc = O ( log n(cid:15) ) (for some constant c ) and κ copies overall.We create a new graph H by contracting all the copies of each vertex into a single vertex. Thereis a naturally deﬁned classic embedding from G to H with distortion ≤ gc . The Euler characteristicof the graph G equals χ ( G ) = ∣ E ∣ − ∣ V ∣ + = (cid:15)n +

1, while the Euler characteristic of H is at most χ ( H ) ≤ κ − n . According to [RR98], if an embedding from a girth g graph G has distortion ≤ gc ,the host graph must have Euler characteristic at least as large as G . It follows that κ ≥ ( + (cid:15) ) n + Ramsey type embedding for minor free graphs

The structure theorem of Robertson andSeymour [RS03] stated that every minor free graph can be decomposed into a collection of graphsembedded on the surface of constant genus (with some vortices and apices), glued together intoa tree structure by taking clique-sums. [CFKL20] stochastic embedding of minor free graphs intodistribution over bounded treewidth graphs was constructed according to the layers of the structuretheorem. First they constructed embedding for a planar graph with a single vortex. Then theygeneralized it to planar graphs with multiple vortices. Next to graphs embedded on the surfaceof constant genus with multiple vortices. Next to surface embeddable graphs with multiple vor-tices and apices. Finally, they incorporated cliques-sums and generalized to minor free graphs.Most crucially for this paper, the only step requiring randomness was the incorporation of apices.Speciﬁcally, [CFKL20] constructed a deterministic embedding for graphs embedded on the surfaceof constant genus with multiple vortices. This is the starting point of both our embeddings.Our ﬁrst step is to incorporate apices, however instead of guaranteeing that the distance of eachpair is distorted by (cid:15)D in expectation, we will show that each vertex with probability 1 − δ enjoyssmall distortion w.r.t. any other vertex. We begin by deleting all the apices Ψ and obtaining asurface embeddable graph with multiple vortices G ′ = G [ V ∖ Ψ ] . However, the diameter of theresulting graph is essentially unbounded. Pick an arbitrary vertex r , and partition G ′ into layers9f width O ( Dδ ) w.r.t. distances from r with a random shift . It follows that every vertex v is 2 D -padded (that is the ball B ( v, D ) is fully contained in a single layer) with probability 1 − δ . The set M of satisﬁed vertices deﬁned to be the set of all D -padded vertices. We then use the deterministicembedding from [CFKL20] on every layer with distortion parameter (cid:15) ′ = Θ ( (cid:15)δ ) incurring additivedistortion (cid:15)D . Finally, we combine all this embeddings together into a single embedding containingalso the apices.The next step is to incorporate clique-sums. This is done recursively w.r.t. the clique-sumdecomposition tree T . In each step we pick a central piece ˜ G ∈ T such that T ∖ ˜ G brakes intoconnected components T , T , . . . , where each T i contains at most ∣ T ∣/ G using the lemma above obtaining a set ˜ M of satisﬁed vertices.Recursively we construct Ramsey type embedding for each T i , obtaining a set M i of satisﬁedvertices. We insure that all these embeddings are clique-preserving. Thus even though eventuallywe will obtain one-to-one embedding, during the process we keep them one-to-many and clique-preserving. This provides us with a natural way to combine all the embeddings of ˜ G, T , T , . . . into a single embedding into a graph of bounded treewidth (by identify vertices of respective cliquecopies). All the vertices in ˜ M will be satisﬁed. A vertex v ∈ T i will be satisﬁed if v ∈ M i and allthe vertices in the clique Q i , used in the clique sum of ˜ G with T i , are satisﬁed Q i ⊆ ˜ M . Analyzingthe entire process, we show that each vertex is satisﬁed with probability at least ( − δ ) log n . Thetheorem follows once we use the parameter δ ′ = Θ ( δ log n ) . Clan embedding for minor free graphs

The construction here follows similar lines to ourRamsey type embedding. However, we cannot simply “give-up” on vertices, as we required toprovide worst case distortion gurantee on all vertex pairs. Similarly to the Ramsey type case, webuild on the deterministic embedding of surface embeddable graphs with vortices from [CFKL20],and generalize it to clan embedding of graphs including the apices. However, there is one crucialdiﬀerence, when creating the “layering” (with the random shift). In the Ramsey type embeddingvertices near the boundary between two layers simply fail and did not join M . Here instead, thelayers will somewhat overlap such that copies of vertices near boundary areas will be splitted intotwo unrelated sets. In particular, cliques that lay near boundary areas will have two separatedclique copies w.r.t. each corresponding layer (at most two). Even though that actually each vertexwill have essentially unbounded number of copies (due to the clique-preservation requirement), thecopies of each vertex will be divided to either one or two sets, such that in the ﬁnal embedding itwill be enough to pick an arbitrary single copy from each set. The copies of a vertex will split intotwo sets only if it is in the area of the boundary, the probability of which is bounded by δ .The generalization to clique-sums also follows similar lines to the Ramsey type embedding. Wecreate clan embedding for ˜ G into treewidth graph ˜ H as above, and recursively clan embeddings H , H , . . . for T , T , . . . . For each T i , we will make the vertices of the clique Q i , used for theclique-sum between ˜ G and T i , into apices, thus insuring that H i will succeed on Q i . In particularevery vertex v ∈ Q i will have a single copy in H i . When combining H i with ˜ H there are two cases.If the embedding ˜ H was successful w.r.t. Q i we will simply identify between the two clique copiesand done. Otherwise, ˜ H will contain two disjoint clique copies ˜ Q i , ˜ Q i of Q i . We will create two Alternatively, one could use here strong padded decomposition [Fil19] (as in [CFKL20]) into clusters of diameter O r ( Dδ ) such that each radius D ball is fully contain is a single cluster with probability 1 − δ . However, this approachwill not work for our clan embedding, as there is no bound on the number of copies we will need for failed vertices.We use the layering approach for the Theorem 4 as well in order to keep the proofs of Theorems 4 and 5 similar. H i : H i , H i , and identify the two copies of Q i in H i , H i with˜ Q i , ˜ Q i , respectively. It follows that for a vertex v ∈ T i , with probability at least 1 − δ , the numberof copies it will have is the same as in H i , while with probability at most δ it will be doubled.Analyzing the entire process (and picking single copy from each relevant set as above), we showthat each vertex is expected to have at most ( + δ ) log n copies. The theorem follows once we usethe parameter δ ′ = Θ ( δ log n ) . The constructions of Ramsey trees are asymptotically tight [BBM06], and as was shown by Bartal et al. [BFN19] cannot be substantially improved even for planar graphs with constant doublingdimension. Therefore [BFN19] suggested to study a weaker gurantee provided by tree covers. Herethe goal is to construct small collection of dominating embeddings into trees such that every pair ofvertices has small distortion in some tree in the collection. Among other results, [BFN19] showedthat every n -vertex minor free graph admits a collection of O r ( log n(cid:15) ) trees with multiplicativedistortion 1 + (cid:15) , or a collection of O ( ) trees with O ( ) multiplicative distortion.Diﬀerent types of embedding were studied for minor free graphs. K r free graphs embed into (cid:96) p space with multiplicative distortion O r ( log min { , p } n ) [Rao99, KLMN05, AGG +

19, AFGN18], inparticular they embed into (cid:96) ∞ of dimension O r ( log n ) with constant multiplicative distortion. Theyalso admit spanners with multiplicative distortion 1 + (cid:15) and ˜ O r ( (cid:15) − ) lightness [BLW17]. From theother hand, there are other graph families that embed well into bounded treewidth graphs. Talwar[Tal04] showed that graphs with doubling dimension d and aspect ratio Φ , stochastically embedinto graphs with treewidth (cid:15) − O ( d log d ) ⋅ log d Φ with expected distortion 1 + (cid:15) . Similar embeddings areknown for graphs with highway dimension h [FFKP18] (into treewidth ( log Φ ) − O ( log h(cid:15) ) graphs),and graphs with correlation dimension k [CG12] (into treewidth ˜ O k,(cid:15) (√ n ) graphs). ˜ O notation hides poly-logarithmic factors, that is ˜ O ( g ) = O ( g ) ⋅ polylog ( g ) , while O r notation hidesfactors in r , e.g. O r ( m ) = O ( m ) ⋅ f ( r ) for some function f of r . All logarithms are at base 2 (unlessspeciﬁed otherwise).We consider connected undirected graphs G = ( V, E ) with edge weights w G ∶ E → R ≥ . Agraph is called unweighted if all its edges have unit weight. Additionally, we denote G ’s vertexset and edge set by V ( G ) and E ( G ) , respectively. Often we will abuse notation and wright G instead of V ( G ) . d G denotes the shortest path metric in G , i.e., d G ( u, v ) equals to the minimalweight of a path from u to v . Note that every metric space can be represented as the shortest pathmetric of a weighted complete graph. We will use the notions of metric space, and weighted graphinterchangeably. When the graph is clear from the context, we might use w to refer to w G , and d to refer to d G . G [ S ] denotes the induced subgraph by S . The diameter of S , denoted by diam ( S ) Speciﬁcally, for every α >

0, [BFN19] constructed planar graph with constant doubling dimension, such that forevery tree embedding, the subset of vertices enjoying distortion ≤ α is of size at most n − Ω ( α log α ) , that is almost asbad as general graphs. The aspect ratio of a metric space ( X, d ) is the ratio between the maximal and minimal distances max x,y d ( x,y ) min x,y d ( x,y ) .

11s max u,v ∈ S d G [ S ] ( u, v ) . An ultrametric ( X, d ) is a metric space satisfying a strong form of the triangle inequality, thatis, for all x, y, z ∈ X , d ( x, z ) ≤ max { d ( x, y ) , d ( y, z )} . The following deﬁnition is known to be anequivalent one (see [BLMN05b]). Deﬁnition 1.

An ultrametric is a metric space ( X, d ) whose elements are the leaves of a rootedlabeled tree T . Each z ∈ T is associated with a label (cid:96) ( z ) ≥ such that if x ∈ T is a descendant of z then (cid:96) ( x ) ≤ (cid:96) ( z ) and (cid:96) ( x ) = iﬀ x is a leaf. The distance between leaves x, y ∈ X is deﬁned as d T ( x, y ) = (cid:96) ( lca ( x, y )) where lca ( x, y ) is the least common ancestor of x and y in T . Classically, a metric embedding is deﬁned as a function f ∶ X → Y between the points of two metricspaces ( X, d X ) and ( Y, d Y ) . A metric embedding f is said to be dominating if for every pair of points x, y ∈ X , it holds that d X ( x, y ) ≤ d Y ( f ( x ) , f ( y )) . The distortion of a dominating embedding f ismax x,y ∈ X d Y ( f ( u ) ,f ( v )) d X ( u,v ) . Here we will study a more permitting generalization of metric embeddingintroduced by Cohen-Addad et al. [CFKL20], which is called one-to-many embedding . Deﬁnition 2 (One-to-many embedding) . A one-to-many embedding is a function f ∶ X → Y from the points of a metric space ( X, d X ) into non-empty subsets of points of a metric space ( Y, d Y ) , where the subsets { f ( x )} x ∈ X are disjoint. f − ( x ′ ) denotes the unique point x ∈ X suchthat x ′ ∈ f ( x ) . If no such point exist, f − ( x ′ ) = ∅ . A point x ′ ∈ f ( x ) will be called a copy of x ,while f ( x ) is called the clan of x . For a subset A ⊆ X of vertices, denote f ( A ) = ∪ x ∈ X f ( x ) .We say that f is dominating if for every pair of points x, y ∈ X , it holds that d X ( x, y ) ≤ min x ′ ∈ f ( x ) ,y ′ ∈ f ( y ) d Y ( x ′ , y ′ ) . We say that f has multiplicative distortion t , if it is dominating and ∀ x, y ∈ X it holds that max x ′ ∈ f ( x ) ,y ′ ∈ f ( y ) d Y ( x ′ , y ′ ) ≤ t ⋅ d X ( x, y ) . Similarly, f has additive distortion (cid:15)D if f is dominating and ∀ x, y ∈ X , max x ′ ∈ f ( x ) ,y ′ ∈ f ( y ) d H ( u ′ , v ′ ) ≤ d X ( x, y ) + (cid:15)D .A stochastic one-to-many embedding is a distribution D over dominating one-to-many embed-dings. We say that a stochastic one-to-many embedding has expected multiplicative distortion t if ∀ x, y ∈ X , E [ max x ′ ∈ f ( x ) ,y ′ ∈ f ( y ) d Y ( x ′ , y ′ )] ≤ t ⋅ d X ( u, v ) . Similarly, f has expected additive distortion (cid:15)D , if ∀ x, y ∈ X , E [ max x ′ ∈ f ( x ) ,y ′ ∈ f ( y ) d Y ( x ′ , y ′ )] ≤ d X ( x, y ) + (cid:15)D .For a one-to-many embedding f between weighted graphs G = ( V, E, w ) and H = ( V ′ , E ′ , w ′ ) ,we say that f is spanning if V ′ = f ( V ) (i.e. f is “onto”), and for every edge ( u, v ) ∈ E ′ , it holdsthat ( f − ( u ) , f − ( v )) ∈ E and w ′ ( u, v ) = w ( f − ( u ) , f − ( v )) . This paper is mainly devoted to the new notion of clan embeddings.

Deﬁnition 3 (Clan embedding) . A Clan embedding from metric space ( X, d X ) into a metric space ( Y, d Y ) is a pair ( f, χ ) where f ∶ X → Y is a dominating one-to-many embedding, and χ ∶ X → Y is a classic embedding. For every x ∈ X it holds that χ ( x ) ∈ f ( x ) , here f ( x ) called the clan of x ,while χ ( x ) is refereed to as the chief of the clan of x (or simply the chief of x ).We say that clan embedding f has multiplicative distortion t if for every x, y ∈ X , min y ′ ∈ f ( y ) d Y ( y ′ , χ ( x )) ≤ t ⋅ d X ( x, y ) . Similarly, f has additive distortion (cid:15)D if for every x, y ∈ X , min y ′ ∈ f ( y ) d Y ( y ′ , χ ( x )) ≤ d X ( x, y ) + (cid:15)D .A clan embedding ( f, χ ) is said to be spanning if f is a spanning one-to-many embedding. This is often called strong diameter. A related notion is the weak diameter of a cluster S , deﬁnedmax u,v ∈ S d G ( u, v ) . Note that for a metric space, weak and strong diameter are equivalent.

12e will construct embeddings for minor free graphs using a divide-and-concur approach. Firstwe will construct embedding on each piece (see bellow). Then, in order to combine the diﬀerentembedding into a single one, it will be important that these embeddings are clique-preserving . Deﬁnition 4 (Clique-copy) . Consider a one-to-many embedding f ∶ G → H , and a clique Q in G .A subset Q ′ ⊆ f ( Q ) is called clique copy of Q if Q ′ is a clique in H , and for every vertex v ∈ Q , Q ′ ∩ f ( v ) is a singleton. Deﬁnition 5 (Clique-preserving embedding) . A one-to-many embedding f ∶ G → H is calledclique-preserving embedding if for every clique Q in G , f ( Q ) contains a clique copy of Q . A clanembedding ( f, χ ) is clique-preserving if f is clique preserving. In this section, we review notation used in graph minor theory by Robertson and Seymour. Infor-mally speaking, the celebrated theorem of Robertson and Seymour (Theorem 9, [RS03]) said thatevery minor-free graph can be decomposed into a collection of graphs nearly embeddable in thesurface of constant genus, glued together into a tree structure by taking clique-sum . To formallystate the Robertson-Seymour decomposition, we need additional notations.

Deﬁnition 6 (Tree/Path decomposition) . A tree decomposition of G ( V, E ) , denoted by T , is atree satisfying the following conditions:1. Each node i ∈ V (T ) corresponds to a subset of vertices X i of V (called bags), such that ∪ i ∈ V (T ) X i = V .2. For each edge uv ∈ E , there is a bag X i containing both u, v .3. For a vertex v ∈ V , all the bags containing v make up a subtree of T .The width of a tree decomposition T is max i ∈ V (T ) ∣ X i ∣ − and the treewidth of G , denoted by tw , is the minimum width among all possible tree decompositions of G . A path decomposition ofa graph G ( V, E ) is a tree decomposition where the underlying tree is a path. The pathwidth of G ,denoted by pw , deﬁned accordingly. A vortex is a graph G equipped with a pah decomposition { X , X , . . . , X t } and a sequence of t designated vertices x , . . . , x t , called the perimeter of G , such that each x i ≤ X i for all 1 ≤ i ≤ t .The width of the vortex is the width of its path decomposition. We say that a vortex W is glued to a face F of a surface embedded graph G if W ∩ F is the perimeter of W whose vertices appearconsecutively along the boundary of F . Nearly h -embeddability A graph G is nearly h -embeddable if there is a set of at most h vertices A , called apices , such that G ∖ A can be decomposed as G Σ ∪ { W , W , . . . , W h } where G Σ is (cellularly) embedded on a surface Σ of genus at most h and each W i is a vortex of width atmost h glued to a face of G Σ . h -Clique-sum A graph G is a h -clique-sum of two graphs G , G , denoted by G = G ⊕ h G , ifthere are two cliques of size exactly h each such that G can be obtained by identifying vertices ofthe two cliques and remove some clique edges of the resulting identiﬁcation.Note that clique-sum is not a well-deﬁned operation since the clique-sum of two graphs is notunique due to the clique edge deletion step. We are ready now to state the decomposition theorem.13 heorem 9 (Theorem 1.3 [RS03]) . There is a constant h = O r ( ) such that any K r -minor-freegraph G can be decomposed into a tree T where each node of T corresponds to a nearly h -embeddablegraph such that G = ∪ X i X j ∈ E ( T ) X i ⊕ h X j . The graphs corresponding to the nodes in the clique-sum decomposition above are referred toas pieces . Note that the pieces in T may not be subgraphs of G , as in the clique-sum, some edgesof a node, namely some edges of a nearly h -embeddable subgraph associated to a node, may notbe present in G . We will slightly modify the graph to insure that this never happens. Speciﬁcally,for any pair u, v of vertices used in a clique-sum for a piece X of T , that are not present in G , weadd edge ( u, v ) to G and set its weight to be d G ( u, v ) . In the decomposition of the resulting graph,the clique-sum operation does not remove any edge. Note that this operation does not change theRobertson-Seymour decomposition of the graph, nor its shortest path metric. Thus from a metricpoint of view, the two graphs are equivalent.Cohen-Addad et al. [CFKL20] showed that every n -vertex K r -minor free graph has a stochasticone-to-many embedding with expected additive distortion (cid:15)D into a graph with treewidth O ( log n(cid:15) ) .The only reason [CFKL20] used randomness is the presences of apices. The following lemmafrom [CFKL20], state that nearly h -embeddable graph without apices embed deterministically intobounded treewidth graphs. We will use this embedding in a black box manner. Lemma 1 (Multiple Vortices and Genus, [CFKL20]) . Consider a graph G = G Σ ∪ W ∪ ⋅ ⋅ ⋅ ∪ W h ofdiameter D , where G Σ is (cellularly) embedded on a surface Σ of genus h , and each W i is a vortexof width at most h glued to a face of G Σ . There is a one-to-many clique-preserving embedding f from G to a graph H of treewidth at most O h ( log n(cid:15) ) with additive distortion (cid:15)D . This section is devoted to proving Theorem 1. We restate it for convenience.

Theorem 1 (Clan embedding into ultrametric) . Consider an n -point metric space ( X, d X ) , andparameter (cid:15) ∈ ( , ] . Then there is a distribution D over clan embeddings ( f, χ ) into ulrametricswith multiplicative distortion O ( log n(cid:15) ) , such that for every point x ∈ X , E f ∼D [∣ f ( x )∣] ≤ + (cid:15) .In addition, for every k ∈ N , there is distribution D over clan embeddings ( f, χ ) into ulrametricswith multiplicative distortion k such that for every point x ∈ X , E f ∼D [∣ f ( x )∣] = O ( n k ) . First, we will prove a “distributional” version of Theorem 1. That is, we will receive a distri-bution µ over the points, and deterministically construct a single clan embedding ( f, χ ) such that ∑ x ∈ X µ ( x )∣ f ( x )∣ will be bounded. Later, we will use the Minimax theorem to conclude Theorem 1.We begin with some deﬁnitions: a measure over a ﬁnite set X , is simply a function µ ∶ X → R ≥ .The measure of a subset A ⊆ X , is µ ( A ) = ∑ x ∈ A µ ( x ) . Given some function f ∶ X → R , it’s expec-tation w.r.t. µ is E x ∼ µ [ f ] = ∑ x ∈ X µ ( x ) ⋅ f ( x ) . We say that µ is a probability measure if µ ( X ) = µ is a (≥ ) -measure if for every x ∈ X , µ ( x ) ≥ Lemma 2.

Given an n -point metric space ( X, d X ) , (≥ ) -measure µ ∶ X → R ≥ , and integerparameter k ≥ , there is a clan embedding ( f, χ ) into an ultrametric with multiplicative distortion k such that E x ∼ µ [∣ f ( x )∣] ≤ µ ( X ) + k .Proof. Our proof is inspired by Bartal’s lecture notes [Bar11], who provided deterministic con-struction of Ramsey trees. Lemma 2 could also be proved using the techniques of Abraham et Pv ¯ Q R + k · DR xy z r U r P r ¯ Qχ ( x ) x χ ( y ) χ ( z ) z z z U P U ¯ Q U X Figure 2:

On the left illustrated the clusters

P, Q, ¯ Q from Claim 1. On the right we illustrate the clanembedding of the metric space ( X, d X ) into ultrametric U . r U is the root of U , and it s children are theroots of the ultrametrics U P , U ¯ Q which were constructed recursively. The vertex x ∈ P ∖ Q has f ( x ) = f P ( x ) and χ ( x ) = χ P ( x ) (where ∣ f ( x )∣ =

2) The vertex y is in ¯ Q ∖ P and thus f ( y ) = f ¯ Q ( y ) and χ ( y ) = χ ¯ Q ( y ) (there is a single copy of y ). The vertex z belongs to P ∩ ¯ Q , where d X ( v, z ) > R + ⋅ diam ( X ) , hence f ( z ) = f P ( z ) ∪ f ¯ Q ( z ) and χ ( z ) = χ ¯ Q ( z ) . Note that ∣ f P ( z )∣ = ∣ f ¯ Q ( z )∣ =

2, and hence ∣ f ( z )∣ = al. [ACE +

20] (and indeed we will use their approach for our clan embedding into a spanningtree, see Lemma 4), however the proof based on [Bar11] we present here is shorter. For a subset A ⊆ X , denote by B A ( x, r ) ∶= B X ( x, r ) ∩ A the ball in the metric space ( X, d X ) restricted to A . Set µ ∗ ( A ) ∶= max x ∈ A µ ( B A ( x, diam ( A ) )) . Note that µ ∗ is monotone: i.e. A ′ ⊆ A implies µ ∗ ( A ′ ) ≤ µ ∗ ( A ) ,and ∀ A, µ ∗ ( A ) ≤ µ ( A ) . The following claim is crucial for our construction, its proof appears bellow.See Figure 2 for illustration of the claim. Claim 1.

There is a point v ∈ X and radius R ∈ ( , diam ( X ) ] , such that the sets P = B X ( v, R + k ⋅ diam ( X )) , Q = B X ( v, R ) , and ¯ Q = X ∖ Q satisfy µ ( P ) ≤ µ ( Q ) ⋅ ( µ ∗ ( X ) µ ∗ ( P ) ) k . The construction of the embedding is by induction on n , the number of points in the met-ric space. We assume that for a metric space X with strictly less than n points, and arbi-trary (≥ ) -measure µ , we can construct clan embedding ( f, χ ) with distortion 16 k , such that E x ∼ µ [∣ f ( x )∣] ≤ µ ( X ) µ ∗ ( X ) k ≤ µ ( X ) + k . Find sets P, Q, ¯ Q ⊆ X using Claim 1. Let µ P (resp. µ ¯ Q )be the (≥ ) -measure µ restricted to P (resp. ¯ Q ). Using the induction hypothesis, construct clanembedding ( f P , χ P ) for P , and ( f ¯ Q , χ ¯ Q ) for ¯ Q into ultra-metrics U P , U ¯ Q respectively. Constructa new ultrametric U by combining U P and U ¯ Q via a new root node r U with label diam ( X ) . Forevery x ∈ X set f ( x ) = f P ( x ) ∪ f ¯ Q ( x ) . If d X ( v, x ) ≤ R + k ⋅ diam ( X ) set χ ( x ) = χ P ( x ) , otherwiseset χ ( x ) = χ ¯ Q ( x ) . This ﬁnishes the construction, see Figure 2 for illustration.Next we argue that the clan embedding ( f, χ ) has multiplicative distortion 16 k . Consider apair of points x, y ∈ X . We will show that min y ′ ∈ f ( y ) d U ( y ′ , χ ( x )) ≤ k ⋅ d X ( x, y ) . Suppose ﬁrstthat d X ( v, x ) ≤ R + k ⋅ diam ( X ) . If y ∈ P , then by the induction hypothesismin y ′ ∈ f ( y ) d U ( y ′ , χ ( x )) ≤ min y ′ ∈ f P ( y ) d U P ( y ′ , χ P ( x )) ≤ k ⋅ d P ( x, y ) = k ⋅ d X ( x, y ) . Else, y ∉ P , implying d X ( v, y ) > R + k ⋅ diam ( X ) . Using the triangle inequality d X ( x, y ) ≥ d X ( v, y )− d X ( v, x ) ≥ diam ( X ) . Note that the label of r u is diam ( X ) , implying min y ′ ∈ f ( y ) d U ( y ′ , χ ( x )) ≤ diam ( X ) ≤ ⋅ d X ( x, y ) . The case where d X ( v, x ) > R + ⋅ diam ( X ) is symmetric (using ¯ Q insteadof P ). 15ext we bound the weighted number of leafs in the ultrametric. Note that the process isdeterministic and there is no probability involved. Using the induction hypothesis it holds that E x ∼ µ [∣ f ( x )∣] = ∑ x ∈ X µ ( x ) ⋅ (∣ f P ( x )∣ + ∣ f ¯ Q ( x )∣)= E x ∼ µ P [∣ f P ( x )∣] + E x ∼ µ ¯ Q [∣ f ¯ Q ( x )∣]≤ µ P ( P ) µ ∗ P ( P ) k + µ ¯ Q ( ¯ Q ) µ ∗ ¯ Q ( ¯ Q ) k ≤ µ ( P ) µ ∗ ( P ) k + µ ( ¯ Q ) µ ∗ ( ¯ Q ) k (∗) ≤ µ ( Q ) µ ∗ ( X ) k + µ ( ¯ Q ) µ ∗ ( X ) k = µ ( X ) µ ∗ ( X ) k , where in the inequality (∗) we used Claim 1, and the fact µ ∗ ( ¯ Q ) ≤ µ ∗ ( X ) . Proof of Claim 1.

Let v be the vertex minimizing the ratio µ ( B X ( v, diam ( X ) )) µ ( B X ( v, diam ( X ) )) . Set ρ = diam ( X ) k , andfor i ∈ [ , k ] let Q i = B X ( v, diam ( X ) + i ⋅ ρ ) . Let i ∈ [ , k − ] be the index minimizing µ ( Q i + ) µ ( Q i ) . Then, ( µ ( Q k ) µ ( Q ) ) k = ( µ ( Q ) µ ( Q ) ⋅ µ ( Q ) µ ( Q ) ⋯ µ ( Q k ) µ ( Q k − ) ) k ≥ ( µ ( Q i + ) µ ( Q i ) ) k ⋅ k = µ ( Q i + ) µ ( Q i ) . Set R = diam ( X ) + i ⋅ ρ , then P = B X ( v, R + ρ ) , Q = B X ( v, R ) , ¯ Q = X ∖ Q . Note that diam ( P ) ≤ ⋅( diam ( X ) + k ⋅ ρ ) = diam ( X ) . Let u P be the vertex deﬁning µ ∗ ( P ) , that is µ ∗ ( P ) = µ ( B P ( u P , diam ( P ) ) ≤ µ ( B P ( u P , diam ( X ) ) . Using the minimality of v , it holds that µ ( P ) µ ( Q ) ≤ ( µ ( Q k ) µ ( Q ) ) k = ⎛⎜⎝ µ ( B X ( v, diam ( X ) )) µ ( B X ( v, diam ( X ) )) ⎞⎟⎠ k ≤ ⎛⎜⎝ µ ( B X ( u P , diam ( X ) )) µ ( B X ( u P , diam ( X ) )) ⎞⎟⎠ k ≤ ( µ ∗ ( X ) µ ∗ ( P ) ) k . Next we translate the language of (≥ ) -measures used in Lemma 2 to probability measures: Lemma 3.

Given an n -point metric space ( X, d X ) , and probability measure µ ∶ X → R ≥ , we canconstruct the two following clan embeddings ( f, χ ) into ultrametric:1. For every parameter k ≥ , multiplicative distortion k such that E x ∼ µ [∣ f ( x )∣] ≤ O ( n k ) .2. For every parameter (cid:15) ∈ ( , ] , multiplicative distortion O ( log n(cid:15) ) such that E x ∼ µ [∣ f ( x )∣] ≤ + (cid:15) .Proof. We deﬁne the following probability measure ̃ µ : ∀ x ∈ X , ̃ µ ( x ) = n + µ ( x ) . Set the following (≥ ) -measure ̃ µ ≥ ( x ) = n ⋅ µ ( x ) . Note that ̃ µ ≥ ( X ) = n . We execute Lemma 2 w.r.t. the (≥ ) -measure ̃ µ ≥ , and parameter δ ∈ N to be determined later. It holds that ̃ µ ≥ ( X ) ⋅ E x ∼̃ µ [∣ f ( x )∣] = E x ∼̃ µ ≥ [∣ f ( x )∣] ≤ ̃ µ ≥ ( X ) + δ = ̃ µ ≥ ( X ) ⋅ ( n ) δ , implying ( n ) δ ≥ E x ∼̃ µ [∣ f ( x )∣] = ⋅ E x ∼ µ [∣ f ( x )∣] + ∑ x ∈ X ∣ f ( x )∣ n ≥ ⋅ E x ∼ µ [∣ f ( x )∣] + .

16. Set δ = k , then we have multiplicative distortion δ = k , and E x ∼ µ [∣ f ( x )∣] ≤ ⋅( n ) δ = O ( n k ) .2. Choose δ ∈ ( , ] such that δ = ⌈ ln ( n ) ln ( + (cid:15) / ) ⌉ , note that δ ≤ ln ( + (cid:15) / ) ln ( n ) . Then we have multiplicativedistortion O ( δ ) = O ( log n(cid:15) ) , and E x ∼ µ [∣ f ( x )∣] ≤ ⋅ ( n ) δ − ≤ + (cid:15) .Finally using the minimax theorem we conclude the main theorem of the section: Proof of Theorem 1.

Let µ be an arbitrary probability measure over the vertices, and D be any dis-tribution over clan embeddings ( f, χ ) of ( X, d X ) intro trees with multiplicative distortion O ( log n(cid:15) ) .Using Lemma 3 and the minimax theorem we have thatmin D max µ E ( f,χ )∼D ,x ∼ µ [∣ f ( x )∣] = max µ min ( f,χ ) E x ∼ µ [∣ f ( x )∣] ≤ + (cid:15) . Let D be the distribution from above, denote by µ z the probability measure where µ z ( z ) = µ z ( y ) = y ≠ z ). Then for every x ∈ X E ( f,χ )∼D [∣ f ( z )∣] = E ( f,χ )∼D ,x ∼ µ z [∣ f ( x )∣] ≤ max µ E ( f,χ )∼D ,x ∼ µ [∣ f ( x )∣] ≤ + (cid:15) . The second case is proven using the exact same argument.

This section is devoted to proving Theorem 3. We restate it for convenience.

Theorem 3 (Spanning clan embedding into trees) . Consider an n -vertex weighted graph G =( V, E, w ) , and parameter (cid:15) ∈ ( , ] . Then there is a distribution D over spanning clan embeddings ( f, χ ) into trees with multiplicative distortion O ( log n log log n(cid:15) ) , such that for every vertex v ∈ V , E f ∼D [∣ f ( v )∣] ≤ + (cid:15) .In addition, for every k ∈ N , there is distribution D over spanning clan embeddings ( f, χ ) intotrees with multiplicative distortion O ( k log log n ) , where for every vertex v ∈ V , E f ∼D [∣ f ( v )∣] = O ( n k ) . In this section we construct spanning clan embeddings into trees. We will use the framework ofpetal decomposition proposed by Abraham and Neiman [AN19], who originally used itto constructa stochastic embedding of a graph into spanning trees with bounded expected distortion. Theframework was also previously used by Abraham et al. [ACE +

20] to construct Ramsey spanningtrees. The petal decomposition is an iterative method to build a spanning tree of a given graph.In each level, the current graph is partitioned into smaller diameter pieces (called petals ), and asingle central piece (called stigma ), which are then connected by edges in a tree structure. Each ofthe petals is a ball in a certain cone metric. When creating a petal from a cluster of diameter ∆,one has the freedom to choose a radius from an interval of length Ω ( ∆ ) . The crucial property, isthat regardless of the radii choices during the execution of the algorithm, the framework guaranteesthat the diameter of the resulting tree will be O ( ∆ ) .However, as we are constructing clan embedding rather than a classical one, some vertices willhave multiple copies. As a result, some mild changes will be introduced to the construction of17AN19]. Once we establish the petal decomposition framework for clan embeddings, the proof ofTheorem 3 will follow similar lines to Theorem 1. The additional log log n factor is a phenomenaalso appearing in previous uses of the petal decomposition framework [AN19, ACE + Organization:

In Section 4.1 we will describe the petal decomposition framework in general. InSection 4.2 we will describe our speciﬁc usage of it, i.e. the algorithm choosing the radii (with someleftovers in Section 4.4). Then in Section 4.3 we will prove Lemma 4, that appears below. Lemma 4is a “distributional” version of Theorem 3, and has a role parallel to Lemma 2 in Section 3. Finally,in Section 4.5 we will deduce Theorem 3 using Lemma 4.

Lemma 4.

Given an n -vertex weighted graph G = ( V, E, w ) , (≥ ) -measure µ ∶ V → R ≥ , andinteger parameter k ≥ , there is a spanning clan embedding ( f, χ ) into a tree with multiplicativedistortion O ( k log log n ) such that E v ∼ µ [∣ f ( v )∣] ≤ µ ( V ) + k . We begin with some notations speciﬁc for this section. For subset S ⊆ G and a center vertex x ∈ S , the radius of S w.r.t x , ∆ x ( S ) , is the minimal ∆ such that B G [ S ] ( x , ∆ ) = S . (If forevery ∆, B G [ S ] ( x , ∆ ) ≠ S , (this can happen iﬀ S is not connected) we say that ∆ x ( S ) = ∞ .)When the center x is clear from context or is not relevant, we will omit it. Given two vertices u, v , P u,v ( X ) denotes the shortest path between them in G [ X ] , the graph induced by X (we willassume that every pair has a unique shortest path, this can be arranged by tiny perturbation ofthe edge weights.). x y ∆ ∆ /ρ A A (cid:48) G Given a graph G = ( V, E, w ) and a cluster A ⊆ V (with center x ),we say that a vertex y ∈ A is ρ -padded by the cluster A ′ ⊆ A (w.r.t A )if B ( y, ∆ x ( A )/ ρ, G ) ⊆ A ′ . See illustration on the right.Next we provide a concise description of the Petal decompositionalgorithm, focusing on the main properties we will use. For proofs andfurther details we refer to [AN19]. The presentation here diﬀers slightly from [AN19] as our goal isto construct a spanning clan embedding into a tree, rather than a classic one. However, the changesare straightforward, and no new ideas required.The hierarchical-petal-decomposition (see Algorithm 1) is a recursive algorithm. Theinput is G [ X ] (a graph G = ( V, E, w ) induced over a set of vertices X ⊆ V ), a center x ∈ X , atarget t ∈ X , and the radius ∆ = ∆ x ( X ) . The algorithm invokes the petal-decomposition procedure to create clusters ̃ X , ̃ X , . . . , ̃ X s of X (for some integer s ), and also provides a set ofedges {( x , y ) , . . . , ( x s , y s )} and targets t , t , . . . , t s . The Hierarchical-petal-decomposition algorithm now recurses on each ( G [ ̃ X j ] , x j , t j , ∆ x j ( ̃ X j )) for 0 ≤ j ≤ s , to get trees { T j } ≤ j ≤ s (andclan embeddings {( f j , χ j )} ≤ j ≤ s ), which are then connected by the edges {( x j , y j )} ≤ j ≤ s to form atree T (the recursion ends when X j is a singleton). The one-to-many embedding f simply deﬁned asthe union of the one-to-many embeddings { f j } ≤ j ≤ s . Note however, that the clusters ̃ X , ̃ X , . . . , ̃ X s Rather than inferring ∆ = ∆ x ( X ) from G [ X ] and x as in [AN19], we will follow [ACE +

20] and think of ∆ aspart of the input. We shall allow any ∆ ≥ ∆ x ( X ) . We stress that in fact in the algorithm we always use ∆ x ( X ) ,and consider this degree of freedom only in the analysis. lgorithm 1: ( T, f, χ ) = hierarchical-petal-decomposition ( G [ X ] , x , t, ∆ ) if ∣ X ∣ = then return G [ X ] Let ({ X j , X j , ̃ X j , x j , t j , ∆ j } sj = , {( y j , x j )} sj = ) = petal-decomposition ( G [ X ] , x , t, ∆ ) for each j ∈ [ , . . . , s ] do ( T j , f j , χ j ) = hierarchical-petal-decomposition ( G [ ̃ X j ] , x j , t j , ∆ j ) for each z ∈ X do Set f ( x ) = ∪ sj = f j ( z ) if ∃ j > such that z ∈ X j then Let j > z ∈ X j . Set χ ( z ) = χ j ( z ) else Set χ ( z ) = χ ( z ) Let T be the tree formed by connecting T , . . . , T s using the edges { χ ( y ) , χ ( x )} , . . . , { χ ( y s ) , χ ( x s )} return ( T, f, χ ) are not disjoint. Therefore in addition, for each cluster ̃ X j the petal-decomposition procedurewill also provide us with sub-clusters X j ⊆ X j ⊆ ̃ X j that will be used to determine the chiefs (i.e. χ part) of the clan embedding.Next we describe the petal-decomposition procedure (see Algorithm 2). Initially it sets Y = X , and for j = , , . . . , s it carves out the petal ̃ X j from the graph induced on Y j − , andsets Y j = Y j − / X j , where X j is a sub-petal of ̃ X j , consisting of all the vertices who are padded by ̃ X j . The idea is that Y j is deﬁned w.r.t. to a smaller set than the petal itself, thus by duplicatingsome vertices we will be able to gurantee that each vertex is padded somewhere. In order tocontrol the radius increase, the ﬁrst petal might be carved using diﬀerent parameters (see [AN19]for details and explanation of this subtlety ). The deﬁnition of petal guarantees that the radius∆ x ( Y j ) is non-increasing, and when at step s it becomes at most 3∆ /

4, deﬁne X = Y s andthen the petal-decomposition routine ends. In carving of the petal ̃ X j ⊆ Y j − , the algorithmchooses an arbitrary target t j ∈ Y j − (at distance at least 3∆ / x ) and a range [ lo , hi ] of size hi − lo ∈ { ∆ / , ∆ / } which are provided to the sub-routine create-petal .Both hierarchical-petal-decomposition and petal-decomposition are essentially the al-gorithms that appeared in [AN19]. The only technical diﬀerence is that in [AN19] ̃ X j = X j forevery j (as they created actually spanning tree, while we are constructing a clan embedding). Themore important diﬀerence lies in the create-petal procedure, depicted in Algorithm 3. It care-fully selects a radius r ∈ [ lo , hi ] , which determines the petal ̃ X j together with a connecting edge ( x j , y j ) ∈ E , where x j ∈ ̃ X j is the center of ̃ X j and y j ∈ Y j . It is important to note that the target t ∈ X of the central cluster X is determined during the creation of the ﬁrst petal X . The petalsare created using an alternative metric on the graph, known as cone-metric : One may notice that in line 15 of the petal-decomposition procedure, the weight of some edges ischanged by a factor of 2. This can happen at most once for each copy of every edge throughout the hierarchical-petal-decomposition execution, thus it may aﬀect the padding parameter by a factor of at most2. This re-weighting is ignored here for simplicity. We again refer to [AN19] for details and further explanation. lgorithm 2: ({ X j , X j , ̃ X j , x j , t j , ∆ j } sj = , {( y j , x j )} sj = ) = petal-decomposition ( G [ X ] , x , t, ∆ ) Let Y = X Set j = if d X ( x , t ) ≥ ∆ / then Let ( X , X , ̃ X ) = create-petal ( G [ Y ] , [ d X ( x , t ) − ∆ / , d X ( x , t ) − ∆ / ] , x , t ) Y = Y / X Let { x , y } be the unique edge on the shortest path P x t from x to t in Y , where x ∈ X and y ∈ Y Set t = y , t = t ; j = else set t = t while Y j − / B X ( x , ∆ ) ≠ ∅ do Let t j ∈ Y j − be an arbitrary vertex satisfying d X ( x , t j ) > ∆ Let ( X j , X j , ̃ X j ) = create-petal ( G [ Y j − ] , [ , ∆ / ] , x , t j ) Y j = Y j − / X j Let { x j , y j } be the unique edge on the shortest path P x j t j from x to t j in Y j − , where x j ∈ ̃ X j and y j ∈ Y j Consider G j = G [ ̃ X j ] the graph induced by ̃ X j . For each edge e ∈ P x j t j ( ̃ X j ) , set itsweight to be w ( e )/ Let j = j + Let s = j − Let X = X = ̃ X = Y s return ({ X j , X j , ̃ X j , x j , t j , ∆ x j ( ̃ X j )} sj = , {( y j , x j )} sj = ) Deﬁnition 7 (Cone-metric) . Given a graph G = ( V, E ) , a subset X ⊆ V and points x, y ∈ X , deﬁnethe cone-metric ρ = ρ ( X, x, y ) ∶ X → R + as ρ ( u, v ) = ∣( d X ( x, u ) − d X ( y, u )) − ( d X ( x, v ) − d X ( y, v ))∣ . The cone-metric is in fact a pseudo-metric, i.e., distances between distinct points are allowed tobe 0. The ball B ( X,ρ ) ( y, r ) in the cone-metric ρ = ρ ( X, x, y ) , contains all vertices u whose shortestpath to x is increased (additively) by at most r if forced to go through y . In the create-petal algorithm, while working in a subgraph G [ Y ] with two speciﬁed vertices: a center x and a target t , we deﬁne W r ( Y, x , t ) = ⋃ p ∈ P x t ∶ d Y ( p,t )≤ r B ( Y,ρ ( Y,x ,p )) ( p, r − d Y ( p,t ) ) which is union of balls in thecone-metric, where any vertex p in the shortest path from x to t of distance at most r from t isa center of a ball with radius r − d Y ( p,t ) . See Figure 3 for illustration. The parameters ( Y, x , t ) areusually clear from context, and omitted. The following fact from [AN19] demonstrates that petalsare similar to balls. Fact 1 ([AN19]) . For every y ∈ W r ( Y, x , t ) and l ≥ , B G [ Y ] ( y, l ) ⊆ W r + l ( Y, x , t ) . Note that Fact 1 implies that W r is monotone in r , i.e., for r ≤ r ′ it holds that W r ⊆ W r ′ .For each j , the clusters X j , X j , ̃ X j returned by the create-petal procedure executed on ( G [ Y j − ] , [ lo , hi ] , x , t j ) will all be petals of the form W r ( Y j − , x , t j ) for r ∈ [ lo , hi ] . Speciﬁcally20 q zxt x p p p p r r r r t = p Figure 3:

On the left illustrated the ball B ( X,ρ ) ( t, r ) in the cone-metric ρ = ρ ( X, x, t ) , containing all vertices u whose shortest path to x is increased (additively) by at most r if forced to go through t . The red vertex z joins B ( X,ρ ) ( t, r ) as d X ( z, t ) + d X ( t, x ) ≤ d X ( z, x ) + r . The blue point q on the path P t,x at distance r from t is the last point on P t,x to join B ( X,ρ ) ( t, r ) .On the right we illustrate the petal W r ( X, x, t ) = ⋃ p ∈ P xt ∶ d Y ( p,t )≤ r B ( Y,ρ ( X,x,p )) ( p, r − d X ( p,t ) ) . In the illustra-tion, the point p i is at distance i r from t , and is the center of a ball of radius − i r in the respective conemetric. we will chose some r , r , r ∈ [ lo , hi ] such that X j = W r ( Y j − , x , t j ) , X j = W r ( Y j − , x , t j ) and ̃ X j = W r ( Y j − , x , t j ) while r − r = r − r = Θ ( hi − lo k log log µ ( Y j − ) ) .The following facts were proven in [AN19] regarding the petal-decomposition procedure.They hold in our version of the algorithm using the exact same proofs. Fact 2 ([AN19]) . Consider the petal-decomposition procedure executed on X with center x ,target t and radius ∆ . It creates clusters ( X , X , ̃ X ) , ( X , X , ̃ X ) , . . . , ( X s , X s , ̃ X s ) . During theprocess we had temporary metrics Y = X , and Y j = Y j − / X j . For j ≥ the cluster ̃ X j had center x j connected to y j ∈ Y j and target t j ∈ ̃ X j . Throughout the execution the following hold:1. For every j and z ∈ Y j , P z,x ( X ) ⊆ G [ Y j ] . In particular, the radius of the Y j ’s is monotonicallynon-increasing: ∆ x ( Y ) ≥ ∆ x ( Y ) ≥ ⋅ ⋅ ⋅ ≥ ∆ x ( Y s ) . In particular X is a connected clusterwith radius at most / .2. For each j ≥ , ̃ X j is a connected cluster with center x j , target t j such that ∆ x j ( X j ) ≤ / .In particular the entire shortest path from x j to t j (in Y j − ) is in ̃ X j .3. If a special ﬁrst cluster is created, then y ∈ X and P x ,t ( X ) ⊆ G [ X ∪ X ] . If no special ﬁrstcluster is created then P x ,t ( X ) ⊆ G [ X ] . Next we cite the relevant properties regarding the hierarchical-petal-decomposition pro-cedure. The proofs follows almost the same lines as [AN19], with slight and natural adaptationsdue to the embedding being a clan embedding with duplicate copies for some vertices. In any case,no new ideas are required and we will skip the proof.

Fact 3 ([AN19]) . Consider the hierarchical-petal-decomposition procedure executed on X with center x , target t and radius ∆ . The following hold: . The algorithm returns a spanning clan embedding into a tree T .2. The tree T has radius at most x ( X ) . That is ∆ x ( T ) ≤ x ( X ) . Note that if follows from Fact 3, that the distance between every pair of vertices in the tree T is at most 8∆ x ( X ) .We will need the following observation. Roughly speaking, it says that when the petal-decomposition algorithm is carving out ( X j + , X j + , ̃ X j + ) , it is oblivious to the past petals,edges and targets – it only cares about Y j and the original diameter ∆. Observation 1.

Assume that petal-decomposition on input ( G [ X ] , x , t, ∆ x ( X )) returns asoutput ({ X j , X j , ̃ X j , x j , t j , ∆ j } j ∈{ ,...,s } , {( y j , x j )} j ∈{ ,...,s } ) .Then running petal-decomposition on input ( G [ Y l ] , x , t , ∆ x ( X )) will output ({ X j , X j , ̃ X j , x j , t j , ∆ j } j ∈{ ,l + ,...,s } , {( y j , x j )} j ∈{ l + ,...,s } ) . Fix some 1 ≤ j ≤ s , and consider carving the petal ( X j , X j , ̃ X j ) from the graph induced on Y = Y j − . Our choice of radius bares similarities to the one in [ACE + petal-decomposition algorithm provides an interval [ lo , hi ] of size at least ∆ /

8, andfor each r ∈ [ lo , hi ] let W r ( Y, x , t ) ⊆ Y denote the petal of radius r (usually we will omit ( Y, x , t ) ).Our algorithm will return three clusters: X j ⊆ X j ⊆ ̃ X j which will correspond to three petals W r − R Lk ⊆ W r ⊆ W r + R Lk respectively, where R Lk = Θ ( hi − lo k log log µ ( Y ) ) = Θ ( ∆ k log log µ ( Y ) ) . The algorithmwill be executed recursively on ̃ X j , while X j will be removed from Y . The cluster X j will onlybe used in order to deﬁne χ (during the hierarchical-petal-decomposition procedure). Fact 1implies that the vertices in X j are padded by ̃ X j , while the vertices in Y / X j are padded by Y / X j . If a pair of vertices u, v do not belong to the same cluster (e.g. u ∈ X j and v ∉ ̃ X j )then d Y ( u, v ) = Ω ( ∆ k log log µ ( Y ) ) . By Fact 3, the diameter of the ﬁnal tree will be O ( ∆ ) . Inparticular, the distance in the embedded tree between every copy of u and v will be bounded by O ( ∆ ) = O ( k log log µ ( Y )) d Y ( u, v ) . Note that only the vertices in ̃ X j / X j are duplicated. Our goalthus, is to choose a radius r such that the measure of the duplicated vertices would be small.Our algorithm to select a radius is based on region growing techniques as in [ACE + t j must be at distance from x ), we ﬁrst choose an appropriate range that mimics that choice (see line 5 in Algorithm 3)– this is the reason for the extra factor of log log µ ( Y ) . The basic idea in region growing is tocharge the measure of the duplicated vertices (i.e. ̃ X j / X j ), to all the vertices in the cluster ̃ X j . Inorder to avoid a range in [ lo , hi ] that contains more than half of the measure, we will cut either in [ lo , mid ] or in [ mid , hi ] where mid = ( hi + lo )/

2. Speciﬁcally, in the case where W mid has measureat least µ ( Y )/

2, we ”cut backwards” in the regime [ mid , hi ] , and charge the duplicated vertices tothe remaining graph Y j , rather than to ̃ X j . 22 lgorithm 3: ( X, X, ̃ X ) = create-petal ( G [ Y ] , µ, [ lo , hi ] , x , t ) L = ⌈ + log log µ ( Y )⌉ R = hi − lo ; mid = ( lo + hi )/ = lo + R / For every r , denote W r = W r ( Y, x , t ) , w r = µ ( W r ) if w mid ≤ µ ( Y ) then Choose [ a, b ] ⊆ [ lo, mid ] such that b − a = R L and w a ≥ w b / µ ( Y ) // see Lemma 7 Pick r ∈ [ a + b − a k , b − b − a k ] such that w r + b − a k ≤ w r − b − a k ⋅ ( w b w a ) k // see Lemma 8 else For every r ∈ [ lo , hi ] , denote q r = µ ( Y / W r ) Choose [ b, a ] ⊆ [ mid , hi ] such that a − b = R L and q a ≥ q b / µ ( Y ) // see Lemma 9 Pick r ∈ [ b + b − a k , a − b − a k ] such that q r − a − b k ≤ q r + a − b k ⋅ ( q b q a ) / k // see Lemma 10 return ( W r − R Lk , W r , W r + R Lk ) Let u, v ∈ V be a pair of vertices, let ( f, χ ) be the spanning clan embedding into a tree T returnedby calling Hierarchical-petal-decomposition on ( G [ V ] , z, z, ∆ z ( V )) for arbitrary z ∈ V . Lemma 5.

The clan embedding ( f, χ ) has distortion O ( ρ ) = O ( k log log µ ( X )) .Proof. The proof is by induction on the radius ∆ of the graph (w.r.t. the center). The basiccase is where the graph is a singleton and ∆ = u, v . Let ({ X j , X j , ̃ X j , x j , t j , ∆ j } sj = , {( y j , x j )} sj = ) be the output of the call to the petal-decomposition procedure on X, x . For each j ≥

0, let Y j − be the graph held duringthe j ’th stage of the algorithm. Note that Y s = X . Then we created the petals ( X j , X j , ̃ X j ) =( W r j − R Lk , W r j , W r j + R Lk ) , and Y j + = Y j / X j + , where L = ⌈ + log log µ ( Y j )⌉ , and R ≥ ∆8 . Set ρ = ⌈ + log log µ ( V )⌉ ⋅ k = O ( k log log µ ( V )) . Note that for every execution of the create-petal procedure at this stage it holds that ∆ ρ ≤ ⋅ R Lk .First consider the case where d G ( u, v ) ≥ ∆ ρ . By Fact 3 the distance between any pair of verticesin T is O ( ∆ ) . In particularmin v ′ ∈ f ( v ) d T ( v ′ , χ ( u )) ≤ O ( ∆ ) = O ( ρ ) ⋅ d G ( u, v ) . Else, it holds that d G ( u, v ) < ∆ ρ , set B = B X ( u, ∆ ρ ) . For ease of notation set X s + = X s + =̃ X s + = X = Y s . Let j u ∈ [ , s + ] be the minimal index such that u ∈ X j . We argue that B ⊆ Y j u − .Assume for contradiction otherwise, and let j ∈ [ , j u − ] be the minimal index such that B ⊈ Y j .Thus there is a vertex u ′ ∈ B ∩ X j ⊆ W r j − R Lk , while by the minimality of j it holds that B ⊆ Y j − .Using Fact 1 it follows that u ∈ B Y j − ( u ′ , ∆ ρ ) ⊆ W r j − R Lk + ⋅ ∆ ρ ⊆ W r j = X j , a contradiction to the minimality of j u . 23ext we argue that B ⊆ ̃ X j u . If j u = s +

1, then we have B ⊆ Y s = X = ̃ X s + and done. Otherwise,as u ∈ X j u = W r ju , using Fact 1 again we obtain B = B X ( u, ∆ ρ ) = B Y ju − ( u, ∆ ρ ) ⊆ W r ju + ⋅ ∆ ρ ⊆ W r ju + R Lk = ̃ X j u . Following the hierarchical-petal-decomposition algorithm we create a clan embedding ( f j u , χ j u ) of ̃ X j u into a tree T j u . The tree T j u is incorporated into a global tree T , where f ( u ) = ∪ j f j ( u ) , f ( v ) = ∪ j f j ( v ) , and χ ( u ) = χ j u ( u ) by the deﬁnition of j u . As d G ( u, v ) < ∆ ρ ,it holds that v ∈ B . In particular, the shortest path from v to u in G belongs to B , thus d G [ X ju ] ( u, v ) = d G ( u, v ) . By Fact 2 the radius of X j u is at most ∆, hence using the inductionhypothesis we conclude:min v ′ ∈ f ( v ) d T ( v ′ , χ ( u )) ≤ min v ′ ∈ f ju ( v ) d T ju ( v ′ , χ j u ( u )) = O ( ρ ) ⋅ d G [ X ju ] ( u, v ) = O ( ρ ) ⋅ d G ( u, v ) . Lemma 6. E v ∼ µ [∣ f ( v )∣] ≤ µ ( V ) + / k .Proof. We prove by induction on ∣ X ∣ and ∆ that the one-to-many embedding f constructedusing the Hierarchical-petal-decomposition algorithm w.r.t. any (≥ ) -measure µ fulﬁlls E v ∼ µ [∣ f ( v )∣] ≤ µ ( X ) + / k . The base case where X is a singleton is trivial. For the inductivestep, assume we call petal-decomposition on ( G [ X ] , x , t, ∆ ) with ∆ ≥ ∆ x ( X ) and measure µ .Assume that the petal-decomposition algorithm does a non-trivial clustering of X to ̃ X , ̃ X , . . . , ̃ X s (if it is the case that all vertices are suﬃciently close to x , then no petals will be cre-ated, and the hierarchical-petal-decomposition will simply recurse on ( G [ X ] , x , t, ∆ x ( X )) ,so we can ignore this case). Let ̃ X = W r + R Lk be the ﬁrst petal created by the petal-decomposition algorithm, and Y = X / X , where X = W r − R Lk . Denote by µ ̃ X j the measure µ restricted to ̃ X j ,and by f ̃ X j the one-to-many embedding our algorithm constructs for ̃ X j .By Observation 1, we can consider the remaining execution of petal-decomposition on Y as a new recursive call of petal-decomposition with input ( G [ Y ] , x , t , ∆ ) . In particular, therecursive calls on ̃ X , ̃ X , . . . , ̃ X s are completely independent from ̃ X . Denote f Y = ∪ j = , ,...,s f ̃ X j ,and by µ Y the measure µ restricted to Y . Since ∣ ̃ X ∣ , ∣ Y ∣ < ∣ X ∣ , the induction hypothesis impliesthat E v ∼ µ ̃ X [∣ f ̃ X ( v )∣] ≤ µ ̃ X ( ̃ X ) + k = µ ( ̃ X ) + k and E v ∼ µ Y [∣ f Y ( v )∣] ≤ µ Y ( Y ) + k = µ ( Y ) + k .Note that by our construction, E v ∼ µ [∣ f ( v )∣] = s ∑ j = E v ∼ µ ̃ Xj [∣ f j ( v )∣] = E v ∼ µ ̃ X [∣ f ( v )∣] + E v ∼ µ Y [∣ f Y ( v )∣] . The rest of the proof is by case analysis according to the choice of radius in Algorithm 3. Recallthat w r ′ = µ ( W r ′ ) and q r ′ = µ ( Y ∖ W r ′ ) = µ ( X ∖ W r ′ ) for every parameter r ′ .1. Case 1: w mid ≤ µ ( X )/

2. In this case we pick a, b ∈ [ lo , hi ] where b − a = R /( L ) , and r ∈ [ a + b − a k , b − b − a k ] such that w a > w b / µ ( X ) and w r + b − a k ≤ w r − b − a k ⋅ ( w b w a ) / k . ̃ X = W r + b − a k , while Y = X / X = X / W r − b − a k . Using this two inequalities we have that µ ( ̃ X ) + k = w r + b − a k ⋅ w k r + b − a k ≤ w r − b − a k ⋅( w b w a ) k ⋅ w k r + b − a k ≤ w r − b − a k ⋅( µ ( X ) w b ) k ⋅ w k r + b − a k ≤ w r − b − a k ⋅ µ ( X ) k , where we used the fact that r + b − a k ≤ b (and that w r is monotone). Using the inductionhypothesis we conclude E x ∼ µ [∣ f ( x )∣] = E x ∼ µ ̃ X [∣ f ̃ X ( x )∣] + E x ∼ µ Y [∣ f Y ( x )∣]≤ µ ( ̃ X ) + k + µ ( Y ) + k ≤ w r − b − a k ⋅ µ ( X ) k + µ ( Y ) ⋅ µ ( X ) k = ( µ ( W r − b − a k ) + µ ( X / W r − b − a k )) ⋅ µ ( X ) k = µ ( X ) + k , where the second inequality follows as µ ( Y ) ≤ µ ( X ) .2. Case 2: w mid > m /

2. This case is completely symmetric. Denoting q r = µ ( X ∖ W r ) , we picked b, a ∈ [ lo , hi ] so that a − b = R /( L ) and r ∈ [ b + b − a k , a − b − a k ] such that q a ≥ q b / µ ( X ) and q r − a − b k ≤ q r + a − b k ⋅ ( q b q a ) / k , Here ̃ X = W r + b − a k , while Y = X / W r − b − a k . Note that µ ( Y ) = q r − a − b k while µ ( ̃ X ) = µ ( X ) − q r + a − b k . Using this two inequalities we have that µ ( Y ) + k = q r − b − a k ⋅ q k r − b − a k ≤ q r + b − a k ⋅ ( q b q a ) k ⋅ q k r − b − a k ≤ q r + b − a k ⋅ ( µ ( X ) q b ) k ⋅ q k r − b − a k ≤ q r + b − a k ⋅ µ ( X ) k where we used the fact that b ≤ r − b − a k . Following previous calculations, we conclude E x ∼ µ [∣ f ( x )∣] ≤ µ ( ̃ X ) + k + µ ( Y ) + k ≤ µ ( ̃ X ) µ ( X ) k + q r + b − a k ⋅ µ ( X ) k = ( µ ( W r + a − b k ) + µ ( X / W r + a − b k )) ⋅ µ ( X ) k = µ ( X ) + k . Lemma 4 follows by the combination of Lemma 5 and Lemma 6. create-petal procedure (Algorithm 3)

In this section we prove that the choices made in the create-petal procedure are all legal. In allthe Lemmas that follow, we shall use the notation in Algorithm 3.

Lemma 7. If w mid ≤ µ ( Y )/ then there is [ a, b ] ⊆ [ lo , mid ] such that b − a = R L and w a ≥ w b / µ ( Y ) . roof. Seeking contradiction, assume that for every such a, b with b − a = R L it holds that w b >√ µ ( Y ) ⋅ w a . Applying this on b = mid − iR L and a = mid − ( i + ) R L for every i = , , . . . , L −

2, we havethat w mid > µ ( Y ) / ⋅ w / − R L > ⋅ ⋅ ⋅ > µ ( Y ) − −( L − ) ⋅ w −( L − ) mid − ( L − ) R L ≥ µ ( Y ) ⋅ − ⋅ w −( L − ) lo ≥ µ ( Y ) , where we used that log log µ ( Y ) ≤ L − = lo + R /

2. In the last inequality we also used that W lo contains at least one vertex, thus w lo ≥

1. The contradiction follows.

Lemma 8.

There is r ∈ [ a + b − a k , b − b − a k ] such that w r + b − a k ≤ w r − b − a k ⋅ ( w b w a ) k .Proof. Seeking contradiction, assume there is no such choice of r , then applying this for r = b − ( i + / ) ⋅ b − ak for i = , , . . . , k − w b > w b − b − ak ⋅ ( w b w a ) / k > ⋯ > w b − k ⋅ b − ak ⋅ ( w b w a ) k / k = w a ⋅ w b w a = w b , a contradiction.The following two lemmas are symmetric to the two lemmas above. Lemma 9. If w mid > m (implies q mid ≤ m ), then there is [ b, a ] ⊆ [ mid , hi ] such that a − b = R L and q a ≥ q b / µ ( Y ) . Lemma 10.

There is r ∈ [ b + b − a k , a − b − a k ] such that q r − a − b k ≤ q r + a − b k ⋅ ( q b q a ) / k . The proof of Theorem 3 using Lemma 4 follows the same lines as the proof of Theorem 1 fromLemma 2. First we transform the language of (≥ ) -measure to that of probability measure. Lemma 11.

Given an n -point weighted graph G = ( V, E, w ) and probability measure µ ∶ V → R ≥ ,we can construct the two following spanning clan embeddings ( f, χ ) into a tree:1. For integer k ≥ , multiplicative distortion O ( k log log n ) such that E x ∼ µ [∣ f ( x )∣] ≤ O ( n k ) .2. For (cid:15) ∈ ( , ] , multiplicative distortion O ( log n log log n(cid:15) ) such that E x ∼ µ [∣ f ( x )∣] ≤ + (cid:15) . The proof of Lemma 11 is exactly identical to that of Lemma 3 and we will skip it. The onlysubtlety to note is the (≥ ) -measure ̃ µ ≥ constructed during the proof of Lemma 3 fulﬁlls ̃ µ ≥ ( V ) = n , and thus the multiplicative distortion gurantee from Lemma 4 will be O ( k log log n ) . Theorem 3now follows from the minimax theorem (in the exact same way as the proof of Theorem 1).26 Lower Bound for Clan Embeddings into Trees

This section is devoted to proving Theorem 2. We restate it for convenience.

Theorem 2 (Lower bound for clan embedding into a tree) . For every ﬁxed (cid:15) ∈ ( , ) and largeenough n , there is an n -point metric spaces ( X, d X ) such that for every clan embedding ( f, χ ) of X into a tree with multiplicative distortion O ( log n(cid:15) ) it holds that ∑ x ∈ X ∣ f ( x )∣ ≥ ( + (cid:15) ) n .Further, for every k ∈ N , there is an n -point metric spaces ( X, d X ) such that for every clan embed-ding ( f, χ ) of X into a tree with multiplicative distortion O ( k ) it holds that ∑ x ∈ X ∣ f ( x )∣ ≥ Ω ( n + k ) . The girth of an unweighted graph G is the length of the shortest cycle in G . The Erd˝os girthconjecture states that for any g and n , there exist an n -vertex graph with girth g and Ω ( n + g − ) edges. The conjecture is known to holds for g = , , ,

12 (see [Ben66, Wen91]). However, the bestknown provable lower bound for general k is due to Lazebnik et al. [LUW95]. Theorem 10 ([LUW95]) . For every even g , and n , there exist an unweighted graph with girth g and Ω ( n + ⋅ g − ) edges. From the upper bound prospective, the (generalized) Moore’s bound [AHL02, BR10] states thatevery n vertex graph with girth g has at most n + g − edges for g ≤ n , and for larger g at most n ( + ( + o ( )) ln ( m − n + ) g ) edges (where m is the number of edges).We will be able to use Theorem 10 to prove the second assertion in Theorem 2, that is thatclan embedding into a tree with distortion O ( k ) requires that ∑ x ∈ X ∣ f ( x )∣ ≥ Ω ( n + k ) . However, theﬁrst assertion requires a much stricter lower bound of ( + (cid:15) ) n on the number of edges. Thereforethe asymptotic nature of Theorem 10 is unfortunately not strong enough for our needs. We beginby showing that for large enough n and (cid:15) ∈ ( , ) , there exist an n -vertex graph with ( + (cid:15) ) n edgesand girth Ω ( log n(cid:15) ) . We do not aware of this very basic fact to previously appear in the literature.Note that Lemma 12 matches Moore’s upper bound (up to a constant dependency on the girth g ). Lemma 12.

For every ﬁxed (cid:15) ∈ ( , ) , and large enough n , there exist a graph with at least ( + (cid:15) ) n edges, and girth Ω ( log n(cid:15) ) . Remark 1. [Ultra sparse spanners] Given a graph G = ( V, E, w ) , a spanners is a sub graph H of G . The stretch of the spanner is the minimal t such that for every pair of vertices u, v ∈ V , d H ( u, v ) ≤ t ⋅ d G ( u, v ) . For every ﬁxed (cid:15) ∈ ( , ) , Elkin and Neiman [EN19] constructed ultra-sparsespanners with ( + (cid:15) ) n edges and stretch O ( log n(cid:15) ) . Even though they noted that the sparsity of theirspanner matches that the Moore bound, it actually remained open whether one can construct betterspanners. As the only g − spanner of a graph with girth g is the graph itself, Lemma 12 impliesthat the ultra sparse spanner from [EN19] is tight (up to a constant in the stretch). The ﬁrst step is to replace the asymptotic notation in the lower bound on the number of edgesfrom Theorem 10 with explicit bound, for the case of girth Ω ( log n ) . Claim 2.

For every n ∈ N , there exist an n -vertex graph with n edges, and girth Ω ( log n ) .Proof. Set p = n ( n ) = n − . Consider a graph G = ( V, E ) sampled according to G ( n, p ) (that is eachedge sampled to G i.i.d. with probability p ). It holds that E [∣ E ∣] = ( n ) ⋅ p = n . By Chernoﬀ bound,Pr [∣ E ∣ < n ] ≤ e − E [ E ] = e − n . t ≥

3, denote by C t the set of cycles of length exactly t . Then, E [∣ C t ∣] ≤ n ( n − )⋯( n − t + ) ⋅ p t = n ( n − )⋯( n − t + )( n − ) t ⋅ t < t . Denote by C the set of all cycles of length smaller than log n . Then E [∣C∣] = log n − ∑ t = E [∣ C t ∣] ≤ log n − ∑ t = t < log n = n . By Markov inequality, Pr [∣C∣ ≥ n ] ≤ E [∣C∣] n < n − < . By union bound, there exist a graph G withat least 3 n edges, and at most n cycles of length less than log n . Let G ′ be the graph obtainedby deleting an arbitrary single edge from each cycle. Continue deleting edges until G ′ has exactly2 n edges. We conclude that G ′ has 2 n edges and girth at least log n as required. Proof of Lemma 12.

Fix δ = − (cid:15) (cid:15) . Set n ′ = (cid:15)n = n + δ . We ignore issues of integrality during theproof. Such issues could be easily ﬁxed as we don’t state an explicit bound on the girth. UsingClaim 2, construct a graph G ′ with n ′ vertices, 2 n ′ edges, and girth Ω ( log n ′ ) .Let G be the graph obtained from G ′ by replacing each edge by a path of length δ +

1. Then: ∣ V ( G )∣ = ∣ V ( G ′ )∣ + δ ⋅ ∣ E ( G ′ )∣ = n ′ + δ ⋅ n ′ = n ′ ( + δ ) = n ∣ E ( G )∣ = ( δ + ) ⋅ ∣ E ( G ′ )∣ = ( δ + ) ⋅ n ′ = n ⋅ ( + δ ) + δ = ( + (cid:15) ) n , where the last equality follow by the deﬁnition of δ . Note that the girth of G is at least Ω (( + δ ) log n ′ ) = Ω ( log (cid:15)n(cid:15) ) = Ω ( log n(cid:15) ) , for n large enough.The Euler characteristic of a graph G deﬁned as χ ( G ) ∶= ∣ E ( G )∣ − ∣ V ( G )∣ +

1. Our lower boundis based on the following theorem by Rabinovich and Raz [RR98].

Theorem 11 ([RR98] ) . Consider an unweighted graph G with girth g , and consider a (classic) em-bedding f ∶ G → H of G into a weighted graph H , such that χ ( H ) < χ ( G ) . Then f has multiplicativedistortion at least g − . Next, we transfer the language of classic embeddings into graphs used in Theorem 11, to thatof clan embeddings into trees.

Lemma 13.

Consider an unweighted, n -vertex graph G = ( V, E ) with girth g , and let ( f, χ ) bea clan embedding of G into a tree T with multiplicative distortion t < g − . Then necessarily ∑ v ∈ V ∣ f ( v )∣ ≥ n + χ ( G ) .Proof. Let H be the graph that obtained from T by contracting all the copies of each vertex.Speciﬁcally, arbitrarily order the vertices in V : v , v , . . . , v n . Iteratively construct a series ofgraphs H = T, H , H , . . . , H n with one-to-many embeddings f i ∶ G → H i . In the i ’th iteration wecreate H i , f i out of H i − , f i − by replacing all the vertices in f i − ( v i ) by a single vertex ˜ v i . For avertex u ∈ H i − , we add an edge from u to ˜ v i if there was an edge from u to some vertex in f i − ( v ) .In case we add the edge { u, ˜ v i } , its weight deﬁned to be min v ′ ∈ f i − ( v ) w H i − ( v ′ , u ) . Set H = H n , and˜ f = f n . Clearly distances in H can only decrease compared to T . This is as for every u, v ∈ V ,28 H ( ˜ u, ˜ v ) ≤ min u ′ ∈ f ( u ) , v ′ ∈ f ( v ) d T ( u ′ , v ′ ) ≤ min u ′ ∈ f ( u ) d T ( u ′ , χ ( v )) ≤ t ⋅ d G ( u, v ) . From the other hand,by induction (and triangle inequality), as f was a dominating embedding, one can show that so is˜ f . That is ∀ u, v ∈ V , d H ( ˜ u, ˜ v ) ≥ d G ( u, v ) .We conclude that ˜ f is a classic embedding of G with multiplicative distortion at most t < g − .By Theorem 11, it follows that χ ( H ) ≥ χ ( G ) . For every i , it holds that χ ( H i ) = ∣ E ( H i )∣ − ∣ V ( H i )∣ − ≤ ∣ E ( H i − )∣ − (∣ V ( H i − )∣ − ∣ f ( v i )∣ + ) − = χ ( H i − ) + ∣ f ( v i )∣ − χ ( G ) ≤ χ ( H ) = χ ( H n ) ≤ ∑ i (∣ f ( v i )∣ − ) + χ ( T ) = ∑ v ∈ V ∣ f ( v )∣ − n . We are now ready to prove Theorem 2.

Proof of Theorem 2.

For the ﬁrst assertion, using Lemma 12, let G be an unweighted graph withgirth g = Ω ( log n(cid:15) ) and ( + (cid:15) ) n edges. Consider a clan embedding of G into a tree with distortionsmaller that g − = Ω ( log n(cid:15) ) . According to Lemma 13, it holds that ∑ v ∈ V ∣ f ( v )∣ ≥ n + χ ( G ) = ∣ E ( G )∣ + > ( + (cid:15) ) n . The second assertion follows by similar lines. Set g = ⋅ ⌊ k + ⌋ . Note that g is largesteven number up to k +

2. Using Theorem 10, let G be an unweighted graph with girth g andΩ ( n + ⋅ g − ) ≥ Ω ( n + k ) edges. Consider a clan embedding of G into a tree with distortion smallerthat g − = Ω ( k ) . According to Lemma 13, it holds that ∑ v ∈ V ∣ f ( v )∣ ≥ n + χ ( G ) = ∣ E ( G )∣ + = Ω ( n + k ) . This section is devoted to proving the following theorem,

Theorem 4 (Ramsey type embedding for minor free graphs) . Given a K r -free n -vertex graph G = ( V, E, w ) with diameter D , and parameters (cid:15) ∈ ( , ) , δ ∈ ( , ) , there is a distribution overdominating embeddings g ∶ G → H , into graphs of treewidth O h ( log n(cid:15)δ ) , such that there is a subset M ⊆ V of vertices for which the following holds:1. For every u ∈ V , Pr [ u ∈ M ] ≥ − δ .2. For every u ∈ M and v ∈ V , d H ( g ( u ) , g ( v )) ≤ d G ( u, v ) + (cid:15)D . We begin by proving Theorem 4 for the special case of nearly- h -embeddable graphs.29 emma 14. Given a nearly h -embeddable n -vertex graph G = ( V, E, w ) of diameter D , and param-eters (cid:15) ∈ ( , ) , δ ∈ ( , ) , there is a distribution over one-to-many, clique preserving, dominatingembeddings f into treewidth O h ( log n(cid:15)δ ) graphs, such that there is a subset M ⊆ V of vertices forwhich the following holds:1. For every clique Q ⊆ V , Pr [ Q ⊆ M ] ≥ − δ .2. For every u ∈ M and v ∈ V , max u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , v ′ )) ≤ d G ( u, v ) + (cid:15)D .Proof. Consider a nearly h -embedded graph G = ( V, E, w ) . Assume w.l.o.g. that D =

1, otherwisewe will scale accordingly. We assume that 1 / δ is an integer, otherwise we solve for δ ′ such that δ ′ = ⌈ δ ⌉ . Let Ψ be the set of apices. We will construct q = δ embeddings, all satisfying property (2)of Lemma 14. The ﬁnal embeddings will be obtained by choosing one of this embeddings uniformlyat random. We ﬁrst create a new graph G ′ = G [ V ∖ Ψ ] by deleting all the apex vertices Ψ. Inthe tree decomposition of H to be constructed, the set Ψ will belong to all the bags (with edgestowards all the vertices). Thus we can assume that G ′ is connected, as otherwise we can simplysolve the problem on each connected component separately, and combine the solutions: i.e. takingthe union of all graphs/embeddings.Let r ∈ G ′ be an arbitrary vertex. For σ ∈ { , . . . , δ } set I − ,σ = [ , σ ] , I +− ,σ = [ , σ + ] , andfor j ≥

0, set I j,σ = [ jδ + σ, ( j + ) δ + σ ) , and I + j,σ = [ jδ + σ − , ( j + ) δ + σ + ) , Set U j,σ = { v ∈ G ′ ∣ d G ′ ( r, v ) ∈ I j,σ } and similarly U + j,σ and U − j,σ w.r.t. I + j,σ . Let G j,σ be the graphinduced by U + j,σ , plus the vertex r . In addition, for every vertex v ∈ U + j,σ who has a neighbor in ∪ j ′ < j U + j ′ ,σ ∖ U + j,σ , we add an edge towards r of weight d G ( v, r ) . Equivalently, G j,σ can be constructedby taking the graph induced by ∪ j ′ ≤ j U + j ′ ,σ , and contracting all the internal edges out of U + j,σ into r .See Figure 4 (in Section 7) for illustration. Note that all the edges towards r have weight at most D =

1, thus G j,σ is a nearly h -embedded graph with diameter at most 2 ⋅ ( δ + ) = O ( δ ) , and noapices.Fix some σ and j . Using Lemma 1 with parameter Θ ( (cid:15) ⋅ δ ) , we construct a one-to-many em-bedding f j,σ , of G j,σ into a graph H j,σ with treewidth O h ( log n(cid:15) ⋅ δ ) , such that f j,σ is clique preserving,and has additive distortion Θ ( (cid:15) ⋅ δ ) ⋅ O ( δ ) = (cid:15) . After the application of Lemma 1, we will identifybetween all the copies of r , and add edges from r to all the other vertices (where the weight of anew edge ( r, v ) is d G ( r, v ) ). Note that this increases the treewidth by at most 1. Further, we willassume that there is a bag containing only the vertex r (as we can simply add such a bag). Next,ﬁx σ . Let H ′ σ be a union of the graphs ∪ j ≥− H j,σ . We identify the vertex r with itself, but all theother vertices that participate in more that a single graph will remain as separate copies. Formally,we deﬁne a one-to-many embedding f σ , where f σ ( r ) equals to the unique r , and for every othervertex v ∈ V ∖ Ψ, f σ ( v ) = ⋃ j ≥− f j,σ ( v ) . Note that H ′ σ has a tree decomposition of width O h ( log n(cid:15) ⋅ δ ) ,by identifying the bag containing only r in all the graphs. Finally, we create the graph H σ byadding the set Ψ with edges towards all the vertices in H ′ σ , where the weight of a new edge ( u ′ , v ) is d G ( u, v ) . For v ∈ Ψ, set f σ ( v ) = { v } . As Ψ = O h ( ) , H σ has treewidth O h ( log n(cid:15) ⋅ δ ) . Finally, set M j,σ = { v ∈ G ′ ∣ d G ′ ( r, v ) ∈ [ jδ + σ + , ( j + ) δ + σ − )} , and M σ = Ψ ∪ { r } ∪ ⋃ j ≥− M j,σ . This ﬁnishesthe construction.The one-to-many embedding f σ is dominating. This follows by triangle inequality as every edge { u ′ , v ′ } for u ′ ∈ f σ ( u ) , v ′ ∈ f σ ( v ) in the graph has weight d G ( u, v ) . Next we argue that f σ is clique30reserving. Consider a clique Q in G , and let ˜ Q = Q ∖ Ψ be the non apex vertices in Q . We willshow that f σ contains a clique copy of ˜ Q . As the apices have edges towards all the other vertices,it will imply that f σ is clique-preserving. Let v ∈ ˜ Q be some arbitrary vertex, and j be the uniqueindex such that v ∈ U j,σ . For every u ∈ ˜ Q , d G ′ ( v, u ) = d G ( v, u ) ≤

1, implying u ∈ U + j,σ . We concludethat all ˜ Q vertices belong to G j,σ . As f j,σ is clique preserving, it follows that there is a bag in H j,σ ,and thus also in H σ , containing a clique copy of ˜ Q .Next, we argue that property (1) holds. We say that f failed on a vertex v ∈ V if v ∉ M , and wesay that f failed on a clique Q if Q ⊈ M . Consider some clique Q , we can assume w.l.o.g. that Q does not contain any apex vertices (as f never fails on apex vertex). Let s Q , t Q ∈ Q be the closestand farthest vertices from r in G ′ , respectively. Then d G ′ ( r, t Q ) − d G ′ ( r, s Q ) ≤ d G ′ ( s Q , t Q ) ≤ f σ fails on Q iﬀ there is a non-empty intersection between the interval [ d G ′ ( r, t Q ) , d G ′ ( r, s Q )) and teinterval [ jδ + σ − , jδ + σ + ) for some j . Note that there are at most 5 values of σ for which thisintersection is non-empty. As we constructed q = δ embeddings,Pr σ [ Q ⊆ M σ ] = ∣{ σ ∈ [ q ] ∣ Q ⊆ M σ }∣ q ≤ q − q = − δ Finally, we show that f σ has additive distortion (cid:15)D w.r.t. M σ . Consider a pair of vertices u ∈ M σ and v ∈ V . If one of u, v belongs to Ψ ∪ { r } then for every u ′ ∈ f σ ( u ) and v ′ ∈ f σ ( v ) , d H σ ( u ′ , v ′ ) = d G ( u, v ) . Otherwise, if d G ′ ( u, v ) > d G ( u, v ) , then the shortest path between u to v in G goes through an apex vertex z ∈ Ψ. In H σ , f σ ( z ) is a singleton that have an edge towards all theother vertices. It follows that max u ′ ∈ f σ ( u ) ,v ′ ∈ f σ ( v ) d H σ ( u ′ , v ′ ) ≤ max u ′ ∈ f σ ( u ) ,v ′ ∈ f σ ( v ) d H σ ( u ′ , f σ ( z )) + d H σ ( f σ ( z ) , v ′ ) = d G ( u, z ) + d G ( z, v ) = d G ( u, v ) . Else, d G ′ ( u, v ) = d G ( u, v ) ≤ D =

1. Let j be the unique index such that u ∈ U j,σ ( u ) . As u ∈ M j,σ ,it implies that there is no index j ′ ≠ j such that v ∈ U + j ′ ,σ . In particular, all the vertices in theshortest path between u to v in G are in u ∈ U j,σ ( u ) . It holds thatmax u ′ ∈ f σ ( u ) ,v ′ ∈ f σ ( v ) d H σ ( u ′ , v ′ ) ≤ max u ′ ∈ f j,σ ( u ) ,v ′ ∈ f j,σ ( v ) d H j,σ ( u ′ , v ′ ) ≤ d G j,σ ( u, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . Consider a K r -minor-free graph G , and let T be its clique-sum decomposition. That is G =∪ ( G i ,G j )∈ E ( T ) G i ⊕ h G j where each G i is a nearly h ( r ) -embeddable graph. We call the clique involvedin the clique-sum of G i and G j the joint set of the two graphs.We denote by h ( r ) the parameter such that K r -free minor graph could be decomposed intoa clique-sum of h ( r ) -free graphs. Let φ h be some function depending only on h such that thetreewidth of the graphs constructed in Lemma 14 is bounded by φ h ⋅ log n(cid:15) ⋅ δ .The embedding of G is deﬁned recursively, where some vertices from former levels will be addedto future levels as apices. In order control for the number of such apices we will use the followingdeﬁnition. Deﬁnition 8 (Enhanced minor free graph) . A graph G is called ( r, s, t ) -enhanced minor free graph if there is a set S of at most s vertices called elevated vertices, such that every elevated vertex u ∈ S has edges towards all the other vertices, and G ∖ S is a K r -free graph that has a clique-sumdecomposition with t pieces. We will prove the following claim by induction on t :31 emma 15. Given an n -vertex ( r, s, t ) -enhanced minor free graph G of diameter D with a set S of elevated vertices, and parameter (cid:15) ∈ ( , ) , there is a distribution over one-to-many, cliquepreserving, dominating embeddings f into graphs H of treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log t , suchthat there is a subset M ⊆ V of vertices for which the following hold:1. For every v ∈ V , Pr [ v ∈ M ] ≥ − δ ⋅ log 2 t .2. For every u ∈ M and v ∈ V , max u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H σ ( u ′ , v ′ ) ≤ d G ( u, v ) + (cid:15)D . Lemma 15 easily implies Theorem 4:

Proof of Theorem 4.

Note that every K r free graph is ( r, , n ) -enhanced minor free. ApplyLemma 15 using parameters (cid:15) and δ ′ = δ log 2 n . Deﬁne an embedding g by setting g ( v ) for each v ∈ V to be an arbitrary vertex from f ( v ) . We obtain a distribution over embeddings into treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ ′ + + h ( r ) ⋅ log n = O r ( log n(cid:15) ) graphs with distortion (cid:15)D , such that for every vertex v ∈ V ,Pr [ v ∈ M ] ≥ − δ ′ ⋅ log 2 n = − δ . Proof of Lemma 15.

It follows from Lemma 14 that the claim holds for the base case t =

1. This isas the ﬁrst step in the embeddings constructed by Lemma 14 is to remove all the apices (and addthem back at the end). In particular, the treewidth will is bounded by φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s .We now turn to the induction step. Consider an ( r, s, t ) -enhanced minor graph G . Let G ′ be a K r -free graph obtained from G by removing a set S of elevated vertices. Let T be the clique-sumdecomposition of G ′ with t pieces. We use the following lemma to pick a central piece G of T . Lemma 16 ([Jor69]) . Given a tree T of n vertices, there is a vertex v such that every connectedcomponent of T ∖ { v } has at most n vertices. Let G , . . . , G p be the neighbors of ˜ G in T . Note that T ∖ ˜ G contains p connected components T , . . . , T p , where G i ∈ T i , and T i contains at most ∣ T ∣/ = t / Q i be the clique used inthe clique-sum of G i with ˜ G in T . For every i , we will add edges between Q i vertices to all thevertices in T i (that is elevating Q i w.r.t. G i ). Every new edge { u, v } will have the weight d G ( u, v ) .Let G i be the graph induced on vertices of T i ∪ S (and the newly added edges). Note that G i isan ( r, s ′ , t ′ ) -enhanced minor free graph for t ′ ≤ t and ∣ s ′ ∣ ≤ ∣ S ∣ + ∣ Q i ∣ ≤ s + h ( r ) . Further, for every u, v ∈ G i it holds that d G i ( u, v ) = d G ( u, v ) , implying that each G i has diameter at most D . Using theinductive hypothesis on G i , we sample a dominating embedding f i into H i , and a subset M i ⊆ G i of vertices. Note that properties (1)-(2) hold, and H i has treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s ′ + h ( r ) ⋅ log 2 t ′ ≤ φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log 2 t .Let ˜ G be the graph induced on ˜ G ∪ S . Note that ˜ G has diameter at most D . We apply Lemma 14on ˜ G , to sample a dominating embedding ˜ f into ˜ H , and a subset ˜ M of vertices. Note that properties(1)-(2) hold, in particular, the treewidth of ˜ H is bounded by φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s (as the constructionﬁrst will delete the elevated vertices and eventually add them to all the bags).As the embeddings ˜ f , f , . . . , f p are clique-preserving embeddings into ˜ H, H , . . . , H p , there isa natural way to combine them into a single graph H of treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log 2 t .In more detail, initially, we just take a disjoint union of all the graphs ˜ H, H , . . . , H p , keeping allcopies of the diﬀerent vertices separately. Next, we identify all the copies of the elevated vertices.Finally, for each i , as both ˜ f and f i are clique-preserving, we simply take two clique copies of Q i from ˜ f and f i , and identify the respective vertices in this two clique copies. Note that every vertex32 ∈ Q i is elevated vertex in G i , and thus f i ( v ) is unique. The embedding f is deﬁned as follows:For v ∈ ˜ G , set f ( v ) = ˜ f ( v ) , while for v ∈ G i ∖ ˜ G for some i , set f ( v ) = f i ( v ) .Next we deﬁne the subset M ⊆ V . Every vertex v ∈ ˜ M joins M . A vertex v ∈ G i ∖ ˜ G join M ifand only if v ∈ M i and Q i ⊆ ˜ M . Note that for vertices in ˜ G property (1) holds trivially, while for v ∈ G i ∖ ˜ G , using the induction hypothesis and union boundPr [ v ∉ M ] ≤ Pr [ v ∉ M i ] + Pr [ Q i ⊈ ˜ M ] ≤ δ ⋅ log 2 t ′ + δ ≤ δ ⋅ log 2 t . Hence property (1) holds. Note that f is clique preserving as every clique must be contained ineither ˜ G or some G i . Finally, we show that property (2) holds. Consider a vertex u ∈ M and v ∈ V .We proceed by case analysis. • If a shortest path from u to v goes through a vertex z ∈ S (this in particular catches the casewhere either u or v in S ). Then for every u ′ ∈ f ( u ) and v ′ ∈ f ( v ) it holds that d H ( u ′ , v ′ ) ≤ d H ( u ′ , f ( z )) + d H ( f ( z ) , v ′ ) = d G ( u, z ) + d G ( z, v ) = d G ( u, v ) . • Else, if both u, v ∈ ˜ G , then by Lemma 14, max u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ ˜ f ( u ) ,v ′ ∈ ˜ f ( v ) d ˜ H ( u ′ , v ′ ) ≤ d ˜ G ( u, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . • Else, if there is an i ∈ [ p ] such that both u, v ∈ G i ∖ ˜ G , then by the induction hypothesismax u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ f i ( u ) ,v ′ ∈ f i ( v ) d H i ( u ′ , v ′ ) ≤ d G i ( u, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . • Else, if u ∈ ˜ G and there is an i ∈ [ p ] such that v ∈ G i . There is necessarily a vertex x ∈ Q i such that there is a shortest path from u to v in G going through x . Let ˆ x be the copy of x used to connect between ˜ H and H i . Note that there is an edge between ˆ x to every copy v ′ ∈ f i ( v ) in H i . In addition, as u ∈ ˜ M , by the second case it holds that max u ′ ∈ f ( u ) d H ( u ′ , ˆ x ) ≤ max u ′ ∈ f ( u ) ,x ′ ∈ f ( x ) d H ( u ′ , ˆ x ) ≤ d G ( u, x ) + (cid:15)D . We concludemax u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ f ( u ) d H ( u ′ , ˆ x ) + max v ′ ∈ f ( v ) d H ( ˆ x, v ′ )≤ d G ( u, x ) + (cid:15)D + d G ( x, v ) = d G ( u, v ) + (cid:15)D . (1) • Else, if v ∈ ˜ G and there is an i ∈ [ p ] such that u ∈ G i ∖ ˜ G . There is necessarily a vertex x ∈ Q i such that there is a shortest path from u to v in G going through x . As u ∈ M it follows that x ∈ ˜ M ⊆ M . Let ˆ x be the copy of x used to connect between ˜ H and H i . Inequality (1) holds. • Else, there are i ≠ j such that u ∈ G i ∖ ˜ G and v ∈ G j ∖ ˜ G . There is necessarily a vertex x ∈ Q i such that there is a shortest path from u to v in G going through x . As u ∈ M it follows that x ∈ ˜ M ⊆ M . Let ˆ x be the copy of x used to connect between ˜ H and H i . By the forth case, itholds that max x ′ ∈ f ( x ) ,v ′ ∈ f ( v ) d H ( x ′ , v ′ ) ≤ d G ( x, v ) + (cid:15)D . Thusmax u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ f ( u ) d H ( u ′ , ˆ x ) + max v ′ ∈ f ( v ) d H ( ˆ x, v ′ )≤ d G ( u, x ) + d G ( x, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . Clan Embedding for Minor-Free Graphs

This section is devoted to proving Theorem 5 (restated bellow for convince). The proof of Theorem 5builds upon similar approach to Theorem 4, however it is more delicate and considerably moreinvolved. We present the proof here without assuming familiarity with the proof of Theorem 4.Nonetheless, from a pedagogical standpoint, we recommend the reader to ﬁrst understand the proofof Theorem 4, and only later to approach this section.

Theorem 5 (Clan embedding for minor free graphs) . Consider a K r -free n -vertex graph G =( V, E, w ) of diameter D , and parameters (cid:15) ∈ ( , ) , δ ∈ ( , ) , there is a distribution D over clanembeddings ( f, χ ) with additive distortion (cid:15)D into graphs of treewidth O h ( log nδ(cid:15) ) , such that forevery v ∈ V , E [∣ f ( v )∣] ≤ + δ . Remark 2.

Note that Theorem 5 implies a weak version Theorem 4, where the distortion guranteeis for pairs u, v ∈ M rather than than for u ∈ M and v ∈ V : simply use the chief part χ as a Ramseytype embedding and set M = { v ∣ ∣ f ( v )∣ = } . Interestingly, this weaker version is still suﬃcient forour application to the metric ρ -independent set problem (Theorem 7). We begin with Lemma 17, which is a special case of nearly-embeddable graphs. Later, we willgeneralize to minor free graphs using clique-sums. Speciﬁcally, inductively we will use Lemma 17 foreach piece, and integrate it to the general embedding. However, for this integration to go through,we will need the intermediate embedding to be clique-preserving. As a consequence, we will notattempt to bound the size of f . Instead, for every vertex v , f ( v ) will be a union of two sets χ ( v ) and ψ ( v ) . Eventually for the clan embedding we will take a one copy from each set. We will saythat the embedding succeed on a vertex v if ψ ( v ) = ∅ . Lemma 17.

Consider a nearly h -embeddable n -vertex graph G = ( V, E, w ) with set of apices Ψ ,diameter D , and parameters (cid:15) ∈ ( , ) , δ ∈ ( , ) . Then there is a distribution over one-to-many,dominating embeddings f into treewidth O h ( log n(cid:15)δ ) graphs, such that for every vertex v ∈ V , f ( v ) can be partitioned into sets χ ( v ) , ψ ( v ) where χ ( v ) ⊍ ψ ( v ) = f ( v ) . It holds that:1. For every pair of vertices u, v , min { max u ′ ∈ χ ( u ) ,v ′ ∈ χ ( v ) d H ( u ′ , v ′ ) , max u ′ ∈ ψ ( u ) ,v ′ ∈ χ ( v ) d H ( v ′ , u ′ )} ≤ d G ( u, v ) + (cid:15)D . (2)

2. We say that f fails on a vertex v if ψ ( v ) ≠ ∅ . For a clique Q ⊆ V , we say that f fails on Q if it fails on some vertex in Q . For every clique Q ⊆ V , Pr [ f fails on Q ] ≤ δ .3. Consider a clique Q , one of the following holds:(a) f succeeds on Q . In particular χ ( Q ) contains a clique copy of Q .(b) f fails on Q , and χ ( Q ) contains a clique copy of Q . In addition, consider the set Q F = { v ∈ Q ∣ ψ ( v ) ≠ } , then ψ ( Q F ) contains a clique copy of Q F . Note that ψ ( v ) might be an empty set. A maximum over an empty set is deﬁned to be ∞ . c) f fails on Q , and f ( Q ) contains two cliques copies Q , Q of Q such that for every vertex v ∈ Q ∖ Ψ , both χ ( v ) ∩ ( Q ∪ Q ) and ψ ( v ) ∩ ( Q ∪ Q ) are singletons. In this case, inaddition to equation (2) it also holds that for every u ∈ V and v ∈ Q ∖ Ψ , min { max u ′ ∈ χ ( u ) ,v ′ ∈ ψ ( v ) d H ( u ′ , v ′ ) , max u ′ ∈ ψ ( u ) ,v ′ ∈ ψ ( v ) d H ( u ′ , v ′ )} ≤ d G ( u, v ) + (cid:15)D . (3) Proof.

Consider a nearly h -embedded graph G = ( V, E, w ) . Assume w.l.o.g. that D =

1, otherwisewe can scale accordingly. We assume that 1 / δ is an integer, otherwise we solve for δ ′ such that δ ′ = ⌈ δ ⌉ . We will construct q = δ embeddings, all satisfying property (1) of Lemma 17. The ﬁnalembeddings will be obtained by choosing one of this embeddings uniformly at random. Denote by G ′ = G [ V ∖ Ψ ] the induced subgraph obtain by removing the apices. In the tree decomposition of H we will construct, the set Ψ will belong to all the bags (with edges towards all the vertices). Thus wecan assume that G ′ is connected, as otherwise we can simply solve the problem on each connectedcomponent separately, and combine the solutions: i.e. taking the union of all graphs/embeddings.Let r ∈ G ′ be an arbitrary vertex. For σ ∈ { , . . . , δ } , set I − ,σ = [ , σ ] , I +− ,σ = [ , σ + ] ,and for j ≥

0, set I j,σ = [ jδ + σ, ( j + ) δ + σ ) , and I + j,σ = [ jδ + σ − , ( j + ) δ + σ + ) . Set U j,σ ={ v ∈ G ′ ∣ d G ′ ( r, v ) ∈ I j,σ } and similarly U + j,σ w.r.t. I + j,σ . Note that by triangle inequality, for ev-ery pair of neighboring vertices u, v it holds that d G ( u, v ) ≤ D =

1, thus u ∈ U j,σ implies v ∈ U + j,σ .Let G j,σ be the graph induced by U + j,σ , plus the vertex r . In addition, we add edges from the vertex r towards all the vertices with neighbors in (∪ q < j U + q,σ ) ∖ U + j,σ (where the weight of a new edge ( r, v ) is d G ( r, v ) ). Equivalently, G j,σ can be constructed by taking the graph induced by ∪ j ′ ≤ j U + j ′ ,σ , andcontracting all the internal edges out of U + j,σ into r . Note that all the edges towards r have weightat most D =

1. Furthermore, for every vertex v ∈ G j,σ , d G j,σ ( v, r ) < + δ +

4. Thus G j,σ is anearly h -embedded graph with diameter at most δ + = O ( δ ) , and no apices. See Figure 4 forillustration.Fix some σ and j . Using Lemma 1 with parameter Θ ( (cid:15) ⋅ δ ) , we construct a dominating one-to-many embedding f j,σ , of G j,σ into a graph H j,σ with treewidth O h ( log n(cid:15) ⋅ δ ) , such that f j,σ is cliquepreserving, and has additive distortion Θ ( (cid:15) ⋅ δ ) ⋅ O ( δ ) = (cid:15) . After the application of Lemma 1 wewill add edges from r to all the other vertices (where the weight of a new edge ( r, v ) is d G ( r, v ) ).Note that this increases the treewidth by at most 1. Further, we will assume that there is a bagcontaining only the vertex r (as we can simply add such a bag). Next, ﬁx σ . Let H ′ σ be a union of thegraphs ∪ j ≥− H j,σ . We identify the vertex r with itself, but all the other vertices that participatein more that a single graph will remain as separate copies. Formally, we deﬁne a one-to-manyembedding f σ , where f σ ( r ) equals to the unique vertex r , and for every other vertex v ∈ V ∖ Ψ, f σ ( v ) = ⋃ j ≥− f j,σ ( v ) . Note that H ′ σ has a tree decomposition of width O h ( log n(cid:15) ⋅ δ ) , by identifyingthe bag containing only r in all the graphs. Finally, we create the graph H σ by adding the set Ψwith edges towards all the vertices in H ′ σ , where the weight of a new edge ( u ′ , v ) for u ∈ f σ ( u ) and v ∈ Ψ is d G ( u, v ) . For v ∈ Ψ, set f σ ( v ) = { v } . As Ψ = O h ( ) , H σ has treewidth O h ( log n(cid:15) ⋅ δ ) . Theone-to-many embedding f σ is dominating. This follows by triangle inequality as every edge { u, v } in the graph has weight d G ( u, v ) . Finally, the embedding f is chosen to equal f σ , for σ chosenuniformly at random. This concludes the deﬁnition of the embedding f .Next, we deﬁne the partition χ σ ( v ) ⊍ ψ σ ( v ) of f σ ( v ) for each vertex v ∈ V as follows: • If v ∈ Ψ ∪ { r } , then there is a single copy of v in f σ . Set χ σ ( v ) = f σ ( v ) and ψ σ ( v ) = ∅ .35igure 4: On the left is the graph G ′ . r is the big black vertex in the middle. The dashed orange linesseparate between the layers of U − ,σ , U ,σ , U ,σ , . . . . The two blue lines are the boundaries of U + ,σ . All thevertices in U + ,σ (and the edges between them) are black, while all other vertices (and the edges incident onthem) are gray. On the right is the graph G ,σ with vertex set U + ,σ ∪ { r } , where the edges added from r tovertices with neighbors in U + ,σ ∖ U − ,σ are marked in red. • Else, let j be the unique index such that v ∈ U j,σ . Set χ σ ( v ) = f j,σ ( v ) . If there is anotherindex j ′ such that v ∈ U + j ′ ,σ , set ψ σ ( v ) = f j ′ ,σ ( v ) , otherwise set ψ σ ( v ) = ∅ .Clearly, as there are at most 2 indices j such that v ∈ U + j,σ , χ σ ( v ) ⊍ ψ σ ( v ) = f σ ( v ) .Next we prove property (1)- the stretch bound. Consider a pair of vertices u, v ∈ V . If v ∈ Ψ ∪{ r } then f σ ( v ) is a singleton with an edge towards every copy of u , thus property (1) holds. The sameargument holds also if u ∈ Ψ ∪{ r } . Otherwise, if d G ′ ( u, v ) > d G ( u, v ) , then the shortest path between u to v in G goes through an apex vertex z ∈ Ψ. In particular, f σ ( z ) is a singleton with an edgetowards every other vertex. It follows that in H σ , the distance between every two copies in f σ ( v ) and f σ ( u ) is exactly d G ( u, z ) + d G ( z, v ) = d G ( u, v ) . Else, d G ′ ( u, v ) = d G ( u, v ) . Let j be the uniqueindex such that v ∈ U j,σ , then u ∈ U + j,σ . Furthermore, d G j,σ ( u, v ) = d G ′ ( u, v ) (as the entire shortestpath between them is in U + j,σ ). By Lemma 1,min { max u ′ ∈ χ σ ( u ) ,v ′ ∈ χ σ ( v ) d H σ ( u ′ , v ′ ) , max u ′ ∈ ψ σ ( u ) ,v ′ ∈ χ σ ( v ) d H σ ( v ′ , u ′ )}≤ max u ′ ∈ f j,σ ( u ) ,v ′ ∈ f j,σ ( v ) d H j,σ ( u ′ , v ′ ) ≤ d G j,σ ( u, v ) + (cid:15)D = d G ′ ( u, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . Next we argue property (2)- the failure probability of a clique. Recall that f, χ, ψ will equal to f σ , χ σ , ψ σ for σ ∈ , . . . , δ chosen uniformly at random. Consider some clique Q , we can assumew.l.o.g. that Q does not contain any apex vertices (as f never fails on apex vertex). Let s Q , t Q ∈ Q be the closest and farthest vertices from r in G ′ , respectively. Then d G ′ ( r, t Q ) − d G ′ ( r, s Q ) ≤ d G ′ ( s Q , t Q ) ≤ D = f σ fails on Q iﬀ there is a non-empty intersection between the interval [ d G ′ ( r, t Q ) , d G ′ ( r, s Q )) (of length at most 1) and interval [ jδ + σ − , jδ + σ + ) for some j . Note36igure 5: Illustration of the diﬀerent cases in prop-erty (3). The green area marks all the vertices in U + j,σ . The vertices in U j,σ are enclosed between thetwo black semicircles. The vertices in U + j,σ ∩ U + j + ,σ (resp. U + j − ,σ ∩ U + j,σ ) are enclosed between the red(resp. orange) dashed semicircles. In the ﬁrst case (a) , all the vertices of Q are in U j,σ and no vertexfailed. In the second case (b) , all the vertices of Q are in U j,σ and some vertices failed. In the thirdcase (c) , the vertices of Q non-trivially partitionedbetween U j,σ and U j + ,σ , and all of them failed. jδ + σ − jδ + σ j +1) δ + σ + 2 (a)(b) (c) j +1) δ + σ r that there are at most 5 choices of σ on which this happens. We conclude that Pr [ f fails on Q ] ≤ / δ − ≤ δ .Finally, we prove property (3)- clique preservation. Consider a clique Q , note that we canassume that Q ⊆ G ′ , as f σ will not fail on any apex. Farther, if r ∈ Q then no vertex in Q fails as Q ⊆ B G ′ ( r, ) ⊆ U − ,σ ∖ U + ,σ . Thus we can assume that r ∉ Q . We proceed by case analysis, thecases are illustrated in Figure 5.(a) if f σ succeeds on Q , then f σ ( Q ) = χ σ ( Q ) . In particular there is a unique j such that Q ⊆ U j,σ .As f j,σ is clique preserving, it contains a clique copy of Q . In particular, χ σ ( Q ) contain aclique copy of Q .Otherwise f σ failed on Q . Then there is a unique index j such that the intersection of Q with both U + j,σ and U + j + ,σ is non empty.(b) First, consider the case such that Q ⊆ U j,σ (the case Q ⊆ U j + ,σ is symmetric). Here χ σ ( Q ) = f j,σ ( Q ) , and ψ σ ( Q ) = ψ σ ( Q Fσ ) = f j + ,σ ( Q Fσ ) , where Q Fσ = { v ∈ Q ∣ ψ σ ( v ) ≠ } . As f j,σ and f j + ,σ are clique-preserving, χ σ ( Q ) contain a clique copy of Q , while ψ σ ( Q Fσ ) contains a cliquecopy of Q Fσ .(c) Finally, consider the case where Q intersect both U j,σ and U j + ,σ . It holds that d G ′ ( r, s Q ) < ( j + ) δ + σ ≤ d G ′ ( r, t Q ) , hence ( j + ) δ + σ − ≤ d G ′ ( r, s Q ) and d G ′ ( r, t Q ) < ( j + ) δ + σ + s Q , t Q ∈ Q are the closest and farthest vertices from r , respectively). Necessarily, Q ⊆ U + j,σ ∩ U + j + ,σ .In particular, as f j,σ ( Q ) , and f j + ,σ ( Q ) are clique preserving, they contain clique copies Q , Q of Q (respectively). Furthermore, Q , Q ⊆ f σ ( Q ) , and for every vertex v ∈ Q , both χ ( v ) ∩ ( Q ∪ Q ) and ψ ( v ) ∩ ( Q ∪ Q ) are singletons.It remains to prove the additional stretch gurantee. Consider a vertex v ∈ Q , suppose that v ∈ U j,σ (the case v ∈ U j + ,σ is symmetric). Here χ σ ( v ) = f j,σ ( v ) and ψ σ ( v ) = f j + ,σ ( v ) .Consider some vertex u ∈ V , in similar manner to the general distortion argument, if either u ∈ Ψ ∪ { r } , or the shortest path from u to v in G goes through Ψ ∪ { r } , then the distancebetween every two copies in f σ ( v ) and f σ ( u ) is exactly d G ( u, v ) , and equation (3) holds.Else, d G ′ ( u, v ) = d G ( u, v ) , and it holds that d G ′ ( r, u ) ≥ d G ′ ( r, v ) − d G ′ ( u, v ) ≥ d G ′ ( r, s Q ) − ≥ ( j + ) δ + σ −

2, thus u ∈ U + j + ,σ . Furthermore, d G j + ,σ ( u, v ) = d G ′ ( u, v ) (as the entire shortestpath between them is in U + j + ,σ ). By Lemma 1,min { max u ′ ∈ χ ( u ) ,v ′ ∈ ψ ( v ) d H ( u ′ , v ′ ) , max u ′ ∈ ψ ( u ) ,v ′ ∈ ψ ( v ) d H ( u ′ , v ′ )}≤ max u ′ ∈ f j + ,σ ( u ) ,v ′ ∈ f j + ,σ ( v ) d H j + ,σ ( u ′ , v ′ ) ≤ d G j + ,σ ( u, v ) + (cid:15)D = d G ′ ( u, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . K r -minor-free graph G , and let T be its clique-sum decomposition. That is G =∪ ( G i ,G j )∈ E ( T ) G i ⊕ h G j where each G i is a nearly h ( r ) -embeddable graph. We call the clique involvedin the clique-sum of G i and G j the joint set of the two graphs.We denote by h ( r ) the parameter such that K r -free minor graph could be decomposed intoa clique-sum of h ( r ) -free graphs. Let φ h be some function depending only on h such that thetreewidth of the graphs constructed in Lemma 17 is bounded by φ h ⋅ log n(cid:15) ⋅ δ . The embedding of G is deﬁned recursively, where some vertices from former levels will be added to future levels asapices. In order control for the number of such apices we will use the following deﬁnition. RecallDeﬁnition 8 from Section 6 of enhanced minor free graph. We will prove the following lemma byinduction on t : Lemma 18.

Given an ( r, s, t ) -enhanced minor free graph G of diameter D with a speciﬁed set S ofelevated vertices, and parameters (cid:15) ∈ ( , ) , δ ∈ ( , ) , there is a distribution over one-to-many, cliquepreserving, dominating embeddings f into graphs of treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log t , such thatfor every vertex v ∈ V , f ( v ) can be partitioned into sets g ( v ) , g ( v ) , . . . where ⊍ j ≥ g j ( v ) = f ( v ) .It holds that:1. For every v ∈ V , let q v be the maximal index j such that g j ( v ) ≠ ∅ , then E [ q v ] ≤ ( + δ ) log 2 t .In addition, if v ∈ S then ∣ f ( v )∣ = and thus q v = .2. For every pair of vertices u, v , min j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ d G ( u, v ) + (cid:15)D . Assuming Lemma 18, Theorem 5 easily follows.

Proof of Theorem 5.

Note that every K r free graph is ( r, , n ) -enhanced minor free. We applyLemma 18 using parameters (cid:15) and δ ′ = δ n . For every vertex v ∈ V , let g ( v ) ⊆ f ( v ) be a setcontaining a single copy from each non empty set g j ( v ) . Let χ ( v ) = g ( v ) ∩ g ( v ) be the copy in g ( v ) from g ( v ) . The distortion guarantee is straightforward to verify. The treewidth of the resultinggraph is φ h ( r ) ⋅ log n(cid:15) ⋅ δ ′ + + h ( r ) ⋅ log n = O r ( log n(cid:15) ) . Finally, for every vertex v ∈ V , it holds that E [∣ g ( v )∣] ≤ ( + δ n ) log 2 n < e δ < + δ .The rest of the section is devoted to proving Lemma 18. Proof of Lemma 18.

The claim is proved by induction on t . It follows from Lemma 17 thatLemma 18 holds for the base case t =

1. This is as the ﬁrst step in the embeddings constructed byLemma 17 is to remove all the apices. Thus the treewidth will be φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s .We turn now to the induction step. Consider an ( r, s, t ) -enhanced minor graph G . Let G ′ bea K r -free graph obtained from G by removing a set S of size at most s . Let T be the clique-sum decomposition of G ′ with t pieces. Using Lemma 16, choose a central piece ˜ G ∈ T of T . Let G , . . . , G p be the neighbors of ˜ G in T . Note that T ∖ ˜ G contains p connected components T , . . . , T p ,where G i ∈ T i , and T i contains at most ∣ T ∣/ = t / Q i be the clique used in the clique-sumof G i with ˜ G in T . For every i , we will add edges between Q i vertices to all the vertices in T i (thatis making Q i into apices). Every new edge { u, v } will have weight d G ( u, v ) . Let G i be the graphinduced on the vertices of T i ∪ S (and the newly added edges). Note that G i is an ( r, s ′ , t ′ ) -enhancedminor free graph for t ′ ≤ t and ∣ s ′ ∣ ≤ s + ∣ Q i ∣ ≤ s + h ( r ) . Further, for every u, v ∈ G i it holds that d G i ( u, v ) = d G ( u, v ) , and thus G i has diameter at most D . Using the inductive hypothesis on G i , we38ample a dominating embedding f i into H i , such that for every v ∈ G i we have f i ( v ) = ⊍ j ≥ g i,j ( v ) .We denote by q iv the maximal index such that g i,q iv ( v ) ≠ ∅ . Note that properties (1) and (2)hold, furthermore H i has treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s ′ + h ( r ) ⋅ log 2 t ′ ≤ φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log 2 t .In addition, for a vertex v ∈ S ∪ Q i , ∣ f i ( v )∣ = q iv = v ∈ V , E [ q iv ] ≤ ( + δ ) log 2 t ′ ≤ ( + δ ) log t .Let ˜ G be the graph induced on ˜ G ∪ S . We apply Lemma 17 on ˜ G , to sample a dominatingone-to-many embedding ˜ f into ˜ H , such that for each vertex v ∈ ˜ G , ˜ f ( v ) is partitioned into ˜ χ ( v ) and ˜ ψ ( v ) . ˜ H has treewidth φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s (this is as in Lemma 17 we ﬁrst remove all apices andthen add them back). Note also that properties (1), (2), and (3) hold.We next describe how to combine the diﬀerent parts into a single embeddings. The graph (andthe induced embedding) will be created by identifying some vertices in ˜ H with vertices in each H i . Some of the graphs H i will be duplicated and we will have two copies of them (depending onwhether Q i failed in ˜ f ). Note that the set S has a single copy everywhere, and thus for every v ∈ S ,we will simply identify all the vertices ˜ f ( v ) , f ( v ) , . . . , f p ( v ) .For a vertex v ∈ ˜ G , set g ( v ) = ˜ χ ( v ) and g ( v ) = ˜ ψ ( v ) . Consider some i ∈ [ p ] . Note that the clique Q i belongs to S i . In particular, for every vertex v ∈ Q i , f i ( v ) is a singleton, and f i ( Q i ) is a clique. We continue w.r.t. the 3 cases in Lemma 17 (seeFigure 5 for illustration of the cases): • ˜ f succeeds on Q i : Here ˜ ψ ( Q i ) = ∅ , and ˜ χ ( Q i ) = g q ( Q i ) contains a clique copy Q i ⊆ ˜ χ ( Q i ) of Q i . We simply identify each vertex in f i ( Q i ) with the corresponding copy in Q i . We willabuse notation and refer to H i as H i , to f i as f i , and to g i,j as g i,j .For a vertex v ∈ G i ∖ ˜ G , for every j ≥ g j ( v ) = g i,j ( v ) . • ˜ f fails on Q i , and ˜ χ ( Q i ) contains a clique copy of Q i : Denote by Q i ⊆ ˜ χ ( Q i ) thepromised clique copy of Q i . In addition, ˜ ψ ( Q Fi ) is guaranteed to contain a clique copy Q i of Q Fi = { v ∈ Q i ∣ ˜ ψ ( v ) ≠ ∅} . We duplicate H i into two graphs H i and H i with respectiveduplicate embeddings f i , f i . However, the vertices of Q i ∖ Q Fi are removed from H i and f i .We combine ˜ H with H i (resp. H i ) by combining a clique copy from ˜ χ ( Q i ) (resp. ˜ ψ ( Q Fi ) )with the corresponding vertices from f i ( Q i ) (resp. f i ( Q Fi ) ) (recall that they are apices andthus have a single copy). – For every vertex v ∈ G i ∖ ˜ G where q iv is the maximal index j such that g i,j ( v ) ≠ ∅ . Forevery j ∈ [ , q iv ] , set g j ( v ) = g i,j ( v ) to be the corresponding copies from f i ( v ) , and g q iv + j ( v ) = g i,j ( v ) be the corresponding copies from f i ( v ) . • ˜ f fails on Q i , and ˜ f ( Q i ) contains two clique copies Q i , Q i of Q i such that for every v ∈ Q i , Q i ∪ Q i intersects both ˜ χ ( v ) and ˜ ψ ( v ) : We duplicate H i into two graphs H i and H i with respective duplicate embeddings f i , f i . We combine ˜ H with H i (resp. H i ) byidentifying Q i (resp. Q i ) with f i ( Q i ) (resp. f i ( Q i ) ) (recall that they are apices and thushave a single copy). – For every vertex v ∈ G i ∖ ˜ G where q iv is the maximal index j such that g i,j ( v ) ≠ ∅ .For every j ∈ [ , q iv ] , set g j ( v ) = g i,j ( v ) be the corresponding copies from f i,j ( v ) , and g q iv + j ( v ) = g i,j ( v ) be the corresponding copies from f i,j ( v ) .39e claim next that f, g , g , . . . fulﬁll all the required properties. First note that f is cliquepreserving as every clique must be contained in either ˜ G or some G i . Second, clearly f is dominatingas the weight of every edge between a vertex in f ( v ) and f ( u ) is d G ( u, v ) . Third, as we only identifybetween cliques, the graph H has treewidthmax { φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log 2 t , φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s } = φ h ( r ) ⋅ log n(cid:15) ⋅ δ + s + h ( r ) ⋅ log 2 t Forth, it holds by deﬁnition that for every vertex v ∈ V , f ( v ) = ⊍ j g j ( v ) .Next, we prove property (1). Clearly, for a vertex v ∈ S , we identify between all its copies andthus f ( v ) is a singleton. Consider a vertex v ∈ V , if v ∈ ˜ G , then by Lemma 17 E [ q v ] = + Pr [ ˜ f fails on v ] ≤ + δ . Else, consider v ∈ G i ∖ ˜ G for some i , and denote by q iv the maximal index j such that g i,j is non-empty.We have E [ q v ] = E [ q iv ] ⋅ Pr [ ˜ f succeeds on Q i ] + E [ q iv ] ⋅ Pr [ ˜ f fails on Q i ]= E [ q iv ] ⋅ ( + Pr [ ˜ f fails on Q i ])≤ ( + δ ) log 2 ⋅ t ⋅ ( + δ ) = ( + δ ) log 2 t , where the ﬁrst equality is as we have two copies of H i iﬀ ˜ f fails on Q i . The second equality is asPr [ f succeeds on Q i ] = − Pr [ f fails on Q i ] . The inequality follows by the induction hypothesisand Lemma 17.Finally, we prove property (2). Consider a pair of vertices u, v ∈ V . We proceed by case analysis. • If a shortest path from u to v goes through a vertex z ∈ S (this in particular catchesthe case where either u or v are in S ): Thenmin j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , v ′ )≤ max u ′ ∈ f ( u ) ,v ′ ∈ f ( v ) d H ( u ′ , f ( z )) + d H ( f ( z ) , v ′ ) = d G ( u, z ) + d G ( z, v ) = d G ( u, v ) . In the rest of the cases we assume that d G ′ ( u, v ) = d G ( u, v ) (recall that G ′ = G [ V ∖ S ] ). • Else, if both u, v ∈ ˜ G : Then by Lemma 17,min j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ min { max u ′ ∈ χ ( u ) ,v ′ ∈ χ ( v ) d H ( u ′ , v ′ ) , max u ′ ∈ ψ ( u ) ,v ′ ∈ χ ( v ) d H ( u ′ , v ′ )}≤ d G ( u, v ) + (cid:15)D . • Else, if u ∈ ˜ G and there is an i ∈ [ p ] such that v ∈ G i ∖ ˜ G : There is necessarily a vertex x ∈ Q i such that there is a shortest path from u to v in G going through x . The copy g ( v ) of v was attached to ˜ G as a part of a graph H i (copy of H i ) by identifying a clique copy Q i ⊆ ˜ f ( Q ) of Q with f i ( Q i ) (a set of singletons). We continue by case analysis:40 If either ˜ f succeeds on Q i , or Q i ⊆ ˜ χ ( Q i ) . Then there is a copy ˆ x of x in g ( x ) ∩ Q i . Itholds thatmin j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ min j ( max u ′ ∈ g j ( u ) d H ( u ′ , ˆ x ) + max v ′ ∈ g ( v ) d H ( ˆ x, v ′ ))≤ d G ( u, x ) + (cid:15)D + d G ( x, v ) = d G ( u, v ) + (cid:15)D . (4)where the second inequality follows by the second case (as x ∈ ˜ G ), and the fact that thereis an edge in H between ˆ x to every vertex in g ( v ) . – Else, ˜ f ( Q i ) contains two clique copies Q i , Q i of Q i . Note that ˆ x can belong to either g ( x ) = ˜ χ ( x ) or g ( x ) = ˜ ψ ( x ) . Nevertheless, by using either equation (2) or (3) we havethat min j max u ′ ∈ g j ( u ) d H ( u ′ , ˆ x ) ≤ d G ( u, x ) + (cid:15)D . As there is edge in H between ˆ x to everyvertex in g ( v ) , we conclude that equation (4) holds. • Else, if v ∈ ˜ G and there is an i ∈ [ p ] such that u ∈ G i ∖ G : There is necessarily a vertex x ∈ Q i such that there is a shortest path from u to v in G going through x . By the secondcase, there is an index j ′ such that max x ′ ∈ g j ′ ( x ) ,v ′ ∈ g ( v ) d H ( x ′ , u ′ ) ≤ d G ( x, v ) + (cid:15)D . As x ∈ ˜ G , j ′ ∈ { , } . In any case, a copy of H i was assigned to ˜ H by identifying clique vertices. Inparticular some vertex ˆ x ∈ g j ′ ( x ) was identiﬁed with the apex vertex f i ( x ) (from the relevantcopy). Therefore there is an index j ′′ such that ˆ x has edges towards all the vertices in g j ′′ ( u ) .We conclude,min j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ g j ′′ ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , ˆ x ) + d H ( ˆ x, v ′ )≤ d H ( u, x ) + max x ′ ∈ g j ′ ( x ) ,v ′ ∈ g ( v ) d H ( x ′ , v ′ )≤ d G ( u, x ) + d G ( x, v ) + (cid:15)D = d G ( u, v ) + (cid:15)D . • Else, if there is an i ∈ [ p ] such that u, v ∈ G i ∖ G : There is a copy of H i which embeddedas is into H and contains all the vertices in g ( v ) . By the induction hypothesismin j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ min j max u ′ ∈ g i,j ( u ) ,v ′ ∈ g i, ( v ) d H i ( u ′ , v ′ ) ≤ d G i ( u, v )+ (cid:15)D = d G ( u, v )+ (cid:15)D . • Else, there are i ≠ i ′ ∈ [ p ] such that u ∈ G i ∖ G and v ∈ G i ′ ∖ G : There are necessarilyvertices y ∈ Q i and x ∈ Q i ′ such that there is a shortest path from u to v in G going through y and x . Note that the copy H i ′ of H i ′ containing g ( v ) was added to H by identifying f i ′ ( Q i ′ ) with a clique copy Q i ′ of Q i ′ . In particular, there is a copy ˆ x ∈ Q i ′ of x which has edgestowards all the vertices in g ( v ) . There are two cases: – If ˆ x ∈ g ( x ) , then by the third case there is an index j such that max u ′ ∈ g j ( u ) d H ( u ′ , ˆ x ) ≤ max u ′ ∈ g j ( u ) ,x ′ ∈ g ( x ) d H ( u ′ , x ′ ) ≤ d G ( u, x ) + (cid:15)D . As there is an edge from ˆ x to every copyof v in g ( v ) , we conclude that max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ g j ( u ) d H ( u ′ , ˆ x ) + max v ′ ∈ g ( x ) d H ( ˆ x, v ′ ) ≤ d G ( u, x ) + (cid:15)D + d G ( x, v ) = d G ( u, v ) + (cid:15)D . – Else, ˆ x ∈ g ( x ) . Necessarily ˜ f failed on Q i ′ and ˜ f ( Q i ′ ) contains two clique copies Q i ′ , Q i ′ of Q i ′ . It holds that g ( x ) = ˜ ψ ( x ) , thus by Lemma 18 there is an index j such thatmax y ′ ∈ g j ( y ) d H ( y ′ , ˆ x ) ≤ max y ′ ∈ g j ( y ) ,x ′ ∈ ˜ ψ ( x ) d H ( y ′ , x ′ ) ≤ d G ( x, y ) + (cid:15)D . Note that there is41n edge from ˆ x to every copy of v in g ( v ) . Farther, there is an index j ′′ such that ˆ y hasedges towards all the vertices in g j ′′ ( u ) . We conclude,min j max u ′ ∈ g j ( u ) ,v ′ ∈ g ( v ) d H ( u ′ , v ′ ) ≤ max u ′ ∈ g j ′′ ( u ) d H ( u ′ , ˆ y ) + d H ( ˆ y, ˆ x ) + max v ′ ∈ g ( v ) d H ( ˆ x, v ′ )≤ d G ( u, y ) + max y ′ ∈ g j ′ ( y ) ,x ′ ∈ g ( x ) d H ( y ′ , x ′ ) + d G ( x, v )≤ d G ( u, y ) + d G ( y, x ) + (cid:15)D + d G ( x, v ) = d G ( u, v ) + (cid:15)D . Remark 3.

The clan embedding in Theorem 5 directly implies a weaker version of Theorem 4,where the only diﬀerence is that the distortion is only for pairs where both u, v ∈ M and not only u ∈ M . Note that this weaker version is still strong enough for our application to the ρ -independentset problem in Theorem 7.Sketch: sample a clan embedding ( f, χ ) using Theorem 5. Return g = χ with the set M = { v ∈ V ∣ ∣ f ( v )∣ = } . The weaker distortion gurantee and failure probability are straightforward. Organization: in Sections 8.1, 8.2 and 8.3 we provide the algorithms (and proofs) to our QPTAS for metric ρ -independent set problem, QPTAS for metric ρ -dominating set problem, and compactrouting scheme, respectively.We begin with a discussion on approximation schemes for metric ρ -dominating/independentset problems in bounded treewidth graphs. In the ( k, r ) -center problem we are given a graph G = ( V, E, w ) , and the goal is to ﬁnd a set S of centers of cardinality at most r such that everyvertex v ∈ V is at distance at most r from some center u ∈ S . Katsikarelis, Lampis and Paschos[KLP19] provided a PTAS for the ( k, r ) -center problem in treewidth tw graphs using a dynamicprogramming approach. Speciﬁcally, for any parameters k, r ∈ N and (cid:15) ∈ ( , ) , they provided analgorithm running in O ( tw (cid:15) ) tw ⋅ poly ( n ) time that either returns a solution to the ( k, ( + (cid:15) ) r ) -center problem, or (correctly) declares that there is no valid solution to the ( k, r ) -center problemin G . This dynamic programming can be easily generalized to the case where there is a measure µ ∶ V → R + , and terminal set K ⊂ V . Speciﬁcally, the algorithm will either return a set S of measure µ ( S ) ≤ k , such that every vertex v ∈ K is at distance at most ( + (cid:15) ) r from S , or will declare thereis no set S of measure at most k at distance at most r from every vertex in K .As was observed by Fox-Epstein et al. [FKS19], using [KLP19] one can construct a bicriteriaPTAS for the metric ρ -dominating set problem in treewidth tw graphs with O ( tw (cid:15) ) tw ⋅ poly ( n ) run-ning time. [FKS19] studied the basic version (with uniform measure and K = V ), however this obser-vation holds for the general case as well. In a followup paper, Katsikarelis et al. [KLP20] constructeda similar dynamic programming for the ρ -independent problem with the same O ( tw (cid:15) ) tw ⋅ poly ( n ) running time. It could also be generalized to work with a measure µ . This dynamic programmingwas also promised to appear in the full version of [FKS19]. We conclude this discussion: Theorem 12 ([KLP19, KLP20]) . There is a bicriteria polynomial approximation scheme (PTAS)for both metric ρ -independent set, and ρ -dominating set problems in treewidth tw graphs with run-ning time O ( tw (cid:15) ) tw ⋅ poly ( n ) . .1 QPTAS for the ρ -Independent Set Problem in Minor-Free Graphs This subsection is devoted to proving the following theorem:

Theorem 7 (Metric ρ -independent set) . There is a bicriteria quasi-polynomial approximationscheme (QPTAS) for the metric ρ -independent set problem in K r -free graphs.Speciﬁcally, given a weighted n -vertex K r -free graph G = ( V, E, w ) , measure µ ∶ X → R + andparameters (cid:15) ∈ ( , ) , ρ > , in ˜ O r ( log2 n(cid:15) ) time, one can ﬁnd a ( − (cid:15) ) ρ -independent set S ⊆ Y suchthat for every ρ -independent set ˜ S , µ ( S ) ≥ ( − (cid:15) ) µ ( ˜ S ) .Proof. Create a new graph G ′ from G by adding a single vertex ψ at distance ρ from all theother vertices. G ′ is K r + -minor free. Note that For every u, v ∈ Y it holds that d G ′ ( u, v ) = min { ρ, d G ( u, v )} . Thus G ′ has diameter at most ρ . Furthermore, for every ρ ′ ∈ ( , ρ ) , a set S ⊆ V is a ρ ′ -independent set in G if and only if S is a ρ ′ -independent set in G ′ . Using Theorem 4with parameters (cid:15) ′ = (cid:15) and δ = (cid:15) , let g be an embedding of G ′ into a treewidth O r ( log n(cid:15) ) graph H with a set M ⊆ V ∪ { ψ } such that (1) for every u, v ∈ M , d H ( g ( u ) , g ( v )) ≤ d G ′ ( u, v ) + (cid:15) ⋅ ρ < d G ′ ( u, v ) + (cid:15)ρ , and (2) for every v ∈ V , Pr [ v ∈ M ] ≥ − (cid:15) .Deﬁne a new measure µ H in H , where for each v ∈ G ′ , µ H ( v ′ ) = ⎧⎪⎪⎨⎪⎪⎩ v ′ ∉ g ( V ∩ M ) µ ( v ) else, g ( v ) = v ′ for some v ∈ M ∖ { ψ } . In particular, µ H ( g ( ψ )) =

0. Using Theorem 12 we ﬁnd a ( − (cid:15) ) ρ -independent set S H w.r.t. µ H ,such that for every ρ -independent set ˜ S in H it holds that µ H ( S H ) ≥ ( − (cid:15) ) µ H ( ˜ S ) . We can assumethat S H ⊆ g ( M ) , as the measure of all vertices out of g ( M ) is 0. We will return S = g − ( S H ) , notethat S ⊆ M . First we argue that S is a ( − (cid:15) ) ρ -independent set. For every u, v ∈ S , g ( u ) , g ( u ) ∈ S H thus ( − (cid:15) ) ρ ≤ d H ( g ( u ) , g ( v )) ≤ d G ′ ( u, v ) + (cid:15) ρ ≤ d G ( u, v ) + (cid:15) ρ , implying d G ( u, v ) ≥ ( − (cid:15) ) ρ .Let S opt be a ρ -independent set w.r.t. d G of maximal measure. As g is dominat-ing embedding, g ( S opt ∩ M ) is a ρ -independent set in H . By linearity of expectation E [ µ ( S opt ∖ M )] = ∑ v ∈ S opt µ ( v ) ⋅ Pr [ v ∉ M ] ≤ (cid:15) ⋅ µ ( S opt ) . Using Markov inequalityPr [ µ ( S opt ∩ M ) < ( − (cid:15) ) µ ( S opt )] = Pr [ µ ( S opt ∖ M ) ≥ (cid:15) µ ( S opt )] ≤ E [ µ ( S opt ∖ M )] (cid:15) µ ( S opt ) ≤ . Thus with probability at least , H contains a ρ -independent set g ( S opt ∩ M ) of measure µ H ( g ( S opt ∩ M )) = µ ( S opt ∩ M ) ≥ ( − (cid:15) ) µ ( S opt ) . If this event indeed occurs, the independentset S H returned by [FKS19] algorithm will be of measure greater than ( − (cid:15) )( − (cid:15) ) µ ( S opt ) >( − (cid:15) ) µ ( S opt ) . High probability could be obtained by repeating the above algorithm O ( log n ) times and returning the independent set of maximal cardinality among the observed solutions. Remark 4.

The algorithm above can be derandomized as follows: ﬁrst note that the algorithm fromTheorem 12 is deterministic. Next, during the construction in the proof of Theorem 4, each timewe execute Lemma 14 we pick σ ∈ O ( δ ) uniformly at random, where δ = Θ ( (cid:15) log n ) . As we bound theprobability of Pr [ v ∉ M ] using a simple union bound, it will still hold if we pick the same σ in allthe executions of Lemma 14. We conclude that we can sample the embedding of Theorem 4 from adistribution with support size O ( log n(cid:15) ) . A derandomization follows. .2 QPTAS for the ρ -Dominating Set Problem in Minor-Free Graphs We restate the main theorem the subsection for convince.

Theorem 8 (Metric ρ -dominating set) . There is a bicriteria quasi-polynomial approximationscheme (QPTAS) for the metric ρ -dominating set problem in K r -free graphs.Speciﬁcally, given a weighted- n vertex K r -free graph G = ( V, E, w ) , measure µ ∶ V → R + , a subsetof terminals K ⊆ V , and parameters (cid:15) ∈ ( , ) , ρ > , in ˜ O r ( log2 n(cid:15) ) time, one can ﬁnd a ( + (cid:15) ) ρ -dominating set S ⊆ V such that for every ρ -dominating set ˜ S , µ ( S ) ≤ ( + (cid:15) ) µ ( ˜ S ) .Proof. Similarly to Theorem 7, we start by constructing an auxiliary graph G ′ from G by addinga single vertex ψ at distance 2 ρ from all the other vertices. Extend the measure µ to ψ by setting µ ( ψ ) = ∞ . For every u, v ∈ V it holds that d G ′ ( u, v ) = min { ρ, d G ( u, v )} . It follows that G ′ is a K r + -minor free graph with diameter bounded by 4 ρ . In particular, for every ρ ′ ∈ ( , ρ ) , a set S ⊆ V is ρ ′ -dominating set (w.r.t. K ) in G if and only if S is ρ ′ dominating set in G ′ (w.r.t. K ).Using Theorem 5 with parameters (cid:15) ′ = (cid:15) and δ = (cid:15) , let ( f, χ ) be a clan embeddings of G ′ into atreewidth O r ( log n(cid:15) ) graph H ′ with additive distortion (cid:15) ′ ⋅ ρ = (cid:15) ρ . Deﬁne a new measure µ H in H ,where for each v ′ ∈ H , µ H ( v ′ ) = ⎧⎪⎪⎨⎪⎪⎩∞ v ′ ∉ f ( V ) µ ( v ) v ′ ∈ f ( v ) Set also K H = χ (K) ⊆ H to be our set of terminals. Using Theorem 12, we ﬁnd a ( + (cid:15) ) ρ -dominating set A H , such that for every χ ( v ) ∈ K H , d H ( χ ( v ) , A H ) ≤ ( + (cid:15) ) ρ , and for every ( + (cid:15) ) ρ -dominating set ˜ A w.r.t. K H it holds that µ H ( A H ) ≤ ( + (cid:15) ) µ H ( ˜ A ) . We can assume that A H contains only vertices from f ( V ) (as all other vertices have measure ∞ , while K H itself is legalsolution of ﬁnite measure). We will return A = f − ( A H ) = { u ∈ V ∣ f ( u ) ∩ A H ≠ ∅} .First we argue that A is a ( + (cid:15) ) ρ -dominating set. For every vertex v ∈ K , χ ( v ) ∈ K H . Thereforethere is a vertex ˆ u ∈ A H such that d H ( χ ( v ) , ˆ u ) ≤ ( + (cid:15) ) ρ . In particular, our solution A containsthe vertex u such that ˆ u ∈ f ( u ) . As ( f, χ ) is dominating embedding we conclude d G ( u, v ) ≤ min u ′ ∈ f ( u ) d H ( u ′ , χ ( v )) ≤ d H ( ˆ u, χ ( v )) ≤ ( + (cid:15) ) ρ < ( + (cid:15) ) ρ . Second, we argue that A has nearly optimal measure. Let A opt be a ρ -dominating set in G w.r.t. K of minimal measure. As ( f, χ ) has additive distortion (cid:15) ρ , f ( A opt ) is a ( + (cid:15) ) ρ -dominatingset in H (w.r.t. K H ). Indeed, consider a vertex χ ( v ) ∈ K H (for v ∈ K ). There is a vertex u ∈ A opt such that d G ( u, v ) ≤ ρ . It holds that d H ( f ( A opt ) , χ ( v )) ≤ min u ′ ∈ f ( u ) d H ( u ′ , χ ( v )) ≤ d G ( u, v ) + (cid:15) ρ ≤ ( + (cid:15) ) ρ By Theorem 12, we will ﬁnd a ( + (cid:15) ) ρ -dominating set of measure at most ( + (cid:15) ) µ H ( f ( A opt )) in H . By linearity of expectation, E [ µ H ( f ( A opt )] = ∑ u ∈ A opt µ ( u ) ⋅ E [∣ f ( u )∣] ≤ ( + (cid:15) ) ⋅ µ ( A opt ) . µ H ( f ( A opt )) ≥ µ H ( χ ( A opt )) = µ ( A opt ) . Using Markov inequality,Pr [ µ H ( f ( A opt )) ≥ ( + (cid:15) ) ⋅ µ ( A opt )] = Pr [ µ H ( f ( A opt )) − µ ( A opt ) ≥ (cid:15) µ ( A opt )]≤ E [ µ H ( f ( A opt )) − µ ( A opt )] (cid:15) µ ( A opt ) ≤ (cid:15) (cid:15) = . Thus with probability at least , H contains ( + (cid:15) ) ρ -dominating set of measure ( + (cid:15) ) µ ( A opt ) .If this event indeed occurs, the independent set A H returned by Theorem 12 will be of measureat most ( + (cid:15) ) µ ( A opt ) < ( + (cid:15) ) µ ( A opt ) . High probability could be obtained by repeating thealgorithm above O ( log n ) times and returning the set of minimum measure among the observeddominating sets. We restate the main theorem the subsection for convince.

Theorem 6 (Compact routing scheme) . Given a weighted graph G = ( V, E, w ) on n vertices andinteger parameter k > , there is a compact routing scheme with stretch O ( k log log n ) that has(worst case) labels (and headers) of size O ( log n ) , and the expected size of the routing table of eachvertex is O ( n / k ) . We begin by presenting a result of Thorup and Zwick [TZ01] regarding routing in a tree.

Theorem 13 ([TZ01]) . For any tree T = ( V, E ) (where ∣ V ∣ = n ), there is a routing scheme withstretch that has routing tables of size O ( ) and labels (and headers) of size O ( log n ) . Recall that we measure space in machine words, where each word is Θ ( log n ) bits. We stressout the extremely short routing table size obtained in [TZ01]. Note that when a vertex receivesa packet with a header, it makes the routing decision based only on the routing table, and donot require any knowledge of the label of itself. In particular, the routing table contains a uniqueidentiﬁer of the vertex.Additional ingredient that our construction will require is that of a distance labeling scheme for trees. A distance labeling , assigns to each point x ∈ X a label l ( x ) , and there is an algorithm A (oblivious to ( X, d ) ) that provided labels l ( x ) , l ( y ) of arbitrary x, y ∈ X , can compute d G ( u, v ) .Speciﬁcally, a distance labeling is said to have stretch t ≥ ∀ x, y ∈ X, d ( x, y ) ≤ A ( l ( x ) , l ( y )) ≤ t ⋅ d ( x, y ) . We refer to [FGK20] for an overview of distance labeling schemes in diﬀerent regimes (and com-parison with metric embedding). Exact distance labeling on an n -vertex tree requires Θ ( log n ) words [AGHP16] (see also [GPPR04, Pel00]), which is already larger than the routing table size weare aiming for. Nonetheless, Freedman et al. [FGNW17] (improving upon [AGHP16, GKK + n -vertex unweighted tree, and (cid:15) ∈ ( , ) , one can construct an 1 + (cid:15) -labelingscheme with labels of size O ( log (cid:15) ) words. Note that this can be simply extended to a weightedtree with polynomial aspect ratio (by subdividing edges). Theorem 14 ([FGNW17]) . For any n -vertex tree T = ( V, E ) with polynomial aspect ratio , andparameter (cid:15) ∈ ( , ) , there is a distance labeling scheme with stretch + (cid:15) , and O ( log (cid:15) ) label size. roof of Theorem 6. We combine Theorem 3 with Theorem 13 and Theorem 14 to construct acompact routing scheme. We begin by sampling a spanning clan embedding ( f, χ ) of G into atree T with distortion O ( k log log n ) such that for every vertex v ∈ V , E [∣ f ( v )∣] ≤ n / k . UsingTheorem 14, we construct a distance labeling scheme for T with stretch at most 2 and O ( ) labelsize. That is, each vertex v ′ ∈ T has a label l dl ( v ′ ) of constant size, such that for every pair v ′ , u ′ ∈ T , d T ( v ′ , u ′ ) ≤ A ( l dl ( v ′ ) , l dl ( u ′ )) ≤ ⋅ d T ( v ′ , u ′ ) (dl stands for distance labeling).Using Theorem 13, we construct a compact routing scheme for T , such that each v ′ ∈ T hasa label (cid:96) crs ( v ′ ) of size O ( log ∣ T ∣) = O ( log n ) , and routing table τ crs ( v ′ ) of size O ( ) (crs standsfor compact routing scheme). We construct a compact routing scheme to G as follows: for everyvertex v ∈ V , its label deﬁned to be (cid:96) G ( v ) = ( (cid:96) crs ( χ ( v )) , l dl ( χ ( v ))) , and its table τ G ( v ) to be theconcatenation of {( τ crs ( v ′ ) , l dl ( v ′ ))} v ′ ∈ f ( v ) . In words, the label (cid:96) G ( v ) consist of the routing label (cid:96) crs ( χ ( v )) , and distance label l dl ( χ ( v )) , of the chief χ ( v ) in T , while the routing table τ G ( v ) consistof the routing label (cid:96) crs ( v ′ ) , and distance label l dl ( v ′ ) , of all the copies v ′ in the clan f ( v ) . Clearly,the size of the label is O ( log n ) + O ( ) = O ( log n ) , while the expected size of the routing table is E [∑ v ′ ∈ f ( v ) O ( )] = O ( ) ⋅ E [∣ f ( v )∣] = O ( n k ) .Consider a node v that wants so send a package to a node u , while possessing the routing label (cid:96) G ( u ) of u . v will go over all the copies v ′ ∈ f ( v ) , and choose the copy v u that minimized theestimated distance A ( l dl ( v ′ ) , l dl ( χ ( u ))) . Then, using the routing table τ crs ( v u ) of v u , v will makea routing decision and transfer the package to the ﬁrst vertex z ′ ∈ T on the shortest path from v u to χ ( u ) in T . v will transfer this package with a header consisting of the label of u and the name of z ′ . This somewhat longer routing decision process occurs only when a delivery is initiated. In anyother step, a node z receives a package with a header containing the routing label of the destination (cid:96) G ( u ) and a name of a copy z ′ ∈ f ( z ) . Then z uses the routing table τ crs ( z ′ ) of z ′ to make a routingdecision and transfer the package to the ﬁrst vertex q ′ ∈ T on the shortest path from z ′ to χ ( u ) in T . As previously, z will transfer the package with a header consisting of the label of u and thename of q ′ . Clearly the size of the header is O ( log n ) . Note that other than the ﬁrst decision,each decision is made in constant time (while the ﬁrst decision is made in expected O ( n k ) time).Finally, when routing a package starting at v towards u , the path corresponds exactly to a path in T from a copy v u ∈ f ( v ) to χ ( u ) . The length of this path is bounded by d T ( v u , χ ( u )) ≤ A ( l dl ( v u ) , l dl ( χ ( u ))) = min v ′ ∈ f ( v ) A ( l dl ( v ′ ) , l dl ( χ ( u )))≤ min v ′ ∈ f ( v ) ⋅ d T ( v ′ , χ ( u )) = O ( k log log n ) ⋅ d G ( v, u ) . Acknowledgments

The author is grateful to Philip Klein for suggesting the metric ρ -dominating/independent setproblems, which eventually led to this project. The proof of Theorem 16 was communicated to usby Vincent Cohen-Addad, who generously allowed us to publish it. The author would like to thankAlexandr Andoni for helpful discussions. Finally, the author is most grateful to Hung Le for manyfruitful discussions. 46 eferences [ABLP90] B. Awerbuch, A. Bar-Noy, N. Linial, and D. Peleg. Improved routing strategies withsuccinct tables. J. Algorithms , 11(3):307–341, 1990, doi:10.1016/0196-6774(90)90017-9 . 5[ACE +

20] I. Abraham, S. Chechik, M. Elkin, A. Filtser, and O. Neiman. Ramsey spanning treesand their applications.

ACM Trans. Algorithms , 16(2):19:1–19:21, 2020. preliminaryversion published in SODA 2018, doi:10.1145/3371039 . 1, 3, 5, 6, 8, 9, 15, 17, 18,22[AFGN18] I. Abraham, A. Filtser, A. Gupta, and O. Neiman. Metric embedding via shortestpath decompositions. In

Proceedings of the 50th Annual ACM SIGACT Symposiumon Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018 ,pages 952–963, 2018. full version: https://arxiv.org/abs/1708.04073 , doi:10.1145/3188745.3188808 . 11[AGG +

19] I. Abraham, C. Gavoille, A. Gupta, O. Neiman, and K. Talwar. Cops, robbers, andthreatening skeletons: Padded decomposition for minor-free graphs.

SIAM J. Comput. ,48(3):1120–1145, 2019. preliminary version published in STOC 2014, doi:10.1137/17M1112406 . 11[AGHP16] S. Alstrup, I. L. Gørtz, E. B. Halvorsen, and E. Porat. Distance labeling schemes fortrees. In , pages 132:1–132:16, 2016, doi:10.4230/LIPIcs.ICALP.2016.132 . 45[AHL02] N. Alon, S. Hoory, and N. Linial. The moore bound for irregular graphs.

GraphsComb. , 18(1):53–57, 2002, doi:10.1007/s003730200002 . 27[AKPW95] N. Alon, R. M. Karp, D. Peleg, and D. B. West. A graph-theoretic game and its ap-plication to the k-server problem.

SIAM J. Comput. , 24(1):78–100, 1995. preliminaryversion published in On-Line Algorithms 1991, doi:10.1137/S0097539792224474 . 1[AMS99] N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating thefrequency moments.

J. Comput. Syst. Sci. , 58(1):137–147, 1999. preliminary versionpublished in STOC 1996, doi:10.1006/jcss.1997.1545 . 1[AN19] I. Abraham and O. Neiman. Using petal-decompositions to build a low stretch span-ning tree.

SIAM J. Comput. , 48(2):227–248, 2019. preliminary version published inSTOC 2012, doi:10.1137/17M1115575 . 3, 9, 17, 18, 19, 20, 21[AP92] B. Awerbuch and D. Peleg. Routing with polynomial communication-space trade-oﬀ.

SIAM J. Discret. Math. , 5(2):151–162, 1992, doi:10.1137/0405013 . 5[AS03] V. Athitsos and S. Sclaroﬀ. Database indexing methods for 3d hand pose estimation.In

Gesture-Based Communication in Human-Computer Interaction, 5th InternationalGesture Workshop, GW 2003, Genova, Italy, April 15-17, 2003, Selected Revised Pa-pers , pages 288–299, 2003, doi:10.1007/978-3-540-24598-8\_27 . 147AST90] N. Alon, P. D. Seymour, and R. Thomas. A separator theorem for graphs with anexcluded minor and its applications. In

Proceedings of the 22nd Annual ACM Sympo-sium on Theory of Computing, May 13-17, 1990, Baltimore, Maryland, USA , pages293–299, 1990, doi:10.1145/100216.100254 . 53, 54[Bak94] B. S. Baker. Approximation algorithms for NP-complete problems on planar graphs.

Journal of the ACM , 41(1):153–180, 1994. preliminary version published in FOCS1983, doi:10.1145/174644.174650 . 6[Bar96] Y. Bartal. Probabilistic approximations of metric spaces and its algorithmic appli-cations. In , pages 184–193, 1996, doi:10.1109/SFCS.1996.548477 . 1[Bar98] Y. Bartal. On approximating arbitrary metrices by tree metrics. In

Proceedings of theThirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA,May 23-26, 1998 , pages 161–168, 1998, doi:10.1145/276698.276725 . 1[Bar04] Y. Bartal. Graph decomposition lemmas and their role in metric embeddingmethods. In

Algorithms - ESA 2004, 12th Annual European Symposium, Bergen,Norway, September 14-17, 2004, Proceedings , pages 89–97, 2004, doi:10.1007/978-3-540-30140-0\_10 . 1, 18[Bar11] Y. Bartal. Lecture notes in metric embedding theory and its algorithmic applications,2011. URL: http://moodle.cs.huji.ac.il/cs10/file.php/67720/GM_Lecture6.pdf . 8, 14, 15, 18[BBM06] Y. Bartal, B. Bollob´as, and M. Mendel. Ramsey-type theorems for metric spaces withapplications to online problems.

J. Comput. Syst. Sci. , 72(5):890–921, 2006. SpecialIssue on FOCS 2001, doi:10.1016/j.jcss.2005.05.008 . 1, 11[BCL +

18] S. Bubeck, M. B. Cohen, Y. T. Lee, J. R. Lee, and A. Madry. k-server via multiscaleentropic regularization. In

Proceedings of the 50th Annual ACM SIGACT Symposiumon Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018 ,pages 3–16, 2018, doi:10.1145/3188745.3188798 . 1[Ben66] C. T. Benson. Minimal regular graphs of girths eight and twelve.

Canadian Journalof Mathematics , 18:1091–1094, 1966, doi:10.4153/CJM-1966-109-8 . 27[BFM86] J. Bourgain, T. Figiel, and V. Milman. On Hilbertian subsets of ﬁnite metric spaces.

Israel J. Math. , 55(2):147–152, 1986, doi:10.1007/BF02801990 . 1[BFN19] Y. Bartal, N. Fandina, and O. Neiman. Covering metric spaces by few trees. In , pages 20:1–20:16, 2019, doi:10.4230/LIPIcs.ICALP.2019.20 . 11[BGS16] G. E. Blelloch, Y. Gu, and Y. Sun. A new eﬃcient construction on probabilistic treeembeddings.

CoRR , abs/1605.04651, 2016. https://arxiv.org/abs/1605.04651 , arXiv:1605.04651 . 1 48BL16] G. Borradaile and H. Le. Optimal dynamic program for r-domination problems overtree decompositions. In , pages 8:1–8:23,2016, doi:10.4230/LIPIcs.IPEC.2016.8 . 7[BLMN05a] Y. Bartal, N. Linial, M. Mendel, and A. Naor. On metric Ramsey-type dichotomies. Journal of the London Mathematical Society , 71(2):289–303, 2005, doi:10.1112/S0024610704006155 . 1[BLMN05b] Y. Bartal, N. Linial, M. Mendel, and A. Naor. Some low distortion metric ramsey prob-lems.

Discret. Comput. Geom. , 33(1):27–41, 2005, doi:10.1007/s00454-004-1100-z .12[BLW17] G. Borradaile, H. Le, and C. Wulﬀ-Nilsen. Minor-free graphs have light spanners. In , FOCS ’17,pages 767–778, 2017, doi:10.1109/FOCS.2017.76 . 11[Bou85] J. Bourgain. On lipschitz embedding of ﬁnite metric spaces in hilbert space.

IsraelJournal of Mathematics , 52(1-2):46–52, 1985, doi:10.1007/BF02776078 . 1, 3[BR10] A. Babu and J. Radhakrishnan. An entropy based proof of the moore bound forirregular graphs.

CoRR , abs/1011.1058, 2010, arXiv:1011.1058 . 27[CFKL20] V. Cohen-Addad, A. Filtser, P. N. Klein, and H. Le. On light spanners, low-treewidthembeddings and eﬃcient traversing in minor-free graphs.

CoRR , abs/2009.05039, 2020.To appear in FOCS 2020, https://arxiv.org/abs/2009.05039 , arXiv:2009.05039 .4, 7, 9, 10, 12, 14[CG04] D. E. Carroll and A. Goel. Lower bounds for embedding into distributions over ex-cluded minor graph families. In Algorithms - ESA 2004, 12th Annual European Sym-posium, Bergen, Norway, September 14-17, 2004, Proceedings , pages 146–156, 2004, doi:10.1007/978-3-540-30140-0\_15 . 4[CG12] T. H. Chan and A. Gupta. Approximating TSP on metrics with bounded globalgrowth.

SIAM J. Comput. , 41(3):587–617, 2012. preliminary version published inSODA 2008, doi:10.1137/090749396 . 11[Che13] S. Chechik. Compact routing schemes with improved stretch. In

ACM Symposium onPrinciples of Distributed Computing, PODC ’13, Montreal, QC, Canada, July 22-24,2013 , pages 33–41, 2013, doi:10.1145/2484239.2484268 . 5, 6[Che15] S. Chechik. Approximate distance oracles with improved bounds. In

Proceedings ofthe Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015,Portland, OR, USA, June 14-17, 2015 , pages 1–10, 2015, doi:10.1145/2746539.2746562 . 1[CJLV08] A. Chakrabarti, A. Jaﬀe, J. R. Lee, and J. Vincent. Embeddings of topological graphs:Lossy invariants, linearization, and 2-sums. In , pages 761–770, 2008, doi:10.1109/FOCS.2008.79 . 449CKM19] V. Cohen-Addad, P. N. Klein, and C. Mathieu. Local search yields approximationschemes for k-means and k-median in euclidean and minor-free metrics.

SIAM J.Comput. , 48(2):644–667, 2019. preliminary version published in FOCS 2016, doi:10.1137/17M112717X . 7, 53[Cow01] L. Cowen. Compact routing with minimum stretch.

J. Algorithms , 38(1):170–183,2001. preliminary version published in SODA 1999, doi:10.1006/jagm.2000.1134 .5[DFHT05] E. D. Demaine, F. V. Fomin, M. T. Hajiaghayi, and D. M. Thilikos. Fixed-parameter algorithms for ( k , r )-center in planar graphs and map graphs. ACMTrans. Algorithms , 1(1):33–47, 2005. preliminary version published in ICALP 2003, doi:10.1145/1077464.1077468 . 7[DHK05] E. D. Demaine, M. Hajiaghayi, and K. Kawarabayashi. Algorithmic graph minortheory: Decomposition, approximation, and coloring. In

Proceedings of the 46th An-nual IEEE Symposium on Foundations of Computer Science , pages 637–646, 2005, doi:10.1109/SFCS.2005.14 . 6[EEST08] M. Elkin, Y. Emek, D. A. Spielman, and S. Teng. Lower-stretch spanning trees.

SIAM J. Comput. , 38(2):608–628, 2008. preliminary version published in STOC 2005, doi:10.1137/050641661 . 3[EGP03] T. Eilam, C. Gavoille, and D. Peleg. Compact routing schemes with low stretchfactor.

J. Algorithms , 46(2):97–114, 2003. preliminary version published in PODC1998, doi:10.1016/S0196-6774(03)00002-6 . 5[EILM16] H. Eto, T. Ito, Z. Liu, and E. Miyano. Approximability of the distance indepen-dent set problem on regular graphs and planar graphs. In

Combinatorial Optimiza-tion and Applications - 10th International Conference, COCOA 2016, Hong Kong,China, December 16-18, 2016, Proceedings , pages 270–284, 2016, doi:10.1007/978-3-319-48749-6\_20 . 7[EKM14] D. Eisenstat, P. N. Klein, and C. Mathieu. Approximating k -center in planar graphs.In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Al-gorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014 , pages 617–627,2014, doi:10.1137/1.9781611973402.47 . 4, 6, 7, 8[EN19] M. Elkin and O. Neiman. Eﬃcient algorithms for constructing very sparse spannersand emulators.

ACM Trans. Algorithms , 15(1):4:1–4:29, 2019. preliminary versionpublished in SODA 2017, doi:10.1145/3274651 . 3, 27[FFKP18] A. E. Feldmann, W. S. Fung, J. K¨onemann, and I. Post. A (1+ (cid:15) )-embedding oflow highway dimension graphs into bounded treewidth graphs.

SIAM J. Comput. ,47(4):1667–1704, 2018. preliminary version published in ICALP 2015, doi:10.1137/16M1067196 . 11[FGK20] A. Filtser, L. Gottlieb, and R. Krauthgamer. Labelings vs. embeddings: On distributedrepresentations of distances. In

Proceedings of the 2020 ACM-SIAM Symposium on iscrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020 , pages1063–1075, 2020, doi:10.1137/1.9781611975994.65 . 45[FGNW17] O. Freedman, P. Gawrychowski, P. K. Nicholson, and O. Weimann. Optimal distancelabeling schemes for trees. In Proceedings of the ACM Symposium on Principles ofDistributed Computing, PODC 2017, Washington, DC, USA, July 25-27, 2017 , pages185–194, 2017, doi:10.1145/3087801.3087804 . 45[Fil19] A. Filtser. On strong diameter padded decompositions. In

Approximation,Randomization, and Combinatorial Optimization. Algorithms and Techniques, AP-PROX/RANDOM 2019, September 20-22, 2019, Massachusetts Institute of Tech-nology, Cambridge, MA, USA , pages 6:1–6:21, 2019, doi:10.4230/LIPIcs.APPROX-RANDOM.2019.6 . 10[FKS19] E. Fox-Epstein, P. N. Klein, and A. Schild. Embedding planar graphs into low-treewidth graphs with applications to eﬃcient approximation schemes for metric prob-lems. In

Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algo-rithms , SODA ‘19, page 1069–1088, 2019, doi:10.1137/1.9781611975482.66 . 4, 7,8, 42, 43[Fre87] G. N. Frederickson. Fast algorithms for shortest paths in planar graphs, with ap-plications.

SIAM J. Comput. , 16(6):1004–1022, 1987, doi:10.1137/0216064 . 53,54[FRT04] J. Fakcharoenphol, S. Rao, and K. Talwar. A tight bound on approximating arbi-trary metrics by tree metrics.

J. Comput. Syst. Sci. , 69(3):485–497, November 2004.preliminary version published in STOC 2003, doi:10.1016/j.jcss.2004.04.011 . 1,3[GKK +

01] C. Gavoille, M. Katz, N. A. Katz, C. Paul, and D. Peleg. Approximate dis-tance labeling schemes. In

Algorithms - ESA 2001, 9th Annual European Sympo-sium, Aarhus, Denmark, August 28-31, 2001, Proceedings , pages 476–487, 2001, doi:10.1007/3-540-44676-1\_40 . 45[GKK17] L. Gottlieb, A. Kontorovich, and R. Krauthgamer. Eﬃcient regression in metric spacesvia approximate lipschitz extension.

IEEE Trans. Inf. Theory , 63(8):4838–4849, 2017.preliminary version published in SIMBAD 2013, doi:10.1109/TIT.2017.2713820 . 1[GPPR04] C. Gavoille, D. Peleg, S. P´erennes, and R. Raz. Distance labeling in graphs.

J.Algorithms , 53(1):85–112, 2004. preliminary version published in SODA 2001, doi:10.1016/j.jalgor.2004.05.002 . 45[HBK +

03] E. Halperin, J. Buhler, R. M. Karp, R. Krauthgamer, and B. Westover. Detecting pro-tein sequence conservation via metric embeddings.

Bioinformatics , 19(suppl 1):i122–i129, 07 2003, arXiv:https://academic.oup.com/bioinformatics/article-pdf/19/suppl\_1/i122/614436/btg1016.pdf , doi:10.1093/bioinformatics/btg1016 .1 51Ind01] P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In , pages 10–33, 2001, doi:10.1109/SFCS.2001.959878 .1[Jor69] C. Jordan. Sur les assemblages de lignes. Journal f¨ur die reine und angewandteMathematik , 70:185–190, 1869. 32[Kar89] R. M. Karp. A 2k-competitive algorithm for the circle.

Manuscript, August , 5, 1989.1[KKM +

12] M. Khan, F. Kuhn, D. Malkhi, G. Pandurangan, and K. Talwar. Eﬃcient distributedapproximation algorithms via probabilistic tree embeddings.

Distributed Comput. ,25(3):189–205, 2012. preliminary version published in PODC 2008, doi:10.1007/s00446-012-0157-9 . 1[KLMN05] R. Krauthgamer, J. R. Lee, M. Mendel, and A. Naor. Measured descent: anew embedding method for ﬁnite metrics.

Geometric and Functional Analysis ,15(4):839–858, 2005. preliminary version published in FOCS 2004, doi:10.1007/s00039-005-0527-6 . 11[KLP19] I. Katsikarelis, M. Lampis, and V. T. Paschos. Structural parameters, tight bounds,and approximation for (k, r)-center.

Discret. Appl. Math. , 264:90–117, 2019. pre-liminary version published in ISAAC 2017, doi:10.1016/j.dam.2018.11.002 . 7, 8,42[KLP20] I. Katsikarelis, M. Lampis, and V. T. Paschos. Structurally parameterized d-scatteredset.

Discrete Applied Mathematics , 2020. preliminary version published in WG 2018, doi:10.1016/j.dam.2020.03.052 . 7, 8, 42[Lem03] A. Lemin. On ultrametrization of general metric spaces.

Proceedings of the Americanmathematical society , 131(3):979–989, 2003, doi:10.1090/S0002-9939-02-06605-4 .1[LLR95] N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some of itsalgorithmic applications.

Comb. , 15(2):215–245, 1995. preliminary version publishedin FOCS 1994, doi:10.1007/BF01200757 . 1[LUW95] F. Lazebnik, V. A. Ustimenko, and A. J. Woldar. A new series of dense graphsof high girth.

Bulletin of the American mathematical society , 32(1):73–79, 1995, doi:10.1090/S0273-0979-1995-00569-0 . 27[MN07] M. Mendel and A. Naor. Ramsey partitions and proximity data structures.

Journal ofthe European Mathematical Society , 9(2):253–275, 2007. preliminary version publishedin FOCS 2006, doi:10.4171/JEMS/79 . 1[MP15] D. Marx and M. Pilipczuk. Optimal parameterized algorithms for planar facilitylocation problems using voronoi diagrams. In

Algorithms - ESA 2015 - 23rd AnnualEuropean Symposium, Patras, Greece, September 14-16, 2015, Proceedings , pages 865–877, 2015, doi:10.1007/978-3-662-48350-3\_72 . 7, 852NT12] A. Naor and T. Tao. Scale-oblivious metric fragmentation and the nonlinear dvoret-zky theorem.

Israel Journal of Mathematics , 192(1):489–504, 2012, doi:10.1007/s11856-012-0039-7 . 1[Pel00] D. Peleg. Proximity-preserving labeling schemes.

J. Graph Theory , 33(3):167–176, 2000. preliminary version published in WG 1999, doi:10.1002/(SICI)1097-0118(200003)33:3<167::AID-JGT7>3.0.CO;2-5 . 45[PU89] D. Peleg and E. Upfal. A trade-oﬀ between space and eﬃciency for routing tables.

J.ACM , 36(3):510–530, 1989, doi:10.1145/65950.65953 . 5[Rao99] S. Rao. Small distortion and volume preserving embeddings for planar and Eu-clidean metrics. In

Proceedings of the Fifteenth Annual Symposium on ComputationalGeometry, Miami Beach, Florida, USA, June 13-16, 1999 , pages 300–306, 1999, doi:10.1145/304893.304983 . 11[RR98] Y. Rabinovich and R. Raz. Lower bounds on the distortion of embedding ﬁ-nite metric spaces in graphs.

Discret. Comput. Geom. , 19(1):79–94, 1998, doi:10.1007/PL00009336 . 1, 9, 28[RS03] N. Robertson and P. D. Seymour. Graph minors. XVI. Excluding a non-planargraph.

Journal of Combinatoral Theory Series B , 89(1):43–76, 2003, doi:10.1016/S0095-8956(03)00042-X . 9, 13, 14[Tal04] K. Talwar. Bypassing the embedding: algorithms for low dimensional metrics. In

STOC ’04: Proceedings of the thirty-sixth annual ACM symposium on Theory ofcomputing , pages 281–290. ACM Press, 2004, doi:http://doi.acm.org/10.1145/1007352.1007399 . 11[TZ01] M. Thorup and U. Zwick. Compact routing schemes. In

Proceedings of the Thir-teenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA2001, Heraklion, Crete Island, Greece, July 4-6, 2001 , pages 1–10, 2001, doi:10.1145/378580.378581 . 5, 6, 45[Wen91] R. Wenger. Extremal graphs with no c4’s, c6’s, or c10’s.

Journal of Combinatorial The-ory, Series B , 52(1):113 – 116, 1991, doi:https://doi.org/10.1016/0095-8956(91)90097-4 . 27

A Local Search Algorithms for Metric Becker Problems

In this section we present PTAS’s for the metric ρ -dominating/independent set problems underthe uniform measure. Both algorithms are local search algorithms. The analysis of the algorithmfor the metric ρ -dominating set problem was communicated to us by Vincent Cohen-Addad, whogenerously allowed us to publish it. This analysis uses techniques similar to the ones used in[CKM19] to construct PTAS for the k -means and k -median problems in minor-free graphs. Theanalysis for the metric ρ -independent set problem is original (even though similar).In both proofs we will use r -divisions. The following theorem follows from [Fre87, AST90] (see[CKM19] for details). 53 lgorithm 4: Local search algorithm for metric ρ -dominating set input : n vertex graph G = ( V, E, w ) , parameters ρ, s output: ρ -dominating set S S ← V while ∃ ρ -dominating set S ′ ⊆ V s.t. ∣ S ′ ∣ < ∣ S ∣ and ∣ S ∖ S ′ ∣ + ∣ S ′ ∖ S ∣ ≤ s do S ← S ′ return S Theorem 15 ([Fre87, AST90]) . For every graph H , there is an absolute constant c H such thatevery r ∈ N , and every n -vertex H -free graph G = ( V, E ) , the vertices of G can be divided intoclusters R such that:1. For every edge { u, v } ∈ E , there is a cluster C ∈ R such that u, v ∈ C .2. For every C ∈ R , ∣ C ∣ ≤ r .3. Let B be the set of vertices appearing in more than a single cluster, called boundary vertices,then ∑ C ∈R ∣ C ∩ B∣ ≤ c H ⋅ nr . A.1 Local search for ρ -dominating set under uniform measure We state and prove the theorem here when the set of terminals

K = V , however it can be easilyaccommodated yo deal with a general terminal set. Theorem 16.

There is a polynomial approximation scheme (PTAS) for the metric ρ -dominatingset problem in H -free graphs under uniform measure.Speciﬁcally, given a weighted n -vertex H -free graph G = ( V, E, w ) , and parameters (cid:15) ∈ ( , ) , ρ > ,in n O ∣ H ∣ ( (cid:15) − ) time, one can ﬁnd a ρ -dominating set S ⊆ V such that for every ρ -dominating set ˜ S , ∣ S ∣ ≤ ( + (cid:15) )∣ ˜ S ∣ .Proof. Set r = c H (cid:15) where c H is the constant from Theorem 15 w.r.t. H . Let S be the set returned bythe local search Algorithm 4 with parameters ρ , and s = r = O H ( (cid:15) ) . Clearly S is a ρ -dominatingset. The running time of each step of the while loop is at most ( ns ) ⋅ poly ( n ) = n O ∣ H ∣ ( (cid:15) − ) , as thereare at most n iterations, the running time follows. Let S opt be the ρ -dominating set of minimumcardinality, it remains to prove that ∣ S ∣ ≤ ( + (cid:15) )∣ S opt ∣ .Let ˜ V = S ∪ S opt , and let P be a partition of the vertices in V w.r.t. the Voronoi cells with ˜ V as centers. Speciﬁcally, for each vertex u ∈ V , u joins the cluster P v of a vertex v ∈ ˜ V at minimaldistance min v ∈ ˜ V d G ( u, v ) . Let ˜ G be the graph obtained from G by contracting the internal edgesin each Voronoi cell (and keeping only a single copy of each edge). Alternatively, one can deﬁne ˜ G with ˜ V as vertex set such that v, u ∈ ˜ V are adjacent iﬀ there is an edge in G between a vertex in P u to a vertex in P v . Note that ˜ G is a minor of G , and hence is H -free.Next we use Theorem 15 on ˜ G to obtain r -division R , with B as boundary vertices. Considera cluster C ∈ R , and let C ′ = C ∩ (B ∪ S opt ) . Fix S ′ = ( S ∖ C ) ∪ C ′ . For simplicity, we will assume that all the pairwise distances are unique. Alternatively, one can break ties in aconsistent way (i.e. w.r.t. some total order). laim 3. S ′ is a ρ -dominating set. v v v uz i w j v (cid:48) Proof.

Consider a vertex u ∈ V , We will argue that u is at distance at most ρ fromsome vertex in S ′ . Let v ∈ S (resp. v ∈ S opt ) be the closest vertex to u in S (resp.in S opt ). It holds that d G ( u, v ) , d G ( u, v ) ≤ ρ . If either v ∉ C , v ∈ C ∩ B , or v ∈ C then S ′ contains at least one of v , v and we are done. Thus we can assumethat v ∈ C ∖ B and v ∉ C . Let Π = { v = z , z , . . . , z a , u, w , w , . . . , w b = v } be the unique shortest path from v to v that goes through u (the thick blackline in illustration on the right). Assume ﬁrst that u belongs to the Voroni cell P v of v (encircled by a blue dashed line). For every i and v ′ ∈ ˜ V it holds that d G ( v ′ , z i ) ≥ d G ( v ′ , u ) − d G ( u, z i ) > d G ( v , u ) − d G ( u, z i ) = d G ( v , z i ) . It follows thatall the vertices { z , z , . . . , z a } belong to the Voronoi cell P v . As v ∈ C ∖ B , and v ∉ C , there must be some index j such that w j belongs to the Voronoi cell P v of v ∈ C ∩ B (as otherwise there will be an edge in ˜ G between a vertex in C ∖ B to a vertex not in C ). It holds that d G ( u, v ) ≤ d G ( u, w j ) + d G ( w j , v ) ≤ d G ( u, w j ) + d G ( w j , v ) = d G ( u, v ) ≤ ρ , where the ﬁrst inequality follows by triangle inequality, the second as w j ∈ P v , and the equality as w j lays on the shortest path from u to v . As v ∈ C ∩ B it follows that v ∈ S ′ , thus we are done.The case u ∈ P v is symmetric.It holds that ∣ S ′ ∖ S ∣ + ∣ S ∖ S ′ ∣ ≤ ∣ C ∣ ≤ r = s . Thus necessarily ∣ S ′ ∣ ≥ ∣ S ∣ , as otherwise Algorithm 4would’ve not returned the set S . Hence ∣ C ∩ (B ∪ S opt )∣ = ∣ C ′ ∣ ≥ ∣ C ∩ S ∣ . As the same argumentcould be applied on every cluster C ∈ R , we conclude that, ∣ S ∣ = ∑ C ∈R ∣ C ∩ S ∣ ≤ ∑ C ∈R ∣ C ∩ (B ∪ S opt )∣ ≤ ∣ S opt ∣ + ∑ C ∈R ∣ C ∩ B∣ ≤ ∣ S opt ∣ + c H ⋅ ∣ ˜ V ∣ r ≤ ∣ S opt ∣ + c H ⋅ ∣ S ∣ r . But this implies ∣ S opt ∣ ≥ ( − c H r )∣ S ∣ = ( − (cid:15) )∣ S ∣ , thus ∣ S ∣ ≤ − (cid:15) ∣ S opt ∣ ≤ ( + (cid:15) )∣ S opt ∣ . A.2 Local search for ρ -independent set under uniform measure Theorem 17.

There is a polynomial approximation scheme (PTAS) for the metric ρ -independentset problem in H -free graphs under uniform measure.Speciﬁcally, given a weighted n -vertex H -free graph G = ( V, E, w ) , and parameters (cid:15) ∈ ( , ) , ρ > ,in n O ∣ H ∣ ( (cid:15) − ) time, one can ﬁnd a ρ -independent set S ⊆ V such that for every ρ -independent set ˜ S , ∣ S ∣ ≥ ( − (cid:15) )∣ ˜ S ∣ .Proof. Set r = c H (cid:15) where c H is the constant from Theorem 15 w.r.t. H . Let S be the set returnedby the local search Algorithm 5 with parameters ρ , and s = r = c H (cid:15) = O H ( (cid:15) ) . Clearly S is a ρ -independent set. The running time of each step of the while loop is at most ( ns ) ⋅ poly ( n ) = n O ∣ H ∣ ( (cid:15) − ) ,as there are at most n iterations, the running time follows. Let S opt be the ρ -independent set ofmaximum cardinality, it remains to prove that ∣ S ∣ ≥ ( − (cid:15) )∣ S opt ∣ .Construct a graph ˜ G with ˜ V = S ∪ S opt as a vertex set. We add an edge an edge between u, v ∈ ˜ V iﬀ d G ( u, v ) < ρ . Clearly all the edges are from S × S opt (as both S, S opt are ρ -independent sets).55 lgorithm 5: Local search algorithm for metric ρ -independent set input : n vertex graph G = ( V, E, w ) , parameters ρ, s output: ρ -independent set S S ← ∅ while ∃ ρ -independent set S ′ ⊆ V s.t. ∣ S ′ ∣ > ∣ S ∣ and ∣ S ∖ S ′ ∣ + ∣ S ′ ∖ S ∣ ≤ s do S ← S ′ return S Note that ˜ V is a minor of G . This is as if we take all the shortest paths P u,v for { u, v } ∈ E ′ theywill not intersect. To see this assume for contradiction that there are diﬀerent u, u ′ ∈ S opt , v, v ′ ∈ S such that { u, v } , { u ′ , v ′ } ∈ E , and there is some vertex z such that z ∈ P u,v ∩ P u ′ ,v ′ . W.l.o.g. assumethat d G ( u, z ) + d G ( u ′ , z ) ≤ d G ( z, v ) + d G ( z, v ′ ) . Using the triangle inequality it follows that d G ( u, u ′ ) ≤ d G ( u, z ) + d G ( u ′ , z ) ≤ ⋅ ( d G ( u, z ) + d G ( z, v ) + d G ( u ′ , z ) + d G ( z, v ′ ))= ⋅ ( d G ( u, v ) + d G ( u ′ , v ′ )) < ρ , a contradiction.Next we use Theorem 15 on ˜ G to obtain r -division R , with B as boundary vertices. Considera cluster C ∈ R , and let C ′ = ( C ∩ S opt ) ∖ B . Fix S ′ = ( S ∖ C ) ∪ C ′ . Claim 4. S ′ is a ρ -independent set.Proof. Consider a pair of vertices u, v ∈ S ′ , we will show that d G ( u, v ) ≥ ρ . If both u, v belong to S ,then as S is ρ -independent set it follows that d G ( u, v ) ≥ ρ . The same argument holds if both u, v belong to S opt . We thus can assume w.l.o.g. that u ∈ S ∖ S opt and v ∈ S opt ∖ S . It follows that u ∉ C while v ∈ C . However, as v ∈ C ∩ S ′ , necessarily v ∉ B . The only vertices in C with edges towardsvertices out of C are in B . It follows that { u, v } is not an edge of ˜ G , implying d G ( u, v ) ≥ ρ .It holds that ∣ S ′ ∖ S ∣ + ∣ S ∖ S ′ ∣ ≤ ∣ C ∣ ≤ r = s . Thus necessarily ∣ S ′ ∣ ≤ ∣ S ∣ , as otherwise Algorithm 5would’ve not returned the set S . Hence ∣( C ∩ S opt ) ∖ B∣ = ∣ C ′ ∣ ≤ ∣ C ∩ S ∣ . As the same argumentcould be applied on every cluster C ∈ R , we conclude that, ∣ S ∣ = ∑ C ∈R ∣ C ∩ S ∣ ≥ ∑ C ∈R ∣( C ∩ S opt ) ∖ B∣ ≥ ∣ S opt ∣ − ∑ C ∈R ∣ C ∩ B∣ ≥ ∣ S opt ∣ − c H ⋅ ∣ ˜ V ∣ r ≥ ∣ S opt ∣ − c H ⋅ ∣ S ∣ r . But this implies that ∣ S opt ∣ ≤ ( + c H r )∣ S ∣ = ( + (cid:15) )∣ S ∣ , thus ∣ S ∣ ≥ + (cid:15) ∣ S opt ∣ ≥ ( − (cid:15) )∣ S opt ∣∣