[PDF] Complexity and Geometry of Sampling Connected Graph Partitions

Abstract

In this paper, we prove intractability results about sampling from the set of partitions of a planar graph into connected components. Our proofs are motivated by a technique introduced by Jerrum, Valiant, and Vazirani. Moreover, we use gadgets inspired by their technique to provide families of graphs where the "flip walk" Markov chain used in practice for this sampling task exhibits exponentially slow mixing. Supporting our theoretical results we present some empirical evidence demonstrating the slow mixing of the flip walk on grid graphs and on real data. Inspired by connections to the statistical physics of self-avoiding walks, we investigate the sensitivity of certain popular sampling algorithms to the graph topology. Finally, we discuss a few cases where the sampling problem is tractable. Applications to political redistricting have recently brought increased attention to this problem, and we articulate open questions about this application that are highlighted by our results.

Full PDF

CCOMPLEXITY AND GEOMETRY OF SAMPLING CONNECTED GRAPH PARTITIONS

LORENZO NAJT ∗ , DARYL DEFORD † , JUSTIN SOLOMON † Abstract.

In this paper, we prove intractability results about sampling from the set of partitions of a planar graph intoconnected components. Our proofs are motivated by a technique introduced by Jerrum, Valiant, and Vazirani. Moreover, weuse gadgets inspired by their technique to provide families of graphs where the “ﬂip walk” Markov chain used in practice forthis sampling task exhibits exponentially slow mixing. Supporting our theoretical results we present some empirical evidencedemonstrating the slow mixing of the ﬂip walk on grid graphs and on real data. Inspired by connections to the statisticalphysics of self-avoiding walks, we investigate the sensitivity of certain popular sampling algorithms to the graph topology.Finally, we discuss a few cases where the sampling problem is tractable. Applications to political redistricting have recentlybrought increased attention to this problem, and we articulate open questions about this application that are highlighted byour results.

1. Introduction.

The problem of graph partitioning , or dividing the vertices of a graph into a smallnumber of connected subgraphs that extremize an objective function, is a classical task in graph theorywith application to network analytics, machine learning, computer vision, and other areas. Whereas thistask is well-studied in computation and mathematics, a related problem remains relatively understudied:understanding how a given partition compares to other members of the set of possible partitions. In thiscase, the goal is not to generate a partition with favorable properties, but rather to compare a given partitionto some set of alternatives. Recent analysis of political redistricting have invoked such comparisons (see § G = ( V, E ), and let P k ( G ) denote the collection of k -partitions of V suchthat each block induces a connected subgraph. One approach to understanding how a given element of P ( G )compares to the other elements proceeds by uniformly sampling from P ( G ), then gathering statistics aboutthis sample and comparing them to the partition under consideration.While this is an attractive approach,this uniform sampling problem is computationally intractable, assuming NP (cid:54) = RP.We open the paper by reviewing this fact. Our ﬁrst new result is that this intractability persists even ifwe consider partitions of equal size. Then, motivated to produce a result that is more relevant to the classesof graphs that arise in redistricting, we show that uniformly sampling P ( G ) remains intractable even if G is a maximal plane graph with a constant bound on the vertex degree. Beyond sampling from the uniformdistribution, we also prove results about the intractability of sampling from a broader class of distributionsover connected k -partitions.Such worst case results should not be considered proof that uniformly sampling from P k ( G ) is always impossible. However, it does indicate that algorithm designers should examine sampling heuristics with someskepticism. Driven by this philosophy, we follow up our investigation of the worst-case complexity with aninvestigation into applicability of a general and often extremely useful sampling tool, which is Markov chainMonte Carlo.In the context of redistricting, Markov chains have rapidly become a popular tool for sampling from P k ( G ) to compare a districting plan ( § P ( G ) by proposing to change the block assignment of a uniformlychosen node, and accepting such moves only if the connectivity of each block is preserved [16, 31]. We callthis chain the ﬂip walk (Deﬁnition 3.1). If G is 2-connected, then the ﬂip walk on P ( G ) is irreducible, andthe stationary distribution is uniform [12]. In principle, running the ﬂip walk on P ( G ) for a long time willproduce a uniformly random element of P ( G ). For this approach to be computationally feasible, however,one must guarantee that the mixing time of the Markov chain on P ( G ) is not too large compared to | G | .Pursuing this angle, we explain how to engineer a family of graphs G ∈ G so that the mixing time ofthe ﬂip walk on P ( G ) grows exponentially quickly in | G | . Based on this, as well as some empirical workmotivated by the bottlenecks we discover, we can conclude that there are strong reasons to doubt thatMarkov chain methods based on the ﬂip walk mix in polynomial time. ∗ University of Wisconsin-Madison. Corresponding Author: [email protected] † Massachusetts Institute of Technology. 1 a r X i v : . [ c s . CC ] A ug n addition to this, we make a connection with the literature on self-avoiding walks [43] that demonstratesthe existence of dramatic phase transitions in the qualitative behavior of distributions on P ( G ). We provideexperiments illustrating the relevance of these phase transitions to redistricting and, inspired by the ideas inthose experiments, we examine the robustness of other popular approaches to sampling from P ( G ), includingsome methods based on spanning trees [37, 41]. Overall, the observations we make highlight interesting anddiﬃcult challenges for the sampling algorithms and inference principles being used in statistical analysis ofredistricting plans.Finally, we discuss a few classes of graphs on which it is possible to sample uniformly from P ( G )in polynomial time, but which are far from the kinds of graphs relevant to redistricting. The large gapbetween where we know that uniform sampling is intractable and where we know it is tractable, along withsome connections to outstanding problems from statistical physics that seem to be on par with the intendedredistricting application, indicates that there are many challenging questions remaining about sampling from P k ( G ). Overview and contributions.

As part of a broader eﬀort to establish mathematical underpinnings for theanalytical tools used in redistricting [12, 13, 31, 67, 77], we identify challenges and opportunities for furtherimprovement related to random sampling in the space of graph partitions. In addition to the technicalmaterial listed below, we articulate some implicit assumptions behind outlier methods used in the analysisof gerrymandering ( § • Sampling intractability results, and bottlenecks: – We review why it is intractable to sample uniformly from P ( G ) ( § P ( G ) and the set of simple cycles of the dual. – One realistic condition to put on samplers from P ( G ) is to restrict to the set of balanced partitions,partitions for which both blocks have the same number of nodes. We next show that uniformly samplingbalanced 2-partitions remains NP-hard ( § – We also prove the intractability of uniformly sampling from P ( G ) for an even more constrained familyof graphs: planar triangulations with bounded vertex degree ( § – We prove that uniformly sampling k -partitions is intractable, using a generalization of bond-cycle duality(Appendix A.3). – The gadgets used in the intractability proofs provide a means for constructing families of graphs suchthat the ﬂip walk Markov chain on P ( G ) has exponentially large mixing time ( § P ( G ) mixestorpidly. • Empirical results: – We include some empirical evidence indicating that Markov chains based on the ﬂip walk mix slowlyon grid graphs ( § § – We mention a link between our sampling problem on grid graphs and long standing challenges regardingthe self-avoiding walk model from statistical physics ( § P ( G ). – We provide experiments ( § § G and sampling from P ( G ) are impactedin surprising ways by the discretization used. • Positive results: – We prove that there are eﬃcient and implementable dynamic programming algorithms that can beused to sample uniformly from P ( G ) and to sample uniformly from the balanced partitions in P ( G ),provided that G is a series-parallel graph. This algorithm succeeds in some cases where the ﬂip walk isunreliable. We observe that these sampling problems are tractable on graphs of bounded treewidth. We are not the ﬁrst team of researchers to have considered redistricting problemsfrom a complexity point of view. Indeed, there are many papers showing that optimization problems relatedto designing the most “fair” or “unfair” districts are NP-hard, for various meanings of the word fair; worksin this category include [13, 67]. Other researchers have explored the complexity of ﬁndings paths through ) b)Figure 1.1: a) Kansas with county units [25], along with a connected 2-partition. b) The corresponding statedual graph overlayed. Much more granular subdivisions of a state are often used.the ﬂip walk state space [12]. We will discuss other related work in the body of the paper, such as theconnection to self-avoiding walks ( § § Let G = ( V, E ) be a graph; unless otherwise speciﬁed, all of our graphs will beundirected, ﬁnite and simple. If unspeciﬁed, usually n := | V | . Given a graph G , V ( G ) denotes the set ofnodes, and E ( G ) the set of edges. An (ordered) k -partition P = ( V , V , . . . , V k ) of G is an list of disjointsubsets V i ⊆ V whose union is V , while an unordered k -partition is a set { V , . . . , V k } satisfying the sameconditions. Throughout this paper we will be concerned with connected k -partitions , i.e., those k -partitionswhere each V i induces a connected subgraph. The set of ordered connected k -partitions of G is denoted P k ( G ), and the set of unordered connected k -partitions of G is denoted P k ( G ). If A ⊂ V , then we will use ∂ E A to denote the edge boundary of A : ∂ E A = {{ u, v } ∈ E : u ∈ A, v (cid:54)∈ A } . In the United States, states are divided into small geographicalunits, such as in Figure 1.1a); these units are combined into voting districts, each of which elects a singlerepresentative. An assignment of these units to a district is called a districting plan. The units can berepresented by the nodes of a graph, where units that share common boundaries are adjacent, as in Figure 1.1.This graph is called the state dual graph. Assuming that voting districts must be contiguous, as is usuallythe case, a districting plan with k -districts is modeled by a connected k -partition of the state dual graph.It was quickly observed [57, 76] that by a clever choice of districting plan one could engineer aspects ofelectoral outcomes, a practice known as “gerrymandering.” In an eﬀort to counteract this, there have beenmany proposals to design districting plans algorithmically, a process which often involves grappling withcomputationally intractable problems [13, 67]. The reality of redistricting, however, is that the power todraw the graph partition is in the hands of a legislature, dedicated committee, or hired expert—rather thana piece of software. For this reason, rather than using an algorithm to draw plans in the ﬁrst place, somehave suggested to analyze already drawn plans for compliance with civil rights law or desirability relative toalternatives.Arguments for or against districting plans are facilitated by understanding a plan in the context ofwhat is possible. For instance, an argument that a plan was drawn with the intent to discriminate mightcalculate that the proposed plan has more discriminatory properties than the vast majority of plans froma randomly generated collection of comparable plans; more speciﬁcally, the claim is that a particular mapis an outlier compared to the other possibilities [28] (see also § diverse ensemble of plans that are compliant with the principles laid out by the governing body.A variety of algorithms have been proposed to sample ensembles of graph partitions for this purpose, fromgenetic algorithms [32, 71] to random walks [31, 54]; recent expert reports in redistricting cases have usedthese tools to generate quantitative assessments of proposed plans [29, 54, 55, 80].While random walk methods like [54,77] are guaranteed to sample from an explicitly designed distributionif run for long enough, practical computational constraints make it impossible to reach that point if there re no guarantees on the mixing time. On the other hand, algorithms like [30, 41, 75], which are intendedto generate a diverse set of partitions, sample from unknown distributions whose properties are hard tocharacterize.Thus, two critical open problems arise when relying on measurements derived from random ensem-bles of districting plans. First, it is diﬃcult to verify whether ensemble generation algorithms produce astatistically-representative sample from a targeted distribution. We study this problem by asking whethercertain distributions over partitions are eﬃciently sampleable ( § § § § § §

2. Sampling Intractability.

In this section, we present our results about the intractability of variousgeneral sampling problems associated with connected k -partitions. The key idea [63] behind proving thatsome uniform sampling problem is intractable is to show that one can modify an algorithm that solves it intoan algorithm that samples from the solutions to some hard problem. We begin by setting up some language( § P ( G ) ( § § P ( G ) remains intractable under certain constraints on the topology of G § k , certain generalizations of the uniform distribution on P k ( G ) are intractable to sample from( § In this section, we discuss some backgroundon sampling problems, the class RP, why RP (cid:54) = NP is a reasonable assumption, and what it means for asampling problem to be intractable. We also prove lemmas that will be used throughout.The formalism for sampling problems, which goes back to at least [63], begins with a ﬁnite alphabet Σand a binary relation between words in this alphabet R ⊆ Σ ∗ × Σ ∗ . We interpret ( x, y ) ∈ R as assertingthat y is a solution to the instance x . For example, we can deﬁne a binary relation R as those ( x, y ) suchthat x encodes a graph G ( x ) and y encodes the edges of a simple cycle of G ( x ). We will consider only thoserelations that can be veriﬁed eﬃciently, which are called p -relations: Definition p -relations, [63]). A relation R ⊆ Σ ∗ × Σ ∗ is a p -relation if there is a deterministicpolynomial time Turing machine that recognizes R ⊆ Σ ∗ × Σ ∗ and if there is a polynomial p such that ∀ x , ( x, y ) ∈ R implies that | y | ≤ p ( | x | ) . We deﬁne R ( x ) = { y ∈ Σ ∗ : ( x, y ) ∈ R } . Now we deﬁne the sampling problems we will be considering:

Definition p -distributions). A family of p -distributions is deﬁned by a p -relation R andfunction f : Σ ∗ → Q ≥ . For each instance x ∈ Σ ∗ with R ( x ) (cid:54) = ∅ , we require that f is not identically zeroon R ( x ) . For such an instance x , we associate a probability distribution p x on R ( x ) , where y ∈ R ( x ) hasweight proportional to f ( y ) . The uniform distribution on R is deﬁned by taking f to be identically . Definition

To each family of p -distributions ( R, p x ) , there is an associatedsampling problem, which we also refer to as ( R, p x ) : P = ( R, p x ) Sampling

Input: x ∈ { x ∈ Σ ∗ : R ( x ) (cid:54) = ∅} Output: A sample drawn according to p x . Similar to approximation algorithms in the deterministic case, we can ask if Turing machine “almost”solves a sampling problem:

Definition α -almost solving a sampling problem). Suppose that P = ( R, p X ) is some samplingproblem. Let α ∈ [0 , . We say that a probabilistic Turing machine M α -almost solves P = ( R, p x ) Sampling if for all instances X with R ( X ) (cid:54) = ∅ , M ( X ) accepts X at least half the time and then outputsa sample from a distribution q MX , where || q MX − p X || T V ≤ α . In the case α = 0 , we say that M solves thesampling problem. e will use the complexity class RP to describe the intractability of a sampling problem. Definition is the class of languages L ⊆ Σ ∗ such that there is a polynomialtime probabilistic Turing machine M and a constant (cid:15) > so that, if x (cid:54)∈ L , M ( x ) always rejects, and if x ∈ L , M ( x ) accepts with probability at least (cid:15) . It is widely believed that RP (cid:54) = NP; this belief follows from the widely believed conjectures that NP (cid:54) = P[10] and BPP = RP = P [58]. Based on this reasoning, and arguments in the style of [63, Proposition 5.1]or [85, Theorem 1.17], which argue that there is likely no eﬃcient algorithm for a sampling problem byshowing that the existence of an eﬃcient sampler would imply RP = NP, we make the following deﬁnitionfor when a sampling problem is intractable:

Definition

We say that a sampling problem P is in-tractable on a language (or class) of instances C if for all α < , the existence of a polynomial timeprobabilistic Turing machine that α -almost uniformly samples from P for all instances in C implies that RP = NP . Lemma 2.8 below abstracts the repetitive part of most proofs showing that a sampling problem isintractable. To state it cleanly, we make the following deﬁnition, which takes a p -relation Q , containedinside a p -relation S , and describes the set of instances that have solutions: Definition p -relation). Let S be a p -relation. A decision problem in S isa p -relation Q , such that Q ⊆ S , with an associated language L Q = { x ∈ Σ ∗ : Q ( x ) (cid:54) = ∅} . If C ⊆ Σ ∗ is alanguage, and L Q ( C ) := { x ∈ C : Q ( x ) (cid:54) = ∅} is NP -complete, then we will say that Q is a decision problemin the p -relation S which is NP -complete on the language C . Lemma

Consider some sampling problem P = ( R, p x ) . Let Q be some deci-sion problem in a p -relation S , which is NP -complete on the language C . Suppose the following assumptionshold for some polynomials p m ( n ) , m ∈ N ≥ : • There is a p m -time Turing machine B m such that for any instance x ∈ C , B m constructs some B m ( x ) ∈ Σ ∗ with R ( B m ( x )) (cid:54) = ∅ . • There is another p m -time Turing machine M m that computes a map π m : R ( B m ( x )) → S ( x ) . • (Probability Concentration) If | Q ( x ) | ≥ and if C is a random variable distributed according to P on R ( B m ( x )) , then P ( π m ( C ) ∈ Q ( x )) ≥ − / m . Then, P is intractable on the language B ( C ) = { B ( x ) : x ∈ C } .Proof. Fix α <

1, and take m = (cid:100) − α (cid:101) . We ﬁx B = B m , M = M m . Assume that there exists a poly-nomial time probabilistic Turing machine G that α -almost solves P on B ( C ). We claim that Algorithm 2.1gives an RP-algorithm for L Q ( C ). Algorithm 2.1 runs in polynomial time, since for any X ∈ C , constructing B ( X ), sampling C with G and computing π ( C ) with M takes time polynomial in | X | . Thus, we only have toprove that the algorithm succeeds with the correct error bounds. Since Algorithm 2.1 clearly has no false pos-itives, we only need to check that there is a constant lower bound on the true positive rate. We will show thatif | Q ( X ) | ≥

1, then the probability of success is at least / m . Suppose that q Y is the distribution over R ( Y ) ofoutputs of G on input Y . Suppose that A = { C ∈ R ( B ( X )) : π ( C ) ∈ Q ( X ) } . Since (cid:107) p B ( X ) − q B ( X ) || T V < α ,and in particular p B ( X ) ( A ) − q B ( X ) ( A ) < α , it follows that q B ( X ) ( A ) > p B ( X ) ( A ) − α ≥ − / m − α ≥ / m .Hence, with probability at least / m , the sample drawn by G from R ( B ( X )) will land in A . In other words, if | Q ( X ) | ≥

1, then Algorithm 2.1 will answer YES with probability at least / m . Since L Q ( C ) is NP-complete,it would follow that NP = RP. Since this argument holds for all α < P is intractable on B ( C ). Algorithm 2.1

Lucky Guess

Input:

G, M, B and x ∈ C as in the proof of Lemma 2.8. Construct B ( x ) Let C be the output of G on B ( x ) if M ( C ) ∈ Q ( x ) then return YES else Return NO . a) Directed (b) UndirectedFigure 2.1: The directed version from [63] and its undirected bigons counter part.Certain calculations appear repeatedly when checking the probability concentration hypothesis of Lemma 2.8.We isolate them here: Lemma If H, N ≥ and H ≥ DN for some D > , then HH + N ≥ D D . Lemma

Fix q ≥ . Then for any e ∈ N and S ≥ if d ≥ and d ≥ log ( S )+ e log ( q ) ) then q d ≥ Sd e .Proof. It suﬃces to pick d so that d log( d ) ≥ log( S )+ e log( q ) . Since d log ( d ) ≥ √ d for d ≥

2, the claim follows.

The seminal paper [63] proves thatthe following sampling problem is intractable:

GenDirectedCycle

Input: A directed graph G .Output: An element of the set of directed simplecycles of G , selected uniformly at random. Theorem

The sampling problem

GenDirectedCycle is intractable onthe class of directed graphs. We refer the reader to the original paper for the proof. The idea is to concentrate probability on longercycles by replacing edges with chains of diamonds, as in Figure 2.1(a), which has the eﬀect of increasingthe number of ways to traverse a cycle by a quantity which grows at a exponential rate proportional to thelength of the cycle.Similarly, one can consider an undirected, plane graph version of the same problem:

GenSimpleCycle

Input: An undirected, plane graph G .Output: An element of SC ( G ), selected uniformlyat random.Although similar to Theorem 2.11, will include the proof that GenSimpleCycle is intractable on theclass of plane graphs as a preface to our other results. To do so, we formalize an analog of the chain ofdiamonds construction:

Definition

Let G be an (undirected) graph. Let B d ( G ) denote the graph obtained from G by replacing each edge by a chain of d bigons . Formally, we subdivideeach edge into d edges and then add a parallel edge for every edge. This construction is illustrated inFigure (b). There is a natural map π d : SC ( B d ( G )) → E ( G ) , which collapses the chains of bigons.Formally, π d ( C ) = { e ∈ E ( G ) : B d ( e ) ∩ C (cid:54) = ∅} . We want to acknowledge a helpful Stack Exchange conversation [2] with Heng Guo that alerted us to this theorem. For there to be a Turing machine M as in Lemma 2.8 that simulates π d , the collection of B d ( e ) for e ∈ E ( G ) has to bepart of the encoding of the graph B d ( G ) as a string of Σ ∗ , and B must construct an encoding of B d ( G ) with this information.We assume that all of this is true, and make sense of SC ( B d ( G )) and similar objects in terms of the underlying graph. Ingeneral we will omit discussion of this level of detail. 6 roposition GenSimpleCycle is intractable on the classof simple plane graphs.Proof.

Given a polynomial time probabilistic Turing machine M which α -solves GenSimpleCycle onthe class of simple plane graphs, we obtain one that α -almost solves GenSimpleCycle on the class of planegraphs by subdividing each edge of a given graph. We thus let C be the class of plane graphs, and we willnow work on checking that the conditions for Lemma 2.8 can be satisﬁed. For G ∈ C , with n = | V ( G ), weﬁx m and take d = n + m . Observe that Im( π d ) = SC ( G ) ∪ E ( G ). If X ∈ SC ( G ) is a simple cycle with m edges, then | π − d ( X ) | = 2 dm . For e ∈ E ( G ), | π − d ( e ) | = d . Thus, if H ⊂ SC ( G ) is the set of Hamiltoniancycles of G , and | H | (cid:54) = 0, then | π − d ( H ) | ≥ dn ≥ m n d ( n − ≥ m | π − d (2 E ( G ) \ H ) | ; here we haveused 2 n as a crude upper bound on | SC ( G ) ∪ E ( G ) | . Thus, by Lemma 2.9, | π − d ( H ) || SC ( B d ( G )) | ≥ m m ≥ − / m .Deﬁne polynomial-time Turing machines B and M so that B ( G ) = B d ( G ) and M ( C ) = π d ( C ), and set S ( G ) = E ( G ) ∪ SC ( G ), Q ( G ) = HamiltonianCycles( G ) and R ( G ) = SC ( G ) for all G ∈ C . Since theHamiltonian cycle problem is N P -complete on C [49], the conditions of Lemma 2.8 are satisﬁed. P ( G ) . As we discussed in § GenConnected2Partition

Input: A graph G .Output: An element of P ( G ), selected uniformlyat random.We recall a fact about plane duality, which will connect Proposition 2.13 above to GenConnected2Partition . Theorem

Let G be a plane graph and G ∗ its plane dual. Then, there is an polynomialtime computable bijection between SC ( G ∗ ) and P ( G ) . Theorem

GenConnected2Partition is intractable on the class of plane graphs.Proof. If G is a plane graph, then a polynomial algorithm to sample uniformly P ( G ∗ ) gives an algorithmto sample uniformly from SC ( G ) by Theorem 2.14. Now the result follows from Proposition 2.13.This theorem implies that the broadest version of uniform sampling from the space of graph partitionsis intractable. While already this observation is notable given how little is known about the graphs thatappear in redistricting, our discussion does not end here. Rather, § §

3, we will also highlight how the probability concentration gadgets identify concreteissues with Markov chains used for sampling partitions.

For applications in redistricting,the blocks of a connected partition should be roughly equal in population. This motivates studying theproblem of sampling balanced partitions:

Definition (cid:15) -balanced simple cycles and 2-partitions).

Let SC (cid:15) ( G ) be the set of (cid:15) -balanced simplecycles of a plane graph G , such that if { A, B } is the dual connected 2-partition of G ∗ then − (cid:15) ≤ | A || B | ≤ (cid:15) and − (cid:15) ≤ | B || A | ≤ (cid:15) . Similarly, we say that a partition ( A, B ) ∈ P ( G ) is (cid:15) -balanced if these inequalitieshold for { A, B } ; we deﬁne P (cid:15) ( G ) to be the set of such partitions. (cid:15) -Balanced Uniform 2-Partition Sampling Input: A plane graph G .Output: An element of P (cid:15) ( G ) selected uniformlyat random.The existence of a 0- balanced G is not obvious. In fact, determining if there existsa balanced connected 2-partition of a given graph G is N P -complete: We wish to acknowledge a helpful Stack Exchange discussion with Mikhail Rudoy which ﬁrst drew our attention to thistheorem [6]. 7 heorem ). The decision problem of whether a given connected graph G hasa -balanced, connected -partition is NP -complete. Given Theorem 2.17, the problem of uniformly sampling from the set of 0-balanced connected 2-partitionsis vacuously intractable. To circumvent this issue, we focus on the case where G is 2-connected, since inthat case a 0-balanced 2-partition always exists and can be constructed in polynomial time by constructiveversions [90] of the Gy˝ori–Lov´asz Theorem [53, 72] for 2-partitions. As in the previous section, our strategyis to work with simple cycles, rather than connected partitions. In particular, the following classical theoremwill let us translate a statement about Hamiltonian cycles into a statement about balanced Hamiltoniancycles: Theorem

Let G be a plane graph. Assign to each face F a weight deg ( F ) − , where deg ( F ) is the number of edges in F . If H is a Hamiltonian cycle of G , then the totalweight of the faces inside of H is equal to the total weight of the faces outside of H . Grinberg’s Theorem implies that every Hamiltonian cycle of a maximal plane graph is balanced, sinceevery face is a triangle. Since the problem of determining whether a maximal plane graph has a Hamiltoniancycle is NP-complete [93], it follows that the problem of determining whether a maximal plane graph has a balanced

Hamiltonian cycle is NP-complete. We begin by deﬁning the map π and proving the probability concentration result used to apply Lemma 2.8.In particular, recalling the deﬁnition of π d (Deﬁnition 2.12), we deﬁne π (cid:15)d : SC (cid:15) ( B d ( G )) → E ( G ) as therestriction of π d : SC ( B d ( G )) → E ( G ) to the (cid:15) -balanced simple cycles of G . We derive the necessaryinequalities in the following lemma: Lemma

Let G = ( V, E ) be a maximal plane graph, with n = | V ( G ) | . Let C be a Hamiltonian cycleof G , and let A be any non-Hamiltonian cycle of G . Let (cid:15) ≥ . Then we have | ( π (cid:15) d ) − ( C ) | ≥ | ( π d ) − ( C ) | = (cid:18) dndn (cid:19) ≥ dn dn + 1 , and (2.1) | ( π (cid:15) d ) − ( A ) | ≤ | ( π ∞ d ) − ( A ) | ≤ d ( n − . (2.2) If H is the set of Hamiltonian cycles of G , and C ∈ H , then for d ≥ (cid:100) ( m + n / / n )) (cid:101) , (2.3) | ( π (cid:15) d )( H ) | ≥ m | π − d (2 E ( G ) \ H ) | . Proof of Equation (2.1) and Equation (2.2) . Since every Hamiltonian cycle of G is balanced, the onlyway to lift the cycle to a balanced simple cycle of B d ( G ) is to take the inward edge along exactly half of thebigons. For the lifts of non-Hamiltonian cycles, we can bound the number of lifts to a balanced simple cycleby the number of lifts to a cycle. These inequalities then follow from the standard fact that r r +1 ≤ (cid:0) rr (cid:1) Proof of Equation (2.3) . By Lemma 2.10, for d ≥ (cid:100) ( m + n / / n )) (cid:101) we have2 dn ≥ (2 m + n +2 n ) d d ( n − ≥ m n (2 dn + 1)2 d ( n − , and thus | ( π (cid:15) d )( H ) | ≥ | ( π (cid:15) d )( C ) | ≥ dn dn + 1 ≥ m n d ( n − ≥ m | π − d (2 E ( G ) \ H ) | . Proposition

Fix (cid:15) ≥ . Then, (cid:15) -Balanced Uniform Simple Cycle Sampling is intractableon the class of graphs of the form B d ( G ) , where d ≥ and G is any maximal plane graph.Proof. To prove this, we ﬁt what we have calculated into the format of Lemma 2.8. Fix m . Then wetake d = 4 (cid:100) ( m + n / / n )) (cid:101) , which is polynomial in G , and set B ( G ) = B d ( G ), and M to The phrase “ k -partition” in [45] has a diﬀerent meaning from the way we use it here. In their notation, a k -partition isa connected partition where each piece has size k . What we call a balanced connected 2-partition, they call an n / partition.The theorem is stated here in our notation. The authors wish to thank Gamow from Stack Exchange for a helpful comment [7] and directing us towards [93].8 ompute π d . The probability concentration hypothesis follows from Lemma 2.9 and Equation (2.3), sincethey show that | ( π (cid:15) d ) − ( H ) || SC (cid:15) ( G ) | ≥ m m ≥ − /m . Finally, we set Q to be the Hamiltonian cycles, and C tobe the class of maximal plane graphs. Theorem

Fix (cid:15) ≥ . Then (cid:15) -Balanced Uniform 2-Partition Sampling is intractable on theclass of -connected plane graphs.Proof. ( B d ( G )) ∗ is 2-connected when G is a maximal plane graph. The claim now follows by Proposi-tion 2.20 using Theorem 2.14. Remark

These arguments work for any reasonable deﬁnition of nearly balanced that considerspartitions with | A | = | B | to be balanced. In this section,we improve the results from § § N P -completeness theorem, building on the results in [49].Second, in § § NP -complete on cubic, 3-connected plane graphs with facedegree ≤ . Definition

CCP graphs with bounded face degree).

Recall that a CCP graph is one that is -connected, cubic, simple, and plane. Let C m denote the collection of CCP graphs with face degree ≤ m .Let C = (cid:83) k ≥ C k . Definition If D is a language of graphs, let D -HAM = { G ∈ D : G is Hamiltonian } Our goal in this section is to prove the following theorem: Theorem C -HAM is NP -complete. We will prove a reduction from the following theorem:

Theorem C -HAM is NP -complete To obtain Theorem 2.25 from Theorem 2.26, we will show that we can take G ∈ C and construct a G (cid:48) ∈ C such that G (cid:48) has Hamiltonian cycle if and only if G does. We will obtain G (cid:48) from G by using analgorithm that subdivides the large faces repeatedly (Algorithm 2.2) in polynomial time (Proposition 2.33).The gadget that we use to subdivide large faces comes from the proof of Theorem 2.26 in [49], in particular,we use their 3-way OR gate gadget. We now review some relevant properties of that 3-way OR (3OR) thatwill be used in the reduction: Definition OR Gadget, [49]).

The OR gadget is pictured in Figure . This gadget has threedistinguished sets of attaching nodes, each of which consists of a path graph with nodes. Definition OR insertion). A OR gadget can be inserted into a face F of a plane graph bypicking 3 edges { e , e , e } of F , replacing each edge e i with a path containing nodes, and gluing each ofthe distinguished segments of the OR to one of those subdivided edges. For the plane embedding, we placethe rest of the OR into the interior of F , as in Figure . We will refer to this operation as inserting a3 OR into F at the edges { e , e , e } . The authors are grateful to P´alv¨olgyi D¨om¨ot¨or Honlapja for suggesting the proof strategy used in this section [3]. We areof course responsible for all errors in our execution of the strategy.9 igure 2.2: The 3OR gadget. See Figure A.1 in the appendix for more detail. The distinguished attachingnodes are colored red. Figure 2.3: Inserting a 3 OR . See Figure A.1 for more detail. Lemma

Let H be a CCP graph, F a face of H , and { e , e , e } edges of H . Construct a graph H (cid:48) by attaching a OR gadget to the edges { e , e , e } . Then H (cid:48) has a Hamiltonian cycle if and only if H has a Hamiltonian cycle containing at least one of the e i . Additionally, H (cid:48) is a CCP graph.Proof [49].

This amounts to an analysis of the local states, which are described in [49, Fig. 6B]. Theproof that these are the only possible local states exploits detailed properties of the 3 OR gadget; the readercan consult [49] for details. The proof that H (cid:48) is 3-connected follows from Lemma A.1, and checking that H (cid:48) is cubic and planar is straightforward.Our strategy will be to take large faces and subdivide them using 3 OR gadgets: Definition

Let F be a face of H . A subdivision of F is a graph H (cid:48) obtained bytaking edges of F , say e , e , e , where e and e share a vertex , and inserting a OR into F at e , e , e .See Figure for an illustration of this deﬁnition. In Figure A.1 we label a few regions of this subdivisionfor reference: , a pocket , and . The diﬀerence between ‘subdivision’ and ‘inserting a 3 OR ’ is that we require that 2 of the edges usedare adjacent. The following proposition shows that we can subdivide faces without changing Hamiltonicity: Proposition

Let H be any CCP graph, and let F be a face of H . Let v be a vertex of F , let e and e be the edges of F adjacent to v , and let e be any other edge of F . Let H (cid:48) be the graph obtained bysubdividing H at e , e and e . Then, H has a Hamiltonian cycle if and only if H (cid:48) has a Hamiltonian cycle.See Figure A.1 . lgorithm 2.2 Subdivision of faces with more than d edgesInput: A plane graph H and d ∈ N . Let F be any face of H with maximum degree if deg ( F ) ≤ d then terminate and return H else Use a 3 OR gadget to optimally subdivide F (Deﬁnition 2.32). Set this subdivided graph as H . Return to 1.

Proof.

Since H is cubic, every Hamiltonian cycle of H uses all but one of the edges at each vertex. Thus,any Hamiltonian cycle uses at least one of { e , e } , allowing us to apply Lemma 2.29.Since our goal is to reduce the face degree, we deﬁne the subdivisions that optimally decrease degree: Definition

Let F be a face of a CCP graph G . An optimal subdivision of F is any subdivision of F that minimizes the maximum degree of the two large faces (cf. Deﬁnition )of the subdivision. Algorithm 2.2 takes a graph H and a parameter d , and—if it terminates—returns a graph that has nofaces of degree > d . We will use this algorithm to reduce to an instance of C -HAM from one of C -HAM, sowe must show that Algorithm 2.2 terminates in polynomial time for an appropriate choice of d . To determinethe necessary d , we need the following lemma: Lemma

Let G be a CCP graph. Suppose that F is a face of degree f . Suppose we make an optimalsubdivision of F at edges e , e , e , where e is adjacent to e . Let F i be the face in G adjacent to e i for i = 1 , , . Then, the following hold: • Each F i is distinct. Moreover, the degree of each F i increases by . • The two large faces (Figure

A.1 ) inside of what was originally F each have degree (cid:98) f / (cid:99) +10 and (cid:100) f / (cid:101) +10 • The gadget itself introduces faces of degree , faces of degree , faces of degree , faces of degree , faces of degree , faces of degree , faces of degree , and of degree . We call these the“small faces.” • A face of degree 10 is introduced, which is labelled “the pocket” in Figure

A.1 .Proof.

Figure A.1 in the appendix can be used to count the degrees of these faces, and the number ofthe small faces of diﬀerent sizes. The fact that each F i is distinct follows from the fact that the graph is3 CCP . Proposition

Let H be a CCP graph with n nodes. As long as d ≥ , Algorithm 2.2 terminatesin time polynomial in | H | .Proof. We will consider a single step in the subdivision algorithm and show that in each step a certainnonnegative energy function decreases by at least one. Since the energy function starts oﬀ with value O ( n ),the proposition will follow.Let f j , j = 1 , . . . , | F ( H ) | , and f (cid:48) k , k = 1 , . . . , | F ( H (cid:48) ) | refer to the face degrees in some enumerations ofthe faces of H , before and after one step of Algorithm 2.2, respectively. Assume that f corresponds to theface F being subdivided during that step and that f , f , f correspond to the faces adjacent to F alongthe edges where the 3 OR gadget is being added. Let S = (cid:80) f i and S (cid:48) = (cid:80) f (cid:48) k . By Lemma 2.1 and notatingthe degrees of all the small faces by c i , we have that S (cid:48) = S − f + (cid:80) c i + 10 + ( (cid:98) f / (cid:99) + 10) + ( (cid:100) f / (cid:101) +10) + (cid:80) j ∈{ , , } (( f j + 6) − f j ) . Thus, if d is a positive integer so that − d + (cid:80) c i +10 +( (cid:98) d/ (cid:99) +10) +( (cid:100) d/ (cid:101) +10) +3(( d +6) − d ) ≤ − > d one step of the subdivision algorithm reduces the energy S byat least one. The precise computation of the smallest such d depends on the counts of the c i listed inLemma 2.1. Using these counts, it can be checked that taking d = 178 suﬃces to ensure that the energydecreases by at least one in each step.Finally, note that the initial energy S is bounded by ( (cid:80) i f i ) = (2 | E ( H ) | ) = O ( | V ( H ) | ). Therefore,after O ( | V ( H ) | ) subdivision steps, Algorithm 2.2 with d = 178 terminates. Since each step of Algorithm 2.2 R R Figure 2.4: The ﬁrst 3 terms in the sequence of gadgets. a b c b (cid:48) c (cid:48) a (cid:48) c b a c a + = b a b c b (cid:48) c (cid:48) a (cid:48) c b a Figure 2.5: The node labels in the recursive constructiontakes polynomial time, the result follows.In particular, we can use Algorithm 2.2 to eliminate all of the faces of degree greater than or equal to178. Combining this with Theorem 2.26, we can now prove Theorem 2.25:

Proof of Theorem . Let H ∈ C . Apply Algorithm 2.2 to obtain an H (cid:48) ∈ C , such that H (cid:48) has aHamiltonian cycle if and only if H has one. Constructing H (cid:48) takes polynomial time, by Proposition 2.33. R d . In this section, we will construct the corresponding gadgetto the chain of bigons, which will allow us to concentrate probability on the longer cycles while remaining3

CCP . Instead of replacing edges with a chain of d bigons, which allowed for 2 d choices of ways to routethrough that edge, the gadgets R d we construct here will replace cubic vertices and allow for Θ(5 d ) choicesthrough that vertex. The ﬁrst few R d ’s are displayed in Figure 2.4, and we give a deﬁnition below: Definition R d , C d , R (cid:48) d ). Deﬁne R to be a -cycle, withnodes labeled with ( a , b , c ) . For each d ≥ , we will construct R d +1 from R d . First, we construct R (cid:48) d from R d by subdividing the edges { x d , y d } for all x (cid:54) = y ∈ { a, b, c } . The node that subdivided the edge { x d , y d } getslabelled z (cid:48) d , where { x, y, z } = { a, b, c } . For each x ∈ { a, b, c } , we attach a node x d +1 and an edge { x (cid:48) d , x d +1 } .Then, we separately build a cycle C d +1 with nodes labelled by ( a d +1 , b d +1 , c d +1 ) . We obtain R d +1 by gluing R (cid:48) d to C d by identifying the nodes with the same labels. See Figure for an illustration of this construction. To show probability concentration, we will need to compute the number of choices a simple cycle willhave when passing through an R d gadget, as well as the number of simple cycles internal to an R d gadget.The latter we know how to describe as | SC ( R d ) | , and the former is captured by the following deﬁnition: Definition

We call the simple paths in R d that go from any two points x (cid:54) = y ∈ { a , b , c } simple boundary links , denoted SBL ( R d ; x, y ) , where { a , b , c } are as in Deﬁni-tion . We denote SBL ( R d ) := SBL ( R d ; a , b ) . Because of the rotational symmetry of the gadget, | SBL ( R ) | does not change if we choose a diﬀerenttwo element subset of { a , b , c } as the start and stop vertices. We next introduce notation that will be The inspiration for this construction came from [93], wherein one step a reduction from Hamiltonian cycle on 3

CCP graphsto Hamiltonian cycle on maximal plane graphs is to replace cubic vertices by a certain gadget.12 b Figure 2.6: The decompositions used for Equation (2.4) and Equation (2.6). The S d (resp. D d ) part is inred. Observe that there are two options for the simple path R d [ C d ], one of which is colored blue.useful when computing | SC ( R n ) | and | SBL ( R n ) | : Definition X to Y ). For any graph G , with X, Y ⊆ V ( G ) , we let SP X,Y ( G ) denote the set of simple paths in G that start in X , and stop at the ﬁrst positive time they reach Y : SP X,Y ( G ) = { γ = ( x , x , . . . , x n ) : x i ∈ V ( G ) , { x i , x i +1 } ∈ E ( G ) , x ∈ X, x n ∈ Y, x i (cid:54)∈ Y for < i < n } . Theorem

Let R d be as in Deﬁnition . Then: | SC ( R d ) | = 14 (3 · d +1 − d − d ≤| SC ( R d ) | ≤ d +1 (2.5) | SBL ( R d ) | = 12 (5 d +1 − d ≤| SBL ( R d ) | ≤ d +1 . (2.7) Proof of Equation (2.4) . We partition the simple cycles in R d into those that touch C d and those that donot. Those simple cycles that do not touch C d can be identiﬁed with simple cycles in R d − . To describe thesimple cycles that touch C d , we start by deﬁning S d = { X ∈ SP V ( C d ) ,V ( C d ) ( R d ) : X ∩ E ( C d ) = ∅} . Amongthe cycles that touch C d , there is C d itself, and there are the cycles that can be decomposed into an elementof SP V ( C d ) ,V ( C d ) ( R d [ C d ]) along with an element of S d , as in Figure 2.6. Thus, SC ( R d ) = SC d − + 1 + 2 S d .It can be checked that S d = 5 S d − + 6, by analyzing the number of ways to extend an element of S d − to an element of S d and by accounting for the six elements of S d not obtained by extensions of S d − , as inFigure 2.7. This second calculation also shows that S = 6. We can solve this recurrence relation to concludethat S d = (3 / d − SC d = SC d − + 1 + 2 S d we have that SC d = SC d − + 1 + 3(5 d − SC = 1, we conclude that SC d = 1 / · d − d − Proof of Equation (2.6) . We partition the simple boundary links in R d into those that pass through C d and those that do not. Those elements of SBL ( R d ) that do not touch C d can be identiﬁed with SBL ( R d − ).To analyze those that pass through C d , we deﬁne D d to be the set of pairs of disjoint simple paths, onefrom a that stops at the ﬁrst point it touches C d , and the other from b that stops at the ﬁrst point ittouches C d . The elements of SBL ( R d ) that touch C d can be decomposed into an element γ of D d andone of the two simple paths in R d [ C d ] that connect the points where γ meets C d , as in Figure 2.6. Thus, BL d +1 = BL d + 2 D d .We can compute that D d +1 = 5 D d , by analyzing how elements of D d can be extended to elements of D d +1 , as in Figure 2.7. As D = 1, we have that D d = 5 d . From BL d +1 = BL d + 2 D d , we have that BL d +1 = BL d + 2(5 d +1 ). As BL = 2, we can solve the recurrence to ﬁnd that BL d = (1 / d +1 − Proof of Equation (2.5) and Equation (2.7) . These follow directly from Equations (2.4) and (2.6). ) b)Figure 2.7: a) The ﬁve ways to extend an element of D n to one of D n +1 , or one of S n to one of S n +1 . b)The 6 elements of S n that are not extensions of elements of S n − . The inner 3-cycle is C n ; most of R n isnot pictured. We establish the analogous construction to re-placing edges by chains of bigons and state a few results that we will need to ﬁnish the intractability proof.

Definition G → R d ( G ) vertex replacement construction). Given a cubic graph G , we let R d ( G ) denote the graph obtained by keeping the edges and replacing each vertex of G with a copy of R d (Deﬁni-tion ). Proposition

The construction G (cid:55)→ R d ( G ) sends C m to C m . See Appendix A.2 for proof of this proposition.With the R d construction in place, we analyze the relationship between R d ( G ) and G to show that R d can be used to concentrate probability onto the longer simple cycles of G . Definition

There is a natural inclusion map i : E ( G ) → E ( R d ( G )) . We will call the edges in the image of i the original edges . This lets us deﬁne a map π d :2 E ( R d ( G )) → E ( G ) by π d ( X ) = i − ( X ) . Lemma

For any cubic graph G , Im( π d ) = SC ( G ) ∪ {∅} for d ≥ .Proof. That SC ( H ) ∪ {∅} is contained in the image is straightforward. Now, let β ∈ SC ( R d ( G )). If π ( β )has degree 1 or 3 at a node v ∈ V ( G ), then β had an odd degree node in R d ( v ), which impossible as β is asimple cycle. Moreover, π ( β ) is connected. Since the simple cycles can be characterized as the non-emptyconnected edge subgraphs such that all nodes are degree 2, this concludes the proof.We now compute the probability concentration lemma necessary for applying Lemma 2.8: Lemma

Suppose that C is a Hamiltonian cycle of G , where G has n ≥ nodes. If d ≥ n + n + m ,and X is a uniform sample from SC ( R d ( G )) , then P ( π d ( X ) is a Hamiltonian cycle of G ) ≥ m m .Proof. Let H be the set of Hamiltonian cycles of G . From Equation (2.5) and Equation (2.7) it followsthat | π − d ( ∅ ∪ SC ( G ) \ H ) | ≤ n ( d +1)( n − + n d +1 ≤ n +1 ( d +1)( n − and | π − d ( H ) | ≥ dn . For d ≥ n + n + m , 5 dn ≥ m n +1 n − d ( n − . The claim now follows by Lemma 2.9. Theorem

If the Hamiltonian cycle problem is NP-complete on C d , then the problem of uniformlysampling simple cycles is intractable on the class of graphs C d .Proof. We follow the notation of Lemma 2.8. For ﬁxed m , we take d = n + n + m . Deﬁne B by B ( G ) = R d ( G ), and M by the map π d , and Q is the set of Hamiltonian Cycles of G . Lemma 2.42 assuresthat the conditions for Lemma 2.8 are satisﬁed. Corollary

The SC ( G ) uniform sampling problem is intractable on C .Proof. Immediate from Theorem 2.25 and Theorem 2.43.We can now prove the main result of this section: heorem The P ( G ) uniform sampling problem is intractable on the class of maximal plane graphswith vertex degree ≤ .Proof. Since the dual graphs of those graphs in C are exactly the maximal plane graphs with vertexdegree ≤

531 and since simple cycles correspond bijectively to (unordered) connected 2-partitions under thatduality, this result follows from Corollary 2.44. k -partitions. To obtain an intractabilitytheorem about uniformly sampling connected k -partitions, we follow a similar approach as in previoussections. First, we recall a plane duality theorem, and then we prove that a relevant optimization problem isNP-complete. Finally, we introduce a gadget that concentrates samples on the certiﬁcates to that problem.We remind the reader that P k ( G ) denotes the set of unordered k -partitions of G , such that eachblock induces a connected subgraph. In this section, we will show that uniformly sampling from P k ( G )is intractable on the class of planar graphs. We will also show intractability for a family of probabilitydistributions that weights partitions according to the size of their boundary. k -partitions. Key to our proof will be a duality theorem Theorem 2.52,which is proven in detail in the appendix. We now list the deﬁnitions necessary for its statement.

Definition

Let P c ( G ) be the set of unordered partitions of V ( G ) such that each block induces a connected subgraph. That is, P c ( G ) = (cid:83) | V ( G ) | k =1 P k ( G ) . Definition

Let P be a partition of V ( G ) . If P = { A , . . . , A k } , then we refer to the A i as the blocks of P . Let cut( P ) denote the set of edges of G with endpoints in diﬀerent blocks of P . Definition

Given J ⊆ E ( G ) , deﬁne a partition comp( J ) ∈ P c ( G ) as thepartition into the connected components of G \ J . Definition

Let E ( G ) denote the set of subsets of edges of G suchthat each connected component of the induced subgraph is -edge connected: E ( G ) = { J ⊆ E ( G ) : Each component of G [ J ] is -edge connected } . Definition

Let G be a graph. Then h ( G ) denotes thecircuit rank of G , that is, the minimum number of edges that must be removed to make G a forest. Also, h ( G ) denotes the number of connected components of G . Definition k -partition). We deﬁne P ∗ k ( G ) = { J ∈ E ( G ) : h ( G [ J ]) = k − } . We call theelements of P ∗ k ( G ) dual k -partitions. Theorem P k ( G ) and P ∗ k ( G ∗ )). Let G be a plane graph, and G ∗ its planardual. Let D : 2 E ( G ) → E ( G ∗ ) be the natural bijection. The map D ◦ cut : P k ( G ) → P ∗ k ( G ∗ ) is a bijection,with comp ◦ D − : P ∗ k ( G ∗ ) → P k ( G ) as its inverse. Both are computable in polynomial time. NP -complete problem. We will show that it is NP-complete to decideif P ∗ k ( G ) has length maximizing elements. Definition

Let J be a subset of edges of a graph G . We say that J spans G if every node of G is incident to some edge of J . Proposition

Let G = ( V, E ) be a graph. The maximum number of edges any set J ⊆ E with h ( G [ J ]) = k − can have is | V | + k − . Moreover, a J ⊆ E with h ( G [ J ]) = k − has | V | + k − edges ifand only if G [ J ] has one component and spans G .Proof. Let E J , V J be the number of edges and vertices of G [ J ], respectively. From E J − V J = h ( J ) − h ( J ) (Proposition A.21), we have E J − V J = k − − h ( J ). Thus, E J = V J + k − − h ( J ) ≤ | V | + k −

2, as h ( J ) ≥ V J ≤ | V | . This establishes the upper bound. Moreover, these inequalities become equalitiesif and only if V J = | V | and h ( G [ J ]) = 1, which is to say, if and only if J spans and has one component.To deﬁne the NP-complete decision problem we will use for the sampling intractability proof, we singleout the elements of P k ( G ) that achieve the upper bound of Proposition 2.54 and deﬁne a correspondingdecision problem. efinition k -partition). Let P ∗ k ( G ) m be the subset of P ∗ k ( G ) consisting of thesubgraphs that have | V ( G ) | + k − edges, which is the maximal number of edges possible. PlanarMaxEdgesDualkpartition

Input: A planar graph G Output: YES if P ∗ k ( G ) m (cid:54) = ∅ , NO, otherwise.We will prove that PlanarMaxEdgesDualkpartition is NP-complete, by reducing from the Hamil-tonian cycle problem on grid graphs:

Theorem

Let G be a ﬁnite subgraph of the square grid graph Z , wherein integer pointsare adjacent if and only if their Euclidean distance is . Deciding whether G has a Hamiltonian cycle is NP -complete. Proposition

The problem

PlanarMaxEdgesDualkPartition is NP -complete.Proof. The language of graphs that have P ∗ k ( G ) m (cid:54) = ∅ is in NP, since checking if a given set of edges isin P ∗ k ( G ) m can be done in polynomial time. We now show the reduction from Hamiltonicity of grid graphs.Let G be a some subgraph of the grid graph, which we assume without loss of generality is 2-connected,since otherwise it has no Hamiltonian cycle. Because G is 2-connected, the lexicographic upper-left mostnode v of G must have degree 2. We build a new graph, G (cid:48) , by removing v and connecting its neighborswith a chain of k − G (cid:48) has an element of P ∗ k ( G ) m if and only if G has a Hamiltonian cycle. If G has aHamiltonian cycle, that cycle had to pass through both edges of v , hence we can replace those edges with thediamonds in G (cid:48) . The result has h = k −

1, spans G (cid:48) , and is connected. Thus it is an element of P ∗ k ( G (cid:48) ) m .Going in the other direction, suppose that there is an X ∈ P ∗ k ( G (cid:48) ) m . Since X must span G (cid:48) , X mustcontain each node in the chain of diamonds. Moreover, since elements of P ∗ k ( G ) have no bridge edges(Lemma A.1), this implies that X contains all the edges in that chain of diamonds. Thus, k − h ( X ) isaccounted for by the diamonds, and for h ( X ) = k −

1, the rest of X must be a simple cycle, which spans G \ v because X spans G (cid:48) . Replacing those diamonds by the path { a, v, b } , we obtain a Hamiltonian cycleof G . va b Figure 2.8: A step in the reduction in Proposition 2.57 k -Partitions. We will concernourselves with the following sampling problem, where we ﬁx λ > λ -sampling connected dual k -partitions Input: A graph G .Output: An element of P ∗ k ( G ), drawn accordingto the probability distribution that assigns a set J ∈ P ∗ k ( G ) weight proportional to λ | J | .To make some estimates, we give the probability distribution in this problem a name: ) A degenerating dual 3-partition b) A Dual 3-partition with a bigon froman B r,d , but which does not degenerateFigure 2.9: Example of degeneration and subtle non-degeneration. Definition ν λ and N λ ). Let G be a graph, and let λ > . For any collection of setsof edges Y ⊆ E ( G ) , we deﬁne the measure N λ on Y by N λ ( J ) = λ | J | for all J ∈ Y . We deﬁne a probabilitymeasure ν λ by normalizing N λ : ν λ ( J ) = N λ ( J ) N λ ( Y ) . To prove that sampling from ν λ is intractable, instead of using chains of bigons like we did for theuniform distribution, we will use chains of order- r dipoles, where r will be chosen so that rλ ≥ Definition r dipoles). Deﬁne B r,d ( G ) as the graph obtained from G by subdividingeach edge of G into d segments, and then replacing each edge of the resulting graph by r parallel edges (i.e.order- r dipoles). Let B r,d ( e ) denote the chain of d order- r dipoles that replaces the edge e . Definition

We deﬁne a map π d : P ∗ k ( B r,d ( G )) → E ( G ) by π d ( X ) = { e ∈ E ( G ) : X ∩ B r,d ( e ) (cid:54) = ∅} . Now we discuss the main technical hurdle to overcome in this section. Recall from § C , there was only one way that π d ( C ) could fail to be a simple cycle, namely, if C wasone of the bigons. Crucially, there were only d | E | ways this could happen, which was negligible comparedto the size of the domain of π d . On the other hand, elements of P ∗ k ( G ) can degenerate in more complicatedways, as is shown in Figure 2.9. The next few propositions establish that the degenerating elements—andall of the other possibilities—remain negligible compared to the preimages of P ∗ k ( G ) m : Lemma

For each Y ∈ P ∗ k ( G ) , there is a (cid:101) Y ∈ P ∗ k ( B r,d ( G )) such that π d ( (cid:101) Y ) = Y .Proof. Simply replace each edge e ∈ Y with a simple path of length d along B r,d ( e ). This does notchange the topology, and hence the result has the same cycle rank and no bridges, which characterizes theelements of P ∗ k by Lemma A.1.It will be convenient to single out the particular kind of π d -preimage constructed in the proof of the lastlemma: Definition If Y ∈ P ∗ k ( G ) , we refer to any of the (cid:101) Y ∈ P ∗ k ( B r,d ( G )) obtained by replacingeach edge e ∈ Y with a simple path of length d through B r,d ( e ) as a lift of Y . To prove the probability concentration estimates (Lemma 2.67) we need to characterize the elementsof Im( π d ) that have the most N λ mass above them as the elements of P ∗ k ( G ) m . For that we will need tocharacterize the elements of the image with the most edges, which will be accomplished by Proposition 2.63and Proposition 2.64. Proposition

The elements in the image of π d : P ∗ k ( B r,d ( G )) → E ( G ) all have h ≤ k − and ≤ | V ( G ) | + k − edges. roof. The bound on the number of edges will follow from Proposition 2.54 once we argue that all theelements in the image have h ≤ k −

1. So, let X ∈ P ∗ k ( B r,d ( G )), and let 1 C , . . . , C m be a basis for thecycle space of π d ( X ), where 1 Z is the indicator function of any set Z ⊆ E ( G ). For each i = 1 , . . . , m , deﬁne (cid:101) C i as a lift of C i . We will show that the 1 (cid:101) C i are independent, by showing that any linear dependence givesa corresponding dependence between the 1 C i . Set E d = E ( B r,d ( G )). Deﬁne a linear map T : R E d → R E ( G ) ,which, for each e ∈ E ( G ), adds up the values along all edges of B r,d ( e ) and sets that as the value of e .In particular, T (1 (cid:101) C i ) = d C i . Suppose that we had that 0 = (cid:80) mi =1 a i (cid:101) C i , as functions on E d . Then byapplying T to this equation, we obtain 0 = (cid:80) mi =1 a i d C i , which implies that a i = 0 for i = 1 , . . . , m . Hence m ≤ h ( X ), which implies that m ≤ k −

1, so h ( π d ( X )) = m ≤ k − Proposition

The elements of

Im( π d ) that have | V ( G ) | + k − edges are the elements of P ∗ k ( G ) m .Proof. Suppose K = π d ( X ) has | K | = | V ( G ) | + k −

2. By Proposition 2.63, h ( K ) ≤ k −

1. ByProposition 2.54, h ( K ) ≥ k −

1, so h ( K ) = k −

1. To ﬁnish the claim, we prove that K has no bridgeedges. But suppose that e is a bridge edge, and let (cid:101) e be any edge in B r,d ( e ) ∩ X . X , having no bridges,must have a simple cycle C that contains (cid:101) e . Now, let C , . . . , C k − be a cycle basis for K . Since e is a bridgeedge, none of the C i contain e , hence their lifts do not contain (cid:101) e . This implies that C is not in the spanof C , . . . , C k − , hence h ( X ) ≥ k , which contradicts X ∈ P ∗ k ( B r,d ( G )). This shows that every element inIm( π d ) with | V ( G ) | + k − P ∗ k ( G ) m . Since we already showed in Lemma 2.61 that every elementof P ∗ k ( G ) has a lift to an element of P ∗ k ( B r,d ( G )), the claim follows.For the probability concentration lemma we will need the upper and lower bounds on the N λ mass abovean m edge element of Im( π d ) provided by the next two propositions: Proposition

Let K ∈ P ∗ k ( G ) m . Then N λ ( π − d ( K )) ≥ ( rλ ) d ( | V ( G ) | + k − .Proof. Since K has | V ( G ) | + k − B r,d ( e ) there are r d simple paths, there are r d ( | V ( G )+ k − lifts obtained by choosing one of those paths for each edge. Since the length of each such liftis d ( | V ( G ) + k − | ), the total N λ mass is r d ( | V ( G ) | + k − λ d ( | V ( G )+ k − = ( rλ ) d ( | V ( G ) | + k − . Since there maybe other preimages, as shown in Figure 2.9, we only get a lower bound. Proposition

Let K ∈ Im( π d ) , with | E ( K ) | = m . Assume λ ∈ (0 , and assume r is such thatthat rλ ≥ . Then N λ ( π − d ( K )) ≤ m r km d km ( rλ ) dm .Proof. We begin by bounding the number of possible conﬁgurations of X ∩ B r,d ( e ) for any e ∈ E ( G ),and X ∈ P ∗ k ( B r,d ( G )). First, observe that there are r d simple paths across B r,d ( e ). We treat two cases,depending on whether or not X contains one of these paths.If X contains one of those paths then X may contain at most k − B r,d ( e ), becauseeach one increases the rank of h . Thus, r d (cid:0) rdk − (cid:1) upper bounds the number of conﬁgurations which includesa path through B r,d ( e ). Moreover, each such conﬁguration contains at least d edges, hence each one has N λ mass at most λ d , as λ ≤ X ∩ B r,d ( e ) may not contain any of the simple paths crossing B r,d ( e ). However, in thiscase, we have that | X ∩ B r,d ( e ) | ≤ k − h by one, and thus anupper bound on the number of such conﬁgurations is (cid:0) rd k − (cid:1) . Moreover, each one of these conﬁgurationscontains at least 2 edges, so has mass at most λ . Thus, the total N λ mass obtained from this case is (cid:0) rd k − (cid:1) λ .Combining these two cases, we have a bound for the total N λ mass of π − d ( K ) restricted to B r,d ( e ),namely: (cid:88) X ∈ π − d ( K ) N λ ( X ∩ B r,d ( e )) ≤ (cid:18) rdk − (cid:19) r d λ d + (cid:18) rd k − (cid:19) λ ≤ rd ) k ( rλ ) d . Here, the last inequality follows from max( (cid:0) rdk − (cid:1) , (cid:0) rd k − (cid:1) ) ≤ ( rd ) k and from rλ ≥ ≥ λ . To determine N λ ( π − d ( K )), we ﬁrst observe that N λ ( X ) = λ | X | = (cid:81) e ∈ E ( K ) λ | X ∩ E ( B r,d ( e )) | = (cid:81) e ∈ E ( K ) N λ ( X ∩ B r,d ( e )) for ny X ∈ π − d ( K ). From this, the result follows: N λ ( π − d ( K )) = (cid:88) X ∈ π − d ( K ) N λ ( X )= (cid:88) X ∈ π − d ( K ) (cid:89) e ∈ E ( K ) N λ ( X ∩ B r,d ( e )) ≤ (cid:89) e ∈ E ( K ) (cid:88) X ∈ π − d ( K ) N λ ( X ∩ B r,d ( e )) ≤ (cid:89) e ∈ E ( K ) rd ) k ( rλ ) d = 2 m ( rd ) km ( rλ ) dm We now prove the probability concentration lemma necessary for applying the lucky guess algorithm,Lemma 2.8:

Lemma

Let G = ( V, E ) have n = | V | . Let λ ∈ (0 , and ﬁxan integer r such that rλ ≥ . Assuming that P ∗ k ( G ) m is non-empty, then for d = 4 (cid:100) ( a +2 n +2 kn (log( r )+1)log( rλ ) ) (cid:101) the probability under ν λ that an element of P ∗ k ( B r,k ( G )) maps via π d to an element of P ∗ k ( G ) m is at least a a .Proof. Let M = n + K −

2. Since elements of Im( π d ) \ P ∗ k ( G ) m have ≤ M − | Im ( π d ) | ≤ n , we have N d := N λ ( π − d (2 E ( G ) \ P ∗ k ( G ) m )) ≤ n M − ( rd ) k ( M − ( rλ ) d ( M − (Proposi-tion 2.66). We also have that H d := N λ ( π − d ( P ∗ k ( G ) m )) ≥ ( λr ) dM (Proposition 2.65). Using Lemma 2.10and M ≤ n , for d ≥ log( S )+2 kn log( q ) ) , where S = 2 a n r kn and q = rλ , we have that H d ≥ a N d . Hence,by Lemma 2.9, this d suﬃces for H d N d + H d ≥ a a . k -partitions is intractable. In this section we prove in-tractability of sampling dual k -partitions, and connect this to the intractability of sampling connected k -partitions. Theorem

For any ﬁxed λ ∈ (0 , , λ -sampling connected dual k -partitions is intractable on theclass of 2-connected planar graphs.Proof. We will assemble the ingredients for Lemma 2.8, the Lucky Guess lemma. We ﬁx a choice of a so that a a +1 ≥ − /m . Deﬁne B is by the construction G → B r,d ( G ) where r is chosen so that rλ ≥ d = (cid:100) ( a +2 n +2 kn (log( r )+1)log( rλ ) ) (cid:101) . The map M is given by π d , where we have S ( G ) = 2 E ( G ) . The problem Q is deﬁned by Q ( G ) = P ∗ k ( G ) m , which is NP-complete by Proposition 2.57. Moreover, by Lemma 2.67, wehave that P ( π d ( C ) ∈ Q ( G ) : C is distributed according to ν λ on P ∗ k ( B r,d ( G )) ≥ − /m . Thus, we obtainthe result from Lemma 2.8.For any ﬁxed λ >

0, we deﬁne a distribution on connected k -partitions. λ -sampling connected k -partitions Input: A graph G .Output: An element of P k ( G ), drawn accord-ing to the probability distribution that assignsa partition P ∈ P k ( G ) weight proportional to λ | cut( P ) | .Now we state the main theorem of this section: Theorem

Fix λ ∈ (0 , . Then λ -sampling connected k -partitions is intractable on the class of -connected planar graphs.Proof. This follows as a corollary to Theorem 2.68, using Theorem 2.52. . The Flip Markov Chain. In the previous section, we examined the worst case complexity of thepartition sampling problem. However, worst case intractability results do not necessarily mean that theproblem is intractable on examples of interest, since there can be algorithms which are eﬀective only oncertain cases. In this section and the next, we examine the performance of one such algorithm, which isbased on Markov chains.Markov chains provide a generic means of sampling from prescribed distributions over a state space Ω.This technique starts with a seed in Ω and randomly applies perturbations to walk around the space; themore steps in the random walk, the closer the sample is to being distributed according to the stationary distribution of the chain rather than the seed point. While this approach provides an elegant means ofsampling, a mathematical analysis of the mixing time (see § P ( G ), which we call the ﬂip walk . This chain has seen wide use in the analysis of gerrymandering [16, 31, 77]. We know from ouranalysis in the previous sections that one cannot hope for this chain to mix rapidly on general graphs, unlessone also believes that RP = NP. To make this more concrete in this section we show that the gadgets usedin our complexity proofs directly yield bottlenecks impeding the mixing of the ﬂip chain. Later, in §

4, wewill use ideas from this section to analyze the mixing of the ﬂip chain on examples relevant to redistricting.

The ﬂip walk is analogous to Glauber dynamics and Potts models. Contiguityof the blocks is not usually considered in these physical settings, and it is part of what makes samplingdistricting plans challenging. A diﬃculty in analyzing the combinatorics of contiguity constraints is thatit is deﬁned through global, rather than local interactions; a physical model with similar challenges is theself-avoiding walk, which we consider further in ( § A Markov chain with similarproposal moves as in Deﬁnition 3.1 was studied in [21], but restricted to a state space of st -cuts instead ofconnected k -partitions. They show that this Markov chain mixes slowly, even if the underlying graph isof bounded treewidth. They prove a tree-width ﬁxed parameter tractability results for the counting andsampling problems they consider which, similarly to our work in §

5, build on Courcelle’s theorem and arebased on dynamic programming. Their results therefore share some similarities with ours, except that westudied connected all cuts weighted by a function of the edge-cut. The example in their section 7 . st -paths. Another place in the literature where the ﬂip walk appearsis in [78], where the problem of sampling simple st -paths using the ﬂip walk is studied. Their paper providesanother example where there is a bottleneck [78, Theorem 7] and gives a proof of ergodicity for their versionof the ﬂip chain (which is restricted to st -paths for ﬁxed s and t ). They also make the observation that if the st -path ﬂip chain on the grid graph is restricted to paths that are monotone in one direction, then the ﬂipwalk is rapidly mixing on that restricted state space. We remark that the techniques based on [63, Prosition5.1] that we discussed in § st -paths problem is NP-complete. Additionally, the techniques we discuss below in § st -path sampling problem to a corresponding counting problem, which will be tractable on certainclasses of graphs, such as series parallel graphs or graphs of bounded treewidth.The question of sampling simple paths has also received some attention in the literature: [56] provesthat a certain Markov chain on simple paths in a complete graph mixes rapidly (Theorem 4.1.2) but thata Metropolis-Hasting’s version with weights has bottlenecks (Theorem 4.2.2), and repeats a similar analysisfor sampling simple paths in trees (Theorem 4.3.2 and Theorem 4.4.1). The existence of a FPRAS forweighted simple paths on the complete graph, where weights can be set to zero to forbid edges, would implythe existence of a FPRUS for simple paths in any graph, which would imply that RP = NP by using the hain of bigons trick from [63, Prosition 5.1] and the NP-completeness of the Hamiltonian path problem;this negatively answers one of the open problems given in the conclusion of [56]. [56] also provides a dynamicprogram algorithm to count and uniformly sample weighted simple paths in trees and DAGs (Section 5). Theundirected case can be extended using Courcelle’s theorem, so it is likely that a reasonably implementableﬁxed-parameter in treewidth algorithm for sampling simple paths exists, perhaps along similar lines to § We put a graph structure on the set of connected 2-partitions P ( G ) as follows.Let ( A, B ) ∈ P ( G ). Given any x ∈ V ( G ), consider the partition ( A ∪ { x } , B \ { x } ) = ( A (cid:48) , B (cid:48) ). Providedthat ( A (cid:48) , B (cid:48) ) ∈ P ( G ), including the case when ( A (cid:48) , B (cid:48) ) = ( A, B ), this deﬁnes an edge between two elementsof P ( G ). If ( A (cid:48) , B (cid:48) ) (cid:54)∈ P ( G ), that is, if either A (cid:48) or B (cid:48) does not induce a connected subgraph of G , thenwe add a self loop to ( A, B ). Do the same also for ( A \ { x } , B ∪ { x } ). This deﬁnes a | V ( G ) | -regular graphstructure on P ( G ). Given this graph structure, we deﬁne the ﬂip walk: Definition

The ﬂip walk on P ( G ) is the Markov chain obtained by performing alazy simple random walk on P ( G ) , using the graph structure deﬁned in the previous paragraph. We abusenotation and refer to the Markov chain, the graph, and the set by P ( G ) . If G is 2-connected, then P ( G ) is irreducible [12] and hence ergodic. Since every node of P ( G ) hasdegree | V ( G ) | , the uniform distribution is the stationary distribution for the ﬂip walk on P ( G ). Thus, bystandard Markov chain theory [70], this ﬂip walk eventually produces nearly uniformly distributed elementsin P ( G ). However, we will see examples in this section where the ﬂip walk on P ( G ) can take exponentialtime in | G | to generate a nearly uniform sample. We make a short digression to review a fewnotions from Markov chain theory. For details we have left out, see [70]. Since the goal of our discussion isto give examples where the random walk on P ( G ) mixes slowly, we will recall the notion of mixing time inthe context of (discrete) Markov chains: Definition

Given two probability distributions µ and ν on a ﬁnite set Ω , the totalvariation distance between µ and ν is given by (cid:107) µ − ν (cid:107) T V = (cid:80) x ∈ Ω | µ ( x ) − ν ( x ) | . Definition

Let µ be the stationary distribution of the (discrete time) Markov chain M = (Ω , P ) . Let P t δ x denote the distribution at time t of the Markov chain M started at x . Deﬁne d M ( t ) := max x ∈ Ω || P t δ x − µ || T V . Then, the mixing time of M is (3.1) t Mmix ( (cid:15) ) = inf { t : d M ( t ) ≤ (cid:15) } . If the chain is clear from the discussion, we omit the superscript M . The deﬁnitions above help formalize what it means for a Markov chain to mix rapidly or torpidly:

Definition

A family of Markov chains M ∈ M is said to be rapidly mixing ifthe there is a polynomial p ( x, y ) so that t Mmix ( (cid:15) ) ≤ p (log | M | , log (cid:15) ) , ∀ M ∈ M , where | M | denotes the size ofthe state space of M . To prove rapid mixing, it is equivalent to ﬁnd a polynomial q ( x ) so that t Mmix (1 / ≤ q (log( | M | )), ∀ M ∈ M , as t Mmix ( (cid:15) ) ≤ (cid:100) log ( (cid:15) − ) (cid:101) t Mmix (1 /

4) [70, Equation (4.36)].

Definition

If there is an exponentially growing function, f ( n ) , such that t Mmix (1 / ≥ f (log( | M | )) , for all M ∈ M , then we say that M is torpidly mixing. A standard means of arguing about mixing times for random walks on regular graphs comes frommeasuring bottlenecks, as in the next deﬁnition:

Definition

Let G be a d - regular graph, and M the Markov chain obtained by a lazy a) Illustrating that D d ( H ) = ( B d ( H ∗ )) ∗ for plane H (b) Doubled d -starFigure 3.1: The chain of bigons construction and its dual random walk on G . We deﬁne the conductance of M as Φ( M ) = min U ⊂ V ( G ) | U |≤ | V ( G ) | | ∂ E U | d | U | . Loosely speaking, such a set U which proves that Φ( M ) is small is called a bottleneck.The following theorem connects mixing time and conductance and will be used to show that the chain P ( G ) mixes torpidly for certain families of graphs, by building explicit bottleneck sets that upperbound theconductance: Theorem

For every Markov chain M , t Mmix ( / ) ≥ M ) . Due to the sampling intractability results( § Definition d -star, original nodes). Let H be a graph. The doubled d -star constructionapplied to H , notated D d ( H ) , is obtained by replacing each edge of H with d parallel edges and then subdi-viding each new edge once. For e ∈ E ( G ) , we will let D d ( e ) denote the doubled d -star subgraph that replacedit. See Figure (b) for an illustration.There is an obvious inclusion V ( G ) (cid:44) → V ( D d ( G )) , and we call the nodes in the image of that inclusionthe original vertices. The other vertices in D d ( G ) are called new vertices. We will ﬁnd bottlenecks in P ( D d ( G )) by relating it to P ( G ) using Lemma 3.1: Lemma

For any ( A, B ) ∈ P ( D d ( G )) , ( A ∩ V ( G ) , B ∩ V ( G )) ∈ P ( G ) .Proof. Let x, y ∈ V ( G ) be members of the same block of ( A, B ), say A . There exists a path γ in A from x to y . This path alternates between new vertices and original vertices. Forgetting the new vertices in thispath gives a path in A ∩ V ( G ) between x and y .We use Lemma 3.1 to make the following deﬁnition: Definition

Deﬁne a map R d : P ( D d ( G )) → P ( G ) by setting R d (( A, B )) =( A ∩ V ( G ) , B ∩ V ( G )) . This is not the usual deﬁnition of the conductance, but this is the correct formula for the conductance of a lazy randomwalk on a d -regular graph [70, p. 144]. The formula there has a typo, which was corrected in the errata.22 e now explain the key intuition behind the bottlenecks. In order for the ﬂip walk to move betweenthe ﬁbers of R d —that is, to change the assignment of an old node—a certain rare event must occur. Inparticular, if u and v are adjacent old nodes, and u ∈ A and v ∈ B , then to reassign u to B , every new nodein D d ( { u, v } ) must already be in B . However, under the ﬂip walk with u ∈ A , v ∈ B , the new nodes of D d ( { u, v } ) behave like a random walk on a hypercube, and in particular, it is unlikely for them to becomepart of the same block. Pursuing this intuition, the next lemma proves that the ﬁbers of R d have much smaller edge boundary than size, which will mean that they are bottleneck sets: Lemma

A, B ) across the new nodes of D d ( G ), byconsidering each edge e ∈ E ( G ) separately. If e ∈ cut( A, B ), then one can assign new nodes of D d ( e )arbitrarily without aﬀecting contiguity, and therefore one has 2 d choices. If e (cid:54)∈ cut( A, B ), suppose bothendpoints of e are in A . Since B (cid:54) = ∅ , to preserve connectivity all the new nodes of D d ( e ) must be in A .Therefore, there is only one choice for how to extend ( A, B ) along this edge. Combining these two casesyields (3.2).

Proof of (3.3) . Let e ∈ ∂ E R − d (( A, B )) be an edge between (

L, M ) , ( L (cid:48) , M (cid:48) ) ∈ P ( D d ( G )), with R d (( L (cid:48) , M (cid:48) )) =:( A (cid:48) , B (cid:48) ). There is an x ∈ V ( G ) be such that L (cid:48) = L + x , M (cid:48) = M − x , A (cid:48) = A + x and B (cid:48) = B − x . Since L (cid:54) = ∅ , for L (cid:48) to be connected there has to be at least one node l ∈ L so that l ∼ x . Moreover, since M − x isconnected and l, x (cid:54)∈ M − x , M − x can contain at most one new node of D d ( { l, x } ). Hence, there are at most d +1 extensions of ( A (cid:48) , B (cid:48) ) onto the new nodes of D d ( { l, x } ). As there are most cut( A, B ) − d range of extensions, it follows that there are at most ( d + 1)2 (cut( A,B ) − d elements of ∂ E R − d (( A, B )) that map to { ( A, B ) , ( A (cid:48) , B (cid:48) ) } under R d . Finally, the claim follows because thereare at most n candidates for the original node x that gets ﬂipped when making a step across ∂ E R − d ( A, B ).We now use these computations to show the slow mixing of the ﬂip chain.

Theorem

Let G be any -connected graph with at least two distinct connected -partitions P, Q ∈ P ( G ) , neither of which have the empty set as a block. Let n = | V ( G ) | . Then, the family P ( D d ( G )) , d ≥ ,is torpidly mixing. In particular, we have the following bounds: Φ( P ( D d ( G )) ≤ ( d + 1)2 − d − , (3.4) t mix (1 / D d ( G )) ≥ d − ( d + 1) , and (3.5) | P ( D d ( G )) | ≤ | D d ( G ) | ≤ n + dn . (3.6) Proof.

Without loss of generality, assume that cut( P ) ≤ cut( Q ). We have that | R − d ( P ) | ≤ / | P ( D d ( G )) | since R − d ( P ) ∩ R − d ( Q ) = ∅ and | R − d ( Q ) | ≥ | R − d ( P ) | by (3.2). Combining (3.2) with (3.3) yields (3.4).Equation (3.5) follows from this by Theorem 3.7. Finally, (3.6) follows from the construction of D d ( G ).From (3.6), we have that log | P ( D d ( G )) | ≤ n + dn .For a concrete example, take G to be a 4-cycle, and take the two 0-balanced cuts. A snapshot of the evolutionof the ﬂip walk on D ( G ) can be seen in Figure 3.2.We pause to note a key division between the intractability of uniformly sampling P ( G ) and the mixingof the ﬂip walk. First, observe that if G is a series-parallel graph , then D d ( G ) is series parallel as well.We show in § P ( G ) on the class ofseries-parallel graphs, and yet our proof above shows that the ﬂip walk still mixes slowly on this class ofgraphs. Thus, even in cases where uniform sampling is tractable, the ﬂip walk on P ( G ) still may not be aneﬃcient means of sampling. igure 3.2: A snapshot of the ﬂip walk evolving, illustrating the bottleneck of Theorem 3.11.Figure 3.3: The aﬀect of T R d construction. The example of Theorem 3.11 is not entirely satisfying,for example because the degrees of its nodes increase without bound. We will address some of its weaknessesin this section by producing a family of maximal plane graphs with vertex degree ≤

9, such that thecorresponding family of ﬂip walk chains is torpidly mixing. As in § G → T d ( G ) which uses gadgets to reﬁne certain features of G , and a map P ( T d ( G )) → P ( G ),where we can count the size of the ﬁbers and the size of the edge boundaries of the ﬁbers. All graphsin this section are assumed to be embedded in the plane. Additionally, we will freely describe a partition( A, B ) ∈ P ( G ) by a map p : V ( G ) → { a, b } . Definition T d , original vertices, original triangles). Let G be a maximal plane graph. Let T d ( G ) be the graph deﬁned by T d ( G ) = ( R d ( G ∗ )) ∗ , where R d is as in Deﬁnition . There is a natural injection i : V ( G ) → V ( T d ( G )) , since there is a natural injection F aces ( G ∗ ) to F aces ( R d ( G ∗ )) , and we call the nodesin Im( i ) the original vertices. Moreover, if F is any triangular face in G , then we call the original verticesof F in V ( T d ( G )) an original triangle. The aﬀect of T d is to take every triangular face and reﬁne it by gluing a graph hanging from the threenodes of the triangle. Figure 3.3 shows the aﬀect of applying T on a single triangular face.In § Definition

Let G be a plane graph. Consider a partition ( A, B ) ∈ P ( G ) deﬁned by a function p : V ( G ) → { a, b } . We will call a face F pure of assignment a (resp. b ), if p takes thevalue a (resp. b ) on all of its nodes. We will call the face mixed otherwise, that is, if p takes on both valueson the vertices of F . For a partition ( A, B ) ∈ P ( G ) , we let P ( A,B ) be the function on the set of faces of G that assigns a to all pure a -faces, b to all pure b -faces, and m to all mixed faces. Additionally, we deﬁne M : P ( G ) → N as the number of mixed faces in a partition. We are going to ﬁnd bottlenecks in this section by deﬁning sets of partitions of T d ( G ) by whether alloriginal triangles are mixed. To leave such a set, some of the triangles will have to become pure, which willforce the new nodes of that triangle to be in a speciﬁc arrangement. A convenient tool for expressing thiswill be to describe directed edges of P ( G ) as being purifying or not. Definition

For a graph G , let DP ( G ) be the directed graphversion of the ﬂip walk adjacency structure on P ( G ) . That is, it has a node for each node of P ( G ) and for ach edge e = { P, Q } in P ( G ) , DP ( G ) has two edges: ( P, Q ) and ( Q, P ) . Definition

We call an edge e = ( P, Q ) ∈ DP ( G ) purifying if there is a face F of G so that P P ( F ) = m but P Q ( F ) ∈ { a, b } . Let DP C ( G ) be thegraph obtained from DP ( G ) by removing all purifying edges. For Q ∈ P ( G ) , we let C Q ⊆ P ( G ) be the setof all vertices strongly reachable from Q in DP C ( G ) . Finally, we will need a way to relate connected 2-partitions of T d ( G ) to those of G , so that we canpartition P ( T d ( G )) based on which faces of G are pure or mixed. Lemma

For any partition ( A, B ) ∈ P ( T d ( G )) deﬁned by p : V ( T d ( G )) → { a, b } , let p o : V ( G ) →{ a, b } denote the restriction to the original vertices, which we identify with V ( G ) . Then p o : V ( G ) → { a, b } deﬁnes a connected partition of G .Proof. Let x, y ∈ i ( V ( G )) ∩ A . Since T d ( G )[ A ] is connected, there exist a path γ in A from x to y .Forgetting the new vertices in this path gives a path in G [ i − ( A )], since all the vertices on the boundary ofany original triangle are adjacent. Definition

Deﬁne F d : P ( T d ( G )) → P ( G ) to be the restriction map F d ( p ) = p o , with notation as in Lemma 3.2 Lemma

Let G and F d be as above. Let P ∈ P ( G ) . Then, (3.7) | F − d ( P ) | ≥ dM ( P ) and, supposing additionally that P is such that C P = { P } , (3.8) | ∂ E F − d ( P ) | ≤ n ( d +1)( M ( P ) − . Proof of (3.7) . A mixed face of a 2-partition of T d ( G ) corresponds to a SBL ( R d ) segment of the simplecycle dual to that 2-partition. Using the estimates in Equation (2.7), each of the mixed faces can be in atleast in 5 d conﬁgurations, and so the claim follows. Proof of (3.8) . Since C P = { P } any edge out of F − d ( P ) must cause one mixed face of P to becomepure. This mixed face must be in a conﬁguration where all but one node has the same block assignment, andthat one exceptional node must be an original node. Since there are at most n original nodes of G which canswitch during this step, and the other mixed faces have at most 5 d +1 conﬁgurations each, the result followsby the bound on SBL from Equation (2.7).Taking G = K yields the following corollary: Corollary

There is a family of graphs H d , d ∈ N , that are triangulations of the plane (maximalplanar graphs) such the vertex degree is bounded by and | V ( H d ) | = O ( d ) , and for which P ( H d ) and theunordered partition chain have mixing times at least d . H is shown in Figure b).Proof. One can compute DP C ( K ) to ﬁnd that there are three P i ∈ P ( K ) with C P i = { P i } . Fig-ure 3.4a) shows two such examples. Let P be the top partition and Q the bottom one in Figure 3.4a).By symmetry, we have that | F − d ( P ) | = | F − d ( Q ) | , and so since F − d ( P ) ∩ F − d ( Q ) = ∅ , it follows that | F − d ( P ) | ≤ | P ( T d ( G )) | / . Hence, F − d ( P ), is a candidate bottleneck set. We compute, | ∂ E F − d ( P ) | n | F − d ( P ) | ≤

12 5 ( d +1)( M ( P ) − − dM ( P ) . As M ( P ) = 4, it follows that Φ( P ( T d ( G )) ≤ (5 − d ). We obtain corresponding bottleneck sets in thequotient chain of unordered partitions. The result now follows by Theorem 3.7.This last example illustrates that controlling neither the degree, nor the face degree, nor insisting on3-connectedness of G can improve the mixing time of the ﬂip walk on P ( G ). On the other hand, these graphsstill have a lot of area enclosed by length 3 loops, which is arguably unrealistic for redistricting, except wecould in principle see similar behavior around very dense cities. In the next section, we will use statisticsinspired by the idea that certain nodes may change their assignment infrequently, as well as literature on selfavoiding walks, to investigate the ﬂip walk on connected partitions of a grid graph and on state dual graphs. a bb ab ba a) The elements of DP C ( K ) used in Corollary 3.18 b) H of the family of Corollary 3.18.Figure 3.4Figure 4.1: L n

4. Empirical Examples.

The torpid mixing of the ﬂip walk highlighted in the previous section isnot only a theoretical observation. In this section, we present several experiments showing slow mixing inpractical applications of the ﬂip walk to redistricting. The key statistic we study is closely linked to ourbottleneck proofs, wherein we were able to identify sets of nodes that ﬂip infrequently. For an empiricalanalysis, we will observe the frequency at which nodes ﬂip during simulations of the ﬂip walk on lattice likegraphs ( § § § § § In this section, we will review some special features of connected partitions of thegrid graph, in particular through the connection to the self-avoiding walk model from statistical physics.

In one special case, the objects we are studying are closely linked to afamous topic from statistical physics, namely self-avoiding walks on lattices. A self-avoiding walk (resp.polygon) is a simple path (resp. cycle) in the integer lattice graph. These were introduced as models forpolymers, and have shown themselves to be a diﬃcult and rich object of mathematical investigation. Anexcellent reference for this topic is [74]; [87] gives an overview of Monte Carlo methods used to investigatethis topic. Self-avoiding walks on other lattices are also of interest, and we discuss one of those below. Wewill primarily be interested in walks that are constrained to lie in certain subsets of the lattice.Take L n to be the grid graph with shaved corners, as in Figure 4.1. The dual graph, L ∗ n , is an n − × n − G n − with an additional “supernode” V corresponding to the unbounded face. The simple cyclesof L ∗ n break into two classes. First, there are those that do not contain the supernode. These can be thought f as self-avoiding polygons in G n − . Second, there are those that do contain the supernode. We can thinkof these as self-avoiding walks in G n − between two points on the boundary. We will call these chordal self-avoiding walks.This connection to self-avoiding walks is important to us because such self-avoiding walks display phasetransitions as one varies the preference for longer or shorter walks. As we will see, these phase transitionspersist into the distribution ν λ on 2-partitions that we studied in Deﬁnition 2.58. After recalling the relevantfacts and history about self-avoiding walks, we will present our experiments. Definition P λ of probability distributions on self-avoiding walks). Fix λ > . Given aﬁnite set of self-avoiding walks, P λ is a probability distribution that assigns mass to each walk ω proportionallyto x | ω | . Here | ω | counts the number of edges in the walk. It is known that the geometry of P λ -typical chordal self-avoiding walks in G n display a phase transitionas the parameter λ is varied: depending on λ , for suﬃciently large n , a path drawn from the distribution P λ will have certain properties with high probability. To state this phase transition, we will recall an importantconstant called the connective constant of the square lattice.

Theorem

Let c n be the number of self-avoidingwalks on the lattice Z that start at the origin. The limit µ = lim n →∞ n √ c n exists. µ is called the connectiveconstant of the square lattice, and µ ≈ . . The phase transition for the qualitative properties of P λ on chordal self-avoiding walks (SAW) occurs at λ = 1 /µ , which is called the critical “fugacity”. In particular: • Subcritical fugacity λ < /µ : A P λ -typical SAW resembles a geodesic on the grid graph [43]. • Critical fugacity λ = 1 /µ : A P λ -typical SAW resembles a sample path from chordal SLE / [43, 69]. • Supercritical fugacity λ > /µ : A P λ -typical SAW is “space ﬁlling”, in a sense made precise in [43]. It should come as no surprise that a Markov chainas natural as the ﬂip walk has been investigated before, especially given the interest in the self avoiding walkmodel. Indeed, the plane dual of the ﬂip walk moves were applied to the study of self avoiding walks in Z with ﬁxed endpoints (but not constrained to lie in a bounded region) in the BFACF algorithm [87, Section6.7.1]. However, the state space of this walk is inﬁnite, unlike our setting. It was proven that this walkhas inﬁnite exponential autocorrelation time [88]. Various eﬀorts were made to improve the mixing time ofthe BFACF algorithm [87, Section 6.7.2], since physicists were interested in sampling from the stationarydistribution in addition to observing the paths of the chain itself [86]. Additionally, Markov chains on selfavoiding walks have been considered in constrained domains [87, p.69] just as in our setting, but it appearsthat little is known. In a somewhat diﬀerent direction, and conditional on conjectures about the asymptoticsof c n , there are rapidly mixing Markov chains for uniformly sampling from unconstrained, ﬁxed length selfavoiding walks starting at the origin with free endpoint, see [82, 89].Additionally, known bounds on self-avoiding walks provide estimates for the size of P ( L n ). For lowerbounds, estimates on the number of self avoiding walks [73] can be used. For the upper bound, methods insection 5.1 of [24] can be used. Interestingly, [24] cites [11] as the inspiration for their method, and [11] waswritten to address the question of how many ways one could design districting plans for a grid-like state. Our goal in this section is to document experiments about MCMCmethods built on top of the ﬂip walk. These experiments were designed to investigate whether the chain wasdrawing samples from its stationary distribution. As the proposal distribution, we used the ﬂip walk on the simply-connected elements of P ( L n ). We used a Metropolis score function S (( A, B )) = λ −| cut( A,B ) | , so thatthe stationary distribution would be ν λ . To tune λ around the critical fugacity 1 /µ , we used the estimates µ ∈ [2 . , . µ ≈ . X % of the total number ofnodes beyond the number of nodes in a perfectly balanced partition, which we call an allowed populationdeviation (APD) of X %.These experiments are recorded in Figure 4.2 and Figure 4.3. We wish to thank a helpful conversation on MathOverﬂow for drawing our attention to this fact [4].27a)

Very low fugacity, without tight population con-straints

Population deviation 90%, λ = 1 /

10, Steps= 5 , , , Very low fugacity, with tight population constraints

Population deviation 10%, λ = 1 / < /µ , Steps= 2 , , ,

156 This quickly rotated from the diagonalto be horizontal, and oscillated around that.(c)

Very low fugacity, with extremely tight populationconstraints

AP D = 1%, λ = 1 / < /µ , Steps =2 , , , Critical fugacity, with loose population constraints

AP D = 50%, λ = 1 /µ , Steps = 1 , , , Figure 4.2: These examples start from an upper left to bottom right diagonal partition of a 40 ×

40 gridgraph. Each node kept track of the number of times it was ﬂipped, and this number is reported and coloredaccording to the key. Some interpretation of the history of the path revealed by the ﬁgures is provided. Inall of these examples a symmetry argument demonstrates that the chain has evolved into some metastableregion P . In this section, we repeat the experiments we performed on thegrid graph the state dual graphs ( § So far we have discussed phase transitions of self-avoiding walks and connected 2-partitions in the grid graph, and remarked that the critical fugacity occursat 1 /µ ≈ . λ where the part of the partition boundary in the square grid acts super critically, and the part inthe triangular acts subcritically. These experiments about phase transitions in geographic compactness scores ﬁt in with other observa-tions about compactness, namely that many features of the scores are not robust under changes of scale, Connected 2-partitions of the triangular lattice correspond to self avoiding walks in the dual lattice, which is a hexagonallattice. Since [44] puts the connective constant of the hexagonal lattice at (cid:112) √

2, the phase transition for partitions of thetriangular part occurs around . ) Average block b) FlipsFigure 4.3: Two measurements of a single run. AP D = 90%, x = 1, steps = 194 , , −

80, and blue indicating < ×

50. See Figure 4.6 for results of running the ﬂip walk.

The example of the λ cut distribution and the Frankengraphteaches us that the choice of graph used to discretize the same underlying geography can dramatically aﬀectthe distribution over partitions produced by a ﬁxed algorithm. However, in those cases the geographicreasonableness of the distribution also changed dramatically, making it easy to classify a single partition asarising from one discretization or the other. In this section, we look at a diﬀerent distribution over partitionsof a rectangle, with the property that changing the discretization noticeably changes the distributions, butsuch that diﬀerences between the two distributions cannot be easily detected by observing natural geometricproperties of individual plans. ) λ = 1, AP D = 90%. Steps= 93 , ,

894 b) λ = . AP D = 90%. Steps= 1 , , , M CM C on the state dual graph of Kansas. Top: Starting plan.Middle: Counting Flips. Bottom: Ending plan.Figure 4.6: A ﬂip walk based MCMC run with λ = 1 /

2, starting from the diagonal partition of the Franken-graph. 90% population constraint. 17 , , ,

990 steps. e will now explain this other sampling algorithm, which underlies the method used in [36, 41]. LetUST be an algorithm that takes a graph G and returns a uniform spanning tree, for example using Wilson’salgorithm [94]. Let MST refer to the minimum spanning tree obtained by picking iid Uniform([0 , G ) refer to either UST( G ) or MST( G ). Removing an edge e from Tree( G ) gives aforest with two components, and hence an element of P ( G ). If we repeat this algorithm, only selectingthose edges that provide an (cid:15) -balanced connected 2-partition of G , then we obtain a distribution over P (cid:15) ( G ). For both UST and MST, this distribution over partitions has some favorable properties, such asits concentration on partitions with small edge-cuts, which have made it appealing as a tool for samplingdistricting plans [36, Section 3.1.1]. We will call this distribution UST-partition or MST-partition, orTree-partition if we refer to either.We construct a sequence of graphs from the 36 ×

36 grid graph G by by triangulating a set of its facesin the following way. Fix some w ∈ [0 , i, j ), i, j ∈ [0 , x, y ) ∈ V ( G ), if 12 ≤ y ≤

20 and 0 ≤ x ≤ w or 34 − w ≤ x ≤

34, add an edge(( x, y ) , ( x + 1 , y + 1)) if x is even, and an edge (( x, y ) , ( x + 1 , y − x is odd. We call the resulting graph G w , and w is referred to as the width . We think of partitions of G w as modelling the same state, but withdiﬀerent choices regarding adjacency between geographic units. By changing the width, we can see a changein the shape of a typical Tree-partition. The results are displayed in Figure 4.7, where we use a number in[0 ,

3] to quantify the width of each half of the gate; the entire graph has dimensions 36 ×

36, and the lengthof each half-gate is width ×

6. The eﬀect is very clearly that closing the gap in the squeezes the boundaryof the partitions in between the two halves of the gate.In Figure 4.8 we show that MST-partitions and UST-partitions produce diﬀerent partisan outlier mea-surements, despite the similarity between the description of the algorithms. In particular, if the underlyinggraph is is the one with width 2, and if the distribution chosen as a baseline for outlier analysis is MST-partition, then most of the time the seat share is 1, and many random samples from

U ST -partition wouldbe considered extreme outliers. We discuss this further in § § There are also a variety of resolutionson which maps can be viewed, from counties to tracts to census blocks. All of this could potentially impactoutlier analysis in a way similar to the Frankengraph example and Figure 4.7. Thus, someone samplingpartitions of state dual graph to investigate the political geography of redistricting should keep in mind thatthey are making a potentially signiﬁcant choice at the level of the choice of model graph. To comport withbest practices in statistics [50], these decisions should be made as transparently and impartially as possible.In many U.S. states, precincts, which are the atomic geographic units where electoral data is reported,are drawn at the discretion of the local municipalities. The examples in this section show that the peoplewho choose the smaller geographic units potentially have a lot of control over the results reported by outliermethods. If outlier methods become standard, it would open the possibility of metamandering through thedeliberate manipulation of these boundaries. No comprehensive analysis has been done to understand theimpacts of these decisions. A fact partially reﬂected in [65, Corollary 2]; the connection being that the probability that a UST-partition (

A, B ) ∈ P ( G )is chosen is proportional to T ( A ) T ( B ) cut ( A, B ), where T ( ) counts the number of spanning trees, and based on this one canrearrange the asymptotics in Kenyon’s paper to deduce that among the rectilinear partitions those with smaller perimeter areasymptotically preferred. However, rectilinear partitions are a set of extremely small measure in the UST-partition distribution,and so this explanation for concentration of UST-Partition on smaller fundamental cut-sets is only partial. That is, between declaring two units adjacent if they have a common edge, or if they have any common point.31 igure 4.7: An edge is yellower if it is more frequently a cut-edge of sampled partition. The left handdiagrams were made using MST-partition, and the right hand side with UST-partition. Both use 1000sampled plans, and plans are conditioned on having at most a 5% size deviation between the two blocks.Some sample partitions are displayed next to the cut-edge picture. The UST-partitions tend to have slightlylonger boundaries on average.

5. Positive results.

In this section, we provide several results regarding the tractability of samplingconnected k -partitions and simple cycles. Most of these results will follow from the tractability of count-ing connected k -partitions on various families of graphs. Unlike many p -relations, connected 2-partitionsand simple cycles do not appear to be encodeable in a self-reducible way (in the sense used in [63, 66]),meaning that the equivalence between counting and sampling requires variations on the ideas explainedin [63]. However, we can use the chain of bigons construction to evaluate the marginal probabilities thatself-reducibility would normally reduce to a counting problem. We can also directly modify the countingalgorithms we ﬁnd to compute the marginals. First, we recall how to use certain marginal probabilities tosample from a probability distribution over subsets of a given set.

The following algorithm is a standard part of the equivalence betweencounting and sampling, and is usually stated in the context of self-reducible structures. In Appendix B.1 weprovide a proof of correctness.

Definition

Let p be a probability distributionon [ n ] , the set of subsets of [ n ] = { , , , . . . , n } . Let S be a random variable distributed according to p .For a set J ⊆ [ i − , deﬁne p ( i | J ) = P ( i ∈ S | S ∩ [ i −

1] = J ) ; that is, the probability that S contains i ,conditioned on containing J and being disjoint from [ i − \ J . (As a convention, take [0] = ∅ .) The use of the previous deﬁnition is in the following algorithm for sampling from a probability distribution Using techniques similar to [66] we can prove that at least one reasonable encoding of simple cycles is not self-reducible,unless P = NP. See Appendix B.5. 32 idth MST UST0 1.346, 1.339 1.348, 1.2731 1.045, 1.479 1.220, 1.3142 1.003, 1.689 1.191, 1.4253 1.072, 1.559 1.239, 1.379Figure 4.8: 1000 samples, on a 36 ×

36 node graph. Each half gate is 6 × width vertices long, so width 3represents the gate cutting the graph entirely in half. The numbers reported are obtained in the followingway: some nodes are set to be 1 and others to 0. Each district majority votes to decide the “party” ofthe representative in that region, which is a number in { , } . By summing the party across both districts,each plan is assigned a total party value in { , , } . We have reported the mean total party value across1000 trials, according to two diﬀerent voting population distributions. The left hand numbers reported arebased on a distribution of voters where the left 60 % of the nodes in the square are party 1, and the righthand numbers are based on a voter distribution where the bottom 60% are party 1. Since the voting dataforces the total party value into { , } , these are all Bernoulli random variables, and therefore the entiredistribution can be read from the mean that we reported.over subsets of a set, given access to the marginals of Deﬁnition 5.1. Algorithm 5.1

InductiveSampling

Input:

A probability distribution p on 2 [ n ] described via an oracle O that can compute p ( i | J ) for any i ∈ [ n ]and any J ⊆ [ n ]. Output:

A random element of 2 [ n ] distributed according to p . Set J = ∅ for i ∈ [ n ] do Use O to calculate p ( i | J i − ) With probability p ( i | J i − ) , set J i = J i − ∪ { i } . Else, set J i = J i − . Return J n .The correctness of this algorithm (Appendix B.1) proves the following: Theorem

Let C be a language encoding graphs (resp. node-weighted graphs), and suppose thatthere is a polynomial time Turing machine M (resp. M B ), that on input G ∈ C , J, J (cid:48) ⊆ E ( G ) , computesthe number of simple cycles containing J and disjoint from J (cid:48) (resp. the number of balanced connected -partitions whose cut set contains J and is disjoint from J (cid:48) ). Then there is a polynomial time probabilisticTuring machine that uniformly samples from SC ( G ) (resp. uniformly samples from P ( G ) ) for G ∈ C . The next sections will be concerned with computingthe marginals that are necessary for Theorem 5.2. This will be done by showing that we can solve certaincounting problems. The tractability of the counting problems on graphs of bounded treewidth will followfrom extensions of Courcelle’s theorem, such as those in [14]. In particular, the cut edges of a connected k -partition can be expressed in MSO (see § k -partition can be expressed in EMS, as deﬁned in [14]. The constants in these meta-theorems are too large tobe practically useful, as the automata on which they are based grows in size like a tower of exponentials in thesize of the formula, and there are no general tricks to avoid this [48] . Therefore, although we appeal to thesemeta-theorems to conclude complexity theory statements, we emphasize practical approaches to solvingthese counting and sampling problems on series-parallel graphs, which give some directions for practical For background on second order logic, the reader is referred to [46, Chapter 7]; for background on these meta-theoremsand MSO , the reader is referred to [14]. A brief summary of the meta-theorem that we use is given in § mplementations on wider classes of graphs. For example, forthcoming work [91] extends the ideas appliedto series-parallel graphs to arrive at a reasonably implementable algorithm for counting and sampling simplecycles ﬁxed-parameter tractably in the treewidth. Definition

Given a graph G = ( V, E ) and J, J (cid:48) ⊆ E , let G J,J (cid:48) ( d ) denote thegraph where the edges in J (cid:48) are deleted, and the edges in J are replaced by a chain of d bigons. The next lemma shows that for suﬃciently large d , the number of simple cycles in G containing J and disjoint from J (cid:48) can be inferred from | SC ( G J,J (cid:48) ( d )) | , by using division with remainder and the sameexponential growth rate comparisons that drove the intractability results: Definition If C is some language encoding graphs, and k : C → N some function, then we call k a parametrized language of graphs. Lemma

Let k : C → N be a parametrized language of graphs. Suppose that there is a polyno-mial p and a computable function f and a Turing machine M that can calculate | SC ( G J,J (cid:48) ( d )) | in time f ( k ( G J,J (cid:48) ( d ))) p ( | G | , d ) for all G ∈ C and d ≥ and for any J, J (cid:48) ⊆ E ( G ) . Then, there is a polynomial q and a TM which calculates b J,J (cid:48) := |{ T ∈ SC ( G ) : J ⊆ T, J (cid:48) ∩ T = ∅}| in time O ( f ( k ( G J,J (cid:48) (36 n ))) q ( | G | )) for all G ∈ C and J, J ⊆ E ( G ) .Proof. If | J | = 0, then | SC ( G J,J (cid:48) ( d )) | = b J,J (cid:48) . We assume that | J | ≥

1, and let n = | G | . Observe, in themanner of Proposition 2.13, that | SC ( G J,J (cid:48) ( d )) | = | J | (cid:88) k =0 dk |{ X ∈ SC ( G ) : X ∩ J (cid:48) = ∅ , | X ∩ J | = k }| + d | J. | We deﬁne a d = | SC ( G J,J (cid:48) ( d )) | and the remainder term R d = | J |− (cid:88) k =0 dk |{ X ∈ SC ( G ) : X ∩ J (cid:48) = ∅ , | X ∩ J | = k }| + d | J | .R d is bounded above by 2 d ( | J |− n + d | J | ≤ dn d ( | J |− n . By Lemma 2.10, for d = 4 (cid:100) ( n +log( n )+1) (cid:101) ,2 d | J | > dn d ( | J |− n , and hence for such d , we have a = 2 d | J | b J,J (cid:48) + R d , where 2 d | J | > R d . Since eachterm in that expression is an integer, R d is the remainder of dividing a by 2 d | J | , and b J,J (cid:48) is the quotient.This division with remainder can be performed in O ((log( | SC ( G J,J (cid:48) ( d )) | ) log(2 d | J | ))) time [84, Theorem 3.3],which is polynomial in ( | G | , d ). Since d = 4 (cid:100) ( n + log( n ) + 1) (cid:101) ≤ n , the result follows.We brieﬂy recall a deﬁnition of treewidth: Definition k -trees, partial k -trees and treewidth). A k -tree is any graph that can be recursivelyconstructed in the following manner. We start with a tree T that is a k clique. Then, we obtain T n from T n − by picking any k -clique Q of T n − adding a new vertex v and connecting v to each node of Q . A partial k -tree is any subgraph of a k -tree. The treewidth of a graph G is the smallest k such that G is a partial k -tree. The operation G → G J,J (cid:48) ( d ) preserves the class of series-parallel graphs, and in addition, does notincrease the treewidth of any graph with treewidth ≥

2. There are polynomial time dynamic programmingalgorithms for counting the number of simple cycles of a series-parallel graph, and it follows from Courcelle’stheorem that counting the number of simple cycles is ﬁxed-parameter tractable in the treewidth. Thus,from Lemma 5.5 and Theorem 5.2 we have the following:

Theorem

The problem of uniformly sampling simple cycles is

FPT in the treewidth. See Appendix B.3 This follows from Proposition 5.11 and Theorem 6.56 in [34]. We are grateful to Mamadou Moustapha Kant´e for pointingthis out on Stack Exchange [5]. There is an explicit MSO formula for simple cycles at the same link.34 ince the treewidth of a plane dual changes by at most one [23,68], we obtain a similar result for samplingconnected 2-partitions. However, in the next section we will show how to apply Courcelle’s theorem directlyto show that sampling connected k -partitions is FPT in the treewidth.The treewidth of a typical state dual graph used for redistricting ( § P ( G ) tractable when G is a state dual graph.We discuss this further in § ≤

2. We nowintroduce a deﬁnition to describe these distributions.

Definition

Let G be a graph, and c : E ( G ) → Q ≥ some weightfunction. Let N c denote the measure on simple cycles that gives weight N c ( C ) = (cid:81) e ∈ C c ( e ) to each simplecycle C , and let ν c denote the probability distribution obtained by normalizing N c . Theorem

Sampling from ν c is polynomial-time solvable on the class of graphs of treewidth ≤ .Proof. See Lemma B.14, which shows how to directly compute the required marginal probabilities forsampling from ν c via Algorithm 5.1, without using Lemma 5.5.The space of distributions on SC ( G ) is far larger than the distributions described by ν c . We mentionseveral other tractable distributions on SC ( G ) and P ( G ) in § k -partitions. First, we will brieﬂy review MSO and the counting meta-theorem in § § formula that deﬁnes connected k -partitions. Finally, wewill tie these together to prove the following: Theorem

Uniformly sampling from P k ( G ) is FPT in the treewidth.

MSO formulas. We mostly follow [14]. We consider a relationalvocabulary R = ( V, E, J, J (cid:48) , inc), where V, E, J, J (cid:48) are unary relations, and inc is a binary relation. Ad-ditionally, we consider a set of formulas Γ = {∀ xV ( x ) ∨ E ( x ) , ∀ xV ( x ) ↔ ¬ E ( x ) , ∀ x ∀ y inc( x, y ) → V ( x ) ∧ E ( y ) , ∀ xJ ( x ) ∨ J (cid:48) ( x ) → E ( x ) } . Mod(Γ) denotes the set of models of Γ. Given a model of Γ with universe A , A is partitioned by the two sets deﬁned by V and E , which we refer to as V and E , by abuse of notation.We interpret V as the set of vertices, E as the set of edges, J and J (cid:48) as two collections of edges, and inc asthe incidence relation. That is, inc( v, e ) is interpreted as meaning that vertex v is incident to edge e . Thus,a model for Γ is a graph along with two sets of edges.We denote by MSO the second order logic with signature R that allows only unary relational variables.Given a formula Φ( X ) in MSO with a free variable X , the enumeration problem for Φ is that of computing |{ X : G | = Φ( X ) }| for any given G ∈ Mod(Γ).Then, we have:

Theorem [14, Theorem 5.7] For each

MSO formula Φ( X ) , and for each class K of graphs withuniversally bounded treewidth, the enumeration problem for Φ can be solved in O ( | G | log( | G | )) time if G isgiven with a tree-decomposition. For the purposes of obtaining an MSO formula, it is convenient to represent a connected k -partition bythe complement of the cut-set, similarly to Appendix A.3. Definition

Given an (unordered) k -partition P , deﬁne F ( P ) =cut( P ) c . Let Flats k ( G ) be the set { F ( P ) : P ∈ P k ( G ) } . For any J (cid:48) , J ⊆ E , deﬁne F J,J (cid:48) ( G ) ⊆ Flats k ( G ) asthose Q ∈ Flats k ( G ) with J ⊆ Q and Q ∩ J (cid:48) = ∅ . Since a connected k -partition is determined by its cut set (Proposition A.10), it follows that F : P k ( G ) → Flats k ( G ) is a bijection. We are describing (unordered) connected partitions as ﬂats in the graphic matroid,hence the notation “F” and “Flats k .” MSO formula for edge sets in Flats k ( G ) . Let G = ( V, E ) be a graph. For X ⊆ E we willbuild up to an MSO formula that checks if X ∈ Flats k ( G ). Our building blocks are inspired by the examplesin [35, Chapter 7]. First, we deﬁne a formula that checks if a set of nodes, Y , is contained in G [ X ] : n( Y, X ) = ∀ v ∈ Y ∃ e ∈ X inc( v, e ) . Given two sets of vertices, U and W , we deﬁne a formula that checks if there is an edge in X connectinga node in U to a node in W :Bridge( U, W, X ) = ∃ u ∈ U,w ∈ W,e ∈ E inc( u, e ) ∧ inc( w, e ) . Next we deﬁne a formula that checks if G [ X ] is connected, by checking whether there are any non-trivial2-partitions of V ( G [ X ]) with no edges in X between the diﬀerent blocks.connE( X ) := ∀ Y ⊆ V [In( Y, X ) ∧ Y (cid:54) = ∅ ∧ [ ∃ U ⊆ V In(

U, X ) ∧ U ∩ Y = ∅ ∧ U (cid:54) = ∅ ∧ U ∪ Y = V ]] → Bridge(

Y, U, X ) . We next deﬁne a formula that checks if an edge has both endpoints in the nodes of a subgraph inducedby a set of edges Y : ep( e, Y ) = ∀ v ∈ V inc( e, v ) → ( ∃ e (cid:48) ∈ Y inc( e (cid:48) , v ))Finally, for each k ≥

1, we deﬁne a formula that takes a collection of edges, X , and checks whether it isin Flats k ( G ). This is accomplished by checking that every node of G is incident to some edge in X and that X is a union of k sets of edges, each of which induces a connected subgraph and so that any edge with bothendpoints in one of those connected subgraphs is in X . F (cid:48) k ( X ) = In( V, X ) ∧ ( ∃ X ,...,X k ⊆ E ( X = (cid:91) X i ∧ (cid:94) i connE ( X i ) ∧ ( ∀ e ∈ E (cid:94) i (ep( e, X i ) → e ∈ X i )))Recall that we considered J and J (cid:48) to be part of the relational structure R , so we can deﬁne the MSO formula whose solution sets are the members of F J,J (cid:48) :(5.1) F k ( X ) = F (cid:48) k ( X ) ∧ ( J ⊆ X ) ∧ ( J (cid:48) ∩ X = ∅ ) Lemma

Let A be a model of Γ , i.e. a graph G = ( V, E ) with vertex-edge incidence matrix given by inc and two distinguished subsets of edges, J and J (cid:48) . F k ( X ) is true in A if and only if X = F ( P ) for someconnected k -partition P of G and X ∩ J (cid:48) = ∅ and J ⊆ X . We now prove Theorem 5.1.

Proof.

Let K be a class of graphs with universally bounded treewidth. For any G ∈ K and J, J (cid:48) ⊆ E ( G ),by Theorem 5.2 and Lemma 5.3, we can count | F J,J (cid:48) ( G ) | in time O ( | G | log( | G | ) with constant dependentonly on the bound on the treewidth and the formula F k . The conditions for running Algorithm 5.1 aresatisﬁed. Remark

It is easy to add a relational formula (see [14]) to Equation (5.1) that restricts ourcount to only balanced connected k -partitions. In particular, the balanced connected k -partition problem is in extended monadic second order logic (EMS). From this it should follow that so the counting and samplingproblems are XP in the treewidth. However, as noted at this Stack Exchange question [9], the correspondingmeta-theorem appears to be missing from the literature. We mentioned in § counting the number of balanced connected 2-partitions of a given node-weighted series-parallel graph G in time polynomial in G and pseudopolynomialin the weights. We present the details of this algorithm in Appendix B.4. To turn such a counting algorithminto an algorithm for calculating the marginals necessary for Theorem 5.2, we proceed along similar lines asin the simple cycle case. efinition W J,J (cid:48) ( G, w )( d )). Let ( G, w ) be a node weighted graph, and let J, J (cid:48) ⊆ E ( G ) . Deﬁne G J,J (cid:48) by replacing edges in J with the doubled d -star gadgets from Deﬁnition and contracting the edgesin J (cid:48) , deleting any self loops that arise in this way. Assign the “new nodes” of D d ( e ) weight for each e ∈ J , and the old nodes the same weight as they had in G . The resulting node-weighted graph is denoted W J,J (cid:48) ( G, w )( d ) . We now show that the marginals necessary for Algorithm 5.1 can be computed from | P ( W J,J (cid:48) ( G, w )( d )) | (with notation as in § § Proposition

Let ( G, w ) be a weighted graph. Then: (5.2) | P ( W J,J (cid:48) ( G, w )( d )) | = 2 d | J | |{ X ∈ P ( G, w ) : J ⊆ cut( X ) , cut( X ) ∩ J (cid:48) = ∅}| + R d , where R d is a non-negative integer with: (5.3) R d ≤ n d ( | J |− . Proof.

Let

G/J (cid:48) denote the quotient graph obtained by identifying u, v ∈ V ( G ) if { u, v } ∈ J (cid:48) . First, wedecompose(5.4) P ( G/J (cid:48) ) = | J | (cid:91) k =0 { ( A, B ) ∈ P ( G/J (cid:48) ) : | cut( A, B ) ∩ J | = k } . We deﬁne R J : P ( G J,J (cid:48) ( d )) → P ( G/J (cid:48) ) as R d is in Deﬁnition 3.9 by forgetting the assignment of newnodes, We pull back Equation (5.4) along R J to obtain: P ( G J,J (cid:48) ( d )) = | J | (cid:91) k =0 R − J ( { ( A, B ) ∈ P ( G/J (cid:48) ) : | cut( A, B ) ∩ J | = k } ) . The map φ ∗ : P ( G/J (cid:48) ) → P ( G ) deﬁned by φ ∗ (( A, B )) = ( φ − ( A ) , φ − ( B )) is an injection, and theimage is { ( A, B ) ∈ P ( G ) : cut( A, B ) ∩ J (cid:48) = ∅} .Hence we have a partition of P ( G J,J (cid:48) ( d )), P ( G J,J (cid:48) ( d )) = | J | (cid:91) k =0 ( φ ∗ J (cid:48) ◦ R J ) − ( { ( A, B ) ∈ P ( G ) : cut( A, B ) ∩ J (cid:48) = ∅ , | cut( A, B ) ∩ J | = k } ) . So far we have decomposed the set of partitions of G J,J (cid:48) ( d ). Next, we compute the 0-balanced partitionsin each block of that decomposition. The elements of ( φ ∗ J (cid:48) ◦ R J ) − ( { ( A, B ) ∈ P ( G ) : cut( A, B ) ∩ J (cid:48) = ∅ , | cut( A, B ) ∩ J | = k } ) are obtained by extending a partition in { ( A, B ) ∈ P ( G ) : cut( A, B ) ∩ J (cid:48) = ∅ , | cut( A, B ) ∩ J | = k } onto the new nodes. Since each new node has weight 0, it is impossible to assign newnodes in such a way as to make unbalanced partitions of G balanced.The balanced partitions that have J contained in the cut have exactly 2 d | J | balanced extensions each.This proves Equation (5.2). We are left to show the upper bound of Equation (5.3) for the reaminingpartitions, namely :Rem d = ( | J |− (cid:91) k =0 ( φ ∗ J (cid:48) ◦ R J ) − ( { ( A, B ) ∈ P ( G ) : cut( A, B ) ∩ J (cid:48) = ∅ , | cut( A, B ) ∩ J | = k } )) ∩ P ( W J,J (cid:48) ( G )( d ))We have that R d = | Rem d | . Suppose that X is some balanced partition of G , with | cut( X ) ∩ J | ≤ | J | − X to the new nodes and get a balanced partition is at most 2 d ( | J |− . Since | P ( G ) | ≤ n , this provides the upper bound on the remainder term. roposition Let C be some class of graphs that is closed under the operation G → G J,J (cid:48) ( d ) ofDeﬁnition , for all d ≥ . Let p be a polynomial. Suppose that M is a Turing machine which cancompute | P ( G ) | on all weighted graphs ( G, w ) where G ∈ C and w : V ( G ) → { , , . . . , } , in time boundedby p ( | G | , w ( G )) . Then there is a polynomial time probabilistic Turing machine that uniformly samples from P ( G, w ) in time polynomial in ( | G | , w ( G )) for all G ∈ C .Proof. Due to Algorithm 5.1, to sample in polynomial time it suﬃces to be able to compute a J,J (cid:48) := |{ X ∈ P ( G, w ) : | cut( X ) ∩ J | = | J | , cut( X ) ∩ J (cid:48) = ∅}| in polynomial time for any given J, J (cid:48) ⊆ E ( G ). Wewill do this by from computing | P ( W J,J (cid:48) ( G, w )( d )) | at a value of d which is polynomially large in | G | .If d = n + 1, then 2 d | J | > n d ( | J |− . Now, given N d = | P ( W J,J (cid:48) ( G, w )( d )) | , from Proposition 5.13we know that we can write N d = a J,J (cid:48) d | J | + R d . Since we can eﬃciently compute 2 d | J | and N d in timepolynomial in ( | G | , w ( G )), since we ﬁxed d = n + 1, by division with remainder we can compute a J,J (cid:48) intime polynomial in ( | G | , w ( G )). Thus, we have calculated the marginal that we need for sampling. Theorem

There is an algorithm for uniformly sampling from the balanced partitions of a nodeweighted series-parallel graph ( G, w ) , which runs in time polynomial in ( | G | , w ( G )) .Proof. This follows from Proposition 5.14 and the dynamic program for counting balanced partitions onseries-parallel graphs presented in Appendix B.4, since the class of node weighted series-parallel graphs isclosed under the operation G → G J,J (cid:48) ( d ) for all d ≥

1; this is because series-parallel graphs are closed underreplacing edges by doubled d -trees, and under edge contractions (provided we eliminate self loops). Remark

It may be possible to extend this to an XP in treewidth algorithm for sampling balanced k -partitions, using similar ideas as well as those mentioned in the conclusion of [60]. We conclude this section by pointing out that many distributions on P k ( G ) and P k ( G ) are tractable to sample. A general strategy for building k -partitionsof G is to contract G in some random way onto a simpler graph, G (cid:48) , and then pull back k -partitions fromthe simpler graph. The following lemma shows that one can pull back connected k -partitions along quotientmaps obtained by contracting connected partitions: Lemma

Let G be a graph, and let φ : G → G/R be a graph quotient map, where R is an equivalencerelation on the nodes such that the equivalence classes of R induce connected subgraphs. Then for any ( A , . . . , A k ) ∈ P k ( G/R ) , ( φ − ( A ) , . . . , φ − ( A k )) ∈ P k ( G ) . Moreover, if w is a node weight function on G ,then if we assign each equivalence class of G/R the total weight of all its elements, φ − preserves the weightof blocks, and thus also pulls back balanced partitions. This lemma can be used to give a recipe for chaining together random partitions into of G with manyblocks into an algorithm for obtaining random partitions into k blocks. For example, at each stage onecan take R to be an equivalence relation induced by random sets of edges, such as the edges of a randommatching, or the monochromatic edges in a sample from distribution over colorings, or a random forest, aswe did in § contracted onto partially triangulated grid graphs with similar treewidth, which suggeststhat understanding the connected k -partition sampling problem for partially triangulated grid graphs is anopen problem with important implications with sampling connected partitions of state dual graphs.Another means of producing connected 2-partitions is via min cuts, since min cuts are always connected.There are polynomial time algorithms for uniformly sampling min s, t -cuts genus g graphs given in [27]. Ongeneral graphs, one can also sample min-cuts in a way that is ﬁxed parameter tractable in the size of themin-cut [20], but the running time of this algorithm is practical only for very small min-cut sizes.We emphasize that even though these distributions can be eﬃciently sampled from, it is not clear how tocharacterize their properties in terms of interpretable features of districting plans.As we discussed in § . Conclusions.6.1. Broad overview of paper. We motivated this paper by discussing attempts at characterizingoutlier redistricting plans through ensemble methods ( § • The ﬂip walk proposal distribution used in practice is likely not rapidly mixing ( § § § • The complexity results (Theorem 2.43, Proposition 2.20 and Theorem 2.68) show that for many classes ofgraphs and distributions, there are likely to be no eﬃcient replacements for the ﬂip walk when it comesto sampling from certain prescribed distributions. • Even if it were possible to sample from an explicitly designed distribution, that distribution may undergophase changes in its qualitative behavior if the description is slightly modiﬁed ( § § may not be robust to small changes inthe set up. We now describe some directions that the computational redistricting com-munity can go in to address these challenges.

There are local measurements of gerrymandering thatdo not require guarantees about sampling. One example [31, 80] is supported by a rigorous theory for aparticular meaning of gerrymandering regarding “carefully crafted” plans. It remains an important questionto investigate the extent to which the decisions we highlighted in § § § Currently a consensus is developing that the notion ofan extreme partisan outlier is robust between diﬀerent sampling methods. For example, handful of recentcourt decisions reference favorably the outcome of such outlier analysis [40, 79, 95]. The consensus assertsthat diﬀerent ensemble methods are measuring a consistent and interpretable feature of the political databecause those methods used in practice seem to detect the same extreme partisan outliers.The hypothesis that there is a consistent and robust notion of an extreme outlier, which we will call the“extreme outlier hypothesis” (EOH), is likely to be a rich source of challenges and questions about ensemblebased redistricting. This hypothesis was partially explored in [36], where it was found that certain changesin the sampling algorithm had little eﬀect on the tails of some chosen statistics, but that other changescaused certain tails to become exaggerated. However, the changes that exaggerated the tails resulted frominterpreting diﬀerent legal constraints for permissible maps; for example, section 3.4 of [36] shows someexamples of how adherence to the voting rights act can have dramatic impacts on the distribution of partisanscores.It is easy to fabricate distributions where any speciﬁc plan appears to be an outlier, but such fabricateddistributions may not reﬂect principles important in real world redistricting. To be useful, EOH must incor-porate redistricting principles and not just mathematical abstractions. For example, in § § istrict’s geometry, a more subtle change that may or may not have bearing on the EOH in practice. Of ourexperiments, the most challenging for the EOH is the signiﬁcant diﬀerence between the UST-partition andMST-partition in the third row of Figure 4.8.A reasonable formulation of EOH hypothesizes that it is implausibly diﬃcult to fabricate distributionsover districting plans that are defensible as a baseline for redistricting, even under scrutiny by adversarialexperts, but which report diﬀerent extreme outliers. To falsify EOH, it would suﬃce to ﬁnd reasonableoperationalizations of the same set of legal requirements into diﬀerent sampling algorithms, which nonethelessreport diﬀerent extreme outliers. The problem of establishing precise guidelines for what constitutes arepresentative distribution over districting plans is understudied and critical to understanding the EOH. A pragmatic resolution to the hard questions raisedby the EOH may be to pick a handful of sampling algorithms that will be consistently recognized a baseline.Some states already incorporate restrictions on the use of partisan data in the drawing of districting plans,and ensemble based methods could be one additional way for those states to operationalize these intentions.However, there is likely not a single collection of distributions that will suit every geography and politicalculture. Beyond purely mathematical analysis, one route to ﬁnding suitable distributions is through empiricalanalysis and successive reﬁnements under real world conditions.Although the procedure for determining which distributions to set as baselines is politically fraught, un-derstanding the sampling algorithms to promote informed decisions is a hard scientiﬁc problem. In additionto characterizing the distributions generated by these algorithms, an analysis of algorithmic reliability, as in § § metamandering in § if the EOH fails, then giving someonethe power to choose a baseline distribution creates the opportunity for subtler manipulation of voting out-comes. Along with exploring the EOH, developing empirically motivated partition sampling algorithms andunderstanding their trade-oﬀs is a key direction for future research in this area. We summarize a handful of remaining questions about sampling connectedpartitions: • What sort of ground truth models could be useful for assessing the accuracy of outlier methods? Of thedistricting plans historically or presently used, what proportion of them are ﬂagged as outliers by suggestedmethodology? Does it correlate with other evidence for gerrymandering? Is the ﬂagging consistent betweenmethods? • Can the intractability results be uniﬁed and strengthened? It seems unlikely that Theorem 2.25 is optimal. • Can a general and practically useful suﬃcient for the existence of a bottleneck in the ﬂip walk be extractedfrom the examples in § • Besides treewidth, are there other graph parameters that make uniformly sampling from P ( G ) tractable? • All of our intractability results relied on reductions from Hamiltonian cycle, by proving that any algorithmsampling from certain distributions can be modiﬁed to put large mass on the longer simple cycles of agraph. However, the partitions that are of interest to redistricting tend to have relatively short boundarieson the order of Θ( (cid:112) | V | ), rather than Θ( | V | ). As mentioned in § (cid:112) | V | ) for the graphs that arise as state dual graphs ( § • Is it possible to uniformly sample from P ( L n ), where L n is the n × n grid graph from § L n by adding some diagonal edges as in Figure 4.7, orpartially triangulated grid graphs [38]. • Are there families of graph with unbounded treewidth where ν λ sampling P ( G ) is tractable? • Statistical evidence, included repeating the tests in [64] as well as the ﬂip pictures in § ν λ Metropolis-Hastings weighted ﬂip walk Markov chain on P ( L n ) mixes rapidly only at the critical alue λ = 1 /µ . Is this true? What is the dependence on the population balance restriction? • Which distributions over P k ( G ) can we eﬃciently sample from? Which of these distributions is robust tochanges in the discretization? • Recalling the motivation ( § § • Although there are many plans, many of them are similar in shape. This may lead one to guess that thereis a small collection of plans that are near to every other plan; i.e. that there is an epsilon net in the spaceof reasonable plans. For low dimensional shapes, it is reasonable to ﬁnd an epsilon net, but for shapes ofdimension d , the number of points needed to form an epsilon net grows roughly like (1 /(cid:15) ) d . It would beinteresting to determine whether or not the space of reasonable plans was high or low dimensional fromthis point of view. The authors conjecture that this space will behave as if extremely high dimensional,but if there are ways to constrain it to be low dimensional, then the potential existence of a computable (cid:15) -net opens another way to discuss typicality while being distribution agnostic, for example through theanalysis of Pareto fronts between measurements. Acknowledgements.

We want to thank the following people for their patience, enthusiasm, eagernessto share knowledge, insightful questions and helpful discussions: Hugo Akitaya, Eric Bach, Assaf Bar-Natan,Jin-Yi Cai, Sarah Cannon, Ed Chien, Sebastian Claici, Moon Duchin, Charlie Frogner, Jordan Ellenberg,Heng Guo, David Hayden, P´alv¨olgyi D¨om¨ot¨or Honlapja, Mamadou Moustapha Kant´e, Fredrik Berg Kjolstad,Tianyu Liu, Aleksander M¸adry, Elchanan Mossel, Marshall Mueller, David Palmer, Wes Pegden, SebastienRoch, Mikhail Rudoy, Zach Schutzmann, Allan Sly, and Nike Sun.We want to thank Jin-Yi Cai and Tianyu Liu for several in depth discussions that helped to guidethis investigation, for catching some mistakes in an earlier version of § §

4: Mary Barker, Daryl DeFord, Robert Dougherty-Bliss, Max Hully, Anthony Pizzimenti,Preston Ward.

Funding.

The ﬁrst author was partially supported by the NSF RTG award DMS-1502553 and by U.S.National Science Foundation grants DMS-1107452, DMS-1107263, DMS-1107367. The authors acknowledgethe generous support of NSF grant IIS-1838071 and the Prof. Amar G. Bose Research Grant. This workwas partially completed at the Voting Rights Data Institute in the summer of 2018.

REFERENCES[1]

Replication code , https://github.com/LorenzoNajt/Code-For-Complexity-and-Geometry-of-Sampling-Connected-Graph-Partitions.[2]

Stack exchange answer , https://cstheory.stackexchange.com/a/41367/44995.[3]

Stack exchange answer , https://cstheory.stackexchange.com/a/42567/44995.[4]

Stack exchange answer , https://mathoverﬂow.net/a/313003/41873.[5]

Stack exchange answer , https://cstheory.stackexchange.com/a/43865/44995.[6]

Stack exchange answer and discussion , https://cstheory.stackexchange.com/a/41272/44995.[7]

Stack exchange comment , https://cstheory.stackexchange.com/q/41998/44995.[8]

Stack exchange comments , https://mathoverﬂow.net/q/316132/41873.[9]

Stack exchange question , https://cstheory.stackexchange.com/q/44338/44995.4110]

S. Aaronson , P ? = NP, in Open problems in mathematics, Springer, 2016, pp. 1–122.[11] H. Abbott and D. Hanson , A lattice path problem , Ars Combinatoria, 6 (1978), pp. 163–178.[12]

H. A. Akitaya, M. D. Jones, M. Korman, C. Meierfrankenfeld, M. J. Munje, D. L. Souvaine, M. Thramann, andC. D. T´oth , Reconﬁguration of connected graph partitions , arXiv preprint arXiv:1902.10765, (2019).[13]

M. Altman , Is automation the answer: The computational complexity of automated redistricting , Rutgers Computer andLaw Technology Journal, 23 (1997).[14]

S. Arnborg, J. Lagergren, and D. Seese , Easy problems for tree-decomposable graphs , Journal of Algorithms, 12(1991), pp. 308–340.[15]

S. Arora and B. Barak , Computational complexity: a modern approach , Cambridge University Press, 2009.[16]

S. Bangia, C. V. Graves, G. Herschlag, H. S. Kang, J. Luo, J. C. Mattingly, and R. Ravier , Redistricting:Drawing the Line , arXiv:1704.03360 [stat], (2017), http://arxiv.org/abs/1704.03360.[17]

A. Bar-Natan, L. Najt, and Z. Schutzman , The gerrymandering jumble: Map projections permute districts’ compact-ness scores , arXiv preprint arXiv:1905.03173, (2019).[18]

R. Barnes and J. Solomon , Gerrymandering and compactness: Implementation ﬂexibility and abuse , arXiv preprintarXiv:1803.02857, (2018).[19]

D. W. Barnette , On steinitz’s theorem concerning convex 3-polytopes and on some properties of planar graphs , in Themany facets of graph theory, Springer, 1969, pp. 27–40.[20]

P. Berg´e, B. Mouscadet, A. Rimmel, and J. Tomasik , Fixed-parameter tractability of counting small minimum ( s, t ) -cuts , arXiv preprint arXiv:1907.02353, (2019).[21] I. Bez´akov´a, E. W. Chambers, and K. Fox , Integrating and sampling cuts in bounded treewidth graphs , in Advances inthe Mathematical Sciences, Springer, 2016, pp. 401–415.[22]

H. L. Bodlaender and B. De Fluiter , Parallel algorithms for series parallel graphs , in European Symposium onAlgorithms, Springer, 1996, pp. 277–289.[23]

V. Bouchitt´e, F. Mazoit, and I. Todinca , Treewidth of planar graphs: connections with duality , in Euroconference onCombinatorics, Graph Theory and Applications, vol. 10, 2001, pp. 34–38.[24]

M. Bousquet-M´elou, A. J. Guttmann, and I. Jensen , Self-avoiding walks crossing a square , Journal of Physics A:Mathematical and General, 38 (2005), p. 9159.[25]

U. C. Bureau , Tiger/line shapeﬁles

S. Caldera, D. DeFord, M. Duchin, S. C. Gutekunst, and C. Nix , Mathematics of nested districts: The case ofalaska , (2019), https://mggg.org/uploads/Alaska.pdf.[27]

E. W. Chambers, K. Fox, and A. Nayyeri , Counting and sampling minimum cuts in genus g graphs , Discrete &Computational Geometry, 52 (2014), pp. 450–475.[28] G.-U. E. Charles et al. , Amicus brief of mathematicians, law professors, and students in support of the appellees andaﬃrmance , https://mggg.org/SCOTUS-MathBrief.pdf.[29]

J. Chen , Expert report of Jowei Chen, ph.d.

J. Chen and J. Rodden , Unintentional Gerrymandering: Political Geography and Electoral Bias in Legislatures

M. Chikina, A. Frieze, and W. Pegden , Assessing signiﬁcance in a Markov chain without mixing

W. Cho and Y. Liu , Toward a Talismanic Redistricting Tool: A Computational Method for Identifying Extreme Redis-tricting Plans , Election Law Journal: Rules, Politics, and Policy, 15 (2016).[33]

W. K. T. Cho and Y. Y. Liu , Sampling from complicated and unknown distributions: Monte Carlo and Markov ChainMonte Carlo methods for redistricting

B. Courcelle and J. Engelfriet , Graph structure and monadic second-order logic: a language-theoretic approach ,vol. 138, Cambridge University Press, 2012.[35]

M. Cygan, F. V. Fomin, (cid:32)L. Kowalik, D. Lokshtanov, D. Marx, M. Pilipczuk, M. Pilipczuk, and S. Saurabh , Parameterized algorithms , vol. 4, Springer, 2015.[36]

D. DeFord and M. Duchin , Redistricting reform in virginia: Districting criteria in context , Virginia Policy Review,12(2) (2019), pp. 120–146.[37]

D. DeFord, H. Lavenant, Z. Schutzman, and J. Solomon , Total Variation Isoperimetric Proﬁles , arXiv:1809.07943[cs, math], (2018), http://arxiv.org/abs/1809.07943.[38]

E. D. Demaine and M. Hajiaghayi , The bidimensionality theory and its algorithmic applications , The Computer Journal,51 (2008), pp. 292–302.[39]

E. D. Demaine, M. Hajiaghayi, and D. M. Thilikos , The bidimensional theory of bounded-genus graphs , in InternationalSymposium on Mathematical Foundations of Computer Science, Springer, 2004, pp. 191–203.[40]

T. S. C. O. P. M. District , League of women voters of pennsylvania v. the commonwealth of pennsylvania

M. Duchin, D. DeFord, and J. Solomon , Recombination: A family of markov chains for redistricting (forthcoming) .[42]

M. Duchin and B. E. Tenner , Discrete geometry for electoral geography , arXiv preprint arXiv:1808.05860, (2018).[43]

H. Duminil-Copin, G. Kozma, and A. Yadin , Supercritical self-avoiding walks are space-ﬁlling , in Annales de l’IHPProbabilit´es et statistiques, vol. 50, 2014, pp. 315–326.[44]

H. Duminil-Copin and S. Smirnov , The connective constant of the honeycomb lattice equals (cid:112) √

2, Annals of Math-ematics, 175 (2012), pp. 1653–1665. 4245]

M. Dyer and A. Frieze , On the complexity of partitioning graphs into connected subgraphs , Discrete Applied Mathe-matics, 10 (1985), pp. 139–153, http://linkinghub.elsevier.com/retrieve/pii/0166218X85900083.[46]

H.-D. Ebbinghaus, J. Flum, and W. Thomas , Mathematical logic , Springer Science & Business Media, 2013.[47]

J. Erickson , Planar graphs , http://jeﬀe.cs.illinois.edu/teaching/comptop/chapters/02-planar-graphs.pdf.[48]

M. Frick and M. Grohe , The complexity of ﬁrst-order and monadic second-order logic revisited , Annals of pure andapplied logic, 130 (2004), pp. 3–31.[49]

M. Garey, D. Johnson, and R. Tarjan , The Planar Hamiltonian Circuit Problem is NP-Complete , SIAM Journal onComputing, 5 (1976), pp. 704–714, https://epubs.siam.org/doi/10.1137/0205049.[50]

A. Gelman and C. Hennig , Beyond subjective and objective in statistics , Journal of the Royal Statistical Society: SeriesA (Statistics in Society), 180 (2017), pp. 967–1033.[51]

E. Grinbergs , On planar regular graphs degree three without hamiltonian cycles , (2009), https://arxiv.org/abs/arXiv:0908.2563.[52]

A. Große, J. Rothe, and G. Wechsung , Relating partial and complete solutions and the complexity of computingsmallest solutions , in Italian Conference on Theoretical Computer Science, Springer, 2001, pp. 339–356.[53]

E. Gy˝ori , On division of graphs to connected subgraphs , North-Holland Publ. Comp, Amsterdam ; Oxford ; New York,1978, pp. 485 – 494.[54]

G. Herschlag, H. S. Kang, J. Luo, C. V. Graves, S. Bangia, R. Ravier, and J. C. Mattingly , Quantifying Gerry-mandering in North Carolina , (2018), http://arxiv.org/abs/1801.03783.[55]

G. Herschlag, R. Ravier, and J. C. Mattingly , Evaluating Partisan Gerrymandering in Wisconsin , arXiv:1709.01596[physics, stat], (2017), http://arxiv.org/abs/1709.01596.[56]

T. R. Hoens , Counting and sampling paths in graphs , (2008).[57]

T. R. Hunter , The ﬁrst gerrymander? Patrick Henry, James Madison, James Monroe, and Virginia’s 1788 congressionaldistricting , Early American Studies, (2011), pp. 781–820.[58]

R. Impagliazzo and A. Wigderson , P = BPP unless E has subexponential circuits: derandomizing the XOR lemma , inProceedings of the 29th STOC, 1997, pp. 220–229.[59]

A. Itai, C. H. Papadimitriou, and J. L. Szwarcfiter , Hamilton paths in grid graphs , SIAM Journal on Computing, 11(1982), pp. 676–686.[60]

T. Ito, X. Zhou, and T. Nishizeki , Partitioning a graph of bounded tree-width to connected subgraphs of almost uniformsize , Journal of discrete algorithms, 4 (2006), pp. 142–154.[61]

I. Jensen , A parallel algorithm for the enumeration of self-avoiding polygons on the square lattice , Journal of Physics A:Mathematical and General, 36 (2003), p. 5731.[62]

I. Jensen , Improved lower bounds on the connective constants for two-dimensional self-avoiding walks , Journal of PhysicsA: Mathematical and General, 37 (2004), p. 11521.[63]

M. R. Jerrum, L. G. Valiant, and V. V. Vazirani , Random generation of combinatorial structures from a uniformdistribution

T. Kennedy , Monte carlo tests of stochastic loewner evolution predictions for the 2d self-avoiding walk , Physical reviewletters, 88 (2002), p. 130601.[65]

R. Kenyon , The asymptotic determinant of the discrete laplacian , Acta Mathematica, 185 (2000), pp. 239–286.[66]

S. Khuller and V. V. Vazirani , Planar graph coloring is not self-reducible, assuming P (cid:54) = NP , Theoretical ComputerScience, 88 (1991), pp. 183–189.[67] R. Kueng, D. G. Mixon, and S. Villar , Fair redistricting is hard , Theoretical Computer Science, (2019).[68]

D. Lapoire , Treewidth and duality for planar hypergraphs. , (1996).[69]

G. F. Lawler, O. Schramm, and W. Werner , On the scaling limit of planar self-avoiding walk , arXiv preprintmath/0204277, (2002).[70]

D. A. Levin, Y. Peres, and E. L. Wilmer , Markov Chains and Mixing Times , American Mathematical Soc., 2009.[71]

Y. Y. Liu, W. K. T. Cho, and S. Wang , PEAR: a massively parallel evolutionary computation approach for politicalredistricting optimization and analysis

L. Lovasz , A homology theory for spanning tress of a graph , Acta Mathematica Hungarica, 30 (1977), pp. 241–251.[73]

N. Madras , Critical behaviour of self-avoiding walks: that cross a square , Journal of Physics A: Mathematical andGeneral, 28 (1995), p. 1535.[74]

N. Madras and G. Slade , The self-avoiding walk , Springer Science & Business Media, 1996.[75]

D. B. Magleby and D. B. Mosesson , A New Approach for Developing Neutral Redistricting Plans , Political Analysis,26 (2018), pp. 147–167.[76]

K. C. Martis , The original gerrymander , Political Geography, 8 (2008), pp. 833–839.[77]

J. Mattingly , Declaration of Jonathan Mattingly , Common Cause vs. Rucho, (2017), http://s10294.pcdn.co/wp-content/uploads/2016/05/Expert-Report-of-Jonathan-Mattingly.pdf.[78]

S. Montanari and P. Penna , On sampling simple paths in planar graphs according to their lengths , in InternationalSymposium on Mathematical Foundations of Computer Science, Springer, 2015, pp. 493–504.[79]

T. M. D. of North Carolina , Common cause v. rucho

W. Pegden , Pennsylvania’s congressional districting is an outlier: Expert report

A. P¨onitz and P. Tittmann , Improved upper bounds for self-avoiding walks in zd , Electron. J. Combin, 7 (2000).4382]

D. Randall and A. Sinclair , Self-testing algorithms for self-avoiding walks , Journal of Mathematical Physics, 41 (2000),pp. 1570–1584.[83]

J. M. Schmidt , Structure and constructions of 3-connected graphs , PhD thesis, 2011.[84]

V. Shoup , A computational introduction to number theory and algebra , Cambridge university press, 2009.[85]

A. J. Sinclair , Randomised algorithms for counting and generating combinatorial structures , (1988).[86]

A. D. Sokal , How to beat critical slowing-down: 1990 update , Nuclear Physics B-Proceedings Supplements, 20 (1991),pp. 55–67.[87]

A. D. Sokal , Monte carlo methods for the self-avoiding walk , arXiv preprint hep-lat/9405016, (1994).[88]

A. D. Sokal and L. E. Thomas , Absence of mass gap for a class of stochastic contour models , Journal of StatisticalPhysics, 51 (1988), pp. 907–947.[89]

A. D. Sokal and L. E. Thomas , Exponential convergence to equilibrium for a class of random-walk models , Journal ofStatistical Physics, 54 (1989), pp. 797–828.[90]

H. Suzuki, N. Takahashi, and T. Nishizeki , A linear algorithm for bipartition of biconnected graphs , InformationProcessing Letters, 33 (1990), pp. 227–231.[91]

I. S. Vicente and L. Najt , Practical algorithms for counting and sampling simple cycles for graphs of bounded tree-width ,(Forthcoming).[92]

J. A. Wald and C. J. Colbourn , Steiner trees, partial 2-trees, and minimum iﬁ networks , Networks, 13 (1983), pp. 159–167.[93]

A. Wigderson , The Complexity of the Hamiltonian Circuit Problem for Maximal Planar Graphs

D. B. Wilson , Generating random spanning trees more quickly than the cover time , in STOC, vol. 96, Citeseer, 1996,pp. 296–303.[95]

U. S. D. C. F. T. W. D. O. Wisconsin , Whitford v. gill ppendix A. Appendix for complexity results. e e Pocket LargeFace LargeFaceAdjacentFaceAdjacentFace AdjacentFace e Figure A.1: The 3 OR subdivision as it appears in Lemma 2.1. A.1. Verifying Lemma 2.1.A.2. Proving that R d preserves CCP graphs.

By construction, R d ( G ) (Deﬁnition 2.38) remainscubic, and R d ( G ) is planar if G is planar. The next few lemmas show that if G is 3-connected, so is R d ( G ). l l AvG A ( G ) A A (cid:48) w Figure A.2: Constructions described in Lemma A.1We will let ˜ R d be the graph obtained from R d by adding 3 leaf edges to each of { a , b , c } . The followinglemma will show us that we can replace cubic vertices of G with copies of R d and preserve 3-connectedness: Lemma

A.1.

Suppose that A is a graph with leaf nodes, denoted by L = { l , l , l } . Let A (cid:48) be the graphobtained by identifying the 3 leaf nodes of A . Let G be some graph with a cubic vertex v ∈ V ( G ) . Let A ( G ) bea graph obtained from G by replacing v by A : that is, by deleting v from G and choosing some identiﬁcationbetween the leaf nodes of A and the neighbors of v . Then, if G is 3 connected and A (cid:48) is 3 connected, A ( G ) is 3 connected.Proof. Suppose that a, b, x, y ∈ V ( A ( G )). We will show that there is a path in A ( G ) \ { a, b } between x and y . Let B = V ( A ) \ L , and let C : V ( A ( G )) → V ( G ) be the map that contracts B back to v : for s (cid:54)∈ B , C ( s ) = s , and for s ∈ B , C ( s ) = v . There are two cases to consider:1. If C ( x ) (cid:54) = v and C ( y ) (cid:54) = v , then there is a path between C ( x ) and C ( y ) in G \ { C ( a ) , C ( b ) } , since G is3-connected. This can be lifted to a path between x and y in A ( G ) \ { a, b } .2. If C ( x ) = v , it is always possible to ﬁnd a path in A ( G ) \ { a, b } from x to some x (cid:48) with C ( x (cid:48) ) (cid:54) = v . Thereare three cases:(a) If | L ∩ { a, b }| = 0: As A (cid:48) is 3-connected, there is a path in A (cid:48) \ ( B ∩ { a, b } ) from x to w . This givesa path in A ( G ) from x to a node L .(b) If | L ∩ { a, b }| = 1: At most one node of { a, b } can be contained in B = A \ L . Thus, A (cid:48) \ ( B ∩ { a, b } )is 2-connected, so there are two paths in A (cid:48) \ ( B ∩ { a, b } ) from x to w . These give paths in A ( G ),which only intersect { a, b } at L , and of these paths connects to a node in L \ { a, b } .(c) If | L ∩ { a, b }| = 2: Since A (cid:48) is 3 connected, there are three node disjoint paths from x to w . In A ( G ),this corresponds to a path to each of the three leaf nodes, one of which is not contained in { a, b } .Likewise, if C ( y ) = v , then we can connect y to some y (cid:48) , with C ( y (cid:48) ) (cid:54) = v . Once we have connected x to x (cid:48) and y to y (cid:48) outside of A ( v ), we are back in Case 1.The following lemma is well known; it is one of the Barnette-Grunbaum (BG) operations, introducedin [19, Proof of Theorem 2]. See also [83]. Lemma

A.2 (BG-operation).

Let G be a -connected graph. Let e and e ∈ E ( G ) . Suppose that G (cid:48) isthe graph obtained from G by subdividing each e i by introducing a vertex x i , and then adding an edge from x to x . Then G (cid:48) is -connected. Lemma

A.3. If G is 3-connected, then so is R d ( G ) .Proof. If we show that ( ˜ R d ) (cid:48) (in the notation of Lemma A.1 and the paragraph preceding it) is 3connected, then the claim follows from Lemma A.1 by considering R d ( G ) as obtained by replacing each nodeof G by an ˜ R d one at a time in the sense given by Lemma A.1. To prove that ( ˜ R d ) (cid:48) is 3-connected we argueby induction. In the base case, ( ˜ R ) (cid:48) is a K graph, so it is 3-connected. Let Q d be obtained from ( ˜ R d ) (cid:48) byadding a single node c in the center connected with three edges subdividing edges of the inner circle of R d .If ( ˜ R d ) (cid:48) is 3-connected, then it follows from applying the BG-operation of Lemma A.2 twice that Q d is also3-connected. From this it follows that ( ˜ R d +1 ) (cid:48) R d +1 ) (cid:48) is obtained by replacing c withan ˜ R , so it is also 3-connected by Lemma A.1. emark A.4.

Lemma

A.1 and Lemma

A.2 make it relatively straightforward to check that inserting the OR gadget preserves -connectedness, which is stated in [49] without proof. Lemma

A.5.

Suppose that H is a cubic planar graph, with face degree bounded by d . Then R d ( H ) hasface degree bounded by d .Proof. For each face, each vertex along that face supplies two additional edges when we replace verticeswith copies of R d . Thus, the face degree multiplies by 3. The claims follows.Altogether, we have shown that for all d , the construction G → R d ( G ) sends C m into C m , where C m isas in Deﬁnition 2.23. A.3. Duality for connected k -partitions. In this section, we prove Theorem 2.52. There are threemain steps in the proof, which answer three questions:1. Can you recover a connected partition from its edge boundary?2. What does the edge boundary of a partition of a plane graph look like in the dual graph?3. In what way is the number of blocks of a connected partition reﬂected in its representation in the dualgraph?We state and prove the theorems that answer these questions in the next three subsections, and the endresult is Theorem 2.52. The reader may note that 1) is answered by matroid duality between the graphicand cographic matroids (speciﬁcally between ﬂats, which correspond to connected partitions, and unions ofcircuits) , that the answer to 2) follows quickly from the usual bond-cycle duality, and that 3) is a discreteversion of Alexander duality. A.3.1. Connected partitions and edge cuts.

We ﬁrst recall the bijection between connected par-titions and edge-cuts:

Definition

A.6 (Unordered connected partitions).

Let P ( G ) denote the set of unordered partitions of V ( G ) . Let P c ( G ) ⊆ P ( G ) denote the set of partitions such that each block induces a connected subgraph.That is, P c ( G ) = (cid:83) | V ( G ) | k =1 P k ( G ) . Definition

A.7 (Edge cut).

A.8 (Cut sets).

Let

Cuts( G ) be the set of the cuts of partitions of V ( G ) . That is, Cuts( G ) = { cut( P ) : P ∈ P ( G ) } . The elements of Cuts( G ) are called cut sets. Definition

A.9 (Component map).

Given J ∈ Cuts( G ) , deﬁne a partition comp( J ) ∈ P c ( G ) as theconnected components of G \ J . This deﬁnes a function comp : Cuts( G ) → P c ( G ) . Proposition

A.10.

The functions comp and cut induce a bijection between

Cuts( G ) and P c ( G ) .Proof. To show that cut is surjective, we observe that if P is any partition, we can deﬁne a connectedpartition P (cid:48) , whose blocks are the connected components of the blocks of P , and cut P = cut P (cid:48) . We willconclude by showing that comp ◦ cut = id . First, observe that comp ◦ cut does not merge any blocks, sinceevery path in G between two blocks has to cross a cut edge. Second, observe that comp ◦ cut does not splitany blocks, since two points in any block of a connected partition are always connected by a path that doesnot use any cut edges.So far we have established that a connected partition is determined by the boundaries between itsblocks. Next, we work towards characterizing the shapes that can arise as such boundaries, by treating themas subgraphs of the planar dual. A.3.2. Dual connected partitions and connected partitions.

The following straightforward lemmais useful for proving the duality theorem:

Lemma

A.1.

Let G be a graph, and J ⊆ E ( G ) . Then each connected component of G [ J ] is two edge-connected if and only if each connected component of G [ J ] has no bridge edges if and only if G [ J ] is a unionof not-necessarily disjoint simple cycles. We wish to acknowledge a helpful MathOverﬂow discussion that drew our attention to this connection to matroids [8].47 efinition

A.11 (Dual connected partitions).

Let E ( G ) denote the set of subsets of edges of G thatare unions of not-necessarily disjoint simple cycles. We will call these the dual connected partitions . The purpose of the next few propositions is to show that dual connected partitions are plane duals ofthe cuts of connected partitions. First we recall the bijection between the edges of a plane graph and theedges of its dual:

Definition

A.12 (Dual edges).

Let G be a plane graph. For an edge e ∈ E ( G ) , let e ∗ denote the edge in G ∗ with the property that the two endpoints of e ∗ are the shores of e , i.e., the two faces that are separated by e . For a set J ⊆ E ( G ) , denote by J ∗ the corresponding set of edges in G ∗ . We deﬁne a function D ( J ) = J ∗ ,which is a bijection E ( G ) → E ( G ∗ ) We aim to prove a plane duality between Cuts( G ) and E ( G ∗ ). In particular, we want show that D induces a bijection between Cuts( G ) and E ( G ∗ ). Towards that, we will recall the plane duality betweeneven subgraphs and the edge boundaries, which will be useful for controlling the topology of D ( J ) for J ∈ Cuts( G ). Definition

A.13 (Edge boundary).

Let G = ( V, E ) be a graph, and A ⊆ V . Denote by cut( A ) =cut( { A, A c } ) = ∂ E ( G ) , the edge boundary of A . Definition

A.14 (Even Subgraphs).

Let G = ( V, E ) be a graph. A subset J ⊆ E deﬁnes an even sub-graph G [ J ] if the degree of each node of G [ J ] is even. Let Even ( G ) = { J ⊆ E ( G ) : G [ J ] is an even subgraph } . Proposition

A.15 (Proposition 2.1 in [47] ). Let G be a connected plane graph, and let H ⊆ E ( G ) .Then, H is an even subgraph if and only if H ∗ is an edge boundary. Moreover, H is a simple cycle if andonly if H ∗ is the cut of a connected -partition. The following well-known lemma will be useful for relating

Even ( G ) to E ( G ): Lemma

A.16 (Euler).

Let G be a graph. Then J ⊆ E ( G ) is an even subgraph if and only if J is aunion of pairwise disjoint simple cycles. The previous theorem characterized edge boundaries, which are the cut sets of not-necessarily connected2-partitions, using the planar dual. The next proposition will characterize the cut sets of connected k -partitions using the planar dual. Proposition

A.17.

Let G be a connected plane graph. Let H ⊂ E ( G ) . Then H ∈ E ( G ) if and only if H ∗ is a cut set. In particular, D gives a bijection between E ( G ) and Cuts( G ∗ ) .Proof. ⇒ Suppose that H ∈ E ( G ), and let P ∈ P c ( G ∗ ) be the connected partition deﬁned by theconnected components of G ∗ \ H ∗ . We show that cut( P ) = H ∗ . Let e ∈ H , and let C ⊆ H be a cyclecontaining it. Then the shores of e are necessarily in diﬀerent components of G ∗ \ C ∗ by Proposition A.15,and thus in diﬀerent components of G ∗ \ H ∗ . Thus, e ∗ ∈ cut( P ), so H ∗ ⊆ cut( P ). On the other hand,if e ∗ ∈ cut( P ), then the shores of e are in diﬀerent components of G ∗ \ H ∗ , so every path in G ∗ betweenthe two shores of e must pass through H ∗ . In particular, the path e ∗ must pass through H ∗ , which meansthat e ∗ ∈ H ∗ . Thus, cut( P ) ⊆ H ∗ . ⇐ Now suppose that H ∗ = cut( P ) for some P ∈ P c ( G ). Take any e ∈ H . We want to show that e isnot a bridge edge. Suppose that A and B are the blocks of P containing the faces in the two shores of e . Now, create a not necessarily connected 2-partition P (cid:48) by reassigning all of the blocks of P that arenot A or B to be part of block A . Now, cut( P (cid:48) ) ⊆ cut( P ) and e ∈ cut( P (cid:48) ). Since cut( P (cid:48) ) ∗ is a union ofedge disjoint cycles (Lemma A.16), there is a simple cycle C in cut( P (cid:48) ) ∗ that contains e . In particular,cut( P ) ∗ = H ⊇ C , so e could not have been a bridge edge of H .We summarize the previous two results in a single duality statement: Proposition

A.18 (Duality between connected partitions and dual connected partitions).

The func-tions comp ◦ D − and D ◦ cut are mutual inverses, inducing a bijection between P c ( G ) and E ( G ∗ ) .Proof. Since D induces a bijection between E ( G ∗ ) and Cuts( G ) (Proposition A.17) and comp and cutgive a bijection between Cuts ( G ) and P c ( G ) (Proposition A.10), the claim follows. Beware that his terminology is diﬀerent from ours; speciﬁcally, he refers to what we call an edge boundary as an edge cut.48 .3.3. The number of blocks and the circuit rank.

Now we will review some facts that relatethe number of blocks in a connected partition of a plane graph to the circuit rank of the corresponding dualconnected partition.

Definition

A.19 ( h and h ). Let G be a graph. Then h ( G ) denotes the circuit rank of G , h ( G ) denotes the number of connected components of G . Proposition

A.20.

Let G = ( V, E ) be a plane graph. Then h ( G ) = | V ( G ) | − | E ( G ) | + | F ( G ) | .Proof. One can add h − E (cid:48) edges, then E (cid:48) = E + h −

1, and we have V − E (cid:48) + F = 2, from which the formulafollows. Proposition

A.21. If G is a graph, then h ( G ) − h ( G ) = | E ( G ) | − | V ( G ) | .Proof. Since h is the cycle rank, which is the dimension of the kernel of the boundary map ∂ : F E → F V ,and h is the rank of the cokernel of ∂ , this is just a statement of the rank-nullity theorem. Proposition

A.22.

Let G be a connected plane graph. For P ∈ P c ( G ) and J ∈ E ( G ) , h ( G ∗ [cut( P ) ∗ ]) = | P | − and | comp( J ∗ ) | − h ( G [ J ]) . Here | P | counts the number of blocks of P .Proof. Let P ∈ P c ( G ). Let J = cut( P ) ∗ . Proposition A.20 and Proposition A.21 together yield that h ( G ∗ [ J ]) = h − V + E = | F ( G ∗ [ J ]) | −

1. Since the number of faces of G ∗ [ J ] is the number of componentsof G \ J , and since comp( J ) = comp ◦ cut( P ) = P , we obtain h ( G ∗ [cut( P ) ∗ ]) = | P | −

1. Now consider J ∈ E ( G ∗ ), and let P = comp( J ∗ ). Since J = cut( P ) ∗ , it follows from h ( G ∗ [cut( P ) ∗ ]) = | P | − h ( G ∗ [ J ]) = | P | −

1, so the claim follows from ( G ∗ ) ∗ = G .Finally, we present the duality theorem: Definition

A.23 (Dual k -partition). We deﬁne P ∗ k ( G ) = { J ∈ E ( G ) : h ( G [ J ]) = k − } . We call theelements of this set dual k -partitions. Theorem

A.24 (Duality between P k ( G ) and P ∗ k ( G ∗ )). The map D ◦ cut : P k ( G ) → P ∗ k ( G ∗ ) is abijection, with comp ◦ D − : P ∗ k ( G ∗ ) → P k ( G ) as its inverse. Both are computable in polynomial time.Proof. The bijection follows from Proposition A.22 and Proposition A.18. It is well known that D andcomp and cut can be computed in polynomial time. Appendix B. Positive results.B.1. Using marginal counts.

In this section, we prove the correctness of Algorithm 5.1.

Proposition

B.1.

The output of Algorithm is a random variable J valued in [ n ] and drawn withdistribution p . Moreover, if a call to the oracle O takes time O ( T ( n )) , then the total runtime of Algorithm is O ( nT ( n )) .Proof. Let S be a random variable distributed according to p . Let P be the probability measure under-lying the process of the algorithm and the random variable S . Let J k be the random set J on the k th stepof Algorithm 5.1. Using induction, we will show that, for all m ∈ [ n ],(B.1) P ( J m = W ) = P ( S ∩ [ m ] = W ) . The desired conclusion is the case m = n . In the base case, when k = 1, Equation (B.1) holds because P ( J = { } ) = p (1 |∅ ) = P (1 ∈ S ) = P ( S ∩ { } = { } ). Now, suppose that for some m ≥ m < n , itholds that P ( J m = W ) = P ( S ∩ [ m ] = W ) for all W ⊂ [ m ]. Recall that from the deﬁnition of Algorithm 5.1we have that P ( J m +1 = W ∪ { m + 1 }| J m = W ) = P ( m + 1 ∈ S | S ∩ [ m ] = W ).The inductive step now follows by a computation: P ( J m +1 = W ∪ { m + 1 } ) = P ( J m +1 = W ∪ { m + 1 }| J m = W )) P ( J m = W )= P ( m + 1 ∈ S | S ∩ [ m ] = W ) P ( S ∩ [ m ] = W )= P ( S ∩ [ m + 1] = W ∪ { m + 1 } ) . Likewise P ( J m +1 = W ) = P ( S ∩ [ m + 1] = W ). .2. Series-parallel graphs. We recall the deﬁnition of a series-parallel graph, a class of graphs wellsuited to dynamic programming algorithms.

Definition

B.2 (Two-terminal graphs).

A two-terminal graph G is a graph with two distinguishednodes: a source, σ ( G ) , and a sink τ ( G ) . A pair of two-terminal graphs, G and H , are said to be iso-morphic, G ∼ = H , if there is an isomorphism of the underlying graphs which maps the source to the sourceand the sink to the sink. Example

B.3.

The complete graph on { , } is naturally a two-terminal graph, where we set σ ( K ) = 0 and τ ( K ) = 1 . We denote it by K Definition

B.4 (Series Composition).

Let G and G be two-terminal graphs. We deﬁne G ◦ G asthe graph obtained from the disjoint union of G and G by identifying τ and σ , and we make it into atwo-terminal graph by setting σ ( G ◦ G ) = σ ( G ) and τ ( G ◦ G ) = τ ( G ) . Definition

B.5 (Parallel Composition).

In the notation of Deﬁnition

B.4 , we deﬁne G (cid:107) G as thegraph obtained from the disjoint union of G and G by making the identiﬁcations σ ∼ σ and τ ∼ τ .We make G (cid:107) G into a two-terminal graph by deﬁning s ( G (cid:107) G ) = [ σ ( G )] = [ σ ( G )] and t ( G (cid:107) G ) =[ τ ( G )] = [ τ ( G )] . Definition

B.6 (Series-Parallel graphs).

We deﬁne the class of series-parallel graphs as the smallestclass of two-terminal graphs that is closed under Parallel Composition and Series Composition, and whichcontains the two-terminal graph K . The feature which makes series-parallel graphs convenient for dynamic programming is that we canrecord the series and parallel composition operations into a tree, called the SP -tree, around which we canorganize dynamic programs: Definition

B.7 ( SP -tree). An SP tree is a rooted binary tree, where the children of any internal nodehave an ordering, and where each internal node of the tree is labelled P or S . We assign to each leaf a copyof the two-terminal graph K . Then, to each internal node we assign the graph obtained by applying either S eries Composition or P arallel composition to its children, depending on the label of that internal node; forthe series composition (labelled S ), the order of the composition is in the order on the children prescribed bythe tree, and for parallel composition (labelled P ) the order does not matter. If T is an SP -tree, deﬁne G ( T ) as the two-terminal graph assigned to the root of T . If X = ( G, σ, τ ) is a series-parallel graph, we say thatan SP -tree T is an SP -tree for X if X ∼ = G ( T ) . Lemma

B.8 (Theorem 4.1 of [22] ).

Given a graph G , determining if G is series-parallel and if so buildingan SP -tree for it can be done in linear time. B.3. Computing marginal probabilities on graphs of treewidth . In this section we give apolynomial time algorithm for counting simple cycles on graphs of treewidth 2. We also prove as a byproductof the method that it is possible to eﬃciently sample from a much broader family of distributions than justuniform, namely those deﬁned by Deﬁnition 5.8. For these computations, it will be convenient to extend theconcept of a network to allow edge weights in other rings. These results are extended in [91] to graphs ofbounded treewidth.

Definition

B.9 ( R -network). Let G be a graph and R a ring, and let w : E ( G ) → R be a function.Then we will call ( G, w ) an R -network. Let Q be the rationals. Let ( G, w ) be a Q -network with non-negative weights, and let J, J (cid:48) ⊆ E ( G ).For sampling with Algorithm 5.1 we would like to be able to compute the total ν w (Deﬁnition 5.8) massof the simple cycles containing J and disjoint from J (cid:48) . The approach here will be to encode that mass asthe evaluation of a generating function (Deﬁnition B.10), which we can evaluate eﬃciently on series-parallelgraphs by dynamic programming. Definition

B.10 (Simple cycle generating function).

Let ( G, w ) be an R -network. Let f SC ( G, w ) := (cid:80) C ∈ SC ( G ) (cid:81) e ∈ C w ( e ) denote the generating function of the simple cycles of G evaluated at the weights w . Let (

G, c ) be a Q -network. To sample from ν c of G , the marginals we need in the course of Proposition B.1are easily computed from N c ( { C ∈ SC ( G ) : C ∩ J (cid:48) = ∅ , C ⊇ J } ) = (cid:80) C ∈ SC ( G ) ,C ∩ J (cid:48) = ∅ ,C ⊇ J (cid:81) e ∈ C c ( e ). To btain these measurements for any given J, J (cid:48) , let x be some formal variable, and set w ( e ) = xc ( e ) for e ∈ J , w ( e (cid:48) ) = 0 for e (cid:48) ∈ J (cid:48) and w ( f ) = c ( e ) otherwise. Then the coeﬃcient of the x | J | term of f SC ( G, w ) is (cid:80) C ∈ SC ( G ): J ⊆ C,J (cid:48) ∩ C = ∅ (cid:81) e ∈ C c ( e ), which is the N c mass of all the simple cycles that are disjoint from J (cid:48) andthat contain J . We next show that we can compute f SC ( G, w ) if G is a series-parallel graph, via a dynamicprogramming algorithm which runs in time polynomial in | G | and | w | . We also need to keep track of thecorresponding generating function for the simple paths, which we deﬁne next. Definition

B.11 (Simple path generating function).

Let ( G, w ) be a series-parallel R -network, withsource σ and sink τ . Then we deﬁne f SP ( G ) = (cid:80) γ ∈ SP σ,τ ( G ) (cid:81) e ∈ γ w ( e ) , where SP σ,τ ( G ) is the set of simplepaths from σ to τ in G , where a path is a sequence of edges . Lemma

B.12.

Let ( G , w ) and ( G , w ) be series-parallel R -networks. Let w : E ( G ) (cid:116) E ( G ) → R bethe unique weight function that restricts to w and w . Then, let w make both G ◦ G and G (cid:107) G into R -networks, using that both have edge set E ( G ) ∪ E ( G ) . Then: (B.2) f SC ( G ◦ G , w ) = f SC ( G , w ) + f SC ( G , w )(B.3) f SP ( G ◦ G , w ) = f SP ( G , w ) f SP ( G , w )(B.4) f SC ( G (cid:107) G , w ) = f SC ( G , w ) + f SC ( G , w ) + f SP ( G , w ) f SP ( G , w )(B.5) f SP ( G (cid:107) G , w ) = f SP ( G , w ) + f SP ( G , w ) Proof.

In each case, the equality on generating functions will follow from a bijection of sets.

Proof of Equation (B.2) . SC ( G ◦ G ) = SC ( G ) ∪ SC ( G ). Proof of Equation (B.3) . SP ( G ◦ G ) = { γ ◦ γ : γ i ∈ SP ( G i ) } , where ◦ between paths denotesconcatenation. Proof of Equation (B.4) . SC ( G (cid:107) G ) = SC ( G ) ∪ SC ( G ) ∪ { Set ( γ ◦ Reverse ( γ )) : γ i ∈ SP ( G i ) } ,where the Reverse of a path is that path run backwards, and

Set ( ) takes the sequence of edges and turnsit into a subset of E ( G (cid:107) G ). Proof of Equation (B.4) . SP ( G (cid:107) G ) = SP ( G ) ∪ SP ( G )Now we recall a key lemma that will allow us to reduce the treewidth 2 case to the series-parallel case,and prove the theorem about eﬃciently evaluating f SC : Lemma

B.13 ( [92] Theorem 4.1). If H is a graph of treewidth ≤ , then there is a series-parallel graph G and an embedding i : H → G . Both H and i are computable from H in linear time. Lemma

B.14.

Suppose ( H, w ) is a Q [ x ] -network with treewidth ≤ . Then f SC ( H, w ) can be computedin time polynomial in | ( H, w ) | . In particular, if ( H, c ) is a Q -network, then N c ( { C ∈ SC ( H ) : C ∩ J (cid:48) = ∅ , C ⊇ J } ) can be computed in time polynomial in | ( H, c ) | .Proof. By Lemma B.13, there is a series-parallel graph G , which contains H as a subgraph, and moreover G and i : H → G can be constructed in linear time. Extend w to all the edges of G by deﬁning w ( e ) = 0for e ∈ E ( G ) \ E ( H ). Then we have that f SC ( G, w ) = f SC ( H, w ) since all the cycles of G that arenot cycles of H have weight zero. We let T G be a binary SP -tree of G , which has O ( | G | ) nodes. UsingLemma B.12, we compute f SC and f SP at each node. The cost of the calculation at each node is boundedby the cost of multiplying and adding the corresponding generating functions for the nodes children. Eachgenerating function has degree at most O ( | G | (max e ∈ E deg( w ( e ))), and has coeﬃcients that have binaryencoding whose length is polynomial in | ( G, w ) | . Thus, the ﬁrst claim follows. The second claim followsbecause if we set w ( e ) = c ( e ) x x ∈ J + c ( e )1 x (cid:54)∈ J (cid:48) ∪ J , then the coeﬃcient of the x | J | term of f SC ( G, w ) is N c ( { C ∈ SC ( H ) : C ∩ J (cid:48) = ∅ , C ⊇ J } ). .4. Dynamic program for counting balanced partitions on series-parallel graphs. In thissection we show how to set up a dynamic program that will count the number of balanced connected 2-partitions of a series-parallel graph. We will be interested in the case where nodes have weights valued in N = { , , , . . . } , but it will be convenient extend these weights to take values in a larger monoid, N , whichwill also keep track of when a set of nodes is non-empty. Definition

B.15 (The monoid N ). Let E be the commutative monoid given by E = {{∅ , ¬∅} , ∪ , ∅} where ∅ is the identity element, and ¬∅ ∪ ¬∅ = ¬∅ . Let N be the monoid of natural numbers with addition.Let N = N × E and by abuse of notation we let ∈ N denote the additive identity, and + the binary operationin N . Let n : N → N and e : N → E be the natural projections. Definition

B.16 ((Admissible) node-weighted graphs). (

G, w ) will denote a graph G = ( V, E ) alongwith a function w : V ( G ) → N . If e ( w ( v )) = ¬∅ for all v ∈ V , then we call ( G, w ) admissible. For any A ⊆ V ( G ) , deﬁne w ( A ) = (cid:80) a ∈ A w ( a ) . Definition

B.17 (The DP-table X ( G, w )).

For a series-parallel graph G with source σ and sink τ , andweight function w : V ( G ) → N , we imitate [45] and deﬁne: X ( G, w ) = { ( a , a , a , m ) ∈ N × N | there are exactly m partitions of V ( G ) into blocks V , V , V such that:(i) w ( V i ) = a i and G [ V i ] is connected for i = 1 , , (ii) σ ∈ V (iii) a = 0 implies that τ ( G ) ∈ V and a (cid:54) = 0 implies that τ ( G ) ∈ V . } That is, X ( G, w ) is a function N → N that counts the number of partitions of G into connected 3-partitionswhich blocks of given weights. If we know X ( G, w ), we will be able to calculate | P ( G, w ) | (Theorem B.23). Definition

B.18 (Series and parallel composition for weighted SP graphs).

Let ( G , w ) and ( G , w ) be series-parallel graphs with N valued weights on the nodes. Deﬁne weights w on the nodes of G = G ◦ G and G = G (cid:107) G by ﬁrst thinking of w and w as functions on V ( G ) , through extending them to the othernodes by assigning them the value (0 , ¬∅ ) , and then setting w = w + w . Definition

B.19 (Naming conventions for series-parallel compositions).

In the case of G ◦ G , we let (cid:15) denote the node τ = σ , σ = σ and τ = τ . In the case of G (cid:107) G , σ = σ = σ and τ = τ = τ . Let (

G, w ) be a node-weighted series-parallel graph. The next several propositions will show that we cancompute X ( G, w ) by a dynamic programming algorithm on a binary SP -tree of G . Speciﬁcally, we will showhow to compute X (( G , w ) ◦ ( G , w )) and X (( G , w ) (cid:107) ( G , w )) from X ( G , w ) and X ( G , w ), using aalgorithms that are polynomial time in G and pseudopolynomial time in the total weights. To prove thesealgorithms to be correct, we will compare the dynamic calculation of X ( G, w ) with a dynamic enumerationof the partitions in question, using the following deﬁnition:

Definition

B.20 (The enumeration version of X ( G, w )). (cid:101) X ( G ) = { ( V , V , V ) ∈ P ( V ( G )) : ( V , V , V ) form a partition of V ( G ) such that(i) G [ V i ] is connected for i = 1 , , (ii) σ ∈ V (iii) V = ∅ implies that τ ( G ) ∈ V and V (cid:54) = ∅ implies that τ ( G ) ∈ V . } In the proof of correctness for the dynamic program for evaluating X (( G, w )), we will explain how tocompute ˜ X ( G ◦ G ) from ˜ X ( G ) and ˜ X ( G ) by merging the blocks sharing the node (cid:15) , and accepting theoutput when it results in an element of ˜ X ( G ◦ G ). lgorithm B.1 SeriesPartitions

Input: X (( G , w )) and X (( G , w )) for ( G , w ) and ( G , w ). Output: X (( G , w ) ◦ ( G , w )) Set f as constantly zero on { ( a , a , a ) ∈ N : 0 ≤ a i , (cid:80) a i = w (( G , w ) ◦ ( G , w )) } . for ( a , a , a , m ) ∈ X ( G , w ) and ( b , b , b , m ) ∈ X ( G , w ) do if ( a = 0 and b = 0) and ( a = 0 or b = 0) then f ( a + b , a + b ,

0) += m m if ( a = 0 and b (cid:54) = 0) and ( a = 0 or b = 0) then f ( a + b , a + b , b ) += m m if ( a (cid:54) = 0 and b = 0) and ( a = 0 or b = 0) then f ( a , a + b , a + b ) += m m if a (cid:54) = 0 and b (cid:54) = 0 and ( a = 0 and b = 0) then f ( a , a + b , b ) += m m Return fX X Y Y X X Y Y Y X X X Y Y X X Y Y Y X (a) Case 1 (b) Case 2(c) Case 3 (d) Case 4Figure B.1: The four cases in Algorithm B.1 Proposition

B.21. If ( G, w ) is admissible, then Algorithm B.1 runs correctly and in polynomial timein ( | G | , n ( w ( G ))) . Proof.

We need to check that each element of P ( G ) is counted exactly once. To verify this, we explainanother algorithm, Algorithm B.2, that computes (cid:101) X ( G ◦ G ) from (cid:101) X ( G ) and (cid:101) X ( G ), but in exponentialtime. We will verify that in the course of this algorithm each element of (cid:101) X ( G ◦ G ) is computed exactly once.Finally, we will explain that the correctness of Algorithm B.1 can be seen by coupling it with a acceleratedversion of Algorithm B.2, and we compute the time it takes for Algorithm B.1 to run. This is pseudopolynomial in the total weight. 53 lgorithm B.2

SetLevelSeriesPartitions

Input: series-parallel graphs G and G along with sets (cid:101) X ( G ) and (cid:101) X ( G ) Output: (cid:101) X ( G ◦ G ) Initialize F as the zero function on P ( V ( G )) , where P denotes the powerset. for ( X , X , X ) ∈ (cid:101) X ( G ) and ( Y , Y , Y ) ∈ (cid:101) X ( G ) do if X = ∅ AND Y = ∅ AND ( X = ∅ OR Y = ∅ ) then F ( X ∪ Y , X ∪ Y , ∅ ) += 1 if X = ∅ AND Y (cid:54) = ∅ AND ( X = ∅ OR Y = ∅ ) then F ( X ∪ Y , X ∪ Y , Y ) += 1 if X (cid:54) = ∅ AND Y = ∅ AND ( X = ∅ OR Y = ∅ ) then F ( X , X ∪ Y , X ∪ Y ) += 1 if X (cid:54) = ∅ AND Y (cid:54) = ∅ AND X = ∅ AND Y = ∅ then F ( X , X ∪ Y , Y ) += 1 Return F Our ﬁrst goal is to verify that the function F returned by Algorithm B.2 is the indicator of ˜ X ( G )in P ( V ( G )) . First, it is straightforward to check that supp( F ) ⊆ ˜ X ( G ), by examining each case. Let( Z , Z , Z ) ∈ ˜ X ( G ). There are four cases based on how Z separates ( σ, (cid:15), τ ). The cases are:1. There is a block containing σ , (cid:15) and τ .2. There is a block containing σ and (cid:15) , and a distinct block containing τ .3. There is a block containing σ , and a distinct block containing (cid:15) and τ .4. σ, (cid:15) and τ are each in a diﬀerent block.This accounts for 4 of the 5 equivalence relations on { σ, (cid:15), τ } . The missing equivalence relation on { σ, (cid:15), τ } would put σ and τ in a block, and (cid:15) in a diﬀerent block. This case cannot occur due to the requirement thatthe blocks are connected. One can now go through the four cases, and observe that for each Z = ( Z , Z , Z ), Z can be produced in exactly one of the four cases in the algorithm, because each step is distinguished byhow the partitions they produce separate { σ, (cid:15), τ } . Moreover, because within each case the X i and Y i can berecovered by taking appropriate intersections of the Z i with G and G , there is a unique pair of partitionsof G and G that produce each ( Z , Z , Z ). Thus, F ( Z , Z , Z ) = 1. We now go through the cases inmore detail: • The case X = ∅ and Y = ∅ , equivalently, σ, τ ∈ Z . See Figure B.1(a).Recovery: X = Z ∩ G , X = Z ∩ G , X = ∅ , and Y = Z ∩ G , Y = Z ∩ G , Y = ∅ . • The case X = ∅ , Y (cid:54) = ∅ , equivalently, σ, (cid:15) ∈ Z , τ ∈ Z . See Figure B.1(b).Recovery: X = Z ∩ G , X = Z ∩ G , X = ∅ , and Y = Z ∩ G , Y = Z ∩ G Y = Z . • The case X (cid:54) = ∅ , Y = ∅ , equivalently, σ ∈ Z , (cid:15), τ ∈ Z . See Figure B.1(c).Recovery: X = Z ∩ G , X = Z ∩ G , X = Z ∩ G , and Y = Z ∩ G , Y = Z ∩ G , Y = ∅ . • The case X (cid:54) = ∅ , Y (cid:54) = ∅ , equivalently, σ, (cid:15), τ are each in a diﬀerent block. See Figure B.1(d).Recovery: X = Z ∩ G , X = ∅ , X = Z ∩ G , and Y = Z ∩ G , Y = ∅ , Y = Z ∩ G .Finally, we examine the relationship between Algorithm B.2 and Algorithm B.1. In particular, we willcouple them together by sorting the elements of ˜ X (( G , w ) so that those with the same sequence of weightsappear together, and the same for ˜ X (( G , w )). The weights of the two 3-partitions we are merging uniquelydetermines which of the four cases Algorithm B.2 is in since, by construction of the E coordinate of everynode weight, a set is empty iﬀ its weight is zero. The weights of the the two merging partitions also determinethe weights of the resulting 3-partition, say ( c , c , c ). Thus, while the set level algorithm Algorithm B.2putters through Θ( m m ) set-operations and m m updates to a function, the algorithm Algorithm B.1computes m m and adds that that to the value of f over some eﬃciently computable tuple ( c , c , c ).Finally, after having veriﬁed that Algorithm B.1 has the correct output, we remark that it consists ofa single loop over the product of two sets, which has size bounded by O ( w ( G ) w ( G ) ), since the numberof elements in X ( G ) is in general bounded by w ( G ) . Within each loop, each m i ≤ | V ( G ) | , so the cost ofmultiplications and additions are polynomial in | G | . This concludes the proof.We next explain how to compute X (( G , w ) (cid:107) ( G , w )) from X (( G , w )) and X (( G , w )). lgorithm B.3 ParallelPartitions

Input: X (( G , w )) and X (( G , w )) for ( G , w ) and ( G , w ) Output: X (( G , w ) (cid:107) ( G , w )).Set f to be constantly zero on { ( a , a , a ) : 0 ≤ a i , (cid:80) a i = w (( G , w ) (cid:107) ( G , w )) } for ( a , a , a , m ) ∈ X ( G , w ) and ( b , b , b , m ) ∈ X ( G , w ) doif a = 0 AND b = 0 AND ( a = 0 OR b = 0 ) then f ( a + b , a + b ,

0) += m m if a = 0 AND b (cid:54) = 0 AND ( a = 0 OR b = 0) then f ( a + b + b , a + b ,

0) += m m if a (cid:54) = 0 AND b = 0 AND ( a = 0 OR b = 0) then f ( a + b + a , a + b ,

0) += m m if a (cid:54) = 0 AND b (cid:54) = 0 AND ( a = 0 OR b = 0) then f ( a + b , a + b , a + b ) += m m Return f X X Y Y X X Y Y Y Y Y X X X Y Y Y X X X (a) Case 1 (b) Case 2(c) Case 3 (d) Case 4Figure B.2: The four cases in Algorithm B.3 Proposition

B.22. If ( G, w ) is admissible, then Algorithm B.3 runs correctly and in time polynomialin ( | G | , n ( w ( G ))) .Proof. This proof follows the same structure as Proposition B.21. We verify that the function F returnedby Algorithm B.4 is the indicator of ˜ X ( G ) in P ( V ( G )) . First, it is straightforward to check that supp( F ) ⊆ ˜ X ( G ). Next, let ( Z , Z , Z ) ∈ ˜ X ( G ). There are four cases based on how Z connects σ and τ .1. Z has a path through G and G from σ to τ .2. Z has a path through only G from σ to τ .3. Z has a path through only G from σ to τ .4. No paths, that is: σ ∈ Z and τ ∈ Z .One can now observe that for each Z = ( Z , Z , Z ), Z can be produced in exactly one of the four cases inAlgorithm B.4, because each case is distinguished what kind of paths in Z there are from σ to τ . Moreover,because the X i and Y i can be recovered by taking appropriate intersections of the Z i with G and G , thereis a unique pair of partitions of G and G that produce each ( Z , Z , Z ). Thus, F ( Z , Z , Z ) = 1. Wenow list the cases of the algorithm in more detail: • The case X = ∅ and Y = ∅ , equivalently, Z has a path through G and G from σ to τ . See Figure B.2(a).Recovery: X = Z ∩ G , X = Z ∩ G , X = ∅ , and Y = Z ∩ G , Y = Z ∩ G , Y = ∅ . lgorithm B.4 SetLevelParallelPartitions

Input: series-parallel graphs G and G along with (cid:101) X ( G ) and (cid:101) X ( G ) Output: (cid:101) X ( G (cid:107) G ). (Convention to align with pictures G will be the top SP-graph, whose partitions aredenoted with X i .) Initialize F as the zero function on P ( V ( G )) for ( X , X , X ) ∈ (cid:101) X ( G ) and ( Y , Y , Y ) ∈ (cid:101) X ( G ) do if X = ∅ and Y = ∅ AND ( X = ∅ or Y = ∅ ) then F ( X ∪ Y , X ∪ Y , ∅ ) += 1 if X = ∅ and Y (cid:54) = ∅ AND ( X = ∅ or Y = ∅ ) then F ( X ∪ X ∪ Y , X ∪ Y , ∅ ) += 1 if X (cid:54) = ∅ and Y = ∅ AND ( X = ∅ or Y = ∅ ) then F ( X ∪ Y ∪ Y , X ∪ Y , ∅ ) += 1 if X (cid:54) = ∅ and Y (cid:54) = ∅ AND ( X = ∅ or Y = ∅ ) then F ( X ∪ Y , X ∪ Y , X ∪ Y ) += 1 Return F • The case X = ∅ and Y (cid:54) = ∅ , equivalently, Z has a path through only G from σ to τ . See Figure B.2(b).Recovery: X = Z ∩ G , X = Z ∩ G , X = ∅ , and Y the component of Z ∩ G containing σ , Y = Z ∩ G and Y is the component of Z ∩ G containing τ . • The case X (cid:54) = ∅ and Y = ∅ , equivalently, Z has a path through only G from σ to τ . See Figure B.2(c).Recovery: X is the component of Z ∩ G containing σ , X = Z ∩ G , X is the component of Z ∩ G containing τ , and Y = Z ∩ G , Y = Z ∩ G , Y = ∅ . • The case X (cid:54) = ∅ and Y (cid:54) = ∅ , equivalently, σ ∈ Z and τ ∈ Z . See Figure B.2(d). Recovery: X = Z ∩ G , X = Z ∩ G , X = Z ∩ G , and Y = Z ∩ G , Y = Z ∩ G and Y = Z ∩ G .The relationship between Algorithm B.4 and Algorithm B.3 is the same as in Proposition B.21, proving thatAlgorithm B.3 has the correct output. Moreover, it consists of a single loop over the product of two sets,which has size bounded by O ( w ( G ) w ( G ) ), since the number of elements in X ( G ) is in general boundedby w ( G ) . Within each loop, each m i ≤ | V ( G ) | , so the cost of multiplications and additions are polynomialin | G | . Theorem

B.23.

Let ( G, w ) be a node N -weighted series-parallel graph. Then | P ( G, w ) | can be calculatedin time polynomial in ( | G | , w ( G )) . Proof.

We extend w to weights valued in N = E × N , by setting w (cid:48) ( a ) = ( ¬∅ , w ( a )) for all a ∈ V ( G ).Thus,( G, w (cid:48) ) is admissible. We let T be a binary SP -tree for G . This is a binary tree with | E ( G ) | leaves,so it has O ( | E ( G ) | ) nodes. Each node of T is associated with a subgraph of G , and we make them intonode-weighted series-parallel graphs by setting the E component to be ¬∅ on all nodes, and by setting the N component of the weight function in any way that adds up correctly using Deﬁnition B.18; for example,if H is a node in T , with left child L and right child R , we can assign the N part of the weight on L to bethe restriction of w to L , and on R to be the restriction of w to R \ L , and zero elsewhere. The resultingnode-weighted graphs are all admissible by construction.Moreover, the graph at each P node is the node-weighted parallel composition of the graphs at each childnode, and the graph at each S node is the node-weighted series-composition of the graphs at each child node.Computing X (( H, w (cid:48) )) at each of the leaves takes time O ( N ( w ( G ))). Computing the value of X (( H (cid:48) , w (cid:48) ))for each P or S node, given the values at the children, takes time O ( p ( | G | , w ( G ))) for some polynomial ﬁxed p (given by Proposition B.21 and Proposition B.22). Thus, the total time to compute X (( H (cid:48) , w (cid:48) )) at eachnode of the tree by memoization is O ( | E ( G ) | p ( | G | , w ( G ))).From X (( G, w (cid:48) )) we can calculate | P ( G, w ) | as | P ( G, w ) | = X ( G, w (cid:48) )(( w ( V )2 , ¬∅ ) , ( w ( V )2 , ¬∅ ) ,

0) + X ( G, w (cid:48) )(( w ( V )2 , ¬∅ ) , , ( w ( V )2 , ¬∅ )) The input to the polynomial is the size of w ( G ), not the binary encoding of w ( G ).56 igure B.3: a)Replacing each vertex of G with a triangle. Here the blue edge is the one added to J . b) Therouting rules illustrating that any Hamiltonian cycle in G gives an extension of J to a simple cycle in G (cid:48) ,and any extension of J in G (cid:48) gives a Hamiltonian cycle of G . B.5. The natural p -relation for simple cycles and un-ordered -partitions is not self-reducible. Here we prove that a natural encoding of the simple cycles of a graph is not self-reducible, unless P = NP.We take our inspiration from [66], where it is proven that a particular p -relation encoding 4 coloring a planargraph is not self-reducible. We use the same deﬁnition for self-reducibility as in [66].We encode a graph G = ( V, E ) in such a way that the edges are ordered. A solution X ∈ SC ( G ) isdescribed by an element of 2 | E | , a binary sequence of length | E | , where the k th term is 1 if and only if the k th edge of G is in X . We call this p -relation R SC .Let Σ be some alphabet, where Σ has some ordering, and we extend that ordering to Σ n for all n usingthe lexicographic order. We let R ⊆ Σ ∗ × Σ ∗ be any p -relation with | y | = p ( | x | ) for all ( x, y ) ∈ R for somepolynomial p ( n ). Now we deﬁne the following problem (see also [52, Deﬁnition 3.2]): LF- R (Lexicographically First) Input: x ∈ Σ ∗ Output: The lexicographically ﬁrst element of R ( x ), provided R ( x ) (cid:54) = ∅ .For example, LF- R SC is the problem of ﬁnding the lexicographically ﬁrst simple cycle under the partic-ular encoding of R SC . The following proposition is well known: Proposition

B.24 ( [66], Preﬁx-Search).

Suppose that R is a self-reducible p -relation, and suppose thatthere is a polynomial time Turing machine which, given x , answers if R ( x ) (cid:54) = ∅ . Then there is a polynomialtime algorithm for LF- R . The idea in [66] is to show that a certain p -relation R is not self-reducible, as long as P (cid:54) = NP, by showingthat checking R ( X ) (cid:54) = ∅ is in P , but that LF- R is N P -hard. We will follow the same approach by reducingto LF- R SC from the following problem, which we will shortly show is N P -complete.

ExtendingToSimpleCycle

Input: An undirected graph G and a set of edges J ⊆ E ( G ).Output: YES if J can be extended to a simplecycle of G . NO, otherwise. Proposition

B.25.

ExtendingToSimpleCycle is N P -complete on the class of CCP graphs with facedegree bounded by .Proof.

Let G ∈ C . Construct G (cid:48) by replacing each vertex of G with a triangle as in Figure B.3a); theresult remains 3 CCP by Lemma A.1, and has face degree bounded by 531. Build a set J by taking one edgefrom each of those triangles. By examining the local routings in ﬁgure Figure B.3b), we can see that J hasan extension to a simple cycle of G (cid:48) iﬀ G has a Hamiltonian cycle. Since the latter problem is N P -completeby Theorem 2.25, the proposition follows.

Theorem

B.26.

Fix any class of graphs K which contains C . Then the relation R SC is not self-reducible on K assuming P (cid:54) = NP . roof. Let

G, J be some instance of

ExtendingToSimpleCycle . We order the edges of G so that J are the ﬁrst edges. If there is an extension of J to a simple cycle, then one of the simple cycles extending J is the lexicographically ﬁrst simple cycle among all simple cycles of G . If we assume that R SC is self-reducible, then Proposition B.24 guarantees that we can determine the lexicographically ﬁrst simple cyclein polynomial time, which puts the extension problem into P. This contradicts P (cid:54) = NP because we haveproven that ExtendingToSimpleCycle is NP-hard.We remark that, since we have proven Theorem B.26 in the context of plane graphs, we obtain resultsabout the non-self reducibility of encodings of connected 2-partitions. One encoding that immediately reducesto the theorem just proven is to encode a connected 2-partition as the set of cut edges.is NP-hard.We remark that, since we have proven Theorem B.26 in the context of plane graphs, we obtain resultsabout the non-self reducibility of encodings of connected 2-partitions. One encoding that immediately reducesto the theorem just proven is to encode a connected 2-partition as the set of cut edges.