[PDF] Optimal Mixing in Transport Networks: Numerical Optimization and Analysis

Abstract

Many foraging microorganisms rely upon cellular transport networks to deliver nutrients, fluid and organelles between different parts of the organism. Networked organisms ranging from filamentous fungi to slime molds demonstrate a remarkable ability to mix or disperse molecules and organelles in their transport media. Here we introduce mathematical tools to analyze the structure of energy efficient transport networks that maximize mixing and sending signals originating from and arriving at each node. We define two types of entropies on flows to quantify mixing and develop numerical algorithms to optimize the combination of entropy and energy on networks, given constraints on the amount of available material. We present an in-depth exploration of optimal single source-sink networks on finite triangular grids, a fundamental setting for optimal transport networks in the plane. Using numerical simulations and rigorous proofs, we show that, if the constraint on conductances is strict, the optimal networks are paths of every possible length. If the constraint is relaxed, our algorithm produces loopy networks that fan out at the source and pour back into a single path that flows to the sink. Taken together, our results expand the class of optimal transportation networks that can be compared with real biological data, and highlight how real network morphologies may be shaped by tradeoffs between transport efficiency and the need to mix the transported matter.

Full PDF

OOPTIMAL MIXING IN TRANSPORT NETWORKS: NUMERICALOPTIMIZATION AND ANALYSIS ∗ CASSIDY MENTUS † AND

MARCUS ROPER † Abstract.

Many foraging microorganisms rely upon cellular transport networks to deliver nu-trients, ﬂuid and organelles between diﬀerent parts of the organism. Networked organisms rangingfrom ﬁlamentous fungi to slime molds demonstrate a remarkable ability to mix or disperse mole-cules and organelles in their transport media. Here we introduce mathematical tools to analyze thestructure of energy eﬃcient transport networks that maximize mixing and sending signals originatingfrom and arriving at each node. We deﬁne two types of entropies on ﬂows to quantify mixing anddevelop numerical algorithms to optimize the combination of entropy and energy on networks, givenconstraints on the amount of available material. We present an in-depth exploration of optimal singlesource-sink networks on ﬁnite triangular grids, a fundamental setting for optimal transport networksin the plane. Using numerical simulations and rigorous proofs, we show that, if the constraint onconductances is strict, the optimal networks are paths of every possible length. If the constraint isrelaxed, our algorithm produces loopy networks that fan out at the source and pour back into a singlepath that ﬂows to the sink. Taken together, our results expand the class of optimal transportationnetworks that can be compared with real biological data, and highlight how real network morpholo-gies may be shaped by tradeoﬀs between transport eﬃciency and the need to mix the transportedmatter.

Key words.

Transportation networks, Numerical optimization, Mixing entropy, Cellular net-works

AMS subject classiﬁcations.

1. Introduction.

Work by Murray in the 1920s [19] ﬁrst probed the idea thatvessels in biological transportation networks may optimize knowable target functions.Murray hypothesized that blood vessels may have optimal radii are set by tradeoﬀsbetween the need to minimize friction within the vessel (which favors large vessels),and the energetic cost of maintaining the vessel (which penalizes large vessels). Thescalings and geometric relationships that he derived from this trade-oﬀ have foundsome experimental support for the blood networks of animals [28] and water transportnetworks of plants [17]. More recent theoretical work has extended the analysis ofsingle vessels or branch points to whole networks of vessels in which the sources andsinks are prescribed but the network is given many choices for how to connect thesepoints [7, 11], added damage or ﬂuctuations in source and sink strengths [13, 10], ordeveloped models for how feedbacks between ﬂows and network growth allow suchoptimal networks to be grown [12, 21].Hundreds of thousands of species of microorganisms, including slime molds, watermolds and fungi rely on internal transportation networks. These networks have similarfunctions – they continuously grow as the organism claims territory or searches forhosts or resources. Within the network nutrients, ﬂuid and cellular matter (includingnuclei and other organelles) are transported from sites of production or uptake to sitesof utilization. Minimization of friction, in conjunction with robustness to damage,appears to underlie features of some of the foraging networks made for example bywood rotting basidiomycete fungi [6] and slime molds [27]. However, organisms build ∗ Submitted to the editors on July 28, 2020.

Funding:

This work was funded by the National Science Foundation under grant no. DMS–1351860. † a r X i v : . [ n li n . AO ] J u l C. MENTUS AND M. ROPER networks with a tremendous diversity of morphologies that can not be explainedby friction minimization alone. Do these morphologies emerge from other physicalprinciples besides minimizing friction, from constraints on the pathways used to growthe network, or from neutral diﬀerences in network morphology that do not aﬀect theorganism’s ﬁtness? We start from the position that to understand the extent of therole that optimization plays in determining the structure of networks, we must ﬁrstunderstand what the optimal network is for a given target function. This approachpreviously guided us to develop gradient-descent methods for optimizing networks forarbitrary diﬀerentiable functions [9].In this work we focus on a quantity with many points of non-diﬀerentiability:the amount of mixing occurring within the network. This quantity, which is giventwo diﬀerent quantiﬁcations below, is non-diﬀerentiable in the conductances of thenetwork at any point where the ﬂow in an edge goes to 0. Since the optimizationof the network requires searching over possible topologies for ﬂow; i.e. reversing thedirections of ﬂow on edges, we develop here a new numerical optimization methodthat is adapted to deal with this pervasive non-diﬀerentiability.Why are might real networks seek to maximize mixing? Three kinds of mixingseem to be relevant to network-forming microorganisms:1. In fungal networks cellular growth occurs at the periphery of the networkthrough the continuous extension of hyphae at their tips, and in fast growing fungi,such as the model organism

Neurospora crassa growth requires the continuous supplyof nuclei and other organelles to the edge of the mycelium [15]. Within

N. crassa nucleioften take tortuous and multidirectional paths toward the tips, and the network isknown to be organized so that pairs of nuclei that start close together within themycelium are unlikely to be delivered to the same site of growth at the periphery,potentially to stop deleterious mutations accumulating in one region of the fungus[23].2. Recent experiments in the dung fungus

Coprinopsis cinerea show large swathesof the network responding to the external threat of predatory nematode worms. Whennematode grazing is detected in one part of the fungal network, a suite of defensechemicals is expressed, not just at the site of grazing, but spreading in multipledirections through the network [20]. Spreading out nematoxin production may prepareother parts of the network for further attacks or enable the cost of labor to be spreadthrough the network [22].3. Plasmodial slime molds, such as

Physarum polycephalum live in heterogeneousenvironments containing patches of nutrients [4]. The network remodels globallywhen it discovers a new nutrient source, and it is thought that individual tubes in thenetwork respond to a cue carried within the ﬂow carried within the network [5]. Aglobal response to this cue requires that it be dispersed through the entire network.In Section 2 we deﬁne an entropy of mixing within the network. In Section 3 wedescribe a numerical method for choosing the conductances within the network, andin Section 4 we show simulation results. A highlight result is that for small valuesof the parameter, γ , which represents the penalty of dividing one edge into two, theoptimal networks become a set of paths linking source and sink. We prove why pathsare favored, and analytically expose the set of possible path optima in Section 5.

2. Mathematical model and mixing entropies.2.1. Movement of signals through a ﬂow network.

Our mathematicalmodel for the biological transportation network consists of a network (graph) withnodes (vertices) N , enumerated 1 , , . . . , N , and edges E . The nodes are arranged on PTIMAL MIXING IN TRANSPORT NETWORKS κ ij with each node and edge conductance labeled. Thesource is at node 1, the bottom-left corner, and the sink is at node 25, the top rightcorner.a regular triangular lattice, so that each vertex in the interior is linked by equal lengthedges to 6 neighbors (we write n ( i ) for the set of neighbors of i ). The conductance ofthe edge ( i, j ) is denoted by κ ij . Fluid (protoplasm) is continually pushed through thenetwork by pressure diﬀerences between the nodes. In our model the ultimate originof these pressure diﬀerences are ﬂows into and out of the network via diametricallyopposite nodes. Definition

The rate of ﬂuid entering or exiting the network at i ∈ N is the boundary ﬂow at node i , and is denoted Q i . Boundary ﬂow Q i > i (i.e.the node is a source), and Q i < i (i.e. the node is a sink). The total volume of ﬂuid contained in the network isconstant, so total inﬂows and outﬂows must be balanced: (cid:80) i Q i = 0. The boundaryﬂows in turn engender ﬂows, q ij , on the edges. Flows must also be balanced on eachnode in the network, a fact that is known as Kirchhoﬀ’s ﬁrst law of circuits: Definition

A ﬂow is called compatible with regards to the boundary ﬂows Q i if (cid:80) j ∈ n ( i ) q ij = Q i for all i ∈ N . For any set of boundary ﬂows, there are typically multiple compatible ﬂows on thenetwork. The ﬂow we are interested in, called the physical ﬂow , is the uniquecompatible ﬂow that minimizes the dissipation:

Definition

For a conductance network κ ij , the dissipation D from ﬂows q ij is the rate at which work must be done to maintain the ﬂuid ﬂows on all edges ofthe network: D ( q ij ) = (cid:80) ij q ij κ ij . The ﬂow that minimizes the dissipation can be derived from Kirchhoﬀ’s ﬁrst andsecond laws [8], which introduce a pressure variable that is deﬁned on each node inthe network:

C. MENTUS AND M. ROPER

Proposition

Kirchoﬀ ’s second law for circuits

Let κ ij be a connectedconductance network with nodes N and edges E . Let Q be boundary ﬂows such that (cid:80) i Q i = 0 . Let q ij be the physical ﬂows of this network. Then there exists p i ∈ R ,called the pressure at node i , such that q ij = κ ij ( p i − p j ) . We can compute the pressures by deﬁning a vector of pressures p = { p i } i ∈N , a vectorof boundary ﬂows Q = { Q i } i ∈N and the network Laplacian ∆ κ , a |N | × |N | matrixwith entries(2.1) ∆ κ,ij = (cid:26) − κ ij if i (cid:54) = j (cid:80) j κ ij if i = j Then conservation of mass at each node (Proposition 2.4) is equivalent to solving(2.2) ∆ κ p = Q . So long as every connected component of a physical network has one node with adeﬁned pressure, the pressures are uniquely solvable, otherwise they are solvable upto a single additive constant per connected component[8]. When the conductancenetwork is connected and the pressure at node i is known: p i = P , we add P to Q i and construct the invertible matrix ˜∆ κ by adding 1 to ∆ κ,ii .We now consider the mixing produced by the ﬂows within the network. Ourﬂow network model contains all of the scenarios for mixing described in Section 1.The signals passing through the network could represent genetically diverse nuclei(scenario 1), or chemical cues (scenarios 2 and 3). Our model does not need torepresent the entire network, it could represent the portion of network that supplies asingle hyphal tip. This supply network would be linked to supply networks for othertips, and acquires signals, randomly at each node from these other networks . Signalsare made up of blobs: either molecules or organelles. We compile a list of the nodesvisited by each signal blob: call the t -th node visited by a signal, x t . Then x t is arandom walk, with transition probability:(2.3) T ij ≡ P ( x t +1 = j | x t = i ) = q ij (cid:80) k ∈ n ( i ): q ik > q ik − Q i Q i < . that is, the ﬂow of signal from i to j is simply proportional to the total ﬂow alongthe edge ( i, j ). Eﬀectively we assume that signal is uniformly dispersed in the ﬂowingprotoplasm, ignoring any physical eﬀects such as diﬀusion [16] that move signalsindependently of ﬂows. When the signal reaches a sink node it may exit the modelednetwork (with the exit probability proportional to − Q i , so (cid:80) j T ij ≤ i performs a random walk down the pressure gradient.There are two senses in which signals may be considered to mix on the network: 1.Given the node i at which it originates we are interested in the number of nodes thatthe signal visits before exiting the network. 2. Alternately, given a node j , we areinterested in the number of diﬀerent sites of origin that signals passing through j mayhave. To quantify either form of diversity, we must focus on the probability that asignal originating at node i ever visits a node j deﬁned by:(2.4) P ij = P ( x t = j for some t ≥ | x = i ) . Signals can be transferred between supply networks without ﬂow between them, since motorprotein traﬃcking (of nuclei) or diﬀusion (of chemical cues) provide alternate transport mechanisms.PTIMAL MIXING IN TRANSPORT NETWORKS P ij from a N × N matrix. To calculate this matrix from the transitionprobabilities T , note that the probability of getting from i to j by following exactly n edges is ( T n ) ij . Hence, P = (cid:80) N − n =0 T n (note that T N = 0, because a signal can visitat most N nodes before exiting the network and signals can not visit the same nodetwice). Alternatively by summing the geometric progression:(2.5) P = ( I − T ) − , where I is the N × N identity matrix. We deﬁne two types of information entropyon the ﬂows q ij . The ﬁrst is a measure of the accumulation of signals at every node inthe network and the second represents the dispersal of signals throughout the network.We call the two entropies, respectively, total receiver entropy (or total mixingentropy ) and total sender entropy . Let f i be the total ﬂow through node i , i.e. f i = (cid:80) j ∈ n ( i ): q ij > q ij + Q i Q i > . The rate at which ﬂuid ﬂows from i to j is then˜ q ij = P ij f i . We refer to this as the ﬂow from i to j . We assume that the rate at whicha signal is produced at a node is proportional to the total ﬂow through that node.This assumption certainly makes sense if our signal consists of new nuclei that aregenerated by divisions within the protoplasm, since the ﬂow through a node will beproportional to the rate at which nuclei pass through it. For other signal productionscenarios (such as when the signal is produced in response to predation), we canarrive at this assumption if we assume that product of the new signal is rate-limitedby a component that is contained within the protoplasm, so signal production rateis proportional to rate of protoplasm cleared through the node in unit time. Underthis assumption the relative proportions of signals received at node j from upstreamnodes i are the same as the relative proportions of ˜ q ij .In our model each site in the network can send signals to other sites in the network,and any point in the network may potentially receive signals from any other point. Wecannot tell ahead of time which nodes will provide the useful signals, so we considerall nodes as possible sources of signals. We also make no assumption about siteswhere diversity needs to be maximized (this is in contrast to [23], in which geneticdiversity was considered only at hyphal tips), so we consider all of the possible nodesthat signals can reach within the network when computing the mixing entropy.To compute the entropy of the distribution of signals arriving at i we deﬁne theprobability distribution on up-stream nodes of i :(2.6) P i ( j ) = ˜ q ji N i where N i ≡ (cid:88) j :˜ q ji > ˜ q ji , eﬀectively forming a new matrix from ˜ q ij in which all columns are normalized to sum to1. We may deﬁne the local receiver entropy at node i as the Shannon informationentropy of P i : H ( P i ) = − (cid:80) j P i ( j ) log ( P i ( j )). We consider the total ﬂow through i as a measure of the “importance” of the node. In our model, the diversity of signalsis more important at high traﬃc nodes than at low traﬃc nodes. This principle isuseful mathematically, since it ensures that rearrangements of very low conductanceedges don’t greatly aﬀect the overall mixing associated with a network. At the sametime, the weighting is intended to reﬂect the relative biological importance of nodeswithin the network – a node with high ﬂow supplies a greater volume of cytoplasm tothe rest of the network, so it is more important that all of the signals (whether cues C. MENTUS AND M. ROPER or nucleotypes) are present at the node. Hence the total receiver entropy is:(2.7) H = (cid:88) i f i H ( P i ) . Similar to [26] H represents the conditional entropy associated with choosing a re-ceiving node at random with probability proportional to f i and then conditioned onour choice of node i we chose a sending node at random via the distribution P i . Our proofs in Section 5 often requirethat we partition N into subsets of nodes. It is convenient to be able to evaluatethe contributions of each subset to the total network entropy. We deﬁne restrictedentropies for subsets F ⊂ N as follows: For all i ∈ F deﬁne P F i ( j ) = ˜ q ji (cid:80) k ∈F ˜ q ki if j ∈ F and P F i ( j ) = 0 otherwise. Definition

The local negative mixing entropy restricted to F is deﬁned tobe (2.8) H ( P F i ) = − (cid:88) j ∈F :˜ q ji > P F i ( j ) log ( P F i ( j )) and the total mixing entropy restricted to F is (2.9) H F = (cid:88) i ∈F f i H ( P F i ) . It may also be important for the network tospread out signals to as many downstream nodes as possible. We deﬁne an entropyfor the places that can be reached by a new signal originating at a node within thenetwork. Speciﬁcally, instead of taking the mass distribution of incoming ﬂows andnormalizing them to a probability distribution, we use the out-going ﬂows. That is wedeﬁne the probability distribution of nodes that can be reached by a signal introducedat node i :(2.10) P (cid:48) i ( j ) = ˜ q ij (cid:80) j :˜ q ij > ˜ q ij . This is equivalent to normalizing the matrix ˜ q so that all rows sum to 1. We deﬁnethe local sending entropy at node i to be the Shannon information entropy [24]of the distribution P (cid:48) i :(2.11) H ( P (cid:48) i ) = − (cid:88) j P (cid:48) i ( j ) log( P (cid:48) i ( j )) . and we deﬁne the total sending entropy of the entire network to be the weighted sumof the node entropies:(2.12) H (cid:48) = (cid:88) i f i H ( P (cid:48) i ) . Although the en-tropies H and H (cid:48) oﬀer alternate representations of the mixing that occurs within thenetwork, they are linked by an equivalence principle: PTIMAL MIXING IN TRANSPORT NETWORKS Theorem

Let q ij be a ﬂow network compatible with boundary ﬂows Q i . Let q (cid:48) ij and Q (cid:48) i be the ﬂow network and boundary ﬂows obtained from q ij and Q i by re-versing the ﬂows, i.e. q (cid:48) ij = − q ij and Q (cid:48) i = − Q i . Then H (cid:48) ( q ij ) = H ( q (cid:48) ij ) .Proof. Notice that reversing the ﬂows doesn’t aﬀect the ﬂow strengths of nodeswithin the network because f i = (cid:88) j : q ij > q ij + | Q i | Q i < = (cid:88) j : q ij < | q ij | + | Q i | Q i > (2.13) = (cid:88) j : q (cid:48) ij > | q (cid:48) ij | + | Q (cid:48) i | Q (cid:48) i < . (2.14)The equivalence principle boils down to proving the statement ˜ q ij = ˜ q (cid:48) ji where ˜ q (cid:48) ij isthe ﬂow from node i to node j in the ﬂow-reversed network. We derive this equality bycomparing the probability of a signal path x t : t = 0 , , . . . , T under the ﬂow q ij withthe probability of the reversed path x (cid:48) t ≡ x T − t : t = 0 , , . . . , T under the reversed ﬂow q (cid:48) ij : since | q ij | = | q (cid:48) ij | it follows that T ij f i = T (cid:48) ji f j and so T (cid:48) ji = f i f j T ij . We multiplythe probability of the path x t by the strength of the starting node, f x , to obtain f x (cid:81) T − t =0 T x t x t +1 , and rewrite via a telescoping product:(2.15) f x T − (cid:89) t =0 T x t x t +1 = f x T T (cid:89) t =0 f x t f x t +1 T x t x t +1 = f x T T (cid:89) t =0 T (cid:48) x t +1 x t = f x T T (cid:89) t =0 T (cid:48) x (cid:48) t x (cid:48) t +1 . For any nodes i and j in the network, we can sum over the probability of all possiblepaths i to j in the regular network and j to i in the ﬂow-reversed network to obtain:˜ q ij = f i P ij = f j P (cid:48) ji = ˜ q (cid:48) ji . Hence the distributions P (cid:48) i ( j ) for the ﬂow network q ij areequal to the distributions P i ( j ) for the network q (cid:48) ij , so H (cid:48) i ( q ij ) = H i ( q (cid:48) ij ) leading tothe required result.The physical ﬂow on the network (see Proposition 2.4) can be reversed by reversingthe sources and sinks in the network; that is, replacing a source with inﬂow Q i by asink with outﬂow Q i , and conversely. In the cases that we will analyze in this paper,the sources and sinks are matched in number and strength (e.g. a single source andsingle sink at opposite corners of a square grid network); so a network that optimizesreceiving entropy can be transformed into a network that optimizes sending entropysimply by rotating the source into the sink and conversely. For this reason, we donot have to develop separate results for the two entropies. We focus on analyzing thereceiving entropy, which we refer to simply as mixing entropy henceforth. Buildingand using ﬂow networks requires energy investment; an organism’s optimal networkwill reﬂect tradeoﬀs between mixing eﬀectiveness and the cost of the network. Thecost has two components: each edge in the network must be built and maintained,and the ﬂuid transported within the network dissipates energy due to friction. Thetwo cost components play slightly diﬀerent roles in our optimization, we incorporatethe ﬁrst cost via a holonomic constraint, and the second via a penalty.Murray [19] posited that the cost of a maintaining a vessel is either proportionalto its volume or surface area. Since all of the vessels in our networks have the samelength, and the Hagen-Poiseuille law states that conductance is proportional to thefourth power of the radius, these scenarios correspond respectively to the cost of anedge being proportional to κ / ij or to κ / ij . We constrain the cost the total material C. MENTUS AND M. ROPER available to the network, requiring (cid:80) κ γij = C where 0 < γ < { Θ( κ ij ) ≡ − H ( κ ij ) + cD ( κ ij ) : κ ij ≥ ∀ ( i, j ) ∈ E , (cid:88) κ γij = C } . We refer to Θ as the mixing-dissipation cost (abbreviated:

CMD ). Since theset of allowed conductances is compact, we know that the minimizer exists. Theconstant c represents the relative priority to the network of minimizing dissipationover maximizing mixing. Along with γ it is one of the main parameters that weexplore in this work. We stretch our notation to refer to the minimum value of Θ fora given value of c as Θ( c ). Lemma

The minimal mixing-dissipation cost is a concave function of c . Thatis, for c , c ≥ : Θ( tc + (1 − t ) c ) ≥ t Θ( c ) + (1 − t )Θ( c ) for all ≤ t ≤ .Proof. Set c = tc + (1 − t ) c , and let κ i be a minimizer of − H + c i D ; i = 1 , , − tH ( κ ) + tc D ( κ ) − (1 − t ) H ( κ ) + (1 − t ) c D ( κ ) ≤ − tH ( κ ) + tc D ( κ ) − (1 − t ) H ( κ )+(1 − t ) c D ( κ )= − H ( κ )+ ( tc + (1 − t ) c ) D ( κ ) . (2.17)To ﬁnish formulating the optimization problem we restrict the set of networktopologies that we are searching over: Let G be an unweighted undirected networkwith nodes N and edges E . Choosing E allows us to constrain e.g. the maximumdegree of the nodes in our optimal network. Our optimal network is restricted tobe a subnetwork of G : we refer to G as the ambient network . For the purposesof this study we will assume that the network is planar (this assumption is almostcertainly true for slime mold networks, but is less valid in fungal networks, wherehyphae often crossover without connecting to each other). In this paper we restrict toregular triangular networks, in which all of the edges in the ambient network have thesame length. We do not think that our results are sensitive to the choice of (regular)ambient network: we have for example, reproduced all of the results discussed in thispaper with square grid ambient networks [18]. Solutions of our optimization problem depend upon the value of C = (cid:80) κ γij . However, as c is increased from 0 to ∞ , the same sequences of optimal networksare found, independent of C . For suppose that κ ij is a conductance network that solvesEq. 2.16 with (cid:80) κ γij = C . Then rescaling κ (cid:48) ij = (cid:16) C (cid:48) C (cid:17) /γ κ ij produces a new networkwith (cid:80) κ (cid:48) γij = C (cid:48) . The ﬂows are unaltered in this network, so H ( κ (cid:48) ij ) = H ( κ ij ).However, dissipation is changed: D ( κ (cid:48) ij ) = (cid:0) CC (cid:48) (cid:1) /γ D . So the new network minimizes θ in Eq. 2.16 for the new dissipation weighting c (cid:48) = (cid:16) C (cid:48) C (cid:17) /γ c . Sweeping through allvalues c ≥

0, with (cid:80) κ γij = C , we generate in one-to-one correspondence all optimal PTIMAL MIXING IN TRANSPORT NETWORKS κ , and κ , . At some criticalperturbation of the conductances, the ﬂows q , and q , are respectively reversed(reversed ﬂows are circled in the network plots). The landscape of mixing entropy(middle) has ridge lines where the ﬂow is reversed.networks for c (cid:48) ≥ (cid:80) κ γij = C (cid:48) . The choice of the value for C is thereforearbitrary.

3. Numerical optimization.

Chang and Roper [9] optimized networks for gen-eral diﬀerentiable functions using gradient descent. However, the mixing entropy thatwe seek to optimize here is non-diﬀerentiable wherever the ﬂow through an edge isequal to 0. It is necessary that the optimization algorithm be able to navigate throughsuch points, because as conductances are updated to maximize mixing entropy it isoften necessary to reverse the direction of ﬂow on one or more edges. In Fig 2, weshow a contour map of varying two edge conductances within a network (the originalnetwork is shown at top right, and networks with reversed ﬂow bottom and left). Thelandscape is tiled into watersheds, each watershed represents the set of entropies thatcan be attained by varying the conductances without reversing the direction of ﬂowon any edge. Between the watersheds are ridgelines, and crossing a ridgeline reversesthe direction of ﬂow on one or more edges. Within a watershed, gradient descent canmove the network toward the local optimum for the watershed, but deteriorates if thelocal optimum is on the ridgeline.Even diﬀerentiable functions like dissipation produce landscapes with many lo-0

C. MENTUS AND M. ROPER cal optima; accordingly, in [13] simulated annealing and diﬀusive rearrangements ofconductances were implemented to prevent networks from being trapped at unfavor-able local optima. We follow a similar approach, by augmenting a gradient-basedsearch that is constrained to remain within a single watershed, with a perturbationmethod that is designed to provide the network with alternate routes to explore, andby intentional search over adjacent watersheds. We describe the three parts and theirintegration below.

We perform a gradient-based search, via MATLAB’s implementation of the interior-point method in fmincon. To ensure the search is not challenged to cross the ridgesthat divide diﬀerent ﬂow topologies, we enforce the sign of ﬂow in each edge viaa set of non-linear constraints on the conductances. Although fmincon is capableof calculating the derivative of Θ numerically within a watershed, we accelerate thealgorithm by computing the gradient analytically using Lagrange multipliers to encodeall of the relationships between conductance, ﬂow, transition probabilities and mixingentropy:We rewrite the array κ ij as |E| -entry vector. The function Θ (from Eqn. 2.16)that we are seeking to optimize is built up from κ ij via a chain of dependencies(3.1) κ ij (cid:55)−→ p i (cid:90) = ⇒ f i (cid:90) = ⇒ T ij (cid:55)−→ P ij (cid:90) = ⇒ ˜ q ij (cid:55)−→ N j (cid:90) = ⇒ H. Where a single arrow (cid:55)−→ represents a function of the immediately preceding variableand (cid:90) = ⇒ represents a function of more than one of the variables to the left. All of therelationships between variables are described in Section 2. Although it is possible tocarry derivatives through this list of compositions, the overhead from isolating andusing several derivatives of arrayed functions with respect to arrayed variables, makesthe gradient computation forbiddingly slow [18]. Instead we follow a similar approachto [8] and use Lagrange multipliers to enforce all of the functional relationships thatare embodied in Eq. (3.1). Eq. (3.1) then becomes a road-map for the order in whichwe solve for each of the Lagrange multipliers in our system. The constrained versionof Eq. (2.16), omitting the dissipation, becomes:Θ = (cid:88) i ∈N f i (cid:88) j :˜ q ji > ˜ q ji N i log (cid:18) ˜ q ji N i (cid:19) − (cid:88) i ∈N α i  N i − (cid:88) j :˜ q ji > ˜ q ji  − (cid:88) i,j ∈N γ ij (˜ q ij − f i P ij ) − (cid:88) ij ∈N µ ij (cid:32) δ ij − (cid:32) P ij − (cid:88) l ∈N T il P lj (cid:33)(cid:33) − (cid:88) i (cid:88) j ∈ n ( i ) λ ij (cid:18) T ij − q ij q ij > f i (cid:19) (3.2) − (cid:88) i ∈N β i  f i − (cid:88) j ∈ n ( i ) q ij q ij > + | Q i | Q i <  − (cid:88) i ν i  Q i − (cid:88) j ∈ n ( i ) κ ij ( p i − p j )  . For our gradient descent, we make use of the derivative: ∂ Θ ∂κ ab = λ ab q ab > ( p a − p b ) f a + λ ba q ba > ( p b − p a ) f b ( β a ( p a − p b ) q ab > + β b ( p b − p a ) q ba > )+ ( ν a − ν b )( p a − p b ) − c ( p a − p b ) . (3.3)In which we have made use of the derivatives compiled in Appendix A to calculatethe derivative of the dissipation. PTIMAL MIXING IN TRANSPORT NETWORKS log conductances , deﬁned by: κ ij = exp(˜ κ ij ).Additionally we want to ensure that the total material investment in the networkremains constant; i.e. to ensure (cid:80) κ γij = C . In [8] this constraint was added viaan additional Lagrange multiplier, but this method guaranteed that the constraintis satisﬁed only at leading order in the step size. Hence, here we simply rescale theconductances: κ ij (cid:55)→ (cid:16) C (cid:80) ( i,j ) κ γij (cid:17) γ κ ij after each perturbation. Both transformationsneed to be considered when calculating the derivatives. For the rescaling we get: ∂∂κ ij (cid:18) C /γ κ ab (cid:80) κ γcd (cid:19) = C /γ ( (cid:80) κ γcd ) /γ (cid:32) δ ( ab ) , ( ij ) − κ ab κ γ − ij (cid:80) κ γcd (cid:33) (3.4)To turn derivatives with respect to κ ij into derivatives with respect to ˜ κ ij we pre-multiply them by ∂κ ab ∂ ˜ κ ij = δ ai δ bj κ ij . Similar to dissipation-minimizing networks [13] our optimization algorithm has manylocal optima in which source and sink are sparsely connected. To ﬁnd the true globaloptimum, our algorithm includes a step for redistributing material within the net-work, in a way that presents the algorithm with a range of paths of diﬀerent lengthsbetween source and sink. However, although [13] previously redistributed materialby diﬀusing it on the graph, we found this method tends to short circuit the net-work by introducing much shorter paths between source and sink. The appearance ofthese paths is catastrophic for the optimization algorithm, since they are attractinglocal optima but far from the global optima [18]. Since our algorithm does not sendconductances exactly to zero, we deﬁne a threshold conductance κ c , and say that anedge (as well as the vertices that it connects) is in the support of the network if itsconductance exceeds κ c . In practice we found that a value κ c = 2 × − worked forall of the simulations shown in this paper.We redistribute material using a network growth step , which adds spurs of materialfrom the network’s support. Our algorithm takes the form of a set of operators:Grow up-right ( κ ij ) where “up-right” in the subscript can be replaced with the “up-left”, “down-left” or “down-right” to denote the direction in which material is added.Growth in the up-right direction adds edges that link nodes in the support to nodesnot in the support that are upwards and right of them. Each step of the growthalgorithm concatenates growth in two non-parallel directions. In practice we did notﬁnd it necessary to include right or left growth. We will describe the up-right growthstep: other growth steps can be derived from this step by symmetry.1. First locate up-right edges in the triangle grid for which the top-right node isoutside of the support of the network and the bottom-left node is inside thesupport.2. Add positive conductance to each of these edges to form a new network κ up-right (Fig. 3b). Each new edge is assigned conductance equal to the av-erage conductance of the edges from the support adjacent to its bottom-leftnode in the set of bottom-left nodes identiﬁed in 1.3. In κ up-right locate the nodes which are top right nodes of edges in the support.Call this set the top-right nodes (Fig. 3c).4. For every top-right node i , if i is the apex of a triangle whose base and up-right edge lie in the support of κ up-right , complete this triangle with an up-left2 C. MENTUS AND M. ROPER(a) (b) (c) (d) (e)

Fig. 3: Sequence of steps performed to construct κ up-right from an initial network κ shown in (a). (b) blue edges are added up-rightwards from a (bottom) node in thesupport of the network to a (top) node outside of the support of the network. (c)We identify top-right nodes (blue circles) whose bottom left edges have non-zeroconductance following the ﬁrst step. (d) Add leftwards (orange) and down-rightwards(green) edges if they complete triangles with edges in the current support. (e) Ouroptimization scheme applies the up-right growth algorithm twice.edge whose conductance is the arithmetic mean of the other two edges. Forevery top-right node i that is the left vertex of an inverted triangle, whoseup-left and up-right edges lie in the support of κ up-right , complete the trianglewith a horizontal edge whose conductance is the arithmetic mean of the othertwo edges (Fig 3d). Part 1 of our optimization algorithm can reliably locate local optima while respectingthe directions of ﬂow on every edge (i.e. the ﬂow topology ). To ﬁnd the true global op-timum we search systematically over adjacent topologies. To do this, we take one edgewithin the network, and ﬁnd the smallest increase and decrease in the conductanceof the edge that will change the direction of ﬂow in at least one other edge within thenetwork. To ﬁnd the smallest change in conductance, we use the Sherman-Morrisonformula [25], which allows us to calculate an explicit expression for the conductancechange necessary to reverse the ﬂow in any edge of the network (Eq. B.4). Givena causal edge ( a, b ), Eq. B.4 enables us to compute a set of perturbations t abuv to κ ab to reverse the ﬂow on any edge ( u, v ). We ﬁlter these perturbations to keep onlyperturbations in which κ ab is not allowed to become too small (in practice a thresholdof 10 − gives good results), to prevent this part of the algorithm getting stuck en-gineering and then re-engineering ﬂow reversals on edges that already have very lowconductance. This method was used to ﬁnd the set of ﬂow changes shown in Fig. 2. We initialize thealgorithm by assigning each edge within the ambient network an U (0 ,

1) conductance,and then scaling all conductances to ensure (cid:80) κ γij = C . A single step of the algorithmconsists of running all three of its parts sequentially. Part 1 locates a locally optimalnetwork that respects the ﬂow directions given to it, while the random choice ofgrowth directions in Part 2 and of causal conductances in Part 3 stochastically altersthe topology of the network. We compare the local optima arrived at the end ofconsecutive Part 1’s; if the new local optimum has a lower value of Θ than the old,we keep it, otherwise we revert to the old optimum.Our descent step uses the MATLAB optimization function fmincon using theinterior-point algorithm with 1000 max iterations, with ﬂow directions constrained onall edges with non-negligible conductance (see below), and with analytically computedderivative. Our growth step requires ﬁrst picking a single direction in which to grow PTIMAL MIXING IN TRANSPORT NETWORKS − . When this count reaches 4 we terminate the algorithm. Otherwisewe allow the algorithm to run for 50 iterations. Usually the stopping criterion isreached in fewer than 15 steps. We tested that our algorithm reliably (i.e. in morethan half of runs) located the theoretically obtained optimal network when c = 0 . κ ij = exp(˜ κ ij ). In practice, the local optima located by ouralgorithm use only a subset of the edges in the ambient network. We disregard edgeswith small conductances (in practice any edges with conductance less than 2 × − ):speciﬁcally the directions of ﬂow on these edges are not considered when constrainingﬂow directions in part 1 or when determining the perturbations that cause ﬂows toswitch in part 3.After the algorithm terminates we perform a ﬁnal ﬁltering step to deal with thefact that our gradient search is somewhat slow at removing edges from the networkor redistributing material between high conductance edges. To ﬁlter, we set all edgeswith conductance ≤ − to 10 − , re-scale all edges so that the material cost of thenetwork stays the same and then run our gradient-search with 10000 max iterations.Most of our simulations involve sweeps over c − values (see Section 4), typicallyinvolving 100-200 replicate networks whose c values are close enough that we expectthem to be topologically equivalent. We can further boost coverage since any localoptimum, ˆ κ discovered by our algorithm at c = ˆ c can be compared with local optimafor diﬀerent values of c by tracing the line: ˆ θ = − H (ˆ κ ) + cD (ˆ κ ). We form theenvelope of these straight lines (Fig. 6). At any value of c , we identify the networkthat produces the straight line on which ( c, Θ( c )) lies as the global optimal networkfor that value of c .

4. Results from numerical optimization.4.1. Optimal networks are paths for small values of γ . We ﬁrst studiedthe eﬀect of ﬁxing the value of c and constructing optimal networks over a rangeof values for γ . We found that at each value of c the number of loops in the pathincreased with γ (cf. dissipation minimizing networks, which form loops only when γ > γ ( γ ≈ . − .

5) the globallyoptimal networks are simple loopless paths linking source to sink. c . We noticed that in Fig.4 changing c changes the number of edges in the network. To investigate the eﬀect of c more systematically, we performed a numerical sweep of c values, holding γ = 0 . c increases, the globallyoptimal network systematically explored all path lengths from N − N −

1; (the shortest path linking source to sink).Fig. 6 shows the complete Θ( c ) trace, including for the networks included in Fig. 5.The numerically obtained Θ( c ) is piecewise linear, with slope discontinuities at each c -value where the length of the optimal network increases by one.

5. Optimal path networks.

Our numerical results from Section 4 highlightthree properties of optimal networks: that they are simple paths at small and mod-erate values of γ , that the path length decreases as c increases, and that for smallvalues of c the simple path visits every vertex in the network. In this section we will4 C. MENTUS AND M. ROPER

Fig. 4: Numerically computed optimal networks on a 5 × c (rows) and four diﬀerent values of γ (columns). At suﬃcientlysmall values of γ (in particular for γ = 0 .

45 at each assayed value of c ), optimalnetworks are all paths from source to sink. Increasing γ progressively adds loops tothe network. Increasing c increases the number of nodes visited by the network.Fig. 5: Selected optimal networks for γ = 0 .

45, on a 5 × c (values given above each panel), are all paths with decreasing lengths.Every possible length of path between 24 and 9 edges is obtained. The networksshown are the magenta points in Figure 6.rigorously state and prove theorems justifying these properties.We will ﬁrst prove separate results for mixing and for dissipation. We will showthat the optimal network for mixing (that is, without considering the cost of dissipa-tion) is a tour – a path that visits every node in the network from source to sink.Then given only the ﬂows on a network, we bound its dissipation. Second, we will PTIMAL MIXING IN TRANSPORT NETWORKS × γ = 0 . K = 24. The c -domain is divided into subintervals [ . , c , ], [ c m +1 ,m , c m,m − ]for m = 9 , , . . . ,

24 and [ c , ,

7] ( c m,n is deﬁned in Section 5), and 100 locallyoptimal networks are generated within each subinterval. Global optima are derivedis described in Section 3.4. Selected globally optimal networks (magenta points) areplotted in Fig. 5.consider optimization among path (i.e. loopless) networks, showing that as the costof dissipation is increased, the length of the optimal path network decreases mono-tonically in length in steps of 1, from the tour to a geodesic (shortest path). Finallywe show that among all networks, over many diﬀerent values for the dissipation pen-alty factor c , the optimal network is a path for all suﬃciently small γ . We start byintroducing a notation for paths of diﬀerent lengths, assuming that conductances areuniform, i.e. the same on each edge within the path, which is favored for minimizingdissipation. Definition

Say that a path from source to sink has length m if it visitsexactly m nodes. We use the notation τ m to denote any uniform conductance pathof length m . Further, we call τ |N | , the path that visits every node in the ambientnetwork, a tour . Theorem

Suppose that q ij is a ﬂow network with node-set N , and |N | = n .Then the maximum possible total mixing entropy is log( n !) , and this maximum isattained only for a path that visits all n nodes exactly once. The intuitive interpretation of this result is that all of the nodes in the network can beordered by their pressures, p i . Signals from node i can reach node j only if p i > p j . Anoptimal mixing conﬁguration is one in which signals from node i reach all downstream j with probability 1, which requires that the downstream network is a path that visitseach downstream node in turn. Proof.

Let x = { x t } t ≥ be the random walk on the ﬂow network deﬁned in Section2. Let V k ⊂ N be the set of nodes v that receive signals from exactly k nodes: thatis, V k = { u ∈ N : { v : P vu > } = k } .For a subset of nodes S ⊂ N we say that x hits S , if for some t ∈ Z ≥ x t ∈ S . Let6 C. MENTUS AND M. ROPER k ≥ V k is non-empty. Let u (cid:54) = v ∈ V k . A signal x can not visit more thanone node in V k for, if x hits both v and v ∈ V k , and WLOG p v > p v , then for any u ∈ N with P uv >

0, we must also have P uv >

0. So { u : P uv > } (cid:40) { u : P uv > } ,which is impossible since v , v ∈ V k implies both of these sets contain k elements.Let v ∈ V k . Then H ( P v ) ≤ log( k ) because there are exactly k nodes u with˜ q uv >

0. This inequality, together with (cid:80) v ∈ V k f k ≤ H :(5.1) (cid:88) i ∈N f i H ( P i ) = n (cid:88) k =1 (cid:88) v ∈ V k f v H v ≤ n (cid:88) k =1 log( k ) = log( n !) . We show that the only network with n nodes attaining the maximal entropy is apath. For the path, labeling the nodes 1 , , . . . , n in the order in which they arevisited from source to sink we ﬁnd for 1 ≤ i < j ≤ n , ˜ q ij = 1. Therefore theprobability distribution of signals P i is the uniform distribution on i atoms and hasentropy H i = log( i ). Hence, H = (cid:80) ni =1 f i H i = log( n !).Conversely, if q ij (cid:54) = τ n , then there must be at least one k ∈ { , , . . . n } such that V k = ∅ . In this case, H diﬀers from log( n !) by at least log( k ). Corollary

Let q ij be a ﬂow network on nodes N and ∅ (cid:54) = F ⊂ N with |F| = m . Then H F ≤ log( m !) . To deduce the corollary, we treat the probabilities P F i in the same fashion as wetreated the probabilities P i in the proof of Theorem 5.2. The dissipation, D , for a network is a function both of its conductances κ ij and its ﬂows q ij . However, we can bound the dissipation based on the q ij , alone,given only the constraint that (cid:80) ij κ γij = C . Theorem

Murray’s law . Let q ij be a network of ﬂows, then if (cid:80) ij κ γij = C , the smallest possible dissipation in the network is: (cid:32)(cid:80) ij q γγ +1 ij (cid:33)

1+ 1 γ C γ . This Theorem is equivalent to Murray’s law [8]: it is based on assigning each edgethe conductance that minimizes the overall network dissipation.

Proof.

Fixing ﬂows, we minimize the total dissipation over conductances obeyingthe building constraint (cid:80) κ γij = C . That is we minimize the overall function:(5.2) D ( κ ij ) = (cid:88) ij q ij κ ij + λ (cid:88) ij κ γij − C  where the Lagrange multiplier λ maximizes the dissipation and we restrict to edgeson which q ij (cid:54) = 0. The minimization of D is performed on a compact set ( κ ij ≥ (cid:80) ij κ γij = C ) so the minimum certainly exists. Since D → ∞ whenever κ ij = 0 sothe optimal value of D occurs at an interior point within this set. So at the minimumpoint:(5.3) 0 = ∂D∂κ ab = − q ab κ ab + λγκ γ − ab PTIMAL MIXING IN TRANSPORT NETWORKS κ ab ∝ q γ +1 ab (Murray’s law), and we ﬁnd our constant ofproportionality by imposing the constraint (cid:80) ab κ γab = C :(5.4) κ ab = C /γ q γ +1 ab (cid:18)(cid:80) ij q γγ +1 ij (cid:19) /γ . Substituting for κ ab in the dissipation yields the required inequality. Our main results will concernnetworks that are close to paths; for example path networks that have low conductanceexcursions adjoined to some of the path nodes. How much do these additions aﬀectthe network’s mixing entropy? Thinking more generally, we consider networks inwhich some edges are strong, and others are weak (we will deﬁne strong and weak)and bound the contribution of the weak nodes to the network entropy.

Definition

Let , , . . . , N be a labelling of the nodes in the network G indecreasing order of total ﬂow f i (that is; f ≥ f ≥ · · · ≥ f N ). Select < δ < whichwill be referred to as the dominance factor . Let k = min { i : δf i > f i +1 } . Thenodes , , . . . , k are referred to as the strong nodes above dominance factor δ ,denoted F δ . Theorem

Let (cid:15) > and let q ij be a ﬂow network on an ambient network G with nodes N . Then there exists δ > , depending only on G , such that | H − H F δ | < (cid:15) . In addition, δ may be chosen such that for each node i ∈ F δ thenodes adjacent to the largest magnitude in-ﬂow at i and the largest magnitude out-ﬂow at i are also strong nodes. That is, if u, v ∈ n ( i ) are such that q ui = max j ∈ n ( i ) q ji and q iv = max j ∈ n ( i ) q ij then u, v ∈ F δ .Proof. Let δ > F = F δ to bethe strong nodes in N over dominance factor δ . By the triangle inequality, we boundthe diﬀerence(5.5) | H − H F δ | ≤ (cid:88) i ∈F f i | H i − H F δ i | + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) i (cid:54)∈F δ f i H i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . First we bound the ﬁrst sum on the right-hand side, a sum over the absolute diﬀerencebetween the diﬀerent mixing entropies. Let i ∈ F δ and j (cid:54)∈ F . Then(5.6) P i ( j ) = ˜ q ji (cid:80) k (cid:54)∈F δ ˜ q ki + (cid:80) k ∈F δ ˜ q ki < f i δf i < δ. P F δ i is obtained by omitting fewer than N = |N | states from P i , each with probabilityless than δ , and then renormalizing to give a new probability distribution. Sinceentropy is uniformly continuous on the simplex (cid:110)(cid:80) Ni =1 p i = 1 , p i ≥ (cid:111) , we can choose δ so that | H i − H F i | < (cid:15) N so (cid:80) i ∈F f i | H i − H F i | < (cid:15) .We now bound the magnitude of the second term on the right-hand side. H F δ i is an entropy of a random variable taking on less than N = |N | values. Therefore | H F i | < log N . The total ﬂow through each node, f i ≤ δ for all i (cid:54)∈ F δ . Hence, wehave(5.7) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) i (cid:54)∈F δ f i H i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < (cid:88) i (cid:54)∈F δ δ log N < N δ log N. C. MENTUS AND M. ROPER

And so we can choose δ so that the second term is bounded by (cid:15) . To complete theproof, note that the magnitudes of the largest in- and out-ﬂows are ≥ f i deg i ≥ f i K where K is the largest degree of a node in G . Thus, so long as δ < K the nodes connectedto the largest in- and out-ﬂows of degree i have total ﬂows > K f i > δf i meaning theyare also strong nodes.We refer to the network formed by linking the nodes F δ up using the edges carryingthe maximum inﬂow and outﬂow at each node as the strong network, and re-usenotation by using F δ to represent the strong network. Θ over paths. Anticipating our proof in Section 6 thatoptimal networks are paths for suﬃciently small γ , we start by restricting our op-timization to paths. When restricted to path networks Θ( c ) = min m ( − H ( τ m ) + cD ( τ m )), i.e. Θ is the lower envelope of straight lines. We ﬁrst ask, if c is varied,does the sequence of Θ-minimizing paths always recapitulate Fig. 5; i.e. start witha tour (at vanishingly small c ) and end at large, ﬁnite c with a geodesic, with theintermediate states being paths whose length increases by 1, at ﬁnite and predictable c values. We can rationalize this sequence as follows: For a uniform conductance pathof length m each edge carries ﬂow 1, and has conductance ( C/ ( m − /γ , so thetotal dissipation is D ( τ m ) = C − /γ ( m − /γ , which increases monotonically in m . Increasing c increases the relative strength of dissipation to mixing in Θ. Mixingfavors tours and, more generally, paths that visit as many nodes as possible, whiledissipation favors shorter paths. At each c , the optimal path length emerges from thebalance of these two competing eﬀects.Two paths of diﬀerent lengths: τ m and τ n , give rise to straight lines c (cid:55)→ − H ( τ m )+ cD ( τ m ) and c (cid:55)→ − H ( τ n ) + cD ( τ n ), with diﬀerent slopes. Denote the point of inter-section between the lines by c m,n :(5.8) K − γ c m,n = log( n !) − log( m !)( n − γ − ( m − γ = log(Γ( n + 1)) − log(Γ( m + 1))( n − γ − ( m − γ . Lemma

The point of intersection c m,n is monotonic decreasing in both m and n , for m, n ≥ .Proof. Let x m = ( m − γ , then:(5.9) c m,n = log (cid:16) Γ (cid:16) x γγ +1 n (cid:17)(cid:17) − log (cid:16) Γ (cid:16) x γγ +1 m (cid:17)(cid:17) x n − x m . So c m,n is the slope of the secant from ( x m , f ( x m )) to ( x n , f ( x n )) where f ( x ) =log (cid:16) Γ (cid:16) x γγ +1 n (cid:17)(cid:17) . Since f is an increasing function we need to show it is concavein order to show that these secant slopes decrease as either x n or x m increases.Given f = log( u ( x )) where u ( x ) = Γ (cid:16) x γγ +1 (cid:17) we have that(5.10) f (cid:48)(cid:48) ( x ) = u (cid:48)(cid:48) ( x ) u ( x ) − ( u (cid:48) ( x )) ( u ( x )) . To show that f is concave, we then need that u (cid:48)(cid:48) u − ( u (cid:48) ) <

0. To compute thesederivatives recall ddx

Γ = ΓΨ where Ψ is the digamma function. The trigammafunction Ψ is deﬁned to be Ψ (cid:48) , and so d dx Γ = (Ψ + Ψ )Γ. Pulling these results PTIMAL MIXING IN TRANSPORT NETWORKS u (cid:48)(cid:48) u − u (cid:48) = γ (cid:18) γ + 1 (cid:19) x − γ +1 (cid:16) γ Ψ − x − γγ +1 Ψ (cid:17) Γ Let z = x γγ +1 + 2. Then z is an increasing function of x and visa versa. Since all ofthe other multipicative terms in the expression u (cid:48)(cid:48) u − u (cid:48) are positive we only need toshow that γ Ψ − x − γγ +1 Ψ is negative for all z = x γγ +1 n + 2 = n − ≥

3. We have γ Ψ − x − γγ +1 Ψ ≤ Ψ − x − γγ +1 Ψ = Ψ ( z ) − z − ( z ) , ≤ z + 1 z + 1 z ( z − − log zz − , = (cid:18) − z − log z (cid:19) z − . Here we made use of the inequalities [14] Ψ ( z ) ≥ log z − z , and Ψ ( z ) ≤ z + z [14]for all z >

0. The last line is < z ≥

3, proving the lemma.

Theorem

When Θ is optimized among paths, on a triangular ambient gridwith n nodes, and c is increased from 0, the optimal path decreases in length by 1 atpredictable values of c : c n,n − , c n − ,n − , c n − ,n − . . . . That is: τ n for c < c n,n − , τ n − for c n,n − < c < c n − ,n − , τ n − for c n − ,n − < c < c n − ,n − and so on.Proof. The theorem follows directly from the monotonicity property proven inLemma 5.7. We have already shown that the tour τ n is the optimal path at c = 0.As c is increased, the line Θ( τ n ) intersects with all lines Θ( τ m ) for m < n , at c n,m .Because of monotonicity, the smallest of these points of intersection is c n,n − . Thus τ n is the optimal path for c < c n,n − . Θ( τ n − ) intersects with all lines Θ( τ m ) for m < n at c n − ,m . The ﬁrst point of intersection is c n − ,n − . So τ n − is replaced by τ n − , and in turn by τ n − and so on.

6. All optimal networks are paths for suﬃciently small γ . Now we provethat for c m +1 ,m < c < c m,m − , networks with a unit source-sink pair optimizing H + cD are approximately paths of length m in the limit as γ →

0. Our proof worksfor any subinterval of ( c m +1 ,m , c m,m − ). The parameter 0 < σ < c m +1 ,m , c m,m − ] covered by the subinterval. We can also represent thesubinterval by c m +1 ,m + ρ < c < c m − ,m − ρ , where ρ = (1 − σ )( c m − ,m − c m +1 ,m ). Theorem

Let G be an ambient network with a single unit-ﬂow source andsink. Let m be a possible length of a path in G connecting the source to the sink. Let < σ < . We claim that there exists (cid:15) > and Γ > such that if δ > and F δ is the network of strong nodes such that | H F δ − H | < (cid:15) provided by by Theorem 5.6,and the material cost exponent γ < Γ , then for any c m +1 ,m + ρ < c < c m,m − − ρ thenetwork F δ is a path of length m .Proof. We can simplify the calculations in our proof by appealing to the resultfrom Section 2.7, that the sequence of optimizers is identical as c is varied for anyvalue of C . Accordingly we consider the special case C = m −

1. For this choice ofmaterial cost C , D ( τ n ) = ( m − − /γ ( n − /γ . Then the computation of Θ onpaths is is drastically simpliﬁed:(6.1) D ( τ n ) →  n < mm − n = m ∞ if n > m . C. MENTUS AND M. ROPER − log( m − − log m ! − log( m + 1)! c m,m − c m,m − − ρ c m +1 ,m + ρ Θ Θ ( τ m +1 ) Θ ( τ m − ) Θ ( τ m ) c Fig. 7: Θ-against- c loci for the simple paths τ m − (dotted, black), τ m (dashed, black), τ m (solid, gray), in the limit as γ → τ m − , τ m , τ m +1 are shown in Fig. 7. Deﬁne ρ as above, for the ﬁxedmaterial cost C = m −

1. Let Γ be such that we can choose (cid:15) with Θ( τ m )+ (cid:15) < Θ( τ m ± )for all c m +1 ,m + ρ < c < c m,m − − ρ and γ < Γ. Let q ij be a ﬂow on G with thespeciﬁed source and sink. According to Theorem 5.6, we can deﬁne δ > F δ a network of strong nodes such that | H ( F δ ) − H ( q ij ) | < (cid:15) . F δ has noleaf nodes except, potentially, the source and the sink. Suppose F δ contains n nodes.Then by Euler’s topological formula, it must contain n − F edges, where F isthe number of faces in the network (given the constraints on F δ , F = 0 if and onlyif F δ is a path). Hence − H ( F δ ) ≥ log n !. Each edge must carry, at minimum,ﬂow δ n − F . Accordingly, the dissipation in the network can be bounded below by D ∗ ( n ) = δ n − F ) (cid:16) n − Fm − (cid:17) /γ by Theorem 5.4. D ∗ ( n ) → ∞ as γ → n + F >m . Since networks with bounded dissipation exist, the optimal network must have n + F ≤ m . We can then compare the strong network with τ m − , and τ m . If n ≤ m − F δ ) ≥ − H ( F δ ) ≥ − log n ! ≥ − log( m − γ → Θ( τ m − ).This implies that Θ( q ij ) > Θ( τ m ), and so q ij is in fact sub-optimal. Therefore n = m .Since n + F ≤ m , F = 0, i.e. F δ is a path of length m .By choice of (cid:15) , F δ can approximate F arbitrarily closely in Θ. Since our convergenceresult can be made uniform in (cid:15) over the interval c m +1 ,m + ρ < c < c m,m − − ρ , itfollows that F converges to some path τ m , as γ →

0, except possibly at the points c m,m − . Our proof method does not provide us with a way to prove convergence atthese points, but based on our numerical simulations, we think it is likely that as γ → τ m and τ m − , with indistinguishable Θ values atthese crossover c -values.

7. Discussion.

We introduced and analyzed theoretically and by numerical sim-ulations two measures of mixing quality on networks, one measuring the diversity ofplaces within a network that may be reached by cues originating within that network(sender entropy), and the other reﬂecting the diversity of cues that are received ateach point within the network (receiver entropy). Happily, we were able to show thatsender entropy for a network is equivalent to the receiver entropy on the same net-work if ﬂows are reversed, allowing us to focus on optimizing just one kind of entropy

PTIMAL MIXING IN TRANSPORT NETWORKS γ optimizingeither entropy will produce identical networks, at biologically relevant values of γ ,which type of mixing is most important to the network inﬂuences important featuresof its organization, such as the placement of loops.We proved that in the single source-single sink geometry, the optimal networksconverge to simple paths joining source to sink, with the path length determined by thediﬀerent priorities that the network gives to mixing (which favors long paths) and todissipation (which favors short paths). Intriguingly, our numerical simulations suggestthat there is a ﬁnite value of γ , which for our 5 × γ →

0, and a bifurcation that removes edges at a ﬁnite value of γ , so we mustbe cautious about interpreting the disappearance of loops at ﬁnite γ as evidence of aphase transition in the network, analogous to the disappearance of loops at γ = 1 fordissipation-minimizing networks [7].We have narrowly focused on the case where there is only one source and onesink within the network, allowing us to rigorously validate our numerical results.However, our numerical optimization method is equally applicable to networks withmultiple sources and sinks, and it is possible to explore for example the conditionsunder which a network that is transporting material from a pair of sources to a pairof sinks, will determine to maintain two separate ﬂows, or bring these ﬂows together[18]. In particular, determining whether real network forming organisms such as fungiand slime molds have mixing-optimizing networks will require that we properly modelthe locations of the sources and sinks that drive their ﬂows.It is equally important when comparing optimal mixing networks with real biolog-ical networks to pin down the value of γ for these networks. On theoretical grounds,we expect real biological networks in which vessels are simple tubes to operate in arange of γ from γ = 1 / γ = 1 / γ is impossible, the branching hierarchies of xylemvessels in plants and some levels of cellular tubes in the slime mold Physarum poly-cephalum obey Murray’s law (Theorem 5.4) [17, 1]. Speciﬁcally if Q ∝ κ γ +12 , then (cid:80) κ γ +12 will be conserved between diﬀerent levels of a hierarchical network. For sim-ple tubes, we may assume the Hagen-Poiseuille law (that the conductance of a vesseland its radius, a are related by κ ∝ a ), it follows that (cid:80) a γ +2 is conserved.In P. polycephalum , (cid:80) r α is conserved across diﬀerent levels of the hierarchy, witha range of α values between 2 . − . . < α < . α valuesare similar for plants, but determining γ from vessel radii is complicated by the factthat the xylem vessels (like the cords of mycorrhizal fungal networks) are constitutedof many smaller tubes. Suppose these tubes have radius A but individually obey theHagen-Poiseuille law, then: κ ∝ a A . The total number of tubes at the same levelin the hierarchy is reported to increase by a factor F ≈ . a decreases by a factor of 2 − /α when one tube2 C. MENTUS AND M. ROPER splits into two then A ∝ a − α + α log F , and so α = ( γ + 1) (cid:0) − α + α log F (cid:1) . Wetherefore estimate that the plants in [17] have γ values ranging from 0.8 for Fraxinuspensylvanica ( F = 1 . α = 2 .

2) up to 1.4 for

Campsis radicans ( F = 1 . α = 3).So slime molds span the value of γ ≈ . γ (cid:38) .

5, qualitatively resemble thereal structures of migrating slime mold networks, in which densely interconnected‘fans’ of tubes are linked together by sparsely connected or even loopless networks(see e.g. Fig. 1 in [3]). In future work, we plan to analyze the optimal loopy networksfound by our algorithm to determine why optimal mixing requires fans (loopy regions)don’t appear throughout the network but are located only near the source, as wellas to understand how the tradeoﬀs between mixing and dissipation can be used topredict the size of the fan relative to the total length of the network.That real network forming organisms do not form tours may result from their γ values being too high. However, even at low values of γ networks face other tradeoﬀs,such as resistance to damage and or the need to minimize dissipation when the sourcesand sinks ﬂuctuate in strength [13]. An additional property that must be highlightedfor organisms such as fungi and slime molds that have indeterminate growth is thatorganisms need to maintain their mixing while growth pushes sources and sinks everfurther apart. A tour can be extended indeﬁnitely to include to an ever increasingnumber of nodes by extending it node by node. However, this model of growth ex-tends the network only by adding a single edge at a time, restricting growth to asingle growing tip and is an ineﬃcient strategy for a fungus or other foraging organ-ism, that must compete for space and resources with other organisms. The type ofnetwork formed by a network is also shaped by the constraints on how it must formthis network. Fast foraging may favor growth in multiple directions simultaneously,facilitated by the organism having multiple growing tips. Thus optimization prin-ciples such as those developed in this paper only achieve true biological relevancewhen linked to a set of rules that a growing organism can follow to attain the optima.Such rules have been only recently elucidated for dissipation minimizing networks (seee.g. [12]), leaving unmet the challenge of constructing rules to achieve more complexobjectives, including mixing. Acknowledgments.

We thank Karen Alim, Eleni Katifori and Sebastien Rochfor many useful discussions at a sequence of Square Meetings hosted by the AmericanInstitute for Mathematics, where the idea for this project was developed.

Appendix A. Computation of derivatives of Θ . To diﬀerentiate Θ we compute the all of intermediate variables appearing inEq.(3.1): i.e. p i , q ij , f i , T ij , P ij , ˜ q ij and N i . The pressures p i are ﬁrst obtained bysolving Eqn. 2.2, using the Matlab function mldivide. We then solve a chain ofequations to obtain the Lagrange multipliers: ∂ Θ ∂Ni =0 −−−−→ α ∂ Θ ∂ ˜ qab =0 −−−−−→ γ ∂ Θ ∂Pab =0 −−−−−→ µ ∂ Θ ∂Tab =0 −−−−−→ λ ∂ Θ ∂fa =0 −−−−→ β ∂ Θ ∂pa =0 −−−−→ ν . (A.1)First:(A.2) ∂ Θ ∂N a = (cid:88) i (cid:18) − ˜ q ia N a log (cid:18) ˜ q ia N a (cid:19) − ˜ q ia N a (cid:19) − α a = 0 . PTIMAL MIXING IN TRANSPORT NETWORKS ∂ Θ ∂ ˜ q ab = f b N b log (cid:18) ˜ q ab N b (cid:19) + f b N b + α b − γ ab = 0Third:(A.4) ∂ Θ ∂P ab = µ ab − (cid:88) i µ ib T ia + γ ab f a = 0so:(A.5) γ ab = (cid:32)(cid:88) i µ ib T ia − µ ab (cid:33) /f a Fourth:(A.6) ∂ Θ ∂T ab = − (cid:88) j ∈N µ aj P bj − λ ab = 0Fifth:(A.7) ∂ Θ ∂f a = (cid:88) j :˜ q ja > ˜ q ja N a log (cid:18) ˜ q ja N a (cid:19) + (cid:88) j ∈N γ aj P aj − (cid:88) j ∈ n ( i ) λ aj q aj q aj > f a − β a = 0 . Sixth, to calculate ∂ Θ ∂p a we make use of the results ∂∂p a q ai = κ ai and ∂∂p a q ia = − κ ia .Thus: ∂ Θ ∂p a = (cid:88) j κ aj ( ν a − ν j ) + (cid:88) i ∈ n ( a ) ( β a κ ai q ai > − β i κ ai q ia > )+ (cid:88) i ∈ n ( a ) (cid:18) λ ai κ ai q ai > f a − λ ia κ ai q ia > f i (cid:19) = 0 . (A.8)Thus solving for the Lagrange multipliers ν i requires solving a Poisson equation onthe network similar to Eqn. 2.2. Appendix B. Finding adjacent ﬂow topologies.

We assume that the network of non-zero conductances has a single connected com-ponent, because although very small conductances are treated as negligible throughoutour algorithm, they remain large enough to keep the Laplacian rank complete. Wetake the inverse of the version of the Laplacian deﬁned in Section 2.1, ˜∆ κ for theinitial network. We compute the directions of ﬂow on each edge within the network(edges with low ﬂows are ignored). The set of networks with the same directions ofﬂow constitutes one of the watersheds shown in Fig 2. We systematically vary oneconductance κ ab within the network to ﬁnd an adjacent watershed – i.e. a ﬂow net-work in which some subset of the non-negligible ﬂows have been reversed. We ﬁndthe threshold values for κ ab at which one or more ﬂow directions are reversed, byappealing to the Sherman-Morrison formula [25] (we thank Eleni Katifori for bringingthe S.M. formula to our attention). Speciﬁcally, if the conductance in edge ( a, b ) isincreased to κ ab + t , then the Laplacian for the new network becomes(B.1) ˜∆ ˜ κ ij = ˜∆ κ ij + t ( e a − e b )( e a − e b ) T . C. MENTUS AND M. ROPER

Then the Sherman-Morrison formula yields˜∆ − κ ij = (cid:16) ˜∆ κ ij + t ( e a − e b )( e a − e b ) T (cid:17) − = ˜∆ − κ ij − ˜∆ − κ ij t ( e a − e b )( e a − e b ) T ˜∆ − κ ij e a − e b ) T ˜∆ − κ ij t ( e a − e b ) . (B.2)Given another edge ( u, v ), We wish to ﬁnd a perturbation to κ ab such that theﬂow along ( u, v ) is reversed. Let R i be the i th row of ˜∆ − κ ij , and d ij be the i, j entryof ˜∆ κ ij . Then the pressure drop is given by:(B.3) ˜ p u − ˜ p v = ( R u − R v ) Q − t ( d au − d av − d bu + d bv ) ( R a − R b ) Q t ( d aa − d ab − d ba + d bb ) . Therefore the pressure drop is a monotonic function of t so the the zero of this equationis where the pressure reverses. Setting the left side to 0 we get the value t = t abuv atwhich ﬂow reversal occurs:(B.4) t abuv p u − p v ( d au − d av − d bu + d bv ) ( p a − p b ) − ( d aa − d ab − d ba + d bb ) ( p u − p v ) . REFERENCES[1]

D. Akita, I. Kunita, M. D. Fricker, S. Kuroda, K. Sato, and T. Nakagaki , Experimentalmodels for murray’s law , Journal of Physics D: Applied Physics, 50 (2016), p. 024001.[2]

D. Akita, I. Kunita, M. D. Fricker, S. Kuroda, K. Sato, and T. Nakagaki , Experimentalmodels for murray’s law , Journal of Physics D: Applied Physics, 50 (2017), p. 024001.[3]

K. Alim , Fluid ﬂows shaping organism morphology , Philosophical Transactions of the RoyalSociety B: Biological Sciences, 373 (2018), p. 20170112.[4]

K. Alim, N. Andrew, and A. Pringle , Physarum , Current Biology, 23 (2013), pp. R1082–R1083.[5]

K. Alim, N. Andrew, A. Pringle, and M. P. Brenner , Mechanism of signal propagationin physarum polycephalum , Proceedings of the National Academy of Sciences, 114 (2017),pp. 5136–5141.[6]

D. P. Bebber, J. Hynes, P. R. Darrah, L. Boddy, and M. D. Fricker , Biological solutionsto transport network design , Proceedings of the Royal Society B: Biological Sciences, 274(2007), pp. 2307–2315.[7]

S. Bohn and M. O. Magnasco , Structure, scaling, and phase transition in the optimal trans-port network , Physical review letters, 98 (2007), p. 088702.[8]

S.-S. Chang and M. Roper , Minimal transport networks with general boundary conditions ,SIAM Journal on Applied Mathematics, 78 (2018), pp. 1511–1535.[9]

S.-S. Chang and M. Roper , Microvascular networks with uniform ﬂow , Journal of theoreticalbiology, 462 (2019), pp. 48–64.[10]

F. Corson , Fluctuations and redundancy in optimal transport networks , Physical Review Let-ters, 104 (2010), p. 048703.[11]

M. Durand , Structure of optimal transport networks subject to a global constraint , PhysicalReview Letters, 98 (2007), p. 088701.[12]

D. Hu and D. Cai , Adaptation and optimization of biological transport networks , Physicalreview letters, 111 (2013), p. 138701.[13]

E. Katifori, G. J. Sz¨oll˝osi, and M. O. Magnasco , Damage and ﬂuctuations induce loopsin optimal transport networks , Physical Review Letters, 104 (2010), p. 048704.[14]

A. Laforgia and P. Natalini , Exponential, gamma and polygamma functions: Simple proofsof classical and new inequalities , Journal of Mathematical Analysis and Applications, 407(2013), pp. 495–504.[15]

R. R. Lew , How does a hypha grow? the biophysics of pressurized growth in fungi , NatureReviews Microbiology, 9 (2011), p. 509.[16]

S. Marbach, K. Alim, N. Andrew, A. Pringle, and M. P. Brenner , Pruning to increasetaylor dispersion in physarum polycephalum networks , Physical review letters, 117 (2016),p. 178103.PTIMAL MIXING IN TRANSPORT NETWORKS [17] K. A. McCulloh, J. S. Sperry, and F. R. Adler , Water transport in plants obeys murray’slaw , Nature, 421 (2003), pp. 939–942.[18]

C. Mentus , Information Theoretic and Statistical Models for Spatial Transportation Networks:Total Mixing Entropy on Optimal Fluid Flow Networks and Time Dependent StochasticBlock Models , PhD thesis, UCLA, 2019.[19]

C. D. Murray , The physiological principle of minimum work applied to the angle of branchingof arteries , The Journal of general physiology, 9 (1926), p. 835.[20]

D. F. Plaza, S. S. Schmieder, A. Lipzen, E. Lindquist, and M. K¨unzler , Identiﬁcation ofa novel nematotoxic protein by challenging the model mushroom coprinopsis cinerea witha fungivorous nematode , G3: Genes, Genomes, Genetics, 6 (2016), pp. 87–98.[21]

H. Ronellenfitsch and E. Katifori , Global optimization, local adaptation, and the role ofgrowth in distribution networks , Physical review letters, 117 (2016), p. 138301.[22]

M. Roper, C. Ellison, J. W. Taylor, and N. L. Glass , Nuclear and genome dynamics inmultinucleate ascomycete fungi , Current biology, 21 (2011), pp. R786–R793.[23]

M. Roper, A. Simonin, P. C. Hickey, A. Leeder, and N. L. Glass , Nuclear dynamics in afungal chimera , Proceedings of the National Academy of Sciences, 110 (2013), pp. 12875–12880.[24]

C. E. Shannon , A mathematical theory of communication , Bell system technical journal, 27(1948), pp. 379–423.[25]

J. Sherman and W. J. Morrison , Adjustment of an inverse matrix corresponding to a changein one element of a given matrix , The Annals of Mathematical Statistics, 21 (1950),pp. 124–127.[26]

T. Tanyimboh and A. Templeman , Calculating maximum entropy ﬂows in networks , Journalof the Operational Research Society, 44 (1993), pp. 383–396.[27]

A. Tero, S. Takagi, T. Saigusa, K. Ito, D. P. Bebber, M. D. Fricker, K. Yumiki,R. Kobayashi, and T. Nakagaki , Rules for biologically inspired adaptive network de-sign , Science, 327 (2010), pp. 439–442.[28]