Optimal Mixing in Transport Networks: Numerical Optimization and Analysis
OOPTIMAL MIXING IN TRANSPORT NETWORKS: NUMERICALOPTIMIZATION AND ANALYSIS ∗ CASSIDY MENTUS † AND
MARCUS ROPER † Abstract.
Many foraging microorganisms rely upon cellular transport networks to deliver nu-trients, fluid and organelles between different parts of the organism. Networked organisms rangingfrom filamentous fungi to slime molds demonstrate a remarkable ability to mix or disperse mole-cules and organelles in their transport media. Here we introduce mathematical tools to analyze thestructure of energy efficient transport networks that maximize mixing and sending signals originatingfrom and arriving at each node. We define two types of entropies on flows to quantify mixing anddevelop numerical algorithms to optimize the combination of entropy and energy on networks, givenconstraints on the amount of available material. We present an in-depth exploration of optimal singlesource-sink networks on finite triangular grids, a fundamental setting for optimal transport networksin the plane. Using numerical simulations and rigorous proofs, we show that, if the constraint onconductances is strict, the optimal networks are paths of every possible length. If the constraint isrelaxed, our algorithm produces loopy networks that fan out at the source and pour back into a singlepath that flows to the sink. Taken together, our results expand the class of optimal transportationnetworks that can be compared with real biological data, and highlight how real network morpholo-gies may be shaped by tradeoffs between transport efficiency and the need to mix the transportedmatter.
Key words.
Transportation networks, Numerical optimization, Mixing entropy, Cellular net-works
AMS subject classifications.
1. Introduction.
Work by Murray in the 1920s [19] first probed the idea thatvessels in biological transportation networks may optimize knowable target functions.Murray hypothesized that blood vessels may have optimal radii are set by tradeoffsbetween the need to minimize friction within the vessel (which favors large vessels),and the energetic cost of maintaining the vessel (which penalizes large vessels). Thescalings and geometric relationships that he derived from this trade-off have foundsome experimental support for the blood networks of animals [28] and water transportnetworks of plants [17]. More recent theoretical work has extended the analysis ofsingle vessels or branch points to whole networks of vessels in which the sources andsinks are prescribed but the network is given many choices for how to connect thesepoints [7, 11], added damage or fluctuations in source and sink strengths [13, 10], ordeveloped models for how feedbacks between flows and network growth allow suchoptimal networks to be grown [12, 21].Hundreds of thousands of species of microorganisms, including slime molds, watermolds and fungi rely on internal transportation networks. These networks have similarfunctions – they continuously grow as the organism claims territory or searches forhosts or resources. Within the network nutrients, fluid and cellular matter (includingnuclei and other organelles) are transported from sites of production or uptake to sitesof utilization. Minimization of friction, in conjunction with robustness to damage,appears to underlie features of some of the foraging networks made for example bywood rotting basidiomycete fungi [6] and slime molds [27]. However, organisms build ∗ Submitted to the editors on July 28, 2020.
Funding:
This work was funded by the National Science Foundation under grant no. DMS–1351860. † a r X i v : . [ n li n . AO ] J u l C. MENTUS AND M. ROPER networks with a tremendous diversity of morphologies that can not be explainedby friction minimization alone. Do these morphologies emerge from other physicalprinciples besides minimizing friction, from constraints on the pathways used to growthe network, or from neutral differences in network morphology that do not affect theorganism’s fitness? We start from the position that to understand the extent of therole that optimization plays in determining the structure of networks, we must firstunderstand what the optimal network is for a given target function. This approachpreviously guided us to develop gradient-descent methods for optimizing networks forarbitrary differentiable functions [9].In this work we focus on a quantity with many points of non-differentiability:the amount of mixing occurring within the network. This quantity, which is giventwo different quantifications below, is non-differentiable in the conductances of thenetwork at any point where the flow in an edge goes to 0. Since the optimizationof the network requires searching over possible topologies for flow; i.e. reversing thedirections of flow on edges, we develop here a new numerical optimization methodthat is adapted to deal with this pervasive non-differentiability.Why are might real networks seek to maximize mixing? Three kinds of mixingseem to be relevant to network-forming microorganisms:1. In fungal networks cellular growth occurs at the periphery of the networkthrough the continuous extension of hyphae at their tips, and in fast growing fungi,such as the model organism
Neurospora crassa growth requires the continuous supplyof nuclei and other organelles to the edge of the mycelium [15]. Within
N. crassa nucleioften take tortuous and multidirectional paths toward the tips, and the network isknown to be organized so that pairs of nuclei that start close together within themycelium are unlikely to be delivered to the same site of growth at the periphery,potentially to stop deleterious mutations accumulating in one region of the fungus[23].2. Recent experiments in the dung fungus
Coprinopsis cinerea show large swathesof the network responding to the external threat of predatory nematode worms. Whennematode grazing is detected in one part of the fungal network, a suite of defensechemicals is expressed, not just at the site of grazing, but spreading in multipledirections through the network [20]. Spreading out nematoxin production may prepareother parts of the network for further attacks or enable the cost of labor to be spreadthrough the network [22].3. Plasmodial slime molds, such as
Physarum polycephalum live in heterogeneousenvironments containing patches of nutrients [4]. The network remodels globallywhen it discovers a new nutrient source, and it is thought that individual tubes in thenetwork respond to a cue carried within the flow carried within the network [5]. Aglobal response to this cue requires that it be dispersed through the entire network.In Section 2 we define an entropy of mixing within the network. In Section 3 wedescribe a numerical method for choosing the conductances within the network, andin Section 4 we show simulation results. A highlight result is that for small valuesof the parameter, γ , which represents the penalty of dividing one edge into two, theoptimal networks become a set of paths linking source and sink. We prove why pathsare favored, and analytically expose the set of possible path optima in Section 5.
2. Mathematical model and mixing entropies.2.1. Movement of signals through a flow network.
Our mathematicalmodel for the biological transportation network consists of a network (graph) withnodes (vertices) N , enumerated 1 , , . . . , N , and edges E . The nodes are arranged on PTIMAL MIXING IN TRANSPORT NETWORKS κ ij with each node and edge conductance labeled. Thesource is at node 1, the bottom-left corner, and the sink is at node 25, the top rightcorner.a regular triangular lattice, so that each vertex in the interior is linked by equal lengthedges to 6 neighbors (we write n ( i ) for the set of neighbors of i ). The conductance ofthe edge ( i, j ) is denoted by κ ij . Fluid (protoplasm) is continually pushed through thenetwork by pressure differences between the nodes. In our model the ultimate originof these pressure differences are flows into and out of the network via diametricallyopposite nodes. Definition
The rate of fluid entering or exiting the network at i ∈ N is the boundary flow at node i , and is denoted Q i . Boundary flow Q i > i (i.e.the node is a source), and Q i < i (i.e. the node is a sink). The total volume of fluid contained in the network isconstant, so total inflows and outflows must be balanced: (cid:80) i Q i = 0. The boundaryflows in turn engender flows, q ij , on the edges. Flows must also be balanced on eachnode in the network, a fact that is known as Kirchhoff’s first law of circuits: Definition
A flow is called compatible with regards to the boundary flows Q i if (cid:80) j ∈ n ( i ) q ij = Q i for all i ∈ N . For any set of boundary flows, there are typically multiple compatible flows on thenetwork. The flow we are interested in, called the physical flow , is the uniquecompatible flow that minimizes the dissipation:
Definition
For a conductance network κ ij , the dissipation D from flows q ij is the rate at which work must be done to maintain the fluid flows on all edges ofthe network: D ( q ij ) = (cid:80) ij q ij κ ij . The flow that minimizes the dissipation can be derived from Kirchhoff’s first andsecond laws [8], which introduce a pressure variable that is defined on each node inthe network:
C. MENTUS AND M. ROPER
Proposition
Kirchoff ’s second law for circuits
Let κ ij be a connectedconductance network with nodes N and edges E . Let Q be boundary flows such that (cid:80) i Q i = 0 . Let q ij be the physical flows of this network. Then there exists p i ∈ R ,called the pressure at node i , such that q ij = κ ij ( p i − p j ) . We can compute the pressures by defining a vector of pressures p = { p i } i ∈N , a vectorof boundary flows Q = { Q i } i ∈N and the network Laplacian ∆ κ , a |N | × |N | matrixwith entries(2.1) ∆ κ,ij = (cid:26) − κ ij if i (cid:54) = j (cid:80) j κ ij if i = j Then conservation of mass at each node (Proposition 2.4) is equivalent to solving(2.2) ∆ κ p = Q . So long as every connected component of a physical network has one node with adefined pressure, the pressures are uniquely solvable, otherwise they are solvable upto a single additive constant per connected component[8]. When the conductancenetwork is connected and the pressure at node i is known: p i = P , we add P to Q i and construct the invertible matrix ˜∆ κ by adding 1 to ∆ κ,ii .We now consider the mixing produced by the flows within the network. Ourflow network model contains all of the scenarios for mixing described in Section 1.The signals passing through the network could represent genetically diverse nuclei(scenario 1), or chemical cues (scenarios 2 and 3). Our model does not need torepresent the entire network, it could represent the portion of network that supplies asingle hyphal tip. This supply network would be linked to supply networks for othertips, and acquires signals, randomly at each node from these other networks . Signalsare made up of blobs: either molecules or organelles. We compile a list of the nodesvisited by each signal blob: call the t -th node visited by a signal, x t . Then x t is arandom walk, with transition probability:(2.3) T ij ≡ P ( x t +1 = j | x t = i ) = q ij (cid:80) k ∈ n ( i ): q ik > q ik − Q i Q i < . that is, the flow of signal from i to j is simply proportional to the total flow alongthe edge ( i, j ). Effectively we assume that signal is uniformly dispersed in the flowingprotoplasm, ignoring any physical effects such as diffusion [16] that move signalsindependently of flows. When the signal reaches a sink node it may exit the modelednetwork (with the exit probability proportional to − Q i , so (cid:80) j T ij ≤ i performs a random walk down the pressure gradient.There are two senses in which signals may be considered to mix on the network: 1.Given the node i at which it originates we are interested in the number of nodes thatthe signal visits before exiting the network. 2. Alternately, given a node j , we areinterested in the number of different sites of origin that signals passing through j mayhave. To quantify either form of diversity, we must focus on the probability that asignal originating at node i ever visits a node j defined by:(2.4) P ij = P ( x t = j for some t ≥ | x = i ) . Signals can be transferred between supply networks without flow between them, since motorprotein trafficking (of nuclei) or diffusion (of chemical cues) provide alternate transport mechanisms.PTIMAL MIXING IN TRANSPORT NETWORKS P ij from a N × N matrix. To calculate this matrix from the transitionprobabilities T , note that the probability of getting from i to j by following exactly n edges is ( T n ) ij . Hence, P = (cid:80) N − n =0 T n (note that T N = 0, because a signal can visitat most N nodes before exiting the network and signals can not visit the same nodetwice). Alternatively by summing the geometric progression:(2.5) P = ( I − T ) − , where I is the N × N identity matrix. We define two types of information entropyon the flows q ij . The first is a measure of the accumulation of signals at every node inthe network and the second represents the dispersal of signals throughout the network.We call the two entropies, respectively, total receiver entropy (or total mixingentropy ) and total sender entropy . Let f i be the total flow through node i , i.e. f i = (cid:80) j ∈ n ( i ): q ij > q ij + Q i Q i > . The rate at which fluid flows from i to j is then˜ q ij = P ij f i . We refer to this as the flow from i to j . We assume that the rate at whicha signal is produced at a node is proportional to the total flow through that node.This assumption certainly makes sense if our signal consists of new nuclei that aregenerated by divisions within the protoplasm, since the flow through a node will beproportional to the rate at which nuclei pass through it. For other signal productionscenarios (such as when the signal is produced in response to predation), we canarrive at this assumption if we assume that product of the new signal is rate-limitedby a component that is contained within the protoplasm, so signal production rateis proportional to rate of protoplasm cleared through the node in unit time. Underthis assumption the relative proportions of signals received at node j from upstreamnodes i are the same as the relative proportions of ˜ q ij .In our model each site in the network can send signals to other sites in the network,and any point in the network may potentially receive signals from any other point. Wecannot tell ahead of time which nodes will provide the useful signals, so we considerall nodes as possible sources of signals. We also make no assumption about siteswhere diversity needs to be maximized (this is in contrast to [23], in which geneticdiversity was considered only at hyphal tips), so we consider all of the possible nodesthat signals can reach within the network when computing the mixing entropy.To compute the entropy of the distribution of signals arriving at i we define theprobability distribution on up-stream nodes of i :(2.6) P i ( j ) = ˜ q ji N i where N i ≡ (cid:88) j :˜ q ji > ˜ q ji , effectively forming a new matrix from ˜ q ij in which all columns are normalized to sum to1. We may define the local receiver entropy at node i as the Shannon informationentropy of P i : H ( P i ) = − (cid:80) j P i ( j ) log ( P i ( j )). We consider the total flow through i as a measure of the “importance” of the node. In our model, the diversity of signalsis more important at high traffic nodes than at low traffic nodes. This principle isuseful mathematically, since it ensures that rearrangements of very low conductanceedges don’t greatly affect the overall mixing associated with a network. At the sametime, the weighting is intended to reflect the relative biological importance of nodeswithin the network – a node with high flow supplies a greater volume of cytoplasm tothe rest of the network, so it is more important that all of the signals (whether cues C. MENTUS AND M. ROPER or nucleotypes) are present at the node. Hence the total receiver entropy is:(2.7) H = (cid:88) i f i H ( P i ) . Similar to [26] H represents the conditional entropy associated with choosing a re-ceiving node at random with probability proportional to f i and then conditioned onour choice of node i we chose a sending node at random via the distribution P i . Our proofs in Section 5 often requirethat we partition N into subsets of nodes. It is convenient to be able to evaluatethe contributions of each subset to the total network entropy. We define restrictedentropies for subsets F ⊂ N as follows: For all i ∈ F define P F i ( j ) = ˜ q ji (cid:80) k ∈F ˜ q ki if j ∈ F and P F i ( j ) = 0 otherwise. Definition
The local negative mixing entropy restricted to F is defined tobe (2.8) H ( P F i ) = − (cid:88) j ∈F :˜ q ji > P F i ( j ) log ( P F i ( j )) and the total mixing entropy restricted to F is (2.9) H F = (cid:88) i ∈F f i H ( P F i ) . It may also be important for the network tospread out signals to as many downstream nodes as possible. We define an entropyfor the places that can be reached by a new signal originating at a node within thenetwork. Specifically, instead of taking the mass distribution of incoming flows andnormalizing them to a probability distribution, we use the out-going flows. That is wedefine the probability distribution of nodes that can be reached by a signal introducedat node i :(2.10) P (cid:48) i ( j ) = ˜ q ij (cid:80) j :˜ q ij > ˜ q ij . This is equivalent to normalizing the matrix ˜ q so that all rows sum to 1. We definethe local sending entropy at node i to be the Shannon information entropy [24]of the distribution P (cid:48) i :(2.11) H ( P (cid:48) i ) = − (cid:88) j P (cid:48) i ( j ) log( P (cid:48) i ( j )) . and we define the total sending entropy of the entire network to be the weighted sumof the node entropies:(2.12) H (cid:48) = (cid:88) i f i H ( P (cid:48) i ) . Although the en-tropies H and H (cid:48) offer alternate representations of the mixing that occurs within thenetwork, they are linked by an equivalence principle: PTIMAL MIXING IN TRANSPORT NETWORKS Theorem
Let q ij be a flow network compatible with boundary flows Q i . Let q (cid:48) ij and Q (cid:48) i be the flow network and boundary flows obtained from q ij and Q i by re-versing the flows, i.e. q (cid:48) ij = − q ij and Q (cid:48) i = − Q i . Then H (cid:48) ( q ij ) = H ( q (cid:48) ij ) .Proof. Notice that reversing the flows doesn’t affect the flow strengths of nodeswithin the network because f i = (cid:88) j : q ij > q ij + | Q i | Q i < = (cid:88) j : q ij < | q ij | + | Q i | Q i > (2.13) = (cid:88) j : q (cid:48) ij > | q (cid:48) ij | + | Q (cid:48) i | Q (cid:48) i < . (2.14)The equivalence principle boils down to proving the statement ˜ q ij = ˜ q (cid:48) ji where ˜ q (cid:48) ij isthe flow from node i to node j in the flow-reversed network. We derive this equality bycomparing the probability of a signal path x t : t = 0 , , . . . , T under the flow q ij withthe probability of the reversed path x (cid:48) t ≡ x T − t : t = 0 , , . . . , T under the reversed flow q (cid:48) ij : since | q ij | = | q (cid:48) ij | it follows that T ij f i = T (cid:48) ji f j and so T (cid:48) ji = f i f j T ij . We multiplythe probability of the path x t by the strength of the starting node, f x , to obtain f x (cid:81) T − t =0 T x t x t +1 , and rewrite via a telescoping product:(2.15) f x T − (cid:89) t =0 T x t x t +1 = f x T T (cid:89) t =0 f x t f x t +1 T x t x t +1 = f x T T (cid:89) t =0 T (cid:48) x t +1 x t = f x T T (cid:89) t =0 T (cid:48) x (cid:48) t x (cid:48) t +1 . For any nodes i and j in the network, we can sum over the probability of all possiblepaths i to j in the regular network and j to i in the flow-reversed network to obtain:˜ q ij = f i P ij = f j P (cid:48) ji = ˜ q (cid:48) ji . Hence the distributions P (cid:48) i ( j ) for the flow network q ij areequal to the distributions P i ( j ) for the network q (cid:48) ij , so H (cid:48) i ( q ij ) = H i ( q (cid:48) ij ) leading tothe required result.The physical flow on the network (see Proposition 2.4) can be reversed by reversingthe sources and sinks in the network; that is, replacing a source with inflow Q i by asink with outflow Q i , and conversely. In the cases that we will analyze in this paper,the sources and sinks are matched in number and strength (e.g. a single source andsingle sink at opposite corners of a square grid network); so a network that optimizesreceiving entropy can be transformed into a network that optimizes sending entropysimply by rotating the source into the sink and conversely. For this reason, we donot have to develop separate results for the two entropies. We focus on analyzing thereceiving entropy, which we refer to simply as mixing entropy henceforth. Buildingand using flow networks requires energy investment; an organism’s optimal networkwill reflect tradeoffs between mixing effectiveness and the cost of the network. Thecost has two components: each edge in the network must be built and maintained,and the fluid transported within the network dissipates energy due to friction. Thetwo cost components play slightly different roles in our optimization, we incorporatethe first cost via a holonomic constraint, and the second via a penalty.Murray [19] posited that the cost of a maintaining a vessel is either proportionalto its volume or surface area. Since all of the vessels in our networks have the samelength, and the Hagen-Poiseuille law states that conductance is proportional to thefourth power of the radius, these scenarios correspond respectively to the cost of anedge being proportional to κ / ij or to κ / ij . We constrain the cost the total material C. MENTUS AND M. ROPER available to the network, requiring (cid:80) κ γij = C where 0 < γ < { Θ( κ ij ) ≡ − H ( κ ij ) + cD ( κ ij ) : κ ij ≥ ∀ ( i, j ) ∈ E , (cid:88) κ γij = C } . We refer to Θ as the mixing-dissipation cost (abbreviated:
CMD ). Since theset of allowed conductances is compact, we know that the minimizer exists. Theconstant c represents the relative priority to the network of minimizing dissipationover maximizing mixing. Along with γ it is one of the main parameters that weexplore in this work. We stretch our notation to refer to the minimum value of Θ fora given value of c as Θ( c ). Lemma
The minimal mixing-dissipation cost is a concave function of c . Thatis, for c , c ≥ : Θ( tc + (1 − t ) c ) ≥ t Θ( c ) + (1 − t )Θ( c ) for all ≤ t ≤ .Proof. Set c = tc + (1 − t ) c , and let κ i be a minimizer of − H + c i D ; i = 1 , , − tH ( κ ) + tc D ( κ ) − (1 − t ) H ( κ ) + (1 − t ) c D ( κ ) ≤ − tH ( κ ) + tc D ( κ ) − (1 − t ) H ( κ )+(1 − t ) c D ( κ )= − H ( κ )+ ( tc + (1 − t ) c ) D ( κ ) . (2.17)To finish formulating the optimization problem we restrict the set of networktopologies that we are searching over: Let G be an unweighted undirected networkwith nodes N and edges E . Choosing E allows us to constrain e.g. the maximumdegree of the nodes in our optimal network. Our optimal network is restricted tobe a subnetwork of G : we refer to G as the ambient network . For the purposesof this study we will assume that the network is planar (this assumption is almostcertainly true for slime mold networks, but is less valid in fungal networks, wherehyphae often crossover without connecting to each other). In this paper we restrict toregular triangular networks, in which all of the edges in the ambient network have thesame length. We do not think that our results are sensitive to the choice of (regular)ambient network: we have for example, reproduced all of the results discussed in thispaper with square grid ambient networks [18]. Solutions of our optimization problem depend upon the value of C = (cid:80) κ γij . However, as c is increased from 0 to ∞ , the same sequences of optimal networksare found, independent of C . For suppose that κ ij is a conductance network that solvesEq. 2.16 with (cid:80) κ γij = C . Then rescaling κ (cid:48) ij = (cid:16) C (cid:48) C (cid:17) /γ κ ij produces a new networkwith (cid:80) κ (cid:48) γij = C (cid:48) . The flows are unaltered in this network, so H ( κ (cid:48) ij ) = H ( κ ij ).However, dissipation is changed: D ( κ (cid:48) ij ) = (cid:0) CC (cid:48) (cid:1) /γ D . So the new network minimizes θ in Eq. 2.16 for the new dissipation weighting c (cid:48) = (cid:16) C (cid:48) C (cid:17) /γ c . Sweeping through allvalues c ≥
0, with (cid:80) κ γij = C , we generate in one-to-one correspondence all optimal PTIMAL MIXING IN TRANSPORT NETWORKS κ , and κ , . At some criticalperturbation of the conductances, the flows q , and q , are respectively reversed(reversed flows are circled in the network plots). The landscape of mixing entropy(middle) has ridge lines where the flow is reversed.networks for c (cid:48) ≥ (cid:80) κ γij = C (cid:48) . The choice of the value for C is thereforearbitrary.
3. Numerical optimization.
Chang and Roper [9] optimized networks for gen-eral differentiable functions using gradient descent. However, the mixing entropy thatwe seek to optimize here is non-differentiable wherever the flow through an edge isequal to 0. It is necessary that the optimization algorithm be able to navigate throughsuch points, because as conductances are updated to maximize mixing entropy it isoften necessary to reverse the direction of flow on one or more edges. In Fig 2, weshow a contour map of varying two edge conductances within a network (the originalnetwork is shown at top right, and networks with reversed flow bottom and left). Thelandscape is tiled into watersheds, each watershed represents the set of entropies thatcan be attained by varying the conductances without reversing the direction of flowon any edge. Between the watersheds are ridgelines, and crossing a ridgeline reversesthe direction of flow on one or more edges. Within a watershed, gradient descent canmove the network toward the local optimum for the watershed, but deteriorates if thelocal optimum is on the ridgeline.Even differentiable functions like dissipation produce landscapes with many lo-0
C. MENTUS AND M. ROPER cal optima; accordingly, in [13] simulated annealing and diffusive rearrangements ofconductances were implemented to prevent networks from being trapped at unfavor-able local optima. We follow a similar approach, by augmenting a gradient-basedsearch that is constrained to remain within a single watershed, with a perturbationmethod that is designed to provide the network with alternate routes to explore, andby intentional search over adjacent watersheds. We describe the three parts and theirintegration below.
We perform a gradient-based search, via MATLAB’s implementation of the interior-point method in fmincon. To ensure the search is not challenged to cross the ridgesthat divide different flow topologies, we enforce the sign of flow in each edge viaa set of non-linear constraints on the conductances. Although fmincon is capableof calculating the derivative of Θ numerically within a watershed, we accelerate thealgorithm by computing the gradient analytically using Lagrange multipliers to encodeall of the relationships between conductance, flow, transition probabilities and mixingentropy:We rewrite the array κ ij as |E| -entry vector. The function Θ (from Eqn. 2.16)that we are seeking to optimize is built up from κ ij via a chain of dependencies(3.1) κ ij (cid:55)−→ p i (cid:90) = ⇒ f i (cid:90) = ⇒ T ij (cid:55)−→ P ij (cid:90) = ⇒ ˜ q ij (cid:55)−→ N j (cid:90) = ⇒ H. Where a single arrow (cid:55)−→ represents a function of the immediately preceding variableand (cid:90) = ⇒ represents a function of more than one of the variables to the left. All of therelationships between variables are described in Section 2. Although it is possible tocarry derivatives through this list of compositions, the overhead from isolating andusing several derivatives of arrayed functions with respect to arrayed variables, makesthe gradient computation forbiddingly slow [18]. Instead we follow a similar approachto [8] and use Lagrange multipliers to enforce all of the functional relationships thatare embodied in Eq. (3.1). Eq. (3.1) then becomes a road-map for the order in whichwe solve for each of the Lagrange multipliers in our system. The constrained versionof Eq. (2.16), omitting the dissipation, becomes:Θ = (cid:88) i ∈N f i (cid:88) j :˜ q ji > ˜ q ji N i log (cid:18) ˜ q ji N i (cid:19) − (cid:88) i ∈N α i N i − (cid:88) j :˜ q ji > ˜ q ji − (cid:88) i,j ∈N γ ij (˜ q ij − f i P ij ) − (cid:88) ij ∈N µ ij (cid:32) δ ij − (cid:32) P ij − (cid:88) l ∈N T il P lj (cid:33)(cid:33) − (cid:88) i (cid:88) j ∈ n ( i ) λ ij (cid:18) T ij − q ij q ij > f i (cid:19) (3.2) − (cid:88) i ∈N β i f i − (cid:88) j ∈ n ( i ) q ij q ij > + | Q i | Q i < − (cid:88) i ν i Q i − (cid:88) j ∈ n ( i ) κ ij ( p i − p j ) . For our gradient descent, we make use of the derivative: ∂ Θ ∂κ ab = λ ab q ab > ( p a − p b ) f a + λ ba q ba > ( p b − p a ) f b ( β a ( p a − p b ) q ab > + β b ( p b − p a ) q ba > )+ ( ν a − ν b )( p a − p b ) − c ( p a − p b ) . (3.3)In which we have made use of the derivatives compiled in Appendix A to calculatethe derivative of the dissipation. PTIMAL MIXING IN TRANSPORT NETWORKS log conductances , defined by: κ ij = exp(˜ κ ij ).Additionally we want to ensure that the total material investment in the networkremains constant; i.e. to ensure (cid:80) κ γij = C . In [8] this constraint was added viaan additional Lagrange multiplier, but this method guaranteed that the constraintis satisfied only at leading order in the step size. Hence, here we simply rescale theconductances: κ ij (cid:55)→ (cid:16) C (cid:80) ( i,j ) κ γij (cid:17) γ κ ij after each perturbation. Both transformationsneed to be considered when calculating the derivatives. For the rescaling we get: ∂∂κ ij (cid:18) C /γ κ ab (cid:80) κ γcd (cid:19) = C /γ ( (cid:80) κ γcd ) /γ (cid:32) δ ( ab ) , ( ij ) − κ ab κ γ − ij (cid:80) κ γcd (cid:33) (3.4)To turn derivatives with respect to κ ij into derivatives with respect to ˜ κ ij we pre-multiply them by ∂κ ab ∂ ˜ κ ij = δ ai δ bj κ ij . Similar to dissipation-minimizing networks [13] our optimization algorithm has manylocal optima in which source and sink are sparsely connected. To find the true globaloptimum, our algorithm includes a step for redistributing material within the net-work, in a way that presents the algorithm with a range of paths of different lengthsbetween source and sink. However, although [13] previously redistributed materialby diffusing it on the graph, we found this method tends to short circuit the net-work by introducing much shorter paths between source and sink. The appearance ofthese paths is catastrophic for the optimization algorithm, since they are attractinglocal optima but far from the global optima [18]. Since our algorithm does not sendconductances exactly to zero, we define a threshold conductance κ c , and say that anedge (as well as the vertices that it connects) is in the support of the network if itsconductance exceeds κ c . In practice we found that a value κ c = 2 × − worked forall of the simulations shown in this paper.We redistribute material using a network growth step , which adds spurs of materialfrom the network’s support. Our algorithm takes the form of a set of operators:Grow up-right ( κ ij ) where “up-right” in the subscript can be replaced with the “up-left”, “down-left” or “down-right” to denote the direction in which material is added.Growth in the up-right direction adds edges that link nodes in the support to nodesnot in the support that are upwards and right of them. Each step of the growthalgorithm concatenates growth in two non-parallel directions. In practice we did notfind it necessary to include right or left growth. We will describe the up-right growthstep: other growth steps can be derived from this step by symmetry.1. First locate up-right edges in the triangle grid for which the top-right node isoutside of the support of the network and the bottom-left node is inside thesupport.2. Add positive conductance to each of these edges to form a new network κ up-right (Fig. 3b). Each new edge is assigned conductance equal to the av-erage conductance of the edges from the support adjacent to its bottom-leftnode in the set of bottom-left nodes identified in 1.3. In κ up-right locate the nodes which are top right nodes of edges in the support.Call this set the top-right nodes (Fig. 3c).4. For every top-right node i , if i is the apex of a triangle whose base and up-right edge lie in the support of κ up-right , complete this triangle with an up-left2 C. MENTUS AND M. ROPER(a) (b) (c) (d) (e)
Fig. 3: Sequence of steps performed to construct κ up-right from an initial network κ shown in (a). (b) blue edges are added up-rightwards from a (bottom) node in thesupport of the network to a (top) node outside of the support of the network. (c)We identify top-right nodes (blue circles) whose bottom left edges have non-zeroconductance following the first step. (d) Add leftwards (orange) and down-rightwards(green) edges if they complete triangles with edges in the current support. (e) Ouroptimization scheme applies the up-right growth algorithm twice.edge whose conductance is the arithmetic mean of the other two edges. Forevery top-right node i that is the left vertex of an inverted triangle, whoseup-left and up-right edges lie in the support of κ up-right , complete the trianglewith a horizontal edge whose conductance is the arithmetic mean of the othertwo edges (Fig 3d). Part 1 of our optimization algorithm can reliably locate local optima while respectingthe directions of flow on every edge (i.e. the flow topology ). To find the true global op-timum we search systematically over adjacent topologies. To do this, we take one edgewithin the network, and find the smallest increase and decrease in the conductanceof the edge that will change the direction of flow in at least one other edge within thenetwork. To find the smallest change in conductance, we use the Sherman-Morrisonformula [25], which allows us to calculate an explicit expression for the conductancechange necessary to reverse the flow in any edge of the network (Eq. B.4). Givena causal edge ( a, b ), Eq. B.4 enables us to compute a set of perturbations t abuv to κ ab to reverse the flow on any edge ( u, v ). We filter these perturbations to keep onlyperturbations in which κ ab is not allowed to become too small (in practice a thresholdof 10 − gives good results), to prevent this part of the algorithm getting stuck en-gineering and then re-engineering flow reversals on edges that already have very lowconductance. This method was used to find the set of flow changes shown in Fig. 2. We initialize thealgorithm by assigning each edge within the ambient network an U (0 ,
1) conductance,and then scaling all conductances to ensure (cid:80) κ γij = C . A single step of the algorithmconsists of running all three of its parts sequentially. Part 1 locates a locally optimalnetwork that respects the flow directions given to it, while the random choice ofgrowth directions in Part 2 and of causal conductances in Part 3 stochastically altersthe topology of the network. We compare the local optima arrived at the end ofconsecutive Part 1’s; if the new local optimum has a lower value of Θ than the old,we keep it, otherwise we revert to the old optimum.Our descent step uses the MATLAB optimization function fmincon using theinterior-point algorithm with 1000 max iterations, with flow directions constrained onall edges with non-negligible conductance (see below), and with analytically computedderivative. Our growth step requires first picking a single direction in which to grow PTIMAL MIXING IN TRANSPORT NETWORKS − . When this count reaches 4 we terminate the algorithm. Otherwisewe allow the algorithm to run for 50 iterations. Usually the stopping criterion isreached in fewer than 15 steps. We tested that our algorithm reliably (i.e. in morethan half of runs) located the theoretically obtained optimal network when c = 0 . κ ij = exp(˜ κ ij ). In practice, the local optima located by ouralgorithm use only a subset of the edges in the ambient network. We disregard edgeswith small conductances (in practice any edges with conductance less than 2 × − ):specifically the directions of flow on these edges are not considered when constrainingflow directions in part 1 or when determining the perturbations that cause flows toswitch in part 3.After the algorithm terminates we perform a final filtering step to deal with thefact that our gradient search is somewhat slow at removing edges from the networkor redistributing material between high conductance edges. To filter, we set all edgeswith conductance ≤ − to 10 − , re-scale all edges so that the material cost of thenetwork stays the same and then run our gradient-search with 10000 max iterations.Most of our simulations involve sweeps over c − values (see Section 4), typicallyinvolving 100-200 replicate networks whose c values are close enough that we expectthem to be topologically equivalent. We can further boost coverage since any localoptimum, ˆ κ discovered by our algorithm at c = ˆ c can be compared with local optimafor different values of c by tracing the line: ˆ θ = − H (ˆ κ ) + cD (ˆ κ ). We form theenvelope of these straight lines (Fig. 6). At any value of c , we identify the networkthat produces the straight line on which ( c, Θ( c )) lies as the global optimal networkfor that value of c .
4. Results from numerical optimization.4.1. Optimal networks are paths for small values of γ . We first studiedthe effect of fixing the value of c and constructing optimal networks over a rangeof values for γ . We found that at each value of c the number of loops in the pathincreased with γ (cf. dissipation minimizing networks, which form loops only when γ > γ ( γ ≈ . − .
5) the globallyoptimal networks are simple loopless paths linking source to sink. c . We noticed that in Fig.4 changing c changes the number of edges in the network. To investigate the effect of c more systematically, we performed a numerical sweep of c values, holding γ = 0 . c increases, the globallyoptimal network systematically explored all path lengths from N − N −
1; (the shortest path linking source to sink).Fig. 6 shows the complete Θ( c ) trace, including for the networks included in Fig. 5.The numerically obtained Θ( c ) is piecewise linear, with slope discontinuities at each c -value where the length of the optimal network increases by one.
5. Optimal path networks.
Our numerical results from Section 4 highlightthree properties of optimal networks: that they are simple paths at small and mod-erate values of γ , that the path length decreases as c increases, and that for smallvalues of c the simple path visits every vertex in the network. In this section we will4 C. MENTUS AND M. ROPER
Fig. 4: Numerically computed optimal networks on a 5 × c (rows) and four different values of γ (columns). At sufficientlysmall values of γ (in particular for γ = 0 .
45 at each assayed value of c ), optimalnetworks are all paths from source to sink. Increasing γ progressively adds loops tothe network. Increasing c increases the number of nodes visited by the network.Fig. 5: Selected optimal networks for γ = 0 .
45, on a 5 × c (values given above each panel), are all paths with decreasing lengths.Every possible length of path between 24 and 9 edges is obtained. The networksshown are the magenta points in Figure 6.rigorously state and prove theorems justifying these properties.We will first prove separate results for mixing and for dissipation. We will showthat the optimal network for mixing (that is, without considering the cost of dissipa-tion) is a tour – a path that visits every node in the network from source to sink.Then given only the flows on a network, we bound its dissipation. Second, we will PTIMAL MIXING IN TRANSPORT NETWORKS × γ = 0 . K = 24. The c -domain is divided into subintervals [ . , c , ], [ c m +1 ,m , c m,m − ]for m = 9 , , . . . ,
24 and [ c , ,
7] ( c m,n is defined in Section 5), and 100 locallyoptimal networks are generated within each subinterval. Global optima are derivedis described in Section 3.4. Selected globally optimal networks (magenta points) areplotted in Fig. 5.consider optimization among path (i.e. loopless) networks, showing that as the costof dissipation is increased, the length of the optimal path network decreases mono-tonically in length in steps of 1, from the tour to a geodesic (shortest path). Finallywe show that among all networks, over many different values for the dissipation pen-alty factor c , the optimal network is a path for all sufficiently small γ . We start byintroducing a notation for paths of different lengths, assuming that conductances areuniform, i.e. the same on each edge within the path, which is favored for minimizingdissipation. Definition
Say that a path from source to sink has length m if it visitsexactly m nodes. We use the notation τ m to denote any uniform conductance pathof length m . Further, we call τ |N | , the path that visits every node in the ambientnetwork, a tour . Theorem
Suppose that q ij is a flow network with node-set N , and |N | = n .Then the maximum possible total mixing entropy is log( n !) , and this maximum isattained only for a path that visits all n nodes exactly once. The intuitive interpretation of this result is that all of the nodes in the network can beordered by their pressures, p i . Signals from node i can reach node j only if p i > p j . Anoptimal mixing configuration is one in which signals from node i reach all downstream j with probability 1, which requires that the downstream network is a path that visitseach downstream node in turn. Proof.
Let x = { x t } t ≥ be the random walk on the flow network defined in Section2. Let V k ⊂ N be the set of nodes v that receive signals from exactly k nodes: thatis, V k = { u ∈ N : { v : P vu > } = k } .For a subset of nodes S ⊂ N we say that x hits S , if for some t ∈ Z ≥ x t ∈ S . Let6 C. MENTUS AND M. ROPER k ≥ V k is non-empty. Let u (cid:54) = v ∈ V k . A signal x can not visit more thanone node in V k for, if x hits both v and v ∈ V k , and WLOG p v > p v , then for any u ∈ N with P uv >
0, we must also have P uv >
0. So { u : P uv > } (cid:40) { u : P uv > } ,which is impossible since v , v ∈ V k implies both of these sets contain k elements.Let v ∈ V k . Then H ( P v ) ≤ log( k ) because there are exactly k nodes u with˜ q uv >
0. This inequality, together with (cid:80) v ∈ V k f k ≤ H :(5.1) (cid:88) i ∈N f i H ( P i ) = n (cid:88) k =1 (cid:88) v ∈ V k f v H v ≤ n (cid:88) k =1 log( k ) = log( n !) . We show that the only network with n nodes attaining the maximal entropy is apath. For the path, labeling the nodes 1 , , . . . , n in the order in which they arevisited from source to sink we find for 1 ≤ i < j ≤ n , ˜ q ij = 1. Therefore theprobability distribution of signals P i is the uniform distribution on i atoms and hasentropy H i = log( i ). Hence, H = (cid:80) ni =1 f i H i = log( n !).Conversely, if q ij (cid:54) = τ n , then there must be at least one k ∈ { , , . . . n } such that V k = ∅ . In this case, H differs from log( n !) by at least log( k ). Corollary
Let q ij be a flow network on nodes N and ∅ (cid:54) = F ⊂ N with |F| = m . Then H F ≤ log( m !) . To deduce the corollary, we treat the probabilities P F i in the same fashion as wetreated the probabilities P i in the proof of Theorem 5.2. The dissipation, D , for a network is a function both of its conductances κ ij and its flows q ij . However, we can bound the dissipation based on the q ij , alone,given only the constraint that (cid:80) ij κ γij = C . Theorem
Murray’s law . Let q ij be a network of flows, then if (cid:80) ij κ γij = C , the smallest possible dissipation in the network is: (cid:32)(cid:80) ij q γγ +1 ij (cid:33)
1+ 1 γ C γ . This Theorem is equivalent to Murray’s law [8]: it is based on assigning each edgethe conductance that minimizes the overall network dissipation.
Proof.
Fixing flows, we minimize the total dissipation over conductances obeyingthe building constraint (cid:80) κ γij = C . That is we minimize the overall function:(5.2) D ( κ ij ) = (cid:88) ij q ij κ ij + λ (cid:88) ij κ γij − C where the Lagrange multiplier λ maximizes the dissipation and we restrict to edgeson which q ij (cid:54) = 0. The minimization of D is performed on a compact set ( κ ij ≥ (cid:80) ij κ γij = C ) so the minimum certainly exists. Since D → ∞ whenever κ ij = 0 sothe optimal value of D occurs at an interior point within this set. So at the minimumpoint:(5.3) 0 = ∂D∂κ ab = − q ab κ ab + λγκ γ − ab PTIMAL MIXING IN TRANSPORT NETWORKS κ ab ∝ q γ +1 ab (Murray’s law), and we find our constant ofproportionality by imposing the constraint (cid:80) ab κ γab = C :(5.4) κ ab = C /γ q γ +1 ab (cid:18)(cid:80) ij q γγ +1 ij (cid:19) /γ . Substituting for κ ab in the dissipation yields the required inequality. Our main results will concernnetworks that are close to paths; for example path networks that have low conductanceexcursions adjoined to some of the path nodes. How much do these additions affectthe network’s mixing entropy? Thinking more generally, we consider networks inwhich some edges are strong, and others are weak (we will define strong and weak)and bound the contribution of the weak nodes to the network entropy.
Definition
Let , , . . . , N be a labelling of the nodes in the network G indecreasing order of total flow f i (that is; f ≥ f ≥ · · · ≥ f N ). Select < δ < whichwill be referred to as the dominance factor . Let k = min { i : δf i > f i +1 } . Thenodes , , . . . , k are referred to as the strong nodes above dominance factor δ ,denoted F δ . Theorem
Let (cid:15) > and let q ij be a flow network on an ambient network G with nodes N . Then there exists δ > , depending only on G , such that | H − H F δ | < (cid:15) . In addition, δ may be chosen such that for each node i ∈ F δ thenodes adjacent to the largest magnitude in-flow at i and the largest magnitude out-flow at i are also strong nodes. That is, if u, v ∈ n ( i ) are such that q ui = max j ∈ n ( i ) q ji and q iv = max j ∈ n ( i ) q ij then u, v ∈ F δ .Proof. Let δ > F = F δ to bethe strong nodes in N over dominance factor δ . By the triangle inequality, we boundthe difference(5.5) | H − H F δ | ≤ (cid:88) i ∈F f i | H i − H F δ i | + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) i (cid:54)∈F δ f i H i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . First we bound the first sum on the right-hand side, a sum over the absolute differencebetween the different mixing entropies. Let i ∈ F δ and j (cid:54)∈ F . Then(5.6) P i ( j ) = ˜ q ji (cid:80) k (cid:54)∈F δ ˜ q ki + (cid:80) k ∈F δ ˜ q ki < f i δf i < δ. P F δ i is obtained by omitting fewer than N = |N | states from P i , each with probabilityless than δ , and then renormalizing to give a new probability distribution. Sinceentropy is uniformly continuous on the simplex (cid:110)(cid:80) Ni =1 p i = 1 , p i ≥ (cid:111) , we can choose δ so that | H i − H F i | < (cid:15) N so (cid:80) i ∈F f i | H i − H F i | < (cid:15) .We now bound the magnitude of the second term on the right-hand side. H F δ i is an entropy of a random variable taking on less than N = |N | values. Therefore | H F i | < log N . The total flow through each node, f i ≤ δ for all i (cid:54)∈ F δ . Hence, wehave(5.7) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) i (cid:54)∈F δ f i H i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < (cid:88) i (cid:54)∈F δ δ log N < N δ log N. C. MENTUS AND M. ROPER
And so we can choose δ so that the second term is bounded by (cid:15) . To complete theproof, note that the magnitudes of the largest in- and out-flows are ≥ f i deg i ≥ f i K where K is the largest degree of a node in G . Thus, so long as δ < K the nodes connectedto the largest in- and out-flows of degree i have total flows > K f i > δf i meaning theyare also strong nodes.We refer to the network formed by linking the nodes F δ up using the edges carryingthe maximum inflow and outflow at each node as the strong network, and re-usenotation by using F δ to represent the strong network. Θ over paths. Anticipating our proof in Section 6 thatoptimal networks are paths for sufficiently small γ , we start by restricting our op-timization to paths. When restricted to path networks Θ( c ) = min m ( − H ( τ m ) + cD ( τ m )), i.e. Θ is the lower envelope of straight lines. We first ask, if c is varied,does the sequence of Θ-minimizing paths always recapitulate Fig. 5; i.e. start witha tour (at vanishingly small c ) and end at large, finite c with a geodesic, with theintermediate states being paths whose length increases by 1, at finite and predictable c values. We can rationalize this sequence as follows: For a uniform conductance pathof length m each edge carries flow 1, and has conductance ( C/ ( m − /γ , so thetotal dissipation is D ( τ m ) = C − /γ ( m − /γ , which increases monotonically in m . Increasing c increases the relative strength of dissipation to mixing in Θ. Mixingfavors tours and, more generally, paths that visit as many nodes as possible, whiledissipation favors shorter paths. At each c , the optimal path length emerges from thebalance of these two competing effects.Two paths of different lengths: τ m and τ n , give rise to straight lines c (cid:55)→ − H ( τ m )+ cD ( τ m ) and c (cid:55)→ − H ( τ n ) + cD ( τ n ), with different slopes. Denote the point of inter-section between the lines by c m,n :(5.8) K − γ c m,n = log( n !) − log( m !)( n − γ − ( m − γ = log(Γ( n + 1)) − log(Γ( m + 1))( n − γ − ( m − γ . Lemma
The point of intersection c m,n is monotonic decreasing in both m and n , for m, n ≥ .Proof. Let x m = ( m − γ , then:(5.9) c m,n = log (cid:16) Γ (cid:16) x γγ +1 n (cid:17)(cid:17) − log (cid:16) Γ (cid:16) x γγ +1 m (cid:17)(cid:17) x n − x m . So c m,n is the slope of the secant from ( x m , f ( x m )) to ( x n , f ( x n )) where f ( x ) =log (cid:16) Γ (cid:16) x γγ +1 n (cid:17)(cid:17) . Since f is an increasing function we need to show it is concavein order to show that these secant slopes decrease as either x n or x m increases.Given f = log( u ( x )) where u ( x ) = Γ (cid:16) x γγ +1 (cid:17) we have that(5.10) f (cid:48)(cid:48) ( x ) = u (cid:48)(cid:48) ( x ) u ( x ) − ( u (cid:48) ( x )) ( u ( x )) . To show that f is concave, we then need that u (cid:48)(cid:48) u − ( u (cid:48) ) <
0. To compute thesederivatives recall ddx
Γ = ΓΨ where Ψ is the digamma function. The trigammafunction Ψ is defined to be Ψ (cid:48) , and so d dx Γ = (Ψ + Ψ )Γ. Pulling these results PTIMAL MIXING IN TRANSPORT NETWORKS u (cid:48)(cid:48) u − u (cid:48) = γ (cid:18) γ + 1 (cid:19) x − γ +1 (cid:16) γ Ψ − x − γγ +1 Ψ (cid:17) Γ Let z = x γγ +1 + 2. Then z is an increasing function of x and visa versa. Since all ofthe other multipicative terms in the expression u (cid:48)(cid:48) u − u (cid:48) are positive we only need toshow that γ Ψ − x − γγ +1 Ψ is negative for all z = x γγ +1 n + 2 = n − ≥
3. We have γ Ψ − x − γγ +1 Ψ ≤ Ψ − x − γγ +1 Ψ = Ψ ( z ) − z − ( z ) , ≤ z + 1 z + 1 z ( z − − log zz − , = (cid:18) − z − log z (cid:19) z − . Here we made use of the inequalities [14] Ψ ( z ) ≥ log z − z , and Ψ ( z ) ≤ z + z [14]for all z >
0. The last line is < z ≥
3, proving the lemma.
Theorem
When Θ is optimized among paths, on a triangular ambient gridwith n nodes, and c is increased from 0, the optimal path decreases in length by 1 atpredictable values of c : c n,n − , c n − ,n − , c n − ,n − . . . . That is: τ n for c < c n,n − , τ n − for c n,n − < c < c n − ,n − , τ n − for c n − ,n − < c < c n − ,n − and so on.Proof. The theorem follows directly from the monotonicity property proven inLemma 5.7. We have already shown that the tour τ n is the optimal path at c = 0.As c is increased, the line Θ( τ n ) intersects with all lines Θ( τ m ) for m < n , at c n,m .Because of monotonicity, the smallest of these points of intersection is c n,n − . Thus τ n is the optimal path for c < c n,n − . Θ( τ n − ) intersects with all lines Θ( τ m ) for m < n at c n − ,m . The first point of intersection is c n − ,n − . So τ n − is replaced by τ n − , and in turn by τ n − and so on.
6. All optimal networks are paths for sufficiently small γ . Now we provethat for c m +1 ,m < c < c m,m − , networks with a unit source-sink pair optimizing H + cD are approximately paths of length m in the limit as γ →
0. Our proof worksfor any subinterval of ( c m +1 ,m , c m,m − ). The parameter 0 < σ < c m +1 ,m , c m,m − ] covered by the subinterval. We can also represent thesubinterval by c m +1 ,m + ρ < c < c m − ,m − ρ , where ρ = (1 − σ )( c m − ,m − c m +1 ,m ). Theorem
Let G be an ambient network with a single unit-flow source andsink. Let m be a possible length of a path in G connecting the source to the sink. Let < σ < . We claim that there exists (cid:15) > and Γ > such that if δ > and F δ is the network of strong nodes such that | H F δ − H | < (cid:15) provided by by Theorem 5.6,and the material cost exponent γ < Γ , then for any c m +1 ,m + ρ < c < c m,m − − ρ thenetwork F δ is a path of length m .Proof. We can simplify the calculations in our proof by appealing to the resultfrom Section 2.7, that the sequence of optimizers is identical as c is varied for anyvalue of C . Accordingly we consider the special case C = m −
1. For this choice ofmaterial cost C , D ( τ n ) = ( m − − /γ ( n − /γ . Then the computation of Θ onpaths is is drastically simplified:(6.1) D ( τ n ) → n < mm − n = m ∞ if n > m . C. MENTUS AND M. ROPER − log( m − − log m ! − log( m + 1)! c m,m − c m,m − − ρ c m +1 ,m + ρ Θ Θ ( τ m +1 ) Θ ( τ m − ) Θ ( τ m ) c Fig. 7: Θ-against- c loci for the simple paths τ m − (dotted, black), τ m (dashed, black), τ m (solid, gray), in the limit as γ → τ m − , τ m , τ m +1 are shown in Fig. 7. Define ρ as above, for the fixedmaterial cost C = m −
1. Let Γ be such that we can choose (cid:15) with Θ( τ m )+ (cid:15) < Θ( τ m ± )for all c m +1 ,m + ρ < c < c m,m − − ρ and γ < Γ. Let q ij be a flow on G with thespecified source and sink. According to Theorem 5.6, we can define δ > F δ a network of strong nodes such that | H ( F δ ) − H ( q ij ) | < (cid:15) . F δ has noleaf nodes except, potentially, the source and the sink. Suppose F δ contains n nodes.Then by Euler’s topological formula, it must contain n − F edges, where F isthe number of faces in the network (given the constraints on F δ , F = 0 if and onlyif F δ is a path). Hence − H ( F δ ) ≥ log n !. Each edge must carry, at minimum,flow δ n − F . Accordingly, the dissipation in the network can be bounded below by D ∗ ( n ) = δ n − F ) (cid:16) n − Fm − (cid:17) /γ by Theorem 5.4. D ∗ ( n ) → ∞ as γ → n + F >m . Since networks with bounded dissipation exist, the optimal network must have n + F ≤ m . We can then compare the strong network with τ m − , and τ m . If n ≤ m − F δ ) ≥ − H ( F δ ) ≥ − log n ! ≥ − log( m − γ → Θ( τ m − ).This implies that Θ( q ij ) > Θ( τ m ), and so q ij is in fact sub-optimal. Therefore n = m .Since n + F ≤ m , F = 0, i.e. F δ is a path of length m .By choice of (cid:15) , F δ can approximate F arbitrarily closely in Θ. Since our convergenceresult can be made uniform in (cid:15) over the interval c m +1 ,m + ρ < c < c m,m − − ρ , itfollows that F converges to some path τ m , as γ →
0, except possibly at the points c m,m − . Our proof method does not provide us with a way to prove convergence atthese points, but based on our numerical simulations, we think it is likely that as γ → τ m and τ m − , with indistinguishable Θ values atthese crossover c -values.
7. Discussion.
We introduced and analyzed theoretically and by numerical sim-ulations two measures of mixing quality on networks, one measuring the diversity ofplaces within a network that may be reached by cues originating within that network(sender entropy), and the other reflecting the diversity of cues that are received ateach point within the network (receiver entropy). Happily, we were able to show thatsender entropy for a network is equivalent to the receiver entropy on the same net-work if flows are reversed, allowing us to focus on optimizing just one kind of entropy
PTIMAL MIXING IN TRANSPORT NETWORKS γ optimizingeither entropy will produce identical networks, at biologically relevant values of γ ,which type of mixing is most important to the network influences important featuresof its organization, such as the placement of loops.We proved that in the single source-single sink geometry, the optimal networksconverge to simple paths joining source to sink, with the path length determined by thedifferent priorities that the network gives to mixing (which favors long paths) and todissipation (which favors short paths). Intriguingly, our numerical simulations suggestthat there is a finite value of γ , which for our 5 × γ →
0, and a bifurcation that removes edges at a finite value of γ , so we mustbe cautious about interpreting the disappearance of loops at finite γ as evidence of aphase transition in the network, analogous to the disappearance of loops at γ = 1 fordissipation-minimizing networks [7].We have narrowly focused on the case where there is only one source and onesink within the network, allowing us to rigorously validate our numerical results.However, our numerical optimization method is equally applicable to networks withmultiple sources and sinks, and it is possible to explore for example the conditionsunder which a network that is transporting material from a pair of sources to a pairof sinks, will determine to maintain two separate flows, or bring these flows together[18]. In particular, determining whether real network forming organisms such as fungiand slime molds have mixing-optimizing networks will require that we properly modelthe locations of the sources and sinks that drive their flows.It is equally important when comparing optimal mixing networks with real biolog-ical networks to pin down the value of γ for these networks. On theoretical grounds,we expect real biological networks in which vessels are simple tubes to operate in arange of γ from γ = 1 / γ = 1 / γ is impossible, the branching hierarchies of xylemvessels in plants and some levels of cellular tubes in the slime mold Physarum poly-cephalum obey Murray’s law (Theorem 5.4) [17, 1]. Specifically if Q ∝ κ γ +12 , then (cid:80) κ γ +12 will be conserved between different levels of a hierarchical network. For sim-ple tubes, we may assume the Hagen-Poiseuille law (that the conductance of a vesseland its radius, a are related by κ ∝ a ), it follows that (cid:80) a γ +2 is conserved.In P. polycephalum , (cid:80) r α is conserved across different levels of the hierarchy, witha range of α values between 2 . − . . < α < . α valuesare similar for plants, but determining γ from vessel radii is complicated by the factthat the xylem vessels (like the cords of mycorrhizal fungal networks) are constitutedof many smaller tubes. Suppose these tubes have radius A but individually obey theHagen-Poiseuille law, then: κ ∝ a A . The total number of tubes at the same levelin the hierarchy is reported to increase by a factor F ≈ . a decreases by a factor of 2 − /α when one tube2 C. MENTUS AND M. ROPER splits into two then A ∝ a − α + α log F , and so α = ( γ + 1) (cid:0) − α + α log F (cid:1) . Wetherefore estimate that the plants in [17] have γ values ranging from 0.8 for Fraxinuspensylvanica ( F = 1 . α = 2 .
2) up to 1.4 for
Campsis radicans ( F = 1 . α = 3).So slime molds span the value of γ ≈ . γ (cid:38) .
5, qualitatively resemble thereal structures of migrating slime mold networks, in which densely interconnected‘fans’ of tubes are linked together by sparsely connected or even loopless networks(see e.g. Fig. 1 in [3]). In future work, we plan to analyze the optimal loopy networksfound by our algorithm to determine why optimal mixing requires fans (loopy regions)don’t appear throughout the network but are located only near the source, as wellas to understand how the tradeoffs between mixing and dissipation can be used topredict the size of the fan relative to the total length of the network.That real network forming organisms do not form tours may result from their γ values being too high. However, even at low values of γ networks face other tradeoffs,such as resistance to damage and or the need to minimize dissipation when the sourcesand sinks fluctuate in strength [13]. An additional property that must be highlightedfor organisms such as fungi and slime molds that have indeterminate growth is thatorganisms need to maintain their mixing while growth pushes sources and sinks everfurther apart. A tour can be extended indefinitely to include to an ever increasingnumber of nodes by extending it node by node. However, this model of growth ex-tends the network only by adding a single edge at a time, restricting growth to asingle growing tip and is an inefficient strategy for a fungus or other foraging organ-ism, that must compete for space and resources with other organisms. The type ofnetwork formed by a network is also shaped by the constraints on how it must formthis network. Fast foraging may favor growth in multiple directions simultaneously,facilitated by the organism having multiple growing tips. Thus optimization prin-ciples such as those developed in this paper only achieve true biological relevancewhen linked to a set of rules that a growing organism can follow to attain the optima.Such rules have been only recently elucidated for dissipation minimizing networks (seee.g. [12]), leaving unmet the challenge of constructing rules to achieve more complexobjectives, including mixing. Acknowledgments.
We thank Karen Alim, Eleni Katifori and Sebastien Rochfor many useful discussions at a sequence of Square Meetings hosted by the AmericanInstitute for Mathematics, where the idea for this project was developed.
Appendix A. Computation of derivatives of Θ . To differentiate Θ we compute the all of intermediate variables appearing inEq.(3.1): i.e. p i , q ij , f i , T ij , P ij , ˜ q ij and N i . The pressures p i are first obtained bysolving Eqn. 2.2, using the Matlab function mldivide. We then solve a chain ofequations to obtain the Lagrange multipliers: ∂ Θ ∂Ni =0 −−−−→ α ∂ Θ ∂ ˜ qab =0 −−−−−→ γ ∂ Θ ∂Pab =0 −−−−−→ µ ∂ Θ ∂Tab =0 −−−−−→ λ ∂ Θ ∂fa =0 −−−−→ β ∂ Θ ∂pa =0 −−−−→ ν . (A.1)First:(A.2) ∂ Θ ∂N a = (cid:88) i (cid:18) − ˜ q ia N a log (cid:18) ˜ q ia N a (cid:19) − ˜ q ia N a (cid:19) − α a = 0 . PTIMAL MIXING IN TRANSPORT NETWORKS ∂ Θ ∂ ˜ q ab = f b N b log (cid:18) ˜ q ab N b (cid:19) + f b N b + α b − γ ab = 0Third:(A.4) ∂ Θ ∂P ab = µ ab − (cid:88) i µ ib T ia + γ ab f a = 0so:(A.5) γ ab = (cid:32)(cid:88) i µ ib T ia − µ ab (cid:33) /f a Fourth:(A.6) ∂ Θ ∂T ab = − (cid:88) j ∈N µ aj P bj − λ ab = 0Fifth:(A.7) ∂ Θ ∂f a = (cid:88) j :˜ q ja > ˜ q ja N a log (cid:18) ˜ q ja N a (cid:19) + (cid:88) j ∈N γ aj P aj − (cid:88) j ∈ n ( i ) λ aj q aj q aj > f a − β a = 0 . Sixth, to calculate ∂ Θ ∂p a we make use of the results ∂∂p a q ai = κ ai and ∂∂p a q ia = − κ ia .Thus: ∂ Θ ∂p a = (cid:88) j κ aj ( ν a − ν j ) + (cid:88) i ∈ n ( a ) ( β a κ ai q ai > − β i κ ai q ia > )+ (cid:88) i ∈ n ( a ) (cid:18) λ ai κ ai q ai > f a − λ ia κ ai q ia > f i (cid:19) = 0 . (A.8)Thus solving for the Lagrange multipliers ν i requires solving a Poisson equation onthe network similar to Eqn. 2.2. Appendix B. Finding adjacent flow topologies.
We assume that the network of non-zero conductances has a single connected com-ponent, because although very small conductances are treated as negligible throughoutour algorithm, they remain large enough to keep the Laplacian rank complete. Wetake the inverse of the version of the Laplacian defined in Section 2.1, ˜∆ κ for theinitial network. We compute the directions of flow on each edge within the network(edges with low flows are ignored). The set of networks with the same directions offlow constitutes one of the watersheds shown in Fig 2. We systematically vary oneconductance κ ab within the network to find an adjacent watershed – i.e. a flow net-work in which some subset of the non-negligible flows have been reversed. We findthe threshold values for κ ab at which one or more flow directions are reversed, byappealing to the Sherman-Morrison formula [25] (we thank Eleni Katifori for bringingthe S.M. formula to our attention). Specifically, if the conductance in edge ( a, b ) isincreased to κ ab + t , then the Laplacian for the new network becomes(B.1) ˜∆ ˜ κ ij = ˜∆ κ ij + t ( e a − e b )( e a − e b ) T . C. MENTUS AND M. ROPER
Then the Sherman-Morrison formula yields˜∆ − κ ij = (cid:16) ˜∆ κ ij + t ( e a − e b )( e a − e b ) T (cid:17) − = ˜∆ − κ ij − ˜∆ − κ ij t ( e a − e b )( e a − e b ) T ˜∆ − κ ij e a − e b ) T ˜∆ − κ ij t ( e a − e b ) . (B.2)Given another edge ( u, v ), We wish to find a perturbation to κ ab such that theflow along ( u, v ) is reversed. Let R i be the i th row of ˜∆ − κ ij , and d ij be the i, j entryof ˜∆ κ ij . Then the pressure drop is given by:(B.3) ˜ p u − ˜ p v = ( R u − R v ) Q − t ( d au − d av − d bu + d bv ) ( R a − R b ) Q t ( d aa − d ab − d ba + d bb ) . Therefore the pressure drop is a monotonic function of t so the the zero of this equationis where the pressure reverses. Setting the left side to 0 we get the value t = t abuv atwhich flow reversal occurs:(B.4) t abuv p u − p v ( d au − d av − d bu + d bv ) ( p a − p b ) − ( d aa − d ab − d ba + d bb ) ( p u − p v ) . REFERENCES[1]
D. Akita, I. Kunita, M. D. Fricker, S. Kuroda, K. Sato, and T. Nakagaki , Experimentalmodels for murray’s law , Journal of Physics D: Applied Physics, 50 (2016), p. 024001.[2]
D. Akita, I. Kunita, M. D. Fricker, S. Kuroda, K. Sato, and T. Nakagaki , Experimentalmodels for murray’s law , Journal of Physics D: Applied Physics, 50 (2017), p. 024001.[3]
K. Alim , Fluid flows shaping organism morphology , Philosophical Transactions of the RoyalSociety B: Biological Sciences, 373 (2018), p. 20170112.[4]
K. Alim, N. Andrew, and A. Pringle , Physarum , Current Biology, 23 (2013), pp. R1082–R1083.[5]
K. Alim, N. Andrew, A. Pringle, and M. P. Brenner , Mechanism of signal propagationin physarum polycephalum , Proceedings of the National Academy of Sciences, 114 (2017),pp. 5136–5141.[6]
D. P. Bebber, J. Hynes, P. R. Darrah, L. Boddy, and M. D. Fricker , Biological solutionsto transport network design , Proceedings of the Royal Society B: Biological Sciences, 274(2007), pp. 2307–2315.[7]
S. Bohn and M. O. Magnasco , Structure, scaling, and phase transition in the optimal trans-port network , Physical review letters, 98 (2007), p. 088702.[8]
S.-S. Chang and M. Roper , Minimal transport networks with general boundary conditions ,SIAM Journal on Applied Mathematics, 78 (2018), pp. 1511–1535.[9]
S.-S. Chang and M. Roper , Microvascular networks with uniform flow , Journal of theoreticalbiology, 462 (2019), pp. 48–64.[10]
F. Corson , Fluctuations and redundancy in optimal transport networks , Physical Review Let-ters, 104 (2010), p. 048703.[11]
M. Durand , Structure of optimal transport networks subject to a global constraint , PhysicalReview Letters, 98 (2007), p. 088701.[12]
D. Hu and D. Cai , Adaptation and optimization of biological transport networks , Physicalreview letters, 111 (2013), p. 138701.[13]
E. Katifori, G. J. Sz¨oll˝osi, and M. O. Magnasco , Damage and fluctuations induce loopsin optimal transport networks , Physical Review Letters, 104 (2010), p. 048704.[14]
A. Laforgia and P. Natalini , Exponential, gamma and polygamma functions: Simple proofsof classical and new inequalities , Journal of Mathematical Analysis and Applications, 407(2013), pp. 495–504.[15]
R. R. Lew , How does a hypha grow? the biophysics of pressurized growth in fungi , NatureReviews Microbiology, 9 (2011), p. 509.[16]
S. Marbach, K. Alim, N. Andrew, A. Pringle, and M. P. Brenner , Pruning to increasetaylor dispersion in physarum polycephalum networks , Physical review letters, 117 (2016),p. 178103.PTIMAL MIXING IN TRANSPORT NETWORKS [17] K. A. McCulloh, J. S. Sperry, and F. R. Adler , Water transport in plants obeys murray’slaw , Nature, 421 (2003), pp. 939–942.[18]
C. Mentus , Information Theoretic and Statistical Models for Spatial Transportation Networks:Total Mixing Entropy on Optimal Fluid Flow Networks and Time Dependent StochasticBlock Models , PhD thesis, UCLA, 2019.[19]
C. D. Murray , The physiological principle of minimum work applied to the angle of branchingof arteries , The Journal of general physiology, 9 (1926), p. 835.[20]
D. F. Plaza, S. S. Schmieder, A. Lipzen, E. Lindquist, and M. K¨unzler , Identification ofa novel nematotoxic protein by challenging the model mushroom coprinopsis cinerea witha fungivorous nematode , G3: Genes, Genomes, Genetics, 6 (2016), pp. 87–98.[21]
H. Ronellenfitsch and E. Katifori , Global optimization, local adaptation, and the role ofgrowth in distribution networks , Physical review letters, 117 (2016), p. 138301.[22]
M. Roper, C. Ellison, J. W. Taylor, and N. L. Glass , Nuclear and genome dynamics inmultinucleate ascomycete fungi , Current biology, 21 (2011), pp. R786–R793.[23]
M. Roper, A. Simonin, P. C. Hickey, A. Leeder, and N. L. Glass , Nuclear dynamics in afungal chimera , Proceedings of the National Academy of Sciences, 110 (2013), pp. 12875–12880.[24]
C. E. Shannon , A mathematical theory of communication , Bell system technical journal, 27(1948), pp. 379–423.[25]
J. Sherman and W. J. Morrison , Adjustment of an inverse matrix corresponding to a changein one element of a given matrix , The Annals of Mathematical Statistics, 21 (1950),pp. 124–127.[26]
T. Tanyimboh and A. Templeman , Calculating maximum entropy flows in networks , Journalof the Operational Research Society, 44 (1993), pp. 383–396.[27]
A. Tero, S. Takagi, T. Saigusa, K. Ito, D. P. Bebber, M. D. Fricker, K. Yumiki,R. Kobayashi, and T. Nakagaki , Rules for biologically inspired adaptive network de-sign , Science, 327 (2010), pp. 439–442.[28]