[PDF] A yield-cost tradeoff governs Escherichia coli's decision between fermentation and respiration in carbon-limited growth

Abstract

Many microbial systems are known to actively reshape their proteomes in response to changes in growth conditions induced e.g. by nutritional stress or antibiotics. Part of the re-allocation accounts for the fact that, as the growth rate is limited by targeting specific metabolic activities, cells simply respond by fine-tuning their proteome to invest more resources into the limiting activity (i.e. by synthesizing more proteins devoted to it). However, this is often accompanied by an overall re-organization of metabolism, aimed at improving the growth yield under limitation by re-wiring resource through different pathways. While both effects impact proteome composition, the latter underlies a more complex systemic response to stress. By focusing on E. coli's `acetate switch', we use mathematical modeling and a re-analysis of empirical data to show that the transition from a predominantly fermentative to a predominantly respirative metabolism in carbon-limited growth results from the trade-off between maximizing the growth yield and minimizing its costs in terms of required the proteome share. In particular, E. coli's metabolic phenotypes appear to be Pareto-optimal for these objective functions over a broad range of dilutions.

Full PDF

AA yield-cost tradeoff governs Escherichia coli’s decision between fermentation andrespiration in carbon-limited growth

Matteo Mori, Enzo Marinari,

2, 3, ∗ and Andrea De Martino

4, 5, ∗ Department of Physics, University of California, San Diego, CA, USA Dipartimento di Fisica, Sapienza Universit`a di Roma, Rome, Italy INFN, Sezione di Roma 1, Rome, Italy Soft & Living Matter Lab, Institute of Nanotechnology (CNR-NANOTEC), Consiglio Nazionale delle Ricerche, Rome, Italy Human Genetics Foundation, Turin, Italy

E. coli ’s ‘acetateswitch’, we use mathematical modeling and a re-analysis of empirical data to show that the transition from apredominantly fermentative to a predominantly respirative metabolism in carbon-limited growth results fromthe trade-off between maximizing the growth yield and minimizing its costs in terms of required the proteomeshare. In particular,

E. coli ’s metabolic phenotypes appear to be Pareto-optimal for these objective functionsover a broad range of dilutions.

The physiology of cell growth can nowadays be experimen-tally probed in exponentially growing bacteria both at bulk(see e.g. the bacterial growth laws detailed in [1–3]) andat single cell resolution [4–7]. Reﬁning the picture devel-oped since the 1950s [8, 9], recent studies have shown thatchanges in growth conditions are accompanied by a massivere-organization of the cellular proteome, whereby resourcesare re-distributed among protein classes (e.g. transporters,metabolic enzymes, ribosome-afﬁliated proteins, etc.) so asto achieve optimal growth performance [10]. This in turn un-derlies signiﬁcant modiﬁcations in cellular energetics to copewith the increasing metabolic burden of fast growth [11–13].

E. coli ’s ‘acetate switch’ is a major manifestation of the ex-istence of a complex interplay between metabolism and geneexpression. Slowly growing

E. coli cells tend to operate closeto the theoretical limit of maximum biomass yield [14, 15].Fast-growing cells, instead, typically show lower yields, to-gether with the excretion of carbon equivalents such as acetate[13]. One can argue that, in the latter regime, cells optimizeenzyme usage, i.e. they minimize the protein costs associ-ated to growth, while at slow growth they try to use nutri-ents as efﬁciently as possible [13]. As one crosses over fromone regime to the other,

E. coli ’s growth physiology appearsto be determined to a signiﬁcant extent by the trade-off be-tween growth and its biosynthetic costs. Interestingly, a simi-lar overﬂow scenario appears in other cell types in proliferat-ing regimes (see e.g. the Crabtree effect in yeast [16, 17] orthe Warburg effect in cancer cells [18–20]).Several phenomenological models have tackled the issue ofhow metabolism and gene expression coordinate to optimizegrowth in bacteria [2, 21–23], while mechanistic genome-scale models can provide a more detailed picture of the cross-overs in metabolic strategies that occur as the growth rate ∗ Co-last authors is tuned [24–26]. Here we combine in silico genome-scalemodeling with experimental data analysis to obtain a quanti-tative characterization of the trade-off between growth and itsmetabolic costs in

E. coli . In speciﬁc, we show that, for

E. coli glucose-limited growth, the growth yield (i.e. the growth rateper unit of intaken glucose) is subject to a trade-off with theproteome fraction allocated to metabolic enzymes. At fast(resp. slow) growth, the latter (resp. the former) is optimized,and one crosses over from one scenario to the other as glucoseavailability is limited. Focusing on energy production and car-bon intake as the central limiting factors of growth, we derivean explicit expression linking the biosynthetic costs of growthto the growth rate and evaluate it within a genome-scale modelof

E. coli ’s metabolism where the costs associated to differentstrategies for ATP synthesis can be directly assessed. The en-suing trade-off is described in terms of a Pareto front in a two-objective function landscape. Remarkably,

E. coli ’s metabolicphenotypes are found to be Pareto-optimal over a broad rangeof growth rates, a picture that we validate through an analysisof proteomic data. This study therefore provides a quanti-tative characterization of the multi-dimensional optimality ofliving cells [12, 27] that directly addresses the crosstalk be-tween growth physiology and gene expression.

RESULTSGeneral view of the optimal proteome allocation problem

Because growth is severely affected by the synthesis of in-efﬁcient proteins [1], optimizing proteome composition is amajor ﬁtness strategy for exponentially growing bacteria inany given condition. In

E. coli , for instance, a substantial re-shaping of the proteome takes place in carbon-limited growth,with ribosome-afﬁliated proteins taking up an increasing frac-tion of the proteome as the bulk growth rate µ increases, at a r X i v : . [ q - b i o . M N ] M a r the expense of catabolic, motility and biosynthetic proteins[1, 10, 28]. Similar changes are observed in cells subject toother modes of growth limitation, like nitrogen starvation ortranslational inhibition [10].In general terms, the problem of optimal proteome alloca-tion can be posed as follows. Consider a generic cellular ac-tivity L described by a rate v L that is subject to limitation andsuch that µ is proportional to v L via a ‘yield’ Y representingthe growth rate per unit of v L , so that µ = Y v L . For instance, v L may be the rate at which a nutrient is imported from thegrowth medium and metabolized, which is limited e.g. by nu-trient availability; or the rate of an intracellular ﬂux reducedby speciﬁc stresses (e.g. high levels of toxic metabolites orantibiotics). In this scenario, as the stress is applied, v L de-creases and µ is proportionally reduced. Generically, more L -devoted proteins will be needed to sustain a given rate v L under stress. We denote the proteome share allocated to L -devoted proteins by φ L (with the rest of the proteins sizingup to a fraction φ NL = 1 − φ L of the total), and deﬁne the“proteome cost” w L of sustaining rate v L via φ L = w L v L .In these terms, the growth rate µ is proportional to the growthyield Y and to φ L , and inversely proportional to w L , i.e. µ = φ L · Yw L . (1)This expression shows that cells can counteract an increase of w L , i.e. a stress affecting L , in two ways, see Fig. 1a.1. The ﬁrst is by increasing φ L . This strategy under-lies e.g. an upregulated synthesis of transporters andcatabolic proteins in response to a nutrient shortage, oran increase of the ribosomal proteome fraction in re-sponse to antibiotics, as seen for instance in [10].2. In addition, they can try to increase the growth yield Y (or, equivalently, decrease of the speciﬁc ﬂux throughthe limited process, i.e. q ≡ Y − = v L /µ ). The growthyield is however a systemic property that depends onthe whole set of metabolic processes. Achieving a moreefﬁcient conversion of v L to µ therefore requires a re-organization of the entire non-limiting sector that occu-pies a fraction φ NL of the proteome. E. coli ’s ‘acetate switch’ [13, 29], whereby bacteria crossover from a predominantly fermentative to a predominantlyrespiratory metabolism upon carbon limitation, is an exam-ple of the latter strategy. Indeed respiration, while morecostly in terms of enzymes with respect to fermentation, hasa larger ATP yield (ca. 26 mol

ATP /mol glc versus ca. 12mol

ATP /mol glc [13]).Cells may employ combinations of the above strategies, asfor instance

E. coli in glucose limitation tends to both increasethe fraction of proteins devoted to glucose scavenging and im-port, and to switch to the more efﬁcient respiratory pathways.However, the metabolic re-wiring required to increase Y nec-essarily implies the coordinated modulation of expression lev-els across multiple metabolic pathways. In the following wewill aim at characterizing more precisely the trade-off that un-derlies such a re-organization. Proteome sectors in

E. coli

A consistent body of experimental work has shown that, inexponential growth,

E. coli ’s proteome can be partitioned into“sectors” whose relative weights adjust with the growth con-ditions [1, 3, 10]. At the simplest level, a four-way partitioncan be considered, in which three sectors [ribosome-afﬁliatedproteins ( R ), metabolic enzymes ( E ) and proteins involvedin the uptake system of the limiting nutrient ( C )] respond tothe growth rate µ , while a fourth sector (a core Q formed byhousekeeping proteins) is µ -independent. Normalization ofproteome mass fractions imposes φ C ( µ ) + φ R ( µ ) + φ E ( µ ) = 1 − φ Q (cid:124) (cid:123)(cid:122) (cid:125) µ -independent . (2)Based on the bacterial ‘growth laws’ characterized in [1, 10],each µ -dependent term in (2) can take the form φ X ( µ ) = φ X, + ∆ φ X ( µ ) , with φ X, an offset value and X ∈{ R, E, C } . In turn, metabolic ﬂuxes can be seen as the bro-kers of proteome re-shaping assuming they are proportionalto enzyme levels [1, 3, 26]. In particular, for carbon-limitedgrowth (2) can be re-cast as (see Fig.1b, [3, 26]) w C v C (cid:124) (cid:123)(cid:122) (cid:125) ∆ φ C + (cid:88) i ∈ E w i | v i | (cid:124) (cid:123)(cid:122) (cid:125) ∆ φ E + w R µ (cid:124)(cid:123)(cid:122)(cid:125) ∆ φ R = φ max (cid:124) (cid:123)(cid:122) (cid:125) µ -independent , (3)where v C is the rate of carbon intake, v i is the ﬂux of reaction i , the sum runs over enzyme-catalyzed reactions and φ max is aconstant that includes all µ -independent terms (equal to about . or 48% in E. coli ). The three terms on the left-hand sideof (3) give explicit representations to the µ -dependent part ofthe proteome. The ﬁrst term corresponds to the µ -dependentproteome fraction to be allocated to the C -sector, with w C the cost of sustaining a carbon intake ﬂux v C (i.e. the pro-teome share to be allocated to C per unit of carbon inﬂux).As detailed in [26], w C reﬂects the amount of carbon avail-able in the extracellular medium, with large values (high im-port costs) associated to low carbon levels. The term w R µ describes the empirically observed linear increase of φ R withthe growth rate [1, 3, 10], the coefﬁcient w R corresponding tothe proteome fraction to be allocated to the R -sector per unitof µ (in short, the “cost” of R ). In E. coli , w R is determined ina robust way by regulation of ribosome expression via ppGpp[30], which sets its value to the inverse translational capac-ity w R (cid:39) . h [26]. The last term represents instead the µ -dependent part of the E -sector. The coefﬁcients w i quan-tify the cost of each reaction i ∈ E in terms of the proteomefraction to be allocated to its enzyme per unit of net ﬂux. Forsakes of simplicity, we assume here that w i is the same foreach i (but see [26] for a discussion of alternative choices).Note that (2) and (3) have the form w L v L + φ NL = 1 upon identifying w L v L with ∆ φ C ≡ w C v C and includingthe µ -dependent E - and R -sectors into φ NL . Coherently withthe general problem of protein allocation, one expects that φ NL , and hence ∆ φ E , might re-shape to counteract nutri-ent limitation. Crucially, though, ∆ φ E depends on intracellu-lar ﬂuxes and takes on different values in different metabolic NLLNLL unusedNL NL NLNL L (a)

Limiting growth rate by targeting a specific cellular activity L As w L increases, growth rate µ and the efficiency of the NL-proteome decrease, thus freeing resources to counteract the growth limitation φ NL = - φ L φ L = w L · v L Proteome reallocation Growth limitation

Increase φ L to increase the limited flux v L Reallocate the NL-sector so as to increase the yield µ / v L µ w L v L (b)(c) µ w C v C metabolic fluxes proteome allocationgrowth optimal state (max µ )nutrient level( ≈ w C ) Proteome Sectors v L : flux of the limited cellular activity w L : fraction of proteins per flux unitAll other metabolic activities Δφ R = w R µ Δφ E = Σ i w i | v i | Δφ C = w C v C Δφ C + Δφ E + Δφ R = const. Growth-rate dependent shares of proteomeProtein allocation constraint

FIG. 1.

Schematic view of the proteome allocation problem. (a) Effect of growth limitation on proteome composition. An increase inthe cost of the limiting sector can be counteracted by expanding the limiting sector (i.e. by increasing φ L ) and/or by allocating part of theunnecessary non-limiting sector to re-organizing metabolism so as to increase the growth yield Y = µ/v L . (b) Proteome sectors consideredfor E. coli , namely ribosomal (R), enzymatic (E), catabolic (C), and core (Q). Following [26], we assume that all sectors but the core have µ -dependent parts ∆ φ j . By normalization, their sums are constrained as in Eq. (3). (c) A change in the nutrient level is processed by the cellvia metabolic ﬂuxes, which affect growth. As the cell senses its new state, it re-allocates its proteome, thereby modulating metabolic ﬂuxesand improving growth. The interplay between the various components leads to optimal phenotypes. states, so that, in principle, any re-shaping of the non-limitingsector will be tied to a change in the overall organization ofmetabolic activity. Therefore, nutrient stress (i.e. an increaseof the nutrient import cost w C ) will be mediated by metabolicﬂuxes into a re-organization of the cellular proteome that inturn affects µ (see Fig. 1c). Optimal growth at each nutri-ent level (i.e. at each w C ) results from the crosstalk betweenproteome allocation and metabolism. Optimal proteome fractions are constrained within tight boundsand interpolate between them as the growth rate changes

In general, any ﬂux pattern v = { v C , { v i } i ∈ E } compat-ible with (3) and with minimal mass balance conditions isa viable non-equilibrium steady state for E. coli ’s metabolicnetwork. Optimal ﬂux patterns, and hence optimal values of ∆ φ E = (cid:80) i ∈ E w i | v i | and ∆ φ C = w C v C (which we denoterespectively as ∆ φ (cid:63)E and ∆ φ (cid:63)C = w C v (cid:63)C ), correspond to max-imum µ . In order to calculate such µ -maximizing ﬂux pat-terns v (cid:63) , we resorted to genome-scale constraint-based mod-eling (see Methods). Focusing on E. coli in a glucose-limitedminimal medium, we obtained the green curves shown in Fig.2a–c, detailing how these quantities vary with the growth rate µ . Quite surprisingly, both ∆ φ (cid:63)E (Fig. 2a) and ∆ φ (cid:63)C (Fig. 2b)appear to display an almost linear behaviour with µ . Signiﬁ-cant deviations occur only at fast growth, more evidently for ∆ φ (cid:63)E and for v (cid:63)C , see Fig. 2c.This behaviour suggests that optimal proteome fractions aretightly constrained by either optimality or proteome allocationrequirements. The allowed ranges of variability of ∆ φ E and ∆ φ C irrespective of (3) can be computed in silico by LinearProgramming (LP, see Methods and Supporting Text for math-ematical details). We call these the q -bound (index ( q ) ) and the ε -bound (index ( ε ) ), respectively, such that ∆ φ ( q ) C ≤ ∆ φ (cid:63)C ≤ ∆ φ ( ε ) C (4) ∆ φ ( ε ) E ≤ ∆ φ (cid:63)E ≤ ∆ φ ( q ) E (5)(Note that the q -bound is above the ε -bound for ∆ φ E , andvice-versa for ∆ φ C , i.e. when one of the two quantities isminimal the other should stick to its largest allowed value.)The bounds corresponding to E. coli growth in glucose-limited minimal medium are reported in Fig. 2a–c as red andblue curves, respectively.One clearly sees that ∆ φ (cid:63)E nearly saturates its maximum(given the q -bound) for slow growth and gradually shifts toits minimum (given the ε -bound) as µ increases. Vice-versa, ∆ φ (cid:63)C interpolates between its minimum (given the q -bound)and its maximum (given the ε -bound) as growth gets faster.This is clearly visible in Fig. 2d, which clariﬁes how close ∆ φ C and ∆ φ E are to their respective bounds. For slowgrowth (below ca. 0.5/h), both the E - and C -sector satu-rate their q -bounds and shift discontinuously to the ε -boundat higher µ .These results suggest that phenotypes minimizing nutrientimport costs (in terms of proteome shares) are optimal at slowgrowth, whereas phenotypes minimizing enzyme costs are fa-vored at fast growth. We stress that the q - and ε -bounds donot account for (3), implying that the optimal proteome allo-cation compatible with (3) interpolates between the physicallyallowed limits. In the broad cross-over region cells appearto balance between the costs of importing glucose and thoseof metabolic processing. Optimality therefore appears to begenerically characterized by a trade-off between different costfunctions, with “extreme” conditions (e.g. fast versus slowgrowth) favoring the minimization of one over the other. Inbetween, the trade-off is strongest and the cell has to ﬁne-tune Δ φ E ε − boundq − boundOptimal Δ φ C C a r bon up t a k e ( mm o l / g D W h ) R e l a t i v e d i s t an c e o f p r o t e i n f r a c t i on s f r o m bound s − bound25%50%75% − bound ε (a) (b) (d)(c) Δφ E * Δφ C * Growth rate, µ (1/h) Growth rate, µ (1/h) Growth rate, µ (1/h) Growth rate, µ (1/h) FIG. 2.

Allowed ranges of variability and optimal values for proteome fractions . Feasibility regions for ∆ φ E ( a ), ∆ φ C ( b ) and carbonuptake v C ( c ), as functions of the growth rate µ . For any µ , the q - and the ε -bound are computed by minimizing either the carbon ﬂux v C orthe E-sector proteome share ∆ φ E . Optimal, µ -maximizing proteome fractions, represented by green lines, interpolate between these boundsas µ changes. ( d ) Fractional distance of the optimal C - and E -sector fractions, ∆ φ (cid:63)C and ∆ φ (cid:63)E , to the q - and ε -bounds. For slow growth,both ∆ φ (cid:63)C and ∆ φ (cid:63)E are close to the q -bound. As µ increases, they both shift toward the ε -bound. Note that ∆ φ ( ε ) E ≤ ∆ φ (cid:63)E ≤ ∆ φ ( q ) E while ∆ φ ( q ) C ≤ ∆ φ (cid:63)C ≤ ∆ φ ( ε ) C . its metabolism so as to optimally balance the two objectives.Such a tradeoff can be shown to occur under very general as-sumptions, i.e. without the need of specifying a detailed func-tional form for the protein sectors in terms of the ﬂuxes (seeSupporting Text). In other terms, as long as the growth rate ismaximized, an increase of the growth yield has be accompa-nied by an increase of the protein cost of metabolism, encodedin the the non-limiting proteome sector (see Fig. 1a). Growth yield and enzyme costs are subject to a trade-off atoptimal growth

The above scenario can be re-cast in more intuitive termsas follows. Let us assume that each metabolic ﬂux v i scalesproportionally to the growth rate µ , i.e. that v = ξ · µ , with ξ a representative ﬂux vector identifying the “metabolic state”of the cell. (While this approximation is made for theoreticalconvenience here, it is empirically valid for moderate to fastgrowth rates [31].) We may now isolate µ from (3), obtaining µ ( ξ ) = φ max w R + ε ( ξ ) + w C q ( ξ ) , (6)where ε ≡ ∆ φ E /µ = (cid:80) i w i | ξ i | stands for the speciﬁc costof the E -sector, whereas q ≡ v C /µ = ξ C denotes the speciﬁccarbon uptake (i.e. the amount of in-taken carbon per unit ofgrowth rate, or the inverse growth yield Y − ). Because in E. coli w R and φ max are roughly constant, (6) relates directlythe growth rate µ ( ξ ) of a given ﬂux pattern ξ to its overall“speciﬁc cost” C ( ξ ) = ε ( ξ ) + w C q ( ξ ) . (7) More speciﬁcally, one sees that maximizing µ is equivalent tominimizing C across the different metabolic states ξ . Ideally,at optimality cells would like to minimize C by minimizing ε and q independently. However both quantities depend onthe underlying metabolic state ξ , so that, as w C varies, op-timal states must strike a compromise between growth yieldand biosynthetic costs. In particular, for w C → (i.e. incarbon-rich media), µ is maximized by minimizing ε , a sce-nario that corresponds to the ε -bound described in the previ-ous section, that optimal metabolic ﬂux patterns saturate atfast growth. On the other hand, when w C (cid:29) (i.e. whenextracellular glucose levels are low), µ is maximized by max-imizing the growth yield Y (or by minimizing q ), leading tothe q -bound that is saturated at slow growth at optimality. In-termediate glucose levels require instead a trade-off betweenthese two objectives. Notice that this scenario is fully con-sistent with the fermentation-to-respiration switch that char-acterizes E. coli growth in carbon limitation and with the ideathat its metabolism is multi-objective optimal [12].

Quantifying the yield-cost trade-off: Pareto-optimality of

E.coli ’s metabolism

In the present context, multi-objective optimality can be de-scribed quantitatively by a Pareto frontier that separates anaccessible region of the ( q, ε ) plane, such that each pointlying therein corresponds to a viable metabolic phenotype,from an inaccessible one, with optimal states lying on thefront (see Fig. 3a). Fig. 3b shows the Pareto frontier of op-timal metabolic phenotypes we obtained for lactose-limited E. coli growth using constraint-based modeling (see Meth-ods and Supporting Text). The growth rate µ increases as onemoves along the Pareto front towards larger values of q (i.e.lower yields). Sub-optimal states, generated by a randomizedconstraint-based model (see Methods), lie as expected in thefeasible region. Both optimal and sub-optimal ﬂux patternsshow a robust switch to a low-yield phenotype at fast growthrates, characterized by acetate secretion and downregulatedrespiration [26]. Such solutions dominate at large values ofthe inverse yield q . Instead, ﬂuxes through the TCA cycle andthe glyoxylate shunt are mostly active in the high-yield ﬂuxpatterns that mainly characterize slow growth.We have validated E. coli ’s Pareto-optimality scenarioagainst experimental results for lactose-limited

E. coli growthby ﬁrst computing ε from mass spectrometry data, and thenby assigning a growth yield to each state thus obtained by fur-ther constraining in silico models with the empirically foundvalues of ε (see Methods). This yields the curve in the ( q, ε ) plane shown in Fig. 3c, which displays a remarkable quali-tative agreement with our computation, conﬁrming the cost-yield trade-off scenario. At the quantitative level, we notethat the normalized protein cost ε predicted in silico for high-yield states matches the observed enzyme cost at growth rate µ = 0 . / h. For faster rates, where acetate excretion setsin, our model underestimates the decrease in ε by only about10%. Likewise, at slow growth (below 0.6/h), our predictionappears to underestimate ε , most likely due to the decrease inenzyme efﬁciencies that is known to set in at low µ [25, 32–35] and which is not accounted for in the constraint-basedframework we employed. DISCUSSION

E. coli ’s acetate switch as a two-state system

In summary, our results indicate that, in the case of

E. coli ,the range of values of w C where the yield-cost trade-off is sig-niﬁcant is relatively small. It is therefore reasonable to clas-sify ﬂux patterns on the Pareto frontier in two broad types(see Fig. 4a). The ﬁrst one corresponds to a ‘fermenta-tion’ phenotype with low yields ( q fer (cid:38) . q lac / q DW ) butlow speciﬁc protein cost ( ε fer (cid:39) . h), characterized by car-bon overﬂow and robust ﬂux through fermentative pathways.The second one has higher yield ( q res (cid:39) . q lac / q DW ) buthigher costs ( ε res (cid:38) . h), and uses respiration as its majorenergy-producing pathway. Generic ﬂux patterns can be seenas linear combinations of these phenotypes with parameter α ( ≤ α ≤ ), giving inverse yield q ( α ) = αq res + (1 − α ) q fer and carrying a cost ε ( α ) = αε res + (1 − α ) ε fer . Correspond-ingly, the growth rate µ can be computed as a function of α from (6), i.e. µ ( α ) = φ max w R + ε ( α ) + w C q ( α ) . (8)One can see (see Supporting Text) that µ is maximized bythe respiration phenotype with α = 1 (resp. the fermentationphenotype with α = 0 ) when w C is above (resp. below) the value w ac C ≡ ε res − ε fer q fer − q res (cid:39) . , (9)corresponding to a growth rate µ ac (cid:39) . /h, in quantitativeagreement (within 10%) with the experimentally determinedonset of the acetate switch [13].Such a two-state scenario inspires a minimal coarse-grainedmathematical model of E. coli ’s metabolism in which the cellcan use either respiration or fermentation to produce energysubject to a global constraint on proteome composition (seeSupporting Text). The model predicts that, at optimality, atransition between the fermentation phenotype (fast growth)and the respiration phenotype (slow growth) occurs when thecost of intaking carbon matches the extra protein cost requiredby respiration, at which point one phenotype outperforms theother in terms of maximum achievable growth rate (see Fig.4b). Straightforward mathematical analysis furthermore helpsclarifying how constraints associated to proteome costs differfrom other types of mechanisms that have been suggested todrive the acetate switch in

E. coli (see Supporting Text).

Outlook

The Pareto scenario presented above allows to describe,with quantitative accuracy, the complex cellular economicsunderlying

E. coli growth in carbon-limited media in terms ofa multi-objective optimization problem, and ultimately leadsto a minimal, two-state model of

E. coli ’s metabolism that in-cludes its essential features. Many coarse grained models ofthe switch between respiration and fermentation are, in fact,two-state models of the kind we have described [13, 21, 36].The very recent model of Basan et al . [13], in particular, ad-dresses speciﬁcally the impact of protein costs on the emer-gence of fermentation metabolism. The approach employedhere differs in two points. First, the yields and proteome costparameters for respiration and fermentation used in [13] re-fer to the ATP yield (as opposed to the growth yield) and tospeciﬁc “respiration” and “fermentation” proteomes. In thepresent model, both pathways are part of the same E -sector,and the focus is on a global re-allocation of the proteomerather than on up- or down-regulation of speciﬁc pathways.Secondly, and more importantly, the cost of carbon uptake(i.e. w C ) is implicitly assumed to be nil in [13]. When w C isset to zero, metabolism is completely determined by the nor-malization of proteome fractions and by the energy ﬂux bal-ance. While the switch to fermentation is still a consequenceof proteome allocation, its physical origin is rather differentin the two models. In [13], it is enforced by the energy de-mand. Under Pareto optimality, instead, it is a consequenceof the tradeoff between the C - and E -sectors. This speciﬁcaspect makes it in principle possible to describe strains withdifferent “acetate overﬂow lines” (e.g. mutants [37, 38] or“acetate feeding” strains obtained in evolution experiments[39, 40]), which correspond to feasible –albeit suboptimal–cellular states that would be harder to describe by the modelof [13]. On the other hand, the latter characterizes, in a sense, Specific uptake, q P r o t eo m e c o s t, P a r e t o f r on t i e r Suboptimal statesSmallest proteome cost, Smallest specific intake, q (a) (b) (c) µ = µ = µ = µ = µ = P r o t eo m e c o s t, ( h ) Specific uptake, q (g lac /g DW ) Specific uptake, q (g lac /g DW ) P r o t eo m e c o s t, ( h ) Infeasible region

FIG. 3.

Trade-off between maximum yield and minimum enzyme cost in

E. coli . (a) Multi-objective optimality and Pareto front. Two costfunctions (speciﬁc carbon intake q and speciﬁc proteome cost ε ) are shown, together with the feasibile (white) and infeasible (grey) regions,separated by the Pareto frontier. Optimal solutions lie on the latter. (b) In silico prediction for optimal

E. coli growth on lactose-limited minimalmedium. The red line corresponds to the computed Pareto front (see Methods), while individual points in the feasible region describe sub-optimal solutions. Blue (resp. red) markers represent solutions dominated by respiration (resp. fermentation), while purple markers denotemixtures. (c)

E. coli states obtained by integrating mass spectroscopy data for lactose-limited growth from [10] with in silico predictionsqualitatively reproduce (with quantitative accuracy for the yield) the predicted Pareto front. The values of µ reported next to the experimentalpoints represent the experimental growth rates. Glycolysis TCA cycle

ATP ATP

Growth

ATPADPADP ADP

Glycolysis TCA cycle

ATP ATP

Growth

ATPADPADP ADP

Respiration phenotype, r Large growth yield (small q )Large protein cost (large ε ) Fermentation phenotype, f Small growth yield (large q )Small protein cost (small ε ) w C (mmol/gh) G r o w t h r a t e , µ ( / h ) RespirationFermentation w C (gh/mmol) (a) (b) q ε q ε FIG. 4.

Phenomenological two-state view of

E. coli carbon-limited growth. (a) Respiration and fermentation phenotypes as characterizedby the multi-objective optimal states on the Pareto frontier of

E. coli ’s metabolism. The respiration phenotype has a large yield (small q ) andlarge speciﬁc protein costs, while the fermentation phenotype carries lower yields (higher q ) and a smaller cost. (b) Growth rate ( µ ) versuscarbon-intake cost w C as obtained from the phenomenological two-state model discussed in the Supporting Text. For each w C , the optimalphenotype is the one for which µ is largest. The switch from the fermentation to the respiration phenotype occurs when w C matches the extraprotein cost required by respiration. an “optimal” strain. Future experiments may allow to measureﬁtness advantages of different metabolic strategies in variousenvironmental and ecological contexts, shedding further lighton the evolution of the acetate switch.By slightly extending the model of [13], Vazquez and Olt-vai [36] have recently linked overﬂow metabolism to a macro-molecular crowding constraint, along the lines of [41, 42]. For E. coli , such an interpretation appears to be at odds withthe empirical fact that the cell volume adjusts in response tochanges in the macromolecular composition of the cell. Inparticular, the cell density was found to be roughly constantacross several distinct conditions, including inhibition of pro-tein synthesis, and only slightly larger in the case of proteinover-expression [43, 44]. The fact that cell density is mini-mally perturbed by “inﬂating” or “deﬂating” cells via tuningof protein synthesis suggests a reduced role of macromolecu-lar crowding in modulating such processes in

E. coli . In addi-tion, however, [36] points out that, at slow growth, an increaseof the proteome share of proteins other than those associatedto respiration and fermentation has to take place. Our resultsare in line with this scenario. In fact, catabolic proteins in-cluded in the C -sector are up-regulated at low µ (see Fig. 2b),in agreement with quantitative measurements [10]. It is in-deed the relationship between the C - and E -sectors, the latterof which accounts for respiration and fermentation pathways,that we have focused on in this work.It is known that many different organisms share E. coli ’s be-haviour in terms e.g. of growth laws [2] and carbon overﬂow.Nevertheless, the picture derived here for

E. coli is not uni-versally valid across microbial species. For instance, recentstudies of

L. lactis , an industrial bacterium that displays car-bon overﬂow (albeit between different types of fermentationpathways rather than between fermentation and respiration as

E. coli ), suggest that protein costs are not a determinant fac-tor in its growth strategies [45]. Likewise, carbon overﬂowin

S. cerevisiæ appears to respond to the glucose intake ﬂuxrather than to the macroscopic growth rate [46]. More workis therefore required to clarify the extent to which the picturedescribed here applies to other organisms.

METHODSMetabolic network reconstruction

All computations were performed on

E. coli ’s iJR904GSM/GPR genome-scale metabolic model [47] using aglucose-limited or a lactose-limited minimal medium.

Computation of the q - and ε -bounds and of the optimal valuesof ∆ φ C and ∆Φ E via constraint-based modeling Flux Balance Analysis (FBA [48]) approaches to metabolicnetwork modeling search for optimal ﬂux vectors v = { v i } within the space F deﬁned by the mass balance conditions Sv = , S denoting the stoichiometric matrix, and by ther-modynamic constraints imposing that v i ≥ for irreversiblereactions. The q - and (cid:15) -bounds are obtained by solving q -bound : : min v ∈F ∆ φ C subject to µ ( v ) = µ (10) ε -bound: : min v ∈F ∆ φ E subject to µ ( v ) = µ , (11)upon varying µ , where µ ( v ) denotes the growth rate asso-ciated to v . Both problems are solved by LP and we em-ployed the openCOBRA toolbox [49] for their solution. Thegrowth-rate dependent minimum values attained by the ob-jective functions, which we denoted by ∆ φ ( q ) C and ∆ φ ( ε ) E in(4) and (5) respectively, directly provide the lower boundsfor ∆ φ C and ∆ φ E . The upper bounds ∆ φ ( q ) E and ∆ φ ( ε ) C can be computed from the ﬂux vectors v ( q ) and v ( ε ) that solve (10) and (11) respectively. The latter is simply given by ∆ φ ( ε ) C = w c v ( ε ) C . For the former, instead, since (10) only de-termines the value of the glucose import ﬂux v ( q ) C , we searchedfor the simplest thermodynamically viable ﬂux pattern amongthe vectors v ( q ) at ﬁxed v ( q ) C by minimizing the L -norm [50].This effectively corresponds to performing a “loopless” ver-sion of FBA [51].Constrained Allocation FBA (CAFBA [26]) was insteadused to compute optimal ﬂux patterns. CAFBA is a slightbut signiﬁcant modiﬁcation of FBA where F is further con-strained though the additional condition described by Eq. (3).Its implementation still only requires straightforward LP aslong as the biomass composition is growth-rate independent.See [26] for details. To solve CAFBA, we set the costs w i of reactions in the E -sector to the same value, namely w E = 8 . × − g DW h/mmol, and used E. coli -speciﬁc val-ues for w R and φ max as done in [26]. Computation of the Pareto frontier

The Pareto front shown in Fig. 3b has been computedby solving CAFBA with homogeneous costs ( w i = w E for each i ) for different values of w C , after silencing theATP maintenance (ATPm) ﬂux. To compensate for the lackof maintenance-associated energy costs, we increased thegrowth-associated ATP hydrolysis rate by an amount equal tothe ATPm ﬂux (i.e. 7.6 mmol ATP / g DW in the iJR904 model),so that the total ATP hydrolysis ﬂux at the maximum growthrate µ = 1 /h is the same as in the default model. The differ-ence in the overall ATP hydrolysis ﬂux (including the main-tenance and growth-rate dependent component) between thisimplementation of CAFBA and the standard one is within15% for growth rates above 0.5/h. For each different classof optimal solutions, the speciﬁc intake q and the speciﬁc cost ε were computed, returning a set of points (one for each class)in the ( q, ε ) plane. The Pareto front is obtained by connectingpoints via straight lines. Details of its construction are givenin the Supporting Text. Generation of sub-optimal CAFBA solutions

In order to generate the sub-optimal CAFBA solutionsshown in Fig. 3b, we computed the values of q = v C /µ and ε = w E (cid:80) i | v i | /µ for ﬂux vectors v = { v c , { v i } i ∈ E } different from the optimal ones. To ensure that such sub-optimal states lie sufﬁciently close to the Pareto front, weused ﬂux vectors that are optimal for a version of CAFBAin which homogeneous costs w i = w E are replaced with in-dependent identically-distributed random variables with mean w E and dispersion δ , as for the case of CAFBA with hetero-geneous weights discussed in [26]. After obtaining a largenumber of such vectors for different realizations of the ran-dom costs and different values of w C , we computed the cor-responding metabolic state vectors ξ (by normalizing each v by its growth rate) and, for each such ξ , we computed q and ε as deﬁned above, i.e. using homogeneous costs w i = w E .This procedure allows to construct viable solutions that are ingeneral sub-optimal with respect to the CAFBA solutions ob-tained with homogeneous costs. The depth of the sampling,that is, the typical distance of sub-optimal solutions from thePareto front, is controlled by the dispersion δ of the individ-ual costs w i [26]. The sampled solutions approach the Paretofront as δ → . As δ increases, instead, the protein costs ε as-sociated to each state ξ ﬂuctuate widely, and metabolic statesfar from the Pareto front become more and more likely. Comparison with mass spectrometry data

Mass spectrometry data from [10] include quantiﬁcation ofprotein levels for

E. coli

NQ381 (a strain with titratable lacYenzyme, based on the wild-type NCM 3722 strain) grown inminimal lactose media. Five different growth rates have beenobtained by inducing different levels of lacY and, for eachcondition, quantitative proteomic data are available. The spe-ciﬁc proteome cost ε for the E -sector shown in Fig. 3c wasobtained in the following way. First, reactions were assignedto the E -sector according to the partition used in [26]. Next, for each reaction, the Gene-Protein-Reaction matrix includedin the iJR904 model was used to obtain a list of its correspond-ing enzymes. We denote by n i, tot the number of reactions inwhich enzyme i participates (irrespective of whether they areassigned to the E -sector or not), and by n i,E the number ofsuch processes included in the E -sector. Given the experi-mental protein mass fractions φ i , our estimate for the speciﬁc E -sector proteome cost ε is given by ε = 1 µ (cid:88) i ∈ E n i,E n i, tot φ i . (12)Unfortunately, the growth yields for the dataset at hand are notavailable. Instead, we employed an estimate obtained by solv-ing CAFBA with heterogeneous w i ’s, using the default valueof the ATP maintenance ﬂux. This is justiﬁed by the quan-titative accuracy that CAFBA achieves in predicting growthyields detailed in [26]. Notice however that the yields them-selves may vary considerably across experiments. For ourpurposes, though, rather than the absolute value of the yield,the key is the decrease due to acetate excretion at fast growthrates, which is remarkably robust and independent on the gly-colytic carbon source used [13]. [1] M. Scott, C. W. Gunderson, E. M. Mateescu, Z. Zhang, andT. Hwa, Science , 1099 (2010).[2] M. Scott and T. Hwa, Current opinion in biotechnology , 559(2011).[3] C. You, H. Okano, S. Hui, Z. Zhang, M. Kim, C. W. Gunderson,Y.-P. Wang, P. Lenz, D. Yan, and T. Hwa, Nature , 301(2013).[4] P. Wang, L. Robert, J. Pelletier, W. L. Dang, F. Taddei,A. Wright, and S. Jun, Current Biology , 1099 (2010).[5] G. Ullman, M. Wallden, E. G. Marklund, A. Mahmutovic,I. Razinkov, and J. Elf, Philosophical Transactions of the RoyalSociety B: Biological Sciences , 20120025 (2013).[6] S. Jun and S. Taheri-Araghi, Trends in Microbiology , 4(2015).[7] A. S. Kennard, M. Osella, A. Javer, J. Grilli, P. Nghe, S. J.Tans, P. Cicuta, and M. C. Lagomarsino, Physical Review E , 012408 (2016).[8] M. Schaechter, O. Maaløe, and N. Kjeldgaard, Journal of Gen-eral Microbiology , 592 (1958).[9] N. Kjeldgaard, O. Maaløe, and M. Schaechter, Journal of Gen-eral Microbiology , 607 (1958).[10] S. Hui, J. M. Silverman, S. S. Chen, D. W. Erickson, M. Basan,J. Wang, T. Hwa, and J. R. Williamson, Molecular systemsbiology , 784 (2015).[11] R. Schuetz, L. Kuepfer, and U. Sauer, Molecular systems biol-ogy , 119 (2007).[12] R. Schuetz, N. Zamboni, M. Zampieri, M. Heinemann, andU. Sauer, Science , 601 (2012).[13] M. Basan, S. Hui, H. Okano, Z. Zhang, Y. Shen, J. R.Williamson, and T. Hwa, Nature , 99 (2015).[14] R. U. Ibarra, J. S. Edwards, and B. O. Palsson, Nature , 186(2002).[15] N. D. Price, J. L. Reed, and B. Ø. Palsson, Nature ReviewsMicrobiology , 886 (2004). [16] R. De Deken, Microbiology , 149 (1966).[17] E. Postma, C. Verduyn, W. A. Scheffers, and J. P. Van Dijken,Applied and Environmental Microbiology , 468 (1989).[18] P. P. Hsu and D. M. Sabatini, Cell , 703 (2008).[19] R. Diaz-Ruiz, S. Uribe-Carvajal, A. Devin, and M. Rigoulet,Biochimica et Biophysica Acta (BBA)-Reviews on Cancer , 252 (2009).[20] M. G. Vander Heiden, L. C. Cantley, and C. B. Thompson,science , 1029 (2009).[21] D. Molenaar, R. van Berlo, D. de Ridder, and B. Teusink,Molecular systems biology (2009).[22] A. Flamholz, E. Noor, A. Bar-Even, W. Liebermeister, andR. Milo, Proceedings of the National Academy of Sciences ,10039 (2013).[23] A. Maitra and K. A. Dill, Proceedings of the National Academyof Sciences , 406 (2015).[24] A. Goelzer and V. Fromion, Biochimica et Biophysica Acta(BBA)-General Subjects , 978 (2011).[25] E. J. O’Brien, J. A. Lerman, R. L. Chang, D. R. Hyduke, andB. Ø. Palsson, Molecular systems biology (2013).[26] M. Mori, T. Hwa, O. C. Martin, A. De Martino, and E. Mari-nari, PLOS Comput Biol , e1004913 (2016).[27] Y. Hart, H. Sheftel, J. Hausser, P. Szekely, N. B. Ben-Moshe,Y. Korem, A. Tendler, A. E. Mayo, and U. Alon, Nature meth-ods , 233 (2015).[28] A. Schmidt, K. Kochanowski, S. Vedelaar, E. Ahrn´e, B. Volk-mer, L. Callipo, K. Knoops, M. Bauer, R. Aebersold, andM. Heinemann, Nature biotechnology , 104 (2016).[29] A. J. Wolfe, Microbiology and Molecular Biology Reviews ,12 (2005).[30] K. Potrykus and M. Cashel, Annu. Rev. Microbiol. , 35(2008).[31] O. Neijssel, M. Teixeira De Mattos, and D. Tempest, in Es-cherichia coli and Salmonella: Cellular and Molecular Biol- ogy , edited by F. C. Neidhardt (ASM Press, 1996).[32] B. D. Bennett, J. Yuan, E. H. Kimball, and J. D. Rabinowitz,Nature protocols , 1299 (2008).[33] V. M. Boer, C. A. Crutchﬁeld, P. H. Bradley, D. Botstein, andJ. D. Rabinowitz, Molecular biology of the cell , 198 (2010).[34] K. Valgepea, K. Adamberg, A. Seiman, and R. Vilu, MolecularBioSystems , 2344 (2013).[35] E. J. OBrien, J. Utrilla, and B. O. Palsson, PLoS Comput Biol , 1 (2016).[36] A. Vazquez and Z. N. Oltvai, Scientiﬁc Reports (2016).[37] S. Casta˜no-Cerezo, J. M. Pastor, S. Renilla, V. Bernal, J. L.Iborra, and M. C´anovas, Microbial cell factories , 1 (2009).[38] K. Valgepea, K. Adamberg, R. Nahku, P.-J. Lahtvee, L. Arike,and R. Vilu, BMC systems biology , 166 (2010).[39] R. B. Helling, C. N. Vargas, and J. Adams, Genetics , 349(1987).[40] D. S. Treves, S. Manning, and J. Adams, Molecular biologyand evolution , 789 (1998).[41] Q. Beg, A. Vazquez, J. Ernst, M. De Menezes, Z. Bar-Joseph,A.-L. Barab´asi, and Z. Oltvai, Proceedings of the NationalAcademy of Sciences , 12663 (2007).[42] A. Vazquez, Q. K. Beg, J. Ernst, Z. Bar-Joseph, A.-L. Barab´asi,L. G. Boros, Z. N. Oltvai, et al. , BMC systems biology , 7(2008).[43] C. Woldringh, J. Binnerts, and A. Mans, Journal of bacteriol-ogy , 58 (1981).[44] M. Basan, M. Zhu, X. Dai, M. Warren, D. S´evin, Y.-P. Wang,and T. Hwa, Molecular systems biology , 836 (2015).[45] A. Goel, T. H. Eckhardt, P. Puri, A. Jong, F. Branco dos Santos,M. Giera, F. Fusetti, W. M. Vos, J. Kok, B. Poolman, et al. ,Molecular microbiology , 77 (2015).[46] D. H. Huberts, B. Niebel, and M. Heinemann, FEMS yeastresearch , 118 (2012).[47] J. L. Reed, T. D. Vo, C. H. Schilling, B. O. Palsson, et al. ,Genome Biol , R54 (2003).[48] J. D. Orth, I. Thiele, and B. Ø. Palsson, Nature biotechnology , 245 (2010).[49] J. Schellenberger, R. Que, R. M. Fleming, I. Thiele, J. D. Orth,A. M. Feist, D. C. Zielinski, A. Bordbar, N. E. Lewis, S. Rah-manian, et al. , Nature protocols , 1290 (2011).[50] D. De Martino, F. Capuani, M. Mori, A. De Martino, andE. Marinari, Metabolites , 946 (2013).[51] J. Schellenberger, N. E. Lewis, and B. Ø. Palsson, Biophysicaljournal , 544 (2011). yield-cost tradeoff governs Escherichia coli’s decision between fermentation andrespiration in carbon-limited growth – SUPPORTING TEXT Matteo Mori, Enzo Marinari,

2, 3, ∗ and Andrea De Martino

4, 5, ∗ Department of Physics, University of California, San Diego, CA, USA Dipartimento di Fisica, Sapienza Universit`a di Roma, Rome, Italy INFN, Sezione di Roma 1, Rome, Italy Soft and Living Matter Lab, Institute of Nanotechnology (CNR-NANOTEC), Consiglio Nazionale delle Ricerche, Rome, Italy Human Genetics Foundation, Turin, Italy

CONTENTS

I. Trade-offs in the general model of optimal proteinallocation 1A. Duality of maximum growth and optimal proteomeallocation 1B. Transitions between optimal phenotypes 2C. Consequences for Constrained Allocation FBA 2II. Computation of the Pareto front 2III. Phenomenological model: deﬁnition and solution indifferent cases 3References 4

I. TRADE-OFFS IN THE GENERAL MODEL OFOPTIMAL PROTEIN ALLOCATIONA. Duality of maximum growth and optimal proteomeallocation

Here we consider in more detail the problem of characteriz-ing the optimal strategy for the cell to reallocate its proteomewhen a speciﬁc metabolic activity is limited. In the following, µ stands for the growth rate, v L for the ﬂux of the limited ac-tivity, φ NL for the mass fraction of the rest of the proteomeand q ≡ Y − = v L /µ for the speciﬁc ﬂux of the limited activ-ity (per unit of growth rate). These quantities in turn dependon internal variables of the cell such as metabolic ﬂuxes andmetabolite or enzyme levels (see e.g. [1]). However, insteadof considering the whole space of states spanned by such vari-ables, we will limit ourselves to a set of cellular states (“phe-notypes”) which we label with an index α . Each phenotypeis assumed to be described by a different set of values for thelimiting ﬂux v L and for φ NL for each growth rate µ . We thendeﬁne, for each α , the quantity Φ α ( w L , µ ) ≡ w L v αL ( µ ) + φ αNL ( µ ) , (1)where w L denotes the cost of the protein controlling v αL . Both v αL and φ αNL are modulated by the dilution rate (i.e. Φ α ≡ ∗ Co-last authors Φ α ( w L , µ ) ). In particular, we make the natural assumptionthat both increase with µ for each α , namelyd v αL d µ > , d φ αNL d µ > ∀ α . (2)For any given w L , the growth rate µ α ( w L ) pertaining tophenotype α can in principle be obtained by inverting the con-dition Φ α ( µ α , w L ) = 1 . The problem of growth rate maxi-mization can thus be re-cast as the constrained maximizationof µ α over the index α for given w L and v α ( µ ) , i.e. α ? = arg max α µ subject to Φ α ( µ, w L ) = 1 , (3)or, more simply, α ? = arg max α µ α subject to Φ α ( µ, w L ) = 1 . (4)We call this the direct proteome-constrained problem . We de-note by v ?L and φ ?NL the values of v L and φ NL correspondingto α ? . One can easily show that, for any growth rate µ forwhich all these quantities exist, v ( q ) L ≤ v ?L ≤ v ( ε ) L (5) φ ( ε ) NL ≤ φ ?NL ≤ φ ( q ) NL (6)where ( v ( q ) L , φ ( q ) NL ) and ( v ( ε ) L , φ ( ε ) NL ) are the values of v L and φ NL respectively obtained from the solutions of q -problem : α ( q ) = arg min α v αL ( µ ) s.t. µ = µ α , (7) ε -problem : α ( ε ) = arg min α φ αNL ( µ ) s.t. µ = µ α , (8)where the values of µ are assumed to match the optimalgrowth rate in the direct formulation of the problem. In or-der to see this, one ﬁrst has to introduce the following dual problem of the direct proteome-constrained problem (3): α ? = arg min α Φ α ( µ ) subject to µ = µ α . (9)The solution of this problem is identical to the one of the di-rect problem, Eq. (3), provided d Φ ? / d λ > and the growthrate µ is set so as to match the optimal one obtained in thedirect problem (see [1]). Then, the proof is straightforward:because of the deﬁnitions of the optimization problems (8)and (9), φ ( ε ) NL ≤ φ ?NL , and w L v ?L + φ ?NL ≤ w L v ( ε ) L φ ( ε ) NL , re-spectively. The ﬁrst inequality directly gives us one of the twoconstraints; another constraint, namely v ?L ≤ v ( ε ) L , is obtainedby using both inequalities. The demostration of the remainingbounds (the q -bounds) is analogous.Both the q - and the ε -problem can be shown to be equiva-lent to two other problems different from the direct proteome-constrained problem introduced above. Indeed, the former de-scribes the minimization of the speciﬁc ﬂux q = v L /µ (or tothe maximization of the growth yield Y = µ/v L ) at ﬁxedgrowth rate, while the latter is equivalent to the maximizationof the growth rate subject to a constraint on φ NL .In other terms, optimal growth and optimal proteome al-location are dual to each other and solutions to the directprotein-constrained problem are bound to lie within a rangedeﬁned by the q - and ε -problems. The existence of thesebounds allows to study how cells may optimally handle dif-ferent degrees of limitation, that is, different values of w L ,by switching between alternate solutions. In particular, for w L = 0 , the solution to the original problem coincides withthe solution to the ε -problem. As w L increases, instead, thesolution may shift towards states with larger yields (smallerspeciﬁc rates) at the cost of increasing φ NL . B. Transitions between optimal phenotypes

For any given “phenotype” α , one can obtain a growth rate µ α as a function of w L by solving the equation Φ α ( w L , µ ) =1 . Optimal phenotypes are those maximizing µ for each w L .Assuming for simplicity that optimal states are unique, tran-sitions between “phenotypes” α and β may occur at values of w L such that µ α ( w L ) = µ β ( w L ) and d µ α / d w L = d µ β / d w L .Since Φ α = Φ β = 1 , we would have w L ( v αL − v βL ) = φ βNL − φ αNL , (10)i.e. any decrease in the ﬂux v L has to be matched by anincrease in the non-limited proteome φ NL , highlighting thetradeoff between optimal proteome allocation and efﬁcientuse of limited resources. By differentiating Φ α with respectto w L one gets d µ α d w L = − v αL d Φ α / d µ , (11)and therefore, assuming d µ α / d w L > d µ β / d w L , v αL d Φ α / d µ < v βL d Φ β / d µ . (12)This equation not only involves the absolute magnitude of thelimited ﬂux v L , but also the variation of the proteome as thegrowth rate changes. C. Consequences for Constrained Allocation FBA

The consequences of equations (10) and (12) depend on themodel at hand. Let us consider, as in the main text, a ﬂux-based constraint based model with proteome allocation con-straint such as CAFBA, with no maintenance ATP hydrolysis rate. In this case, as explained in the Main Text, we can intro-duce the metabolic states ξ and express the ﬂuxes as v = ξ · µ .Let us consider a set of different metabolic states, identiﬁedby an index α as ξ α . For each state we can compute the spe-ciﬁc uptakes q α and protein costs ε ? , so that v αL = q α µ and φ αNL = ε α µ . The bounds Eq. (5) and (6) take the form q ( q ) ≤ q ? ≤ q ( ε ) (13) ε ( ε ) ≤ ε ? ≤ ε ( q ) , (14)where ( q ) and ( ε ) label to speciﬁc states ξ ( q ) and ξ ( ε ) , whilethe asterisk indicates the optimal state. On the other hand,Equations (10) and (12) respectively become w L ( q α − q β ) = ε β − ε α (15) q α µw L q α + ε α < q β µw L q β + ε β ⇒ ε α q α > ε β q β . (16)Together, these two constraints imply that, at any transitionbetween optimal states, both q and ε vary, and they do so inopposite directions. When w L = 0 , ε ? = ε ( ε ) , while the spe-ciﬁc ﬂux q ? is maximum. As w L increases, at each transition q decreases as ε increases. These properties lie at the basis ofthe Pareto front analysis. II. COMPUTATION OF THE PARETO FRONT

Let us consider a value w C of w C such that states α and β with C α = q α w C + ε α and C β = q β w C + ε β are optimal for w C = w C , with C α = C β when w C = w C = ( ε α − ε β ) / ( q β − q α ) . We consider for deﬁniteness q α > q β . Starting from w C < w C (so with β as the optimal state) we are interestedin constraining the parameters of a sub-optimal state γ withcost C γ , such that C γ > C α for w C > w C and C γ > C β for w C < w C . In what follows, we assume that q β < q γ < q α .Suppose that w C < w C . The constraint C γ > C β can berewritten as ε γ >w C ( q β − q γ ) + ε β = w C ( q β − q γ ) + ε α + w C ( q α − q β ) > w C ( q β − q γ ) + ε α + w C ( q α − q β )= ε α − w C ( q γ − q α ) , (17)where we used w C ( q β − q γ ) > w C ( q β − q γ ) (since w C < w C and q β < q γ ). Similarly, when w C > w C , one gets (cid:15) γ > ε β − w C ( q γ − q β ) . (18)Note that conditions (17) and (18) identify the same half-spacein the ( q γ , ε γ ) plane, deﬁned by the line passing through thepoints ( q α , ε α ) and ( q β , ε β ) . Therefore, given a set of optimalstates α, β, . . . , the Pareto frontier is obtained by connectingneighboring points with straight lines, as explained by a con-crete example in Fig. 1. q P r o t eo m e c o s t, ε w C q w C + ε w C1 w C2 T o t a l c o s t, C = FIG. 1. Example of Pareto–optimality in CAFBA.

Left : we show a set of possible cellular states α , characterized by their proteome costs ε α and their inverse yields q α . The total cost of each pathway is given by C α = q α w C + ε α . Each total cost C α is a linear function of w C .The cost is minimized by the envelope (shown as a red line) of the lines corresponding to the three optimal pathways (dashed lines). The greylines denote the costs of suboptimal pathways. Three optimal modes (carrying the lowest cost) appear as w C changes, numbered from one(minimum proteome cost ε ) to three (minimum inverse yield q ). The values of w C at which switches between optimal phenotypes occurare given by w C = ( ε − ε ) / ( q − q ) and w C = ( ε − ε ) / ( q − q ) . Right : each grey cross in the ( q, ε ) plane represents a suboptimalpathway. The Pareto frontier is shown in red and delimitates the feasible region. The optimal modes (blue dots) lie on the Pareto frontier. III. PHENOMENOLOGICAL MODEL: DEFINITION ANDSOLUTION IN DIFFERENT CASES

We consider a simple reaction network shown in Fig. 2a. Itinvolves 5 reactions among 3 chemical species, two of which( m and m ) can be exchanged with the exterior. The con-sumption of a “biomass” metabolite e models growth. Reac-tions are summed up by“glycolysis” (rate g ) m → a · m + b · e “respiration” (rate r ) m → b · em import (rate u ) ∅ → m (19) m export (rate v ) m → ∅ “growth” (rate µ ) e → ∅ (For simplicity, the growth rate is expressed in the units ofreaction ﬂuxes.) Under steady-state mass-balance, one has u = g , au = r + v and µ = b g + b r . Flux states compatiblewith these constraints can be expressed as functions of µ and u alone. From r ≥ and v ≥ one gets instead the followingbounds for the growth rate, b u ≤ µ ≤ ( b + ab ) u , (20)implying that the yield Y = µ/u of the different steady stateslies between the yield of fermentation Y fer = b and that ofrespiration Y res = b + ab . As Eq. (20) does not by itselflimit the growth rate, extra constraints have to be enforced to obtain well-deﬁned solutions to the problem of maximizing µ . The nature of optimal states therefore depends on whichadditional constraints are imposed. We consider three distinctscenarios, whose solutions are summarized in Fig. 2b–d. Standard FBA scenario:

A minimal choice consists in im-posing an upper bound to the carbon uptake ﬂux, u ≤ u max , where u max may coincide with the maximumcarbon uptake rates observed in experiments with con-tinuous cultures. In this condition, maximization of µ returns the state with the largest growth yield, i.e. µ ? = Y res u max , r ? = au max and v ? = 0 . This is awell known property of the solutions of standard FBA. FBA with Molecular Croding (FBAwMC) scenario [2, 3]:

In this case a “crowding constraint” is imposed, con-sisting of an overall bound on intracellular ﬂuxes of theform c g + c r = 1 . Now, for the growth rate one ﬁnds µ = b /c + ( b − b c /c ) g . For b /c > b /c ,the optimal solution is obtained by minimizing g , i.e.by setting g = g ? = ( b + ab ) − µ , and presentsrespiratory metabolism. If instead b /c < b /c , then g should be maximized ( g ? = µ/b ) and the optimalsolution presents fermentative metabolism. The inclu-sion of explicit coefﬁcients for the carbon uptake u and the fermentation ﬂux v in the crowding constraintonly leads to a re-deﬁnition of the coefﬁcients c and c . Note however that the crowding picture for E. coli appears to be at odds with experiments showing that g r µ u vm em (a) µ u m a x µ u = u m a x (b) µ u (c) c r o w d i n g c o n s t r a i n t m a x µ µ u (d) m a x µ C A F B A , w C < Q C A F BA , w C > Q FIG. 2. (a)

Coarse grained model of metabolism (see text). A metabolite m can be consumed to provide “energy” e plus other metabolicproducts m . These other products can be either further processed to yield more energy e (plus waste), or excreted. The metabolites e arethen used by the cell to grow. (b) - (d) Solution space of the model imposing the steady state constraints d [ m ] / d t = d [ m ] / d t = d [ e ] / d t = 0 .Grey areas denote infeasible regions. Feasible solution (white area) are delimited by the linear bounds b u ≤ µ ≤ ( b + ab ) u , see Eq. (20).As µ does not have an upper limit, an extra constraint is required in order to maximize the growth rate µ . We consider 3 different scenarios. (b) Maximization of µ subject to a maximum intake ﬂux (standard FBA). (c) Maximization of µ subject to a molecular crowding constraint. (d) Maximization of µ subject to a proteome allocation constraint, with the lines corresponding to the cases w C < Q and w C > Q shownexplicitly. Optimal solutions for each of the examples displayed are shown as red circles. the cell density remains constant under a variety ofperturbations [4, 5]. Constrained Allocation FBA (CAFBA) scenario [6]:

Inthis case, the additional constraint models proteome al-location and reads w C u + w g + w r + w R µ = φ max .As in the FBAwMC case, the growth rate can beexpressed as a function of g alone, obtaining µ = 11 + b w R /w (cid:20) b φ max w + b w ( Q − w C ) g (cid:21) (21)with Q = w b b − w . Q can be interpreted as the addi-tional protein cost of respiration, w /b , with respect to the cost of fermentation, w /b , where the protein costsare weighted by the inverse yields of the pathways.Since w C is (roughly) inversely proportional to the nu-trient level, the sign of Q − w C might change whenone passes from good carbon sources to poor ones. Forthe realistic case where Q > , in speciﬁc, the optimalsolution shifts from fermentation to respiration as w C increases (i.e. as carbon is limited). [1] M. Mori, M. Ponce-de Le´on, J. Peret´o, and F. Montero, Frontiersin Microbiology , 1553 (2016).[2] Q. Beg, A. Vazquez, J. Ernst, M. De Menezes, Z. Bar-Joseph,A.-L. Barab´asi, and Z. Oltvai, Proceedings of the NationalAcademy of Sciences , 12663 (2007).[3] A. Vazquez, Q. K. Beg, J. Ernst, Z. Bar-Joseph, A.-L. Barab´asi,L. G. Boros, Z. N. Oltvai, et al. , BMC systems biology , 7 (2008).[4] C. Woldringh, J. Binnerts, and A. Mans, Journal of bacteriology , 58 (1981).[5] M. Basan, M. Zhu, X. Dai, M. Warren, D. S´evin, Y.-P. Wang,and T. Hwa, Molecular systems biology , 836 (2015).[6] M. Mori, T. Hwa, O. C. Martin, A. De Martino, and E. Marinari,PLOS Comput Biol12