[PDF] On the scaling of functional spaces, from smart cities to cloud computing

Abstract

The study of spacetime, and its role in understanding functional systems has received little attention in information science. Recent work, on the origin of universal scaling in cities and biological systems, provides an intriguing insight into the functional use of space, and its measurable effects. Cities are large information systems, with many similarities to other technological infrastructures, so the results shed new light indirectly on the scaling the expected behaviour of smart pervasive infrastructures and the communities that make use of them. Using promise theory, I derive and extend the scaling laws for cities to expose what may be extrapolated to technological systems. From the promise model, I propose an explanation for some anomalous exponents in the original work, and discuss what changes may be expected due to technological advancement.

Full PDF

SSpacetimes with semantics II (supplement) ∗ On the scaling of functional spaces,from smart cities to cloud computing

Mark BurgessOctober 10, 2018

Abstract

The study of spacetime, and its role in understanding functional systems has received little attentionin information science. Recent work, on the origin of universal scaling in cities and biological systems,provides an intriguing insight into the functional use of space, and its measurable effects. Cities arelarge information systems, with many similarities to other technological infrastructures, so the resultsshed new light indirectly on the scaling the expected behaviour of smart pervasive infrastructures andthe communities that make use of them.Using promise theory, I derive and extend the scaling laws for cities to expose what may be extrapo-lated to technological systems. From the promise model, I propose an explanation for some anomalousexponents in the original work, and discuss what changes may be expected due to technological ad-vancement.

Contents ∗ These notes are a continuation my series on semantic spacetimes. This document is inspired by the studies of coarse-graineduniversal scaling in cities [1–3], and a comparison with models developed over the past decade or two on information systems,e.g. [4,5]. As IT systems grow in scale, is natural to expect a bridge between the behaviours of cities, software networks, and otherfunctionally ‘smart’ spaces, and one hopes for a better understanding of pervasive information technology in social contexts. Workon this, from the low level viewpoint, has already begin in [6, 7]. I want to show how these views relate. a r X i v : . [ c s . C Y ] F e b A promise theory derivation of the city model, and beyond 16

Two recent works have shed light on the role of infrastructure in the scaling of functional systems. Theobservation of universal, scale free behaviours, both in observed data, and in a ‘mean ﬁeld’ description ofcities [1, 2], and the observation, in biology, of metabolic scaling laws for organisms, following a centuryof observation [8]. Both of these were explained from the behaviours of their internal networks.Cities are an example of collaborative community networks, where humans and technology mixwithin a semantically (i.e. functionally) rich space, equipped with infrastructure. One may ask howcities differ from apparently similar communities across different scales, including tribes, collectives,companies, software installations, and even countries. Understanding the dominant processes that makethese shared environments smart, creative, and productive, is a worthy investment, given how 21st centurylife relies on them so much for its success and survival.In this note, I review and build on Bettencourt’s model of cities [1], and discuss its implications forpervasive information technology (IT). By building on the lessons of cities, I hope to to foster a betterunderstanding of a broader range of functional systems, especially in information technology, while tryingto preserve the simplicity of Bettencourt’s arguments. I begin by summarizing an interpretation that lays2he foundations for a more microscopic formulation of the model, using promise theory [9]. The lattermay be used to relate outcomes to intentions and mechanisms, in a way that respects the idea of scaling.

This section is a review (and trivial generalization) of the city scaling arguments, and data used by Bet-tencourt and collaborators at the Santa Fe Institute (SFI) [1–3], in a form suitable for comparison withother work. In section 3, I propose a deeper justiﬁcation for the model, with some embellishments. Someinterpretations may be my own.

Measurable attributes, of ﬁnite dynamical systems, typically scale in proportion to some measure of theirsize. Size may refer to the number of agents N (persons) in the system, or per unit volume V of thesystem, and N and V may or may not be related. This is a point of view that is a basic tool of analysis inphysical systems. For cities, N is used as the scaling variable. Across an ensemble of many systems ofdifferent size, the measurements one obtains may scale in three broad ways (see ﬁgure 1): (a) (b) (c) Figure 1:

Scaling of a square quantity, relative to the circular system size (a) sublinear, (b) linear, (c) superlinear. Ifan cost is superlinear, to applies a braking force on city size.

1. Sublinear scaling of quantities q of the infrastructure machinery q ∝ N β< . This indicateseconomies of scale, because, as the system size grows, the cost becomes relatively cheaper. Incities, it is observed to apply to the transport networks that animate the system (arterial systems,roads, cables, petrol stations, etc).2. Linear scaling, simply proportional to the number q ∝ N β =1 . In cities, this seems to apply todirect consumption of goods and resources per capita (jobs, houses, water consumption, etc).3. Superlinear yields of produce or ‘output’, where q ∝ N β> , which is driven by interactivity be-tween the parts of the system and its consequences (wages, rents, patents, crime, disease, GDP). Ifa process rate is superlinear, then the corresponding time for the process to run will be sublinear,and vice versa.It is of particular interest when these patterns seem to apply across such a broad range of scales. Thissuggests some emergent universality , whose origin and mechanism is worth understanding.As part of a protracted project to uncover the behaviours of cities or urban metropolitan districts,Bettencourt has proposed a mean ﬁeld model to explain observed scaling behaviour of certain economicmeasures, making only elementary assumptions about the processes within [1]. The model predicts themain features of the data, by assuming a dynamical universality, but seems to fall short of describingthe full range of observed scaling exponents β . Empirical data from [2] revealed sublinear, linear, and3uperlinear scaling behaviour in the variety of accessible data. The data came initially from mostlyAmerican cities (see table 1), and have since been demonstrated in European cities in [10]. All cities,thus far, have a more or less comparable level of technological development, and thus ﬁt plausibly into astatistical ensemble.S UBLINEAR

Linear S

UPERLINEAR

Fuel sales Housing New patentsFuelling stations Employment InventorsLength of cables Power consumption R&D employmentRoad surface Water consumption Other creative employmentDisease (AIDS)CrimeTable 1: Examples of city outputs with sub- and superlinear scaling per capita, reported in [2].Cities are not just dynamical systems, they also exhibit complex semantics, or purposeful, intentional,patterned behaviours. In the physics of inanimate systems, the markers of semantics are comprised ofonly a few simple labels; charges, force laws, and interaction graphs. These are constant over time, andaccepted as universal ‘physical law’. However, in human functional systems, the range of interactions,and their assumed meanings, is much richer, and may depend of time and context. Coarse graining, bycreating a mean ﬁeld model, is a standard physical approach which eliminates the types, labels, and othersemantics of networks, and exposes the universality of scaling phenomena. However, this simplicity is atrade-off: semantics are also what sustains the arrangement and composition of functional processes, at adeeper level, and could lead to actionable predictions.In these notes, I show how semantics provide some additional structure and constraints on dynamics,and how both short and long range interactions may be distinguished through functional dependency. Thisleads to a possible explanation for the discrepancy between data and predictions of superlinear scalingexponents in [1, 2] using a slightly more detailed model than [1], based on promise theory [9].

Consider a city, in the coarsest approximation, as a bounded homogenous mass, sustained by externalsupplies and internally generated wealth. This is analogous to a model of gravity and pressure versusvolume, whose equilibrium deﬁnes an average city size. Bettencourt likens the arrangement to a star,which gravitates together from the beneﬁts of city infrastructure, and expands through the accretion ofnew inhabitants.The city is populated by N inhabitants, which I will call ‘agents’, which and may include machines,animals, etc., or any proxies for human intent that lead to economic output. The data on cities, usedin the comparison, are based on human population numbers, so N will refer to the people, in the ﬁrstapproximation. The agents are distributed within a volume V , in D dimensions. Cities are more or lesstwo dimensional ( D = 2 ), in spite of high rise regions, because the more or less 1 dimensional networksconnecting them lie mainly in a plane. • Let the number of agents in the network, or city community, be: N = N I + N (1)4here N is a partial dead-weight population that is not interacting with the city infrastructure(as user or provider) , and the functional networks generating the yields are agents from the N I population. Bettencourt does not distinguish between N I and N , but it is useful to track thisdistinction throughout the scaling argument.A dependence on N can mean two different things, in the formulae: an average value of a staticpopulation, across an ensemble of different cities, or a dynamically growing value of N within asingle city, over the course of its evolution. The interpretation of the results is sensitive to thisdistinction, requiring some care. If universality were completely true, they would be the same. • A city is a volume V of agents, accreted from a wider region, with ﬁnite compressibility, by virtueof some average space requirements per person, and a pressure a (non-detailed) balance condi-tion, which matches the output shared amongst interactions to the resources that can be fed to theconnected agents. • An estimate is made of the minimum fraction of the volume to be infrastructure V I that couldconnect the city’s agents together into a virtual mesh. • The city infrastructure is assumed to be ‘sparsely’ utilized, at equilibrium, in the sense that thetechnology on which ensemble cities are based can absorb ﬂuctuations in space and time, withoutsigniﬁant contention or expansion , else it would choke to a standstill. This is a technical (statisti-cal) property, which is necessary to achieve economies of scale, through spacetime multiplexing .A city may appear to be busy, even crowded at certain moments, but ‘sparse’ means that it couldtechnically get a lot more crowded, on average, without collapsing from congestion. • Various functional output yields Y , of the city may be calculated in a form suitable for ensemblecomparison, to determine their scaling with N . Some outputs stem from individual agents, andsome are from the interactions between them. To make a mean ﬁeld explanation plausible, over a range of scales and circumstances, some assumptionsare required. The mean ﬁeld model should not be confused with a detailed model of physical processes.It is an effective representation that emphasizes universal aspects of behaviour. Expanding on what isstated in [1]: • The ability to form an ensemble over multiple cities, assumes that when similar outputs are promised,they are made by the same kinds of agents, with the same basic capabilities [13]. If one city has atechnology that makes the same output twice as productive, this will not necessarily respect the en-semble’s universality assumption, leading to anomalies. The latter assumption suggests that theremight be difﬁculties comparing third world cities, primitive societies, or fully automated productionfacilities with more homogenized average candidates. • The distribution of agents within a city cannot be predicted or described by a mean ﬁeld model.I’ll return to this issue in section 3. The agents involved in a speciﬁc output ‘yield’ Y might beconcentrated into regions, however, there may also be multiple regions with the same role, adding The dead weight could be green spaces, but more likely slum dwellings such as those that dominate Baltimore. Hot-beds forcrime and shadow economy. This merely interferes with city output as far as the world is concerned. In Feynman’s words, there is ‘plenty of room at the bottom’. Fluctuations in network utilization generally exhibit long-tailedbehaviour [11, 12], so we must be far away from the point of collapse, in spite of superﬁcial appearances like trafﬁc jams. Arrival processes like Poisson and L´evy distributed events have the property that a convolution of multiple ﬂows is forminvariant.

5p to the total partial volume V Y used in the expressions. All are lumped into a single equivalentvolume V Y for the purpose of total comparison (see ﬁgure 2).. • A key assumption is that infrastructure ﬁlaments take up only a small or ‘sparse’ fraction of volumeof the city V I (cid:28) V , but can cope with all of its load requirements. This sparse use of resources,with the city volume, will be important to justifying the superlinear scaling. • In this rendition of the model, I add one trivial extension to [1]: the infrastructure network connectsa fraction of N I ≤ N of the total population N together, leaving some residual number N ofcity dwellers unconnected, or non-participatory. This allows us to track the possible impact of anon-productive mass. • The infrastructure network is not a mesh network itself, but it delivers sufﬁcient capacity and in-terconnectivity to allow a virtual mesh of coverage, i.e. any agent can reach any other agent withequal average cost. • The maximum output of the city community is assumed to follow Metcalfe’s law, which estimatesthe productivity of a network proportionally to the maximum number of links that can be made [14].This has been criticized theoretically (for a sample see [15, 16]). However, recently this conjecturehas received empirical support from social media studies [17] . I will derive this result, and itslimitations, from promise theory in section 3.6.3, and also show that this cannot be the only measureof productivity to reproduce the data. • Let us measure both value and cost in the dimensions of ‘money’ [ M ] . In [1], the author uses power(energy per unit time) as the proposed currency; however, money might be easier to grasp for manyreaders. A A AB BC A B C

Figure 2:

The yield agency model is a pudding model, in which different yields may be spread about inside the citybounds, but they may all be lumped into a single equivalent volume for the purpose of scaling the ingredients.

Regarding the economic output of the city, the model is quite simple. • The city’s activities yield outputs, labelled by different Y . • These come from groups of agents that interact and collaborate in an unspeciﬁed way. The groupoccupies an aggregate volumes V Y , for each Y (see ﬁgure 2). These patches are virtually contigu-ous even when they are physically distributed, and so they may overlap in space. Metcalfe originally assumed that value creation would be proportional to N while costs grew proportional to N . The study[17] indicates that both grow quadratically with network vertex count, though these ideas are still disputed [15, 16]. By assuming that V Y is a fraction of the N I interaction mesh, one implicitly allows the membersof a yield producer to be located anywhere within the equal-cost network.The economic output due to a process Y may be written approximately as: Y = g Y v coop Y V I N I , (2)where g Y has units of money [ M ] , and N I is assumed dimensionless. In many cases we can assumethat N I (cid:39) N , so that the entire city is active (no dead weight or free riders), but there is no need tomake this identiﬁcation yet. In the volume of the city, network links are counted using a continuum approximation in terms of volumes,and fractions of volumes. This is an unfamiliar step in computer science, but it plays an importantrole in deriving the scaling laws, and this case can offer valuable lessons. The minimum size of theinfrastructure network can be estimated by squeezing the total sparse volume into a narrow, approximatelyone dimensional pipeline, with a small cross section. This is only plausible of the network utilization isreally sparse, since then the total interaction can be compressed into the lower dimensional network, bymultiplexing. The average distance between agents inside the city (in D dimensions) is d = (cid:18) VN (cid:19) D . (3)An infrastructure network has the topology of a graph, in the mathematical sense, but it may also havesufﬁcient structure to pervade space . It is embedded in a real world volume, and needs to reach theagents distributed within. If the agents cluster around the network, the system will remain largely onedimensional; however, if the network penetrates the space homogeneously (either by wiring or by themovement of inhabitants who use it), then (again, in the spirit of generality) the effect of this ‘spaceﬁlling’, or fractal invasion , may be captured by an effective (Hausdorff) dimension H < D , so that wemay write the order of magnitude estimate for the infrastructure volume: V I ≥ g I (cid:18) VN (cid:19) HD L D − H N I , (4)where g I < is a dimensionless constant that indicates the fraction of nodes in N I spanned by theparticular infrastructure being considered. L is some ﬁxed scale with the dimensions of length [ L ] , sothat [ V ] = [ L ] D . (5)and L D (cid:28) V I (cid:28) V . In other words, the volume is the effective average linear volume swept out by aﬁxed cross section L D − H , as it feeds into the N I nodes connected by the infrastructure . This has the This was an important argument in deriving the biological scaling laws [8]. The model cannot formally distinguish between the intricacy of the infrastructure itself and the movement of agents around it,but it makes sense to assume that it is the motion of people and mobile agents that is complex, rather than the system of roads andwires of the city. I’m grateful to Luis Bettencourt for this comment. It is well known that the scaling of ad hoc communications networks, where agents are distributed randomly is like √ N . Thisis easily understood from the spatial geometry: mobile phones occupy some approximately two dimensional area, so the diameter isof order N ; alternatively, they have average separation d (cid:39) V/ √ N , so the distance across the group is of the order Nd (cid:39) √ N .The linearity of the process gets mixed up with the geometry of the embedding space. N I queues that are serialized paths of dimension V /D . For H = 0 the nodes areunconnected, for H = 1 roads are serial or linear, and for D > H > , the roads or channels have aneffective fractal ‘thickness’, from a coarse-grain perspective (see ﬁg. 3). It turns out that we only need tolook at H = 1 , as it is serialization rather than physical dimension that is important.The assumption that we can squash a volume V into a smaller linearized volume V I is the key process,or universal mechanism for comparison between cities. It expresses what we mean by a ‘sparsely utilized’infrastructure, i.e. the gas of inhabitants is somewhat compressible. One thinks ﬁrst of physical channels,such as roads, cables, and transportation costs etc; however, any serial stream of work could constitutework process from a number of agents within the volume V . Thus we write the infrastructure volume as ddL VV I Figure 3:

The volume of the infrastructure sparse network is negligible compared to the ball of the city. approximately (rewriting (4)): V I ≥ g I V HD L D − H N I N − HD , (6)representing N I serial queues of supporting services, being fed from a D dimensional region. The V /D scaling represents a serialization of the work from across the homogeneous region. Although the city population and volume are presumed to accrete, as people come, attracted by thepromises of the city, a given population has to be sustainable. The total population and volume of thecity must satisfy an equation of state, that explains the average balance of these payments. The scaling ofthese payments is the same, but in reverse, so the balance ends up only as a sign to the coefﬁcients. Fromthe structure, there must be three main parts: • Agent sustenance : the existence constraint on agent survival, individually, represents a separateconcern that does not play directly into the scaling of the city (every agent for itself). It is repre-sented, implicitly, through the assumption of constant N . This has two aspects: the attractive forcethat brings people to the city, and the resources that feed and supply the balance of payments. Fewcities are self-sufﬁcient within their bounds. These aspects remain formally unexplained in [1], andthus do not play a detailed role in the data described. • Work output sustenance : output yields Y , produced by the various agents of the city, whetherindividuals or factories, are assumed to make the city cash ﬂow positive, together with the inputof external resources, making that the city is economically viable. This feedback is not modelleddirectly, only assumed by the positive signs of the coupling coefﬁcients along links, and the as-sumption of steady state. Thus, cities might be proﬁtable, or borrowing money to reach this steadystate. There is no way to capture these factors in the mean ﬁeld approximation.8 City volume sustenance : what we can model is the supply of resources must balance the outputsof the city. Running cost ≤ Resource supply in − Transport cost . (7)The running and transport costs are assumed to be low, else the agents would not be able to sustaintheir existence. The costs are in the links and in the linearized supply through the infrastructure.The feeding of resources (perhaps from outside the city) through the linearized infrastructure (RHS)powers interactions within the partial city volumes, over many activities, (LHS), and effectivelyinﬂates the volume of the city by placing a scale V over which resources must be transported. Thisyields a simple (non-detailed) balance condition:Cost of interaction links ≤ Supply through infrastructure g Y v Y V N I ≤ c Y V HD N, (8)where we assume that work cost is proportional to the distance travelled (analogous to W = F · dl ),and c Y is the cost per unit length of path transport along the infrastructure (dimensions [ M ][ L ] − H ).The positive coupling g Y includes the balance of payments to keep the city viable. Rearrangingthis inequality gives the implied constraint on the sustainable volume: V ≥ a (cid:18) N I N (cid:19) DD + H (9)where a = ( g Y v maintain Y /C Y ) DD + H . Substituting the steady state volume of the city into the expression for the infrastructure volume, themodel now predicts the three kinds of scaling from the introduction:1.

Sublinear . The scaling of infrastructure itself in terms of the volume, we get: V I = g I a H/D (cid:18) N I N (cid:19) − HD + H D ( D + H ) (10)setting D = 2 and H = 1 into (6), V I (cid:39) (cid:18) N I N (cid:19) . (11)For pervasive N (cid:39) N I , this yields V I (cid:39) N , (12)giving the sublinear scaling observed by [1]. When the infrastructure cost is basically absent, thisscales like ‘every man for himself’, like an ideal gas of non-interacting agents.2. Linear . For individual agent consumption, the scaling is trivially linear, by assumption, both inputs(consumption C ) and outputs O . C − = e − N (13) C + = e + N (14)where the dimensions [ e ] = [ M ] are of money.9. Superlinear . The positive economic yield of a process in the city, due to interactions may bewritten as a fraction of the possible N I output that can be channelled through the volume V I , witha different constant of proportionality for each output: Y + Y = G Y N I V I , (15)where G Y is assumed positive, absorbing the costs of interaction, and from (6) we have an expres-sion for the infrastructure volume, up to invariant unknowns, which are simply constants that donot depend on the state variables N or V . So, substituting (10) into (15), Y + Y = (cid:0) g I G Y L H − D (cid:1) V − HD N I N HD , (16)and substituting for volume in (9), since it also depends on N I : Y + Y = (cid:16) g I G Y L H − D a − HDD ( D + H ) (cid:17) N HD + H D ( H + D ) N D − HDD ( D + H ) I , (17)If we substitute D = 2 , and H = 1 , this scales as Y + Y (cid:39) N I (cid:18) N N I (cid:19) . (18)And if we further assume that the infrastructure network is pervasive, so that N (cid:39) N I , then Y + Y (cid:39) N I (cid:39) N . (19)This reproduces the superliner scaling identiﬁed theoretically in [1], and this result matches abouthalf of the superlinear scaling data quite well. Other data show signiﬁcantly higher values forthe scaling. If the infrastructure network is insigniﬁcantly small N I (cid:28) N , then there would bebinomial corrections to a /N scaling. The ‘dead population’ reduces the power law slightly (inbinomial corrections), so we could expect processes, for which most of the city cannot contribute,to scale below the / . th power, thus this cannot explain the higher scaling exponent. Anenhanced explanation is proposed in section 3.7. In spite of a smattering of city related narrative, this calculation is based on a very simple and universalargument about resources exchanged between a D dimensional volume and a one dimensional supplynetwork. There must be sufﬁcient spacetime volume associated with ensemble-standard infrastructure toaccommodate growth in N I , or an increase in density of the users, without saturating it. Multiplexing inspace and time, is they key to this. • Measures, relating to the output of self-sufﬁcient agents, scale linearly, as one would expect. If wedouble the number of suppliers, there is twice as much availability. • Shared resources may exhibit economies of scale if they become relatively fewer per person, ascities grow, without impairing output. These economies of scale seem to be quite well captured bythe model, using a value of β = 5 / (cid:39) . to match data, which suggests that the result makessense both for N interpreted as an ensemble average and as a growth parameter over time.10 Finally, there are superlinear outputs, whose behaviour is more subtle. A production output Y maybe a fraction, not of N , but of N because the maximum output can depend on the interactionsbetween agents, and the agents working alone. Output may or may not depend on the volumeof the city, i.e. perhaps only the number of agents or currents in the pudding, and perhaps thespace they occupy in the course of their interactions; this depends on the kinematics of the detailedprocesses.For every scaling beneﬁt, there may also be scaling costs, with the opposite sign: contention for sharedresources, spiralling costs of equilibration, etc. Some general comments about the mean ﬁeld approxima-tion: • People have different jobs, capabilities, and habits. In the mean ﬁeld representation, such detailis not represented explicitly, but this does not imply that they don’t exist. Indeed, they must bepresent to account for different ﬂavours of output. However, this is not a one-to-one association.Formally, the outputs are fractions of the total amount of possible produced work, averaged overall specializations, mechanisms, and functions. To understand why this is plausible, we need tounderstand that outputs are the result of combinations of jobs, in collaboration (see section 3.5).They are assumed equally likely on average across cities of the ensemble. The diversity of a citymay increase with size , but this does not matter as long as we assume that all jobs are the same.The data suggest that this does matter (see section 3.7). • A city must have ‘potential barriers’, or entry costs for construction and community building, thatdepend on the level of technology available. This is also not represented in the model. There mightbe signiﬁcant debt associated with building, which is invisible in this picture. These are transientresponses, invisible in steady state behaviour. • The technology at the time of building ought to affect the density of the city too: as transport im-proves and gets cheaper, it enables greater distances to be covered in the same amount of time, i.e.lower density. Conversely, it enables more efﬁcient transport of provisions to sustain a high popu-lation density. One might not expect old cities like Rome or London to be immediately comparablewith Brasilia or Shenzhen . Similar questions could apply to the pace of life. Does this dependonly on the size of the city, or also on cultural norms? What if we compared Dakar with Seoul, forexample? The scaling depends only on relative density however. • To be comparable in an ensemble average, one could expect that cities would need to be comparablein productive technology, composition and wastage. A machine could do many times the work ofa manual laborer, for instance, so it would not be fair to compare a city with mechanization to acity of horses and carts. The data in [2] were taken from mainly American cities, which have arelatively uniform level of technological infrastructure, and arise from similar epochs. There is noparticular reason why they would be comparable to the slums of Mumbai. But this remains to bediscovered. • By distinguishing between N and N I , it is possible to see how about the productive capacity of thecity could be altered by the presence of unproductive regions. From the expansion of (18), it wouldseem that the higher the proportion of the population that cannot contribute to a process, the lowerthe scaling power of outputs might become.It is known, for instance, that a typical ‘80/20 rule’ (power law behaviour) is almost universalfor networks, i.e. 80 percent of the yield typically comes from 20 percent of the agents [11, 18], This could be checked by comparing the size of yellow pages directory for a community to its white pages directory. Just 30 years old, ‘Shenzhen speed’ is the stuff of legend in China. N , i.e. atthe coarsest scales. It is unable to probe the separation of such scales, by weak local interactions, ordistinguish long and short range interactions. At smaller range, the semantics of interactions often becomevital to the functioning of a system. In physics, ‘semantics’ are simple and mostly ﬁxed by ‘physical law’,e.g. they manifest as different ‘charges’, ‘forces’ or allowed interaction types. In a human-technologysystem, however, there are many more ﬂavours of interactive behaviour that may be distinguished, andtheir number and patterns might even change over time.The urban scaling predictions reviewed here have been partially matched to data, with N on theorder of to [1]; this offers strong evidence that underlying semantics of comparable cities areunimportant at these scales. However, the argument above underestimates some of the data signiﬁcantly,especially where the data exhibit a high level of uncertainty, estimated in the margins for β . This suggeststhat there is something not captured adequately by the preceding argument.There are two possibilities: either inhomogeneities across the ensembles are signiﬁcant, or the outputsdo not all follow a single model, and we are looking for either an addition or a reﬁnement. I shall tryto shed some light on these issues, in the next section, by deriving the model from a more microscopicmodel, using promise theory. Some aspects of the data, analyzed in [2], are suggestive of a ﬁt with the single universality model in [1],but not all. The authors attributed the superlinear scaling quantities to creativity or innovation [1, 2],though the link between this explanation and the calculation of the exponent is incomplete. Below areapproximate representations of few samples of the superlinear scaling data from [2] (see reference fordetails of the numbers):M

EASURE A PPROX . A

VERAGE β S OURCE

Wages . ± . USAGDP . − . ± . EU, Germany, ChinaPatents . ± . USAPrivate R&D employment . ± . USAR&D employment . ± . chinaR&D establishments . ± . USAAIDS cases . ± . USAAs pointed out by the authors, many of the numbers have a high level of uncertainty, due to the difﬁcultyof ﬁtting data from disparate sources . Even with generous margins, the single predicted value of β =7 / . does not plausibly agree with all of these measures, however, and the deviation seems tonot be attributable to a normal variation. The data in the table below, by contrast, were collected inan independent study in the UK that attempted to verify the superlinearity hypothesis in only a singlemeasure: patents (see reference [13] for details):M EASURE A PPROX . A

VERAGE β S OURCE

Patents . ± . UK all citiesPatents . ± . UK small cities My hat goes off to the authors for making this effort though! • Patents, wages, GDP, disease, are related to the promises of an output, i.e. produce that relate to aproduction process. • Jobs related to research are not outputs but occupations. They represent the state of agents. Theyare not (directly) the result of a process .If we consider the version model of [1], there are two freedoms that remain to alter (17). One is to imaginethe existence of a large fractional N population (e.g. by invoking the 80/20 rule for the distribution ofoutputs), relative to the population N I , involved in making each particular promise. However, this wouldhave the effect of reducing the value, so, while it might potentially explain the UK data, it could notexplain the SFI data. The second is to assume H > , which suggests a signiﬁcant self-similarity in theinfrastructure that relates to patent production. However, this does not seem credible as the infrastructurethat enables patents is principally researchers, which do not think in paths that ﬁll space. Something aboutthe universality assumptions need to be reconsidered. A possible resolution is provided in section 3.7.One clue to these behaviours could be that the selections represent highly specialized sub-populations,rather than homogeneous fractions. Recall that the way superlinearity arises is that an efﬁciency of scaleeffectively gets better at large N , leading to an effective ampliﬁcation with the size of the population.Wages and GDP to involve the largest fraction of the population of a city community. All the othermeasures are based on highly specialized populations to which few contribute. Another point is thatpatents are not merely the result of an economic process, but are sometimes weapons used politicallyand strategically as part of non-collaborative economic warfare between companies. This suggests thatsemantics would play a role in their explanation, and potentially skew some data (cities with companieslike Apple and Samsung, well known for software patent ﬁghts might appear differently).In section 3.7, a generic possible explanation is proposed, based on the generalized semantics ofscaling agents into clusters (superagents). If one distinguishes agents from their promises, then stagedefﬁciencies, arising from functional dependency, can be compounded or reduced, altering the β valuehierarchically. Infrastructure × . −−−−→ Superagents × . −−−−→ Produce/Output (20)To derive this plausibly, we need to formulate a simple promise theory of communities.

Before looking at a deeper model, I want to comment on another scaling model, from the world ofinformation technology (IT). Superlinear scaling has also been observed in high performance clustercomputing [20], albeit for a different reason. Most IT models are essentially one dimensional in nature,describing serial processes, divided into parallel one-dimensional threads. The ampliﬁcation of output inthis kind of ﬂow network is based on queueing results, which may approximated quite well by a simple They could conceivably be related to a training process, but then should be be counted as population or output? Software patents usually have low production costs, as they are often trivial and frivolous inventions. N L local, parallelworker threads; then this takes the general parametric form: S ( N L ) = N L α ( N L −

1) + βN L ( N L − (21)The output rate for a collective of N L agents is S ( N L ) times that of a single agent (see appendix D). Thetwo terms in the denominator represent two kinds of network process, at close range. The linear term isa bottleneck term, where all the agents contend over a limited shared resource. As long as α > , scalingwill be sublinear. The quadratic term is the cost of making the mesh of agents agree about something ateach stage of the process. This is a cooperation, or ‘homogenization of information’ (coherency) term,which puts a sharp brake on scaling. For example, in database or cache replication, information hasto be replicated consistently, which is expensive. This term reduces the speed up to the point whereperformance can actually grow worse with increasing N (see ﬁgure 4). α = 0, β=0 α > 0, β = 0 α > 0, β > 0 (a) (b) (c) Figure 4: The Gunther universal scaling law for its control parameters. (a) linear scaling, (b) cost ofsharing resources and diminishing returns from contention, (c) Negative returns from incoherency andthe cost of equilibration.The expression (21) cannot exhibit a superlinear result for positive coefﬁcients. Nonetheless, super-linear scaling has been observed in computing clusters [20]. How can these facts be resolved? Guntherargues that superlinear scaling is not possible without violating work conservation; but, by artiﬁciallymaking α < , as a parametric ﬁt, it is possible to simulate higher dimensional effects in this one dimen-sional projection. This is a one dimensional view of a process that is actually two or three dimensional.Datacentre clusters, with dense interconnect networks approach mesh-levels that ﬁll the effective volumeof the datacentre during their internal processes . As this trend continues, we can expect datacentres tobehave a lot like the model of cities discussed.The appearance of superlinear scaling came as a shock in IT, because there is no way to understand α < from a microscopic model. However, it can be understood as a renormalization effect, i.e. aneffective parametric representation of the projected output. Qualitatively, the serial limit on scaling could This is called East-West trafﬁc in the IT industry.

14e cheated as follows, in a sparsely utilized system. When the average state of the system is highly un-derutilized to begin with, then some of the efﬁcency can be regained by close packing of the utilization .If dense packing can absorb the increased size of a system (like a city or computational process), thenthat economy of scale can enable greater output for a relatively smaller infrastructure (renormalized neg-ative growth). Could this lead to a superlinear speedup? The phenomenon of packing is related to queueparallelization, where for example G/G/N is provably more efﬁcient than

G/G/ × N [4, 24]. This isbecause wasted idle time can be eliminated by arranging by an efﬁcient packing of work. This cannotpersist for ever, as eventually the sparse utilization of the limiting work agents must ﬁll up to capacity.As it does so, the amount of contention ( α for shared bottlenecks, and β for equilibration of state [25])must increase rapidly. The response time R of a queue (proportional to its average length divided by thedispatch rate µ takes the form: R (cid:39) E ( n ) µ = 1 µ − λ (22)This queue is only stable when λ (cid:28) µ , indicating sparse usage. The scaling indicated in the city resultsindicates cities that have not peaked in their infrastructure utilization yet. Superlinear scaling sounds likea good thing but it is unstable, as the queueing projection illustrates.The rational queueing expression, in Gunther’s Universal Scaling Law (21), can never explain frac-tional scaling exponents seen in cities, but it can demonstrate some projected superlinearity with α < .To feed superlinearity, we need something more than parallel serial processes where the work is done by N point sources. Only if the work is done by N interactions can a partial efﬁciency make the expo-nent greater than unity. To get this, we need to feed higher dimensional volumes into lower dimensionalvolumes. This is what is going on in the mean ﬁeld theory of [1].Interaction output = Maximum output ( N ) × Fraction of infrastructure involved ( N ) (23)From a graph theoretical perspective, this is a change in average connectivity of the infrastructure network(i.e. the average degree of nodes k [4]). If the fraction is a fraction of a volume rather than a lineor a number, there are dimensional exponents involved, which represent the contact efﬁciency by closepacking the city population. With more dimensions, a larger surface area can be used for interaction. Ifwe assume that the infrastructure network is sufﬁciently dense that it reaches almost everyone, then thiscontinuum approximation is reasonable.density of infrastructure users = N I V I = n I N δ ( D ) . (24)where δ ( D ) = 1 /D ( D + 1) , for H = 1 . Recalling that this volume is really a continuum approximationof a network of nodes, this translates into an average node degree utilization (or locally used connectivity)within the infrastructure channel k ( N ) = N I V I = n I N δ ( D ) . (25)Assuming the infrastructure is pervasive so N (cid:39) N I , the equivalent serialized infrastructure volume, fora single process, is something like: V I = (cid:18) VN (cid:19) δ × N I (cid:39) N − δ × cross section , (26) = capture volume per agent × span of agents × cross section , This is the motivation for packetization (atomization) of networks, and context switching in time sharing operating systems. Note that this is not a real connectivity, which has to do with the number of nodes, but a kind of close-packing of the sparseinteractions that occur between the nodes into the infrastructure stream. . The efﬁciency comes from being ableto use more of an unexpected ﬁxed cost, sparsely utilized resource along with other economies of scale.The net result is an ampliﬁcation of the output by δ :Interaction related output Y = const V I N I (cid:39) Y N δ ( D ) . (27)The numerator is unexpectedly constant, but the infrastructure volume scales sub-linearly, the net outputappears superlinear, with these assumptions. The question is how do we know if these are the sameassumptions as the used for the measurements?Modern datacentres and networks at scale have multiple redundant paths that make their interconnec-tion networks space ﬁlling (e.g. Clos structures [26]). This brings higher dimensional scaling issues intothe picture. The general principle is one of close packing of utilization. When dependencies scale morefavorably than the contended processes that rely on them (relatively speaking), each process gets a largershare of the shared resource, and is accelerated for a while, provided the total utilization remains low. I now want to show that we can re-derive the scaling, and indeed the model of [1], from using a quasi-atomic theory of a network, taking into account some of the more important semantics to see how univer-sality emerges, and where its limits might lie. Promise theory strikes a balance between semantics anddynamics, and thus between coarse graining and the chemistry of different functional agents.

Promise theory is a formalism that describes dynamics of atomic, black-box agents, alongside their broadfunctional semantics, i.e. with their intentional behaviour. The default or ground state of agents is one inwhich they make no promises, and are independent or ‘autonomous’, i.e. each agent is self-sufﬁcient andcontrols its own internal resources, and each agent can make promises only about its own capabilities andintentions . Assuming that an agent is not deliberately deceiving, a promise may be considered the bestavailable local information about the likelihood of an outcome.Promise theory can be understood in a number of ways. For the present purpose, it can be thought ofas a labelled graph theory, with some rules and constraints on interpretation. A promise from an agent S to one of more agents R takes the form S b −→ R (28)where b is the body of the promise. The promise body, explains what is being promised, and has polarity(+ to offer, and - to accept), as well as type τ ( b ) and constraint χ ( b ) . All agents are considered to be A simple analogy is to think of a tube of toothpaste. The toothpaste comes out in a one dimensional stream of ﬁxed width, butwe are forcing the output of a three dimensional tube through this portal, and asking: how does the amount that comes out increasewith the size of the tube if we squeeze it in the same way? By ﬁxing the cross section, we can compare different tubes, or differentcities. From a physics perspective, promises look a lot like a rich array of charge ﬂavours, for exotic forces, and the network looks alot like a force ﬁeld. b b b b b b int A A ext Π ext Π int Figure 5:

What is an agent? Agents aggregate to make superagents, with new promises possible at each scale ofagency. So what we consider to be an agent, depends on the scale at the networks under consideration. summary:1. Agent types are distinguished by the promises they make.2. The way that agents make use of one anothers’ intent through promises is what we mean by agentsemantics.3. The outcome of a promise is deduced by observation; this is called an assessment in promise theory.An assessment by agent i about a promise π is written α i ( π ) .4. A link between two agents requires a promise by both parties: one to make a service offer (+), andthe other to accept it (-). S + b −−→ RR − b −−→ S (cid:41) = unidirectional transfer (29)This kind of binding is the basis for determining the coupling strength g Y for the output yields.Promise theory is a theory of incomplete information, and embodies controlled coarse graining oversemantic scales. Consider a set of agents A i , where i runs over the population of the system (city, datacentre, etc), andall machines and proxies for human intent. In order to collaborate, agents need to make some basicpromises [9]. Every promise may be either kept or not kept, and the average value needs to be selfsustaining. Each autonomous agent thus has a balance of payments to consider. It needs to accept fuel,17ood, energy, money, etc. Dependencies also include raw materials which have to come from outside. Ifagents can cache resources they can maintain weak coupling, else they are strongly dependent on theirenvironments. This applies to energy, supplies, and also inputs of information and ideas. Promises madedirectly between agents are called short range. Deﬁnition 1 (Short range interaction)

A binding between adjacent agents S and R of the form S + τ,χ + −−−−→ R (30) R − τ,χ − −−−−→ S (31) where τ is the same in both promises, and χ + ∩ χ − (cid:54) = 0 . A promise may be long range if it is non-local, i.e. it couples several agents together, or employs inter-mediaries.

Deﬁnition 2 (Long range interaction)

A binding between adjacent agents S and R , through an inter-mediate agent A S + τ,χ + | d −−−−−→ R (32) R − τ,χ − −−−−→ S (33) I + d −−→ S (34) S − d −−→ I (35) where τ is the same in both promises, and χ + ∩ χ − (cid:54) = 0 . Note that there is no a priori notion of distance in a graph, other than the number of interactions or hopsbetween agents nodes. The familiar notion of distance comes about from embedding a graph in metricspace, which in turn is related to a continuum approximation.At any scale, a promising agency that plays a functional systemic role makes promises of the followinggeneral forms. A i − f −−→ A ext , ∀ i. (36)There must be an external agency acting as a source of this fuel f , providing A ext + f −−→ A i , ∀ i. (37)Any agent A i may depend on pre-requisite promise of dependency D , provided by another, in order toprovide service + S ; according to the assisted promise law [9]: A i + S | d −−−→ A j , A i − d −−→ A k (cid:39) A i + S −−→ A j (38)provided A k + d −−→ A i . (39)In this way, agents have probes, services, skills (+), and needs or receptors (-) that can unlock or cat-alyze their functionality. The expresses its exterior promises outwardly, e.g. a door handle’s function isrecognized by its shape, just as a car and its promises are recognizable by its exterior structure. Interiorpromises might be involved in making the exterior ones, but these are not generally visible at super-agentscale. The basics of scaling semantics and agency are laid out in [7].18 .3 Promise networks: functional interactions Functional networks have two aspects to their productivity, which can be described by: • Replication of agent output: dynamics, economics. • Combination of used services, ideation by mixing: intent, semantics, ﬁtness for purpose.It is the combination of these that leads to the understanding of fully functional scaling. Basic communi-cation and supply infrastructure networks enable interactions between any pair of agencies, but specialistfunctional networks are typically small and disconnected [27]. They relate speciﬁc promises or servicesthat are constrained by operational semantics. The scaling of such networks has been examined in [7]. Ithas a few aspects: • Agents keep promises at scale by individually promising similar capabilities in parallel, e.g. Am-dahl’s and Gunther’s laws of scaling [20, 22, 23]. A promise that is considered kept by a singleagency scales by having multiple sub-agents form a ‘superagent’. Now Bettencourt’s insight re-veals how output can also scale in a non-parallel fashion [1]. • When one agent depends on a promise being kept by another, in order to keep a promise, thiscreates a serial dependency, introducing a queue and a handling rate scale. The agents must be ableto understand one another, with common language, else they are partitioned. Partitioning cuts offlong range interaction, and promotes long range diversity, like cultural and specialist diversity .Figure 6: No single type of promise binding (dark lines) leads to percolation of value in the promise graph. However,with conditional dependency, and sufﬁcient diversity and homogeneity, there can still be effectively close to N linkswhose value converts into a common currency. A hypothesis of promise theory is that one may deﬁne a notion of force for agents, which is attractivewhen there is economic advantage, and repulsive for economic disadvantage . The formation of super- The lack of a common language is effectively a channel separation, disconnecting networks into separate branches. In commu-nication networks, channel width is sometimes shared between different partitioned agencies by using non-overlapping frequencyranges. This is called Frequency/Wavelength Division Multiplexing (FDM). It is a form of multi-tenancy. It does not matter here whether we consider the force to be a Newtonian deterministic force, or a probabilistic susceptibility fordrifting closer, as in stochastic systems. We can think of a generalized energy-momentum tensor [28], for generalized agent ‘ﬁelds’. • Agents, which make the same kinds of promises of same polarity, tend to repel one another. • Agents, which make complementary (binding) promises of opposite polarity tend to attract oneanother.Applied to the city problem, this suggests that the basic attraction to condense the city out of a sur-rounding gas of agents comes from the common supply promises, which are predominantly of positivesign, and that all agents share; for the survival infrastructure. This is held in check by the repulsion ofagents making similar promises, except where there are promises to cooperate. One may expect structuresas follows: • Clusters of professionals bound together by cooperation promises. • Chained transport agents, bound together by conditional promises. • Distributed competitors, perhaps clustering around shared infrastructure hubs, e.g. malls, districts.The embedded spacetime structure of the city should be an equilibrium conﬁguration between the attrac-tion to needs and desires, and the repulsion from competition with similar skills.In promise theory, a specialized role characterizes a pattern of agents that make similar promises. Byspecializing speciﬁc tasks to speciﬁc agents, each agent can be more focused in learning and adapting,but acquires an additional cost of cooperation proportional to some positive power of the cluster size.Superagency collaboration is a short range interaction (between interior promises) [7]. It can form longrange strong interactions, with associated cost, if it has the internal resources to support these. Limitinginteractions to short range leads to stability. The long range interactions drive superlinear effects, but alsopromote ‘chaos’.Figure 7:

The geometry of superagents may ﬁll space in different ways. Infrastructure that interconnects otheragencies is a superagent in its own right, involving linear or approximately linear cooperation between memberagents. Under preferential attachment, agents N I tend to cluster around the infrastructure agency, leaving a few N padding out the spatial volume. The circles around the subagents may be considered infrastructure binding the agentstogether. The notions of attraction and repulsion are wired into our imaginations in terms of spatial concepts.Even without an embedding spacetime, we can speak of agent afﬁnities, like the interactions describedin molecular chemistry, where spacetime plays no real role. With a physical volume to embed a graph of20romise-keeping interactions, geometry ties range to distance, but in a virtual network (which includestransport of messages by intermediate carrier), short range interactions can also be disseminated over alonger effective range, by adding cost or latency (such as in telecommunications).

In order to justify Metcalfe’s law, there needs to be an average level of interaction that spans the completegraph and propagates value. This doesn’t necessarily require a single promise type to dominate the entiregraph, because it is the value graph, not the promise graph that needs to link up all the nodes. However,from the previous section, we would expect basic infrastructure to dominate. The main requirement forthis is the presence of a common (or at least interchangeable) currency between all the nodes.Specialized exterior service promises naturally lead to small molecular clusters of component ‘atoms’(superagents), that make speciﬁc interior promises. They seldom span large areas, because promises actlike short range interactions (which is also why superagents can be considered quasi-atomic black boxagents, see ﬁgure 6), so they do not easily bring about percolation of value in an economic zone, likea city. The promises, which are ubiquitous, are associated with the survival of agents, and relies on themost general kind of infrastructure in systems: power, food, air, water, etc. These are likely responsiblefor the interconnection of the many smaller microcosms of value creation (small businesses in cities, andmicroservices in IT) to bring about a uniﬁed community with its economies of scale.If we let N τ be the number of agents that consume a promise of type τ , then we expect the class of τ related to survival to be of the order N survival (cid:39) N I , in the meaning of the city model. For all other types, N other (cid:28) N I . However, we’ll see in section 3.7 that long range interactions are also needed to explainthe scaling exponents for cities. The size of the effective network is not therefore given by the adjacencymatrix of the underlying infrastructure network, but rather by the typed promise graph.It is useful to recall the deﬁnition of a promise network (see [7]). Deﬁnition 3 (Promise adjacency matrix)

The directed graph adjacency matrix which records a link ifthere is a promise of any type τ , and body b ij ( τ ) between the labelled agents. Π ( τ ) ij = 1 iﬀ A i b ij ( τ ) −−−−→ A j , (cid:41) ∀ b ij ( τ ) (cid:54) = ∅ (40)The adjacency is the effective topology of the spacetime network, as far as the agents are concerned. Thelink-occupancy of this matrix, for a given promise type, is a linear sum whose value is generally muchlower than that of the total possible mesh of interactions. Thus, for any promise type τ , N I (cid:88) i,j =1 Π ( τ ) ij = N τ ( N τ − (cid:28) N I , (41)Note that an agent can make a promise to itself too, so the upper limit could be written N I . The value-percolating connectivity or degree of a node Π ij = (cid:88) τ Π ( τ ) ij , (42) k i (cid:39) (cid:88) j Π ij . (43)We can also write this in terms of the direct valuation of the promises, in terms of the actual matrix ofpromises π ij [7]: k i (cid:39) (cid:88) j v C ( π ij ) . (44)21here v C is the value of the promise as calibrated and assessed by a common central agency (see appendixA). Agents can keep multiple promises and multiple types of promise ‘simultaneously’ over a giventimescale T , by multiplexing their time at a rate that is much faster, i.e. t multi (cid:29) /T to avoid thequeueing instability. On the assumption of sufﬁciently sparse packing: (cid:88) τ N (cid:88) i,j =1 Π ( τ ) ij ≤ N ( N − . (45)For the economic output of a promise network, we care more about the assessments of which promiseswere kept than the number of promises that were made (see appendix A). Each agent assesses promisesindividually, and they may not agree. However, to compare to city statistics, we may assume that an sta-tistical bureau agency has been appointed by the city to calibrate these assessments α ofﬁcial ( π τ ) accordingto a single scale. Promise-keeping is an average over time. Provided the sum time to keep a promise, forall τ , for each agent, is much less than each time interval of the assessments, we can reduce α ( π ) to afrequency ‘probability’. Another way of saying this is: provided the cost of keeping the promises is lessthan the budget of each agent.These estimates are maximal. The size of a functional cluster is not really related to any of thesegraphs, because there are semantic constraints. Speciﬁc functional behaviour, in a single promise type,is a strict limitation, which leads to very sparse subgraphs. To gauge an average measure of the totaleconomic impact of all functional interactions, we have to assume: • The functions are successful in driving an economy. • The density of implicit interactions is quite high, else a given output Y will not be represented byan average mesh density. • There are some long range interactions that make the partially connected graph totally connectedon average, even if only at a low level. The survival promises probably fulﬁll this role.In reality, a city or community might be partitioned into quite independent regions, leading to a modularreducible form [30]. If one imagines the speciﬁc network, which delivers output Y , it may be somemaximally quadratic polynomial of N I , related to the structure function of the network, but it may alsobe signiﬁcantly less than this. What will tend to lead to percolation of interactions, which bind in a mesh,is the existence of long range, pre-conditional promises, i.e. dependencies. Then, the sum will be somepolynomial of N I , such that (cid:88) τ N (cid:88) i,j =1 Π (Y) ij (cid:39) c N + c N . (46)If i, j run over all the individual agents within city limits, then these matrices are sparse and fragmentedfor each τ . Suppose we assume that there is no dominant infrastructure, only small clusters of voluntarycooperation, as in [27]. Then, the aggregate graph for a city output Y , would need to have sufﬁcientrandom cooperative connectivity to form a process that generates output algorithmically. The density canbe estimated, if we assume that, for promise type τ , an agent has an average valency V , then we wouldneed τ max (cid:88) τ =1 V τ (cid:88) ij α C (cid:16) Π ( τ ) ij (cid:17) (cid:39) N V τ max ≤ N . (47)22oreover, since the diversity of promise types is unlikely to be greater than the population it stems from,one estimates N V ≤ τ max ≤ N , which does not seem realistic. Taking these estimates into account, itseems most likely that a few types of promise relationships dominate the connectivity and value creation,i.e. the basic ‘survival’ (food, water, sanitation, communications) promises, and the specialized outputscontribute relatively little to the total output, except when directly dependent on the survival promises.This is a further suggestion that the attraction to urban life stems from core interconnection infrastructure,rather than from independent diversity. Consider now the role of dependence in keeping promises as a determinant for scaling.

For promises that are requirements for survival, every agent more or less deterministically depends onsome source infrastructure agent S , the number of promises is one to one:Number of consumed = N I − (cid:88) i =1 Π ( τ ) iS (cid:39) N I − . (48)If the source is distributed over several agents, then this is still true:Number of consumed = S (cid:88) s =1 N I − (cid:88) i =1 Π ( τ ) is (cid:39) N I − . (49)Thus, the number of promises needed to supply this demand is also proportional to N I : N infra ≡ N τ (cid:39) N I . (50) When agents interact through links, on a small scale, the chemistry of their interactions may be based onsimple counting. Normally, in promise theory, we count by agent or by link. However, as the numbersof converging links become so great that counting is impractical, there is no way to liken a process to asimple Poisson arrival queue, and we resort to ﬂow counting based on density arguments. Let’s now showthat this is equivalent to volume of the infrastructure V I in [1]:Consider a number of agents N infra who provide infrastructure (gas stations, supermarket, etc) for anumber of clients N client , which in turn offer a service conditionally, based on the dependence. π infra : A infra + infra V −−−−−−→ { A client } (51) { A client } − infra −−−→ A infra (52) { A client } + service | infra −−−−−−−→ A infra . (53)Suppose that each infrastructure agent A infra can promise to service V clients simultaneously; then, usinga simple valency argument, we have a detailed balance equation for the interactions at steady state: α + N infra V ≥ α − N client . (54)23hus for simple counting of distinguishable agents, we may estimate the number of infrastructure agentsneeded to support a number of clients: N infra ≥ (cid:18) α − α + (cid:19) V (cid:124) (cid:123)(cid:122) (cid:125) intrinsic × N client . (55)where α − /α + may be interpreted as the afﬁnity for the service, or the reciprocal compressibility. Thisscales linearly with the number of clients in the catchment area of the infrastructure. Moreover, thereis no way, in this detailed formulation that we can count otherwise. The only economy of scale in thisarrangement is the standard linear multiplexing result for the marshalling of V queues into a single queuewith V servers, noted in section 2.10.However, if we now ask how to count the number of clients that can be fed into a single infrastruc-ture agent, in a spatial volume, with dimensional multiplexing, then the best estimate is to serialize thecounting, as before: N client = (cid:18) V catchment N users (cid:19) D C ( D ) × N users , (56)where we imagine a catchment volume V catchment , containing any number of agents N users who are inter-ested in the infrastructure service, and we serialize them along a tube of constant cross section C ( D ) (seeﬁgure 8). Although these numbers only apply to a small mesoscopic volume of space, in a homogenous infrastructurecatchment volume Figure 8:

How spacetime involvement compresses serialized agent links into an effective ﬂow of ﬁxed cross section. city, this will apply to the entire city, so we are justiﬁed in taking N use (cid:39) N I , (57)which, combined with an equation of state for the volume, reproduces the earlier result N infra (cid:39) V /D N − /D . (58)In this argument, it is clear that only the active agents N I play a role in the counting, and process ﬂow,hence this also justiﬁes why we can assume N I → N in [1]. A key assumption in the scaling argument of city outputs in [1], is what is Metcalfe’s law [14–16], whichproposes that the value of a network scales like the square of the total number of nodes, i.e. value is24enerated by the number of possible links between agents . This has received some empirical supportin [17], but has also been criticized in [15, 16]. More importantly, interaction value generation is not thesame as output. Agents can also produce wealth without interacting, if they have all the prerequisites(short range interactions). In section 3.7, we’ll see that long range interaction forms the basis for onekind of value creation, but not the only one.Promise theory predicts that links represent value in the following way. Consider then the sum of allimpartial promise valuations by third party C . If we assume that all agents assess the value of interactionsas strictly positive, then:Mean value = (cid:88) τ N I (cid:88) i,j =1 v C ( π τij ) ≤ c (cid:104) α i α j (cid:105) N I ( N I − (59)where N I = max τ (dim( τ )) (see appendix A about promise valuations). In spite of the quadratic appear-ance from the result, this is a linear sum, so it acts automatically as a linear averaging measure. Also,for any given specialization τ , the ﬁlling fraction of the promise network is likely low; thus, a key as-sumption is that, when properly documented, agent’s specialized promises in fact depend on many others conditionally , forming a wide reaching network of progressively weak coupling. Conditional promisespropagate the range of value interaction [9, 31]. This is the ecosystem effect.The weakness of coupling is not a problem provided the city is reasonably homogeneous in densityover the timescales of the ensemble parameters, because of the assumption of overall sparseness. If wedeﬁne an effective density for the network, which describes some probable average level ρ ∈ [0 , of‘intercourse’ between agents (any kind of sustained relationship), then it is fair to write the value of anetwork of promises: Mean value = (cid:88) τ N I (cid:88) i,j =1 v C ( π τij ) = c ρ N I ( N I − . (60)provided the total density of promises forms an SCC of order N I members. This value can be distortedfrom the quadratic form by signiﬁcant inhomogeneity. Now, for most cities, N I (cid:29) and ρ, α i (cid:28) , sofor strictly positive value interactions: v (cid:39) c ρN I . (61)This is Metcalfe’s law. It depends on the assumption of strictly positive value (i.e. no non-proﬁtableinteractions), and sufﬁcient density of promises to involve everyone in the city who belongs to the infras-tructure. Why is this plausible, when most specialization leads to modularity? One reason is that modu-larity is only a separation of scales, not an elimination of dependency: dependencies form an ecosystem.Nearest neighbours might hold the greatest semantic importance to a given function, but this reductionistviewpoint is not independently sustained without the eigenstability of the entire web [30]. We can note brieﬂy why certain system processes (or occupations, in a city) scale differently. In a special-ization society, singular individuals or agencies rarely have all the prerequisites to complete their workwithout assistance. They need to collaborate and depend on others. Thus other agents take on the roleof effective infrastructure for one another. It is the accessibility of this dependency on one another thatthrottles output, and can modulate scaling behaviour. Metcalfe’s law does not refer to the cost/value of physical connectivity, which (once again) can be much sparser than a mesh atlow utilization. Indeed, that is important, else the net proﬁt approaches zero. Rather, it refers to a correlation of agents’ promisedactivities, linking their behaviours and generating value by interaction. N , representingthe maximum output due to interactions (as in Metcalf’s law), or should we consider the output to comedirectly from the agents, as a fraction of N , as in Amdahl’s law. This offers an explanation for theanomalous superlinear exponents in the data for [2]. The superlinear scaling was initially associated with‘innovation’ activities . However, the promise theory shows how one does not need to invoke a processof innovation to explain the scaling. To drive the long range cohesion of the whole community network,specialists come to depend on specialized services (e.g. patents depend on lawyers). This leads to anumber of cases:1. Interaction scaling : as proposed in [1], for interactive value creation. A Lab + patent −−−−→ A observer (62) A Lab ± interact −−−−−→ A services (63)Patent agencies are interacting at arbitrary range with a signiﬁcant fraction the total promise graph,as a part of the ecosystem. Y (cid:39) Y (cid:18) vV I (cid:19) N → N δ (cid:39) N = N . . (64)The ampliﬁed value relies on the interplay between long range mixing, and short range isolation.2. Scarce agent scaling : skilled specialist experts’ output is proportional to the number of skilledagents, since their queue is sparse, and not ﬁlled by a wide volume of demand. However, the sameeconomy of scale applies to their services when they are depended on, as ‘infrastructure’, by others. Y (cid:39) Y (cid:18) vV I (cid:19) N → N δ (cid:39) N = N . . (65)3. Interaction promises with a scarce dependency : such as in the case of a service that depends ona source of agents to fulﬁll a dependency. e.g. patents can only be produced by labs that dependon the outputs of specialized R&D employees and lawyers, working in private relationships, or insecrecy. The expression in (76) assumes a promise conﬁguration like that of the assisted promiselaw [9], with a main output based on a number of agents that provide input. The dependenciesproduce raw output, and the ‘lab’ agency collates and represents the collaborative mixing, e.g. A Lab + patent | research,legal −−−−−−−−−−−→ A observer (66) A Lab ± interact −−−−−→ A services (67) A Lab − research −−−−−→ A staff (68) A Lab − legal −−−→ A lawyer (69) A staff + research −−−−−→ A Lab (70) A lawyer + legal −−−→ A Lab (71)More generically, with two stages in the process of promise keeping, each experiencing scaling(see ﬁgure 9), In [32] an explanation based on spanning tree branching processes was postulated, but was not credible. S D−d+d−Y AND+Y|d

Figure 9:

A two stage (long range) dependency has two economies of scale, when fed by a spacetime workﬂow.The probability of promises kept is multiplicative, like the logical ‘AND’ of the promises. S + Y | d −−−→ R (72) R − Y −−→ S (73) S − d −−→ D (74) D + d −−→ S (75)the total process picks up two ‘economies of scale’: the delivery of Y conditionally AND thedelivery of the conditional dependence. Y (cid:39) Y (cid:18) vV I (cid:19) N × (cid:18) vV depend (cid:19) N → N δ (cid:39) N = N . . (76)where D = 2 is used for the numerical values. These values accord better with the cited data in [2],and tie in with the story about queueing.What characterizes this interaction is the high level of specialization required to fulﬁll the depen-dencies. If the network is sparse, this is more difﬁcult than if it is dense and diverse. This is thespecialization gamble. With specialization comes individual efﬁciency, but also risk of instabilityby disconnection from key dependencies [33].They are a throttle on the process, because their absence could stop it altogether. Hence, we arejustiﬁed in using the product ‘AND’ for combining the probably values in (76). This is not a (a) (b) Figure 10:

Structural recursion in an ecosystem is not like a branching process of containers (a), but rather theagents overlap with other regions of the same network to access their virtual functions. Thus their outputs are notconcealed as interior substructure, but exposed as part of the ﬂat internetwork (b). The result is that a second orderrecursion picks up a second economy of scale, in turn increasing the superlinearity of the derived output. hierarchical system interaction, because the services are not necessarily hidden from the long rangedynamics by internal components of the superagent ‘lab’ (see ﬁgure 10 (b)). But in organizationaltheory, one normally assumes that all organizations are hierarchically organized (see ﬁgure 10 (a)).27.

Recursive promise dependency . Let’s consider what happens when the ecosystem network isbased on a hierarchy of interaction ranges, i.e. promises are made recursively in fully protectedshells. The agency produces a service using full community infrastructure, but also has somespecialist dependency contained entirely within (see ﬁgure 10 (a)). A company + A specialist + solution −−−−−→ A client −−−−−−−−−−−−−−−→ A client (77)would be viewed as a recursive operation on the infrastructure, and the economics of scale wouldapply to both times the (different) infrastructures were used. Y patent = N I V patent (78) V patent = g R&D (cid:18) V R&D N R&D (cid:19) D N R&D (79) V R&D = (cid:18) VN I (cid:19) D (80)Substituting for V R&D /N R&D from the last expression into the former, V patent = g R&D (cid:18) VN I (cid:19) D N R&D (81)Inserting this into the output expression Y patent (cid:39) N D − D ( D +1) (cid:39) N (cid:39) N . . (82)This value is almost linear, which is what we might expect on a self-contained specialization, sincethe outside world would not be able to tell the difference between a single agent and a singlesuperagent.The value is also smaller than case (1) above, not larger, so short range hierarchical scaling cannotexplain the anomalously large exponents measured in cities. On the other hand, there are somesmaller exponents in this range. It is interesting to examine these measures from the perspective ofthe promises represented, and their range in the embedding space.There is a simple prediction here: long range interaction via dependency seems to increase outputsuperlinearity, by compounding ecomomies of scale, i.e. dependency brings strong long range couplingand activates a larger amount of the N mesh. A similar effect can be obtained artiﬁcially in [1], byslightly increasing the Hausdorff dimension of the infrastructure H > . This does would correspond toa more pervasive generic infrastructure network, which is an opaque explanation at best. It seems unclearhow to justify it.Short range dependency is basically invisible at larger scales. This observation might help to explainsuperlinear seen in technological contexts too, through coordination [20], but we have to be careful not tomix together effects that come about due to higher dimensionality, with other mechanisms for increasingthe utilization of dependent resources. The ability for agents to discover and match with other agents, that make complementary promises, isthe basis of functional scaling, and the semantics of cooperation and innovation. It depends on either28hysical or virtual mobility of the agents. Kinetically, agents may follow a random walk, as in ballisticdiscovery. A second possibility is that cheap intermediaries perform the discovery of specialized roles .We can say that two agents are either • Physically close. • Virtually close.Promise theory also suggests that they may be close in two ways: dynamically close or semantically close(such as when related meanings are similar). The former depends on the length scales of the system (e.g.city) and its structure. The latter can be assumed approximately independent of these scales, because thecarriers are very light, cheap, or fast (or, in the case of semantic distance, purely cognitive). If the cost ofdiscovery can be neglected, the cost equation is different: collaboration can be cheaper, and the value ofbeing in close proximity for a particular specialization is reduced .Directories, maps, and indices [7] are the keys for agents to virtualize discovery of dependencies, andlocate one another without physical search in spacetime. Telephone directories map coordinate addressesto names. Yellow pages map coordinates to specializations. Similar specializations are grouped. Shop-ping malls and industrial estates act as physical directories, where clients can expect to ﬁnd services in asmall volume. Directories may be discovered themselves, or formed by voluntary registration. The valueof new bindings overcomes the tendency for similar specializations to repel one another: similar agentsmay be attracted implicitly (covalently) by the intermediate attraction to clients. Apart from predictingthe importance of directories for smart cities, and organizations, this also predicts that the availability ofdirectory information could affects the productivity of a city as a function of size. This effect might notbe clearly visible in the ensemble data, since we would need to compare cities at the same value of N .The scaling estimates of the city are based on infrastructure where physical motion of the population isbased on the cost of traversing some fraction of the length of the city. We can repeat the output calculationto neglect this cost, as is the case in services that do not require physical transport. • Physical interaction (transport/mobility): people move around using transport infrastructure toexperience their environment. There is a promise for people to observe their surroundings, forsomething related to subject τ , and this promise is kept fractionally α τ ∈ [0 , during their walk.Let the linear range of the agent A i be some dimensionless fraction per unit time rT explore of thesize of the city V /D , where r is the speed in units of city size . If the density of impulses per unitlength of city region I is assumed constant relative to the transport rate (because this is the basisof commerce, i.e. what the city is trying to optimize for people’s ﬁnite time), then the number ofimpulses I τ of type τ , experienced on such a walk, may be written: I τ ∝ r i T explore V /D I α τ (83)where α τ is the probability that the person or agent will be receptive to impulses in its environmentthat are relevant to promises of type τ .Although there is room for inhomogeneous variations in the city regions, in the transport rate r ,and the density of offerings I , this will not change the average scaling argument much, as long as N is large. I make the assumption here that the density of experiences I is constant, even thoughthe density of people is related to the city size. This is because the size of a city is constrained bythe time rather than the distance (and we are suppressing explicit time). Electrons play this role in molecular chemistry, or telecommunications in the human realm. A dependency does not just have to be discovered, but also maintained in a persistent relationship, which accumulates cost overtime. If the person’s path is detailed, one could include the Hausdorff dimension of the path and use v H/Di as the range, as Bettencourtsuggests. I’ll ignore this for now, as humans do not tend to move in fractal paths, as his data suggest. V /D .The cost of physically ﬁshing for ideas thus takes the form C (cid:39) c Y N I V D , (84)in agreement with the work model of [1]. This applies for physical city interactions, and leads tothe same output scaling expression in [1]. Y + Y (cid:39) N D +1 D (1+ D ) N D − DD ( D +1) I , (85) (cid:39) N I (cid:18) N N I (cid:19) . (86) • Virtual interaction (teleport/messaging): people are immobile and send messages to one another,watch entertainment, browse, read, talk, etc. These activities occupy an increasing amount of thetime spent by people, not least because it can easily be interleaved with work time. The rate is nolonger related to the size of the city, nor is there any obvious boundary to what can be discoveredonline (since the range of the Internet is even more diverse than a city) . In this case, the impulsesare more likely to be related to availability of the fountain itself (e.g. ‘bandwidth’ B ) multipliedthe time spent. I τ ∝ B T explore I α τ (87)Discovery of information is the main issue. Before search engines, there were only directoriessuch as white pages (by person) and yellow pages (by promise type). However delivery of what isdiscovered might still involve spatial constraints, e.g. locating a new car online does not allow itto be teleported to the buyer’s location. However, 3d printing technology might change this, for aclass of problems, soon.Here it is not the locations that matter, but the rate at which impulses are absorbed. Once again,this is constant. When friends, books, or movies are communicating ideas to us, this happens at arate that depends only on how quickly we can get hold of a stream. How users discover locationsonline, or by telephone is a separate question. Directories [7], advertisements, and chance all playa role here. The cost of ﬁshing for ideas is thus now independent of the city size. For a communityof multiple super-agents, the analogous expression is: C (cid:39) cN I BT explore . (88)If we imagine a community with no other infrastructure except its telecommunications network,and substitute (88) into the detailed balance equation: g Y (cid:16) v Y V (cid:17) N I ≥ cN I BT explore . (89)Following through the calculation for the yield estimate identically, we ﬁnd the scaling is no longersuperlinear ( D = 2 , H = 1 ) Y (cid:39) N HD N ( − HD ) I (cid:39) N I (cid:18) N N I (cid:19) − . (90) Because telecommunications networks are global, it does not make sense to relate their cost to the size of the city (though thisdepends on exactly how we model the costs), so the cost depends more on its usage than on its extent. We simply assume that itexists and has sufﬁcient capacity for the N I connected residents. I work τ = N D (cid:0) c phys I phys τ + c virt I virt τ (cid:1) (91)and for the entire city of N W workplaces: I city = (cid:88) τ I work τ (cid:39) N W I work . (92)The physical channel agrees with Bettencourt’s expression for mixing volume in (4), where N I = N D N W , and the virtual channel accords with mobile ad hoc networks. It would interesting to exam-ine communities physically remote from services to see how these predictions match reality. Cities are just one form of smart adaptive space, for which we now have a new and fascinating insightin the form of statistical scaling data. Remarkably little data are available for computer installations,software development, or smart warehouses, since these reside in the private sector; nonetheless, theremay well be insights to gain from studying the relationship between theory and practice, where we can.Some results may have sufﬁcient dynamical similarity to other cases to infer valuable lessons. Althoughthere are qualitative differences between biological organisms and cities, the main features that makescomputer systems different from cities are the size and timescales involved. Datacentres are still tinycompared to cities (in terms of active agents N ). Structural changes take place on the order of secondsrather than weeks, and the rates increase as technology advances (see ﬁgure 11). The chief lesson we canderive from cities, which might be applied to other smart infrastructure, is the involvement of spacetimerelationships in counting, at large N .In [6, 7], a generalized abstraction of functional spaces was developed, starting from ‘atomic’ irre-ducible considerations. By developing the city scaling theory in this framework, one could hope to bridgethe gap between disciplines, and promote future analysis of the effect of changing costs and technology. Ininformation technology, dynamical infrastructure, known as cloud computing, offering co-located sharedcompute, storage and caching, as well as providing facilities for community software development, sharedrepository models, and ﬁnally the pervasive Internet of Things, represent both present and upcoming chal-lenges for infrastructure modelling. We must be clear about the difference between online communities and physical networks. It has notescaped the notice of [3] that universality of cities implies network scaling in technology-assisted com-munities, and some data concerning online communities has been examined with interesting and largesuperlinear behaviour: 31T

INFRASTRUCTURE C OMMON C ITY N (cid:39) − Functional modulesLong and short range interactions < − s N ETWORK ∆ t infra > s − s − s A GENTS ∆ t interaction s − s Seconds T

RANSPORT TIME

Hourssoftware I

NNOVATION hardwarecode, services P

ROMISES goods, servicescode, memes, habits E

PIDEMIC TRANSMISSION replicants, memes, habitsMembership S

EMANTIC EDGE

MembershipLatency threshold D

YNAMIC EDGE

Density thresholdServers, storage T

ENANCY UNIT

Homes, ofﬁces, storageContainers, hosts, private nets P

ARTITIONING cubicles, rooms, buildingsProcess groups, clusters, datacentres S

UPER - AGENCIES cubicles, rooms, buildings H = 1 − N T RAWL PATH DIMENSION H H = 1-2 − E MBEDDING DIMENSION D = 2 − Figure 11: Comparing cities with IT infrastructure.M

EASURE

Average β SourceDNS hosts . ± . InternetTotal web pages . ± . InternetActive web pages . ± . InternetContributors . ± . WikipediaExternal links . ± . WikipediaInternal links . ± . WikipediaInsufﬁcient details are provided to suggest an explanation for the numbers ; however, a promise theoryapproach like the one begun here, may easily play a role in understanding the structural dependences.Online communities are sociologically interesting, but more practical, from a engineering viewpoint,is to understand how scaling behaviour modulates productivity in smart infrastructure, that mixes physicaland virtual mobility. Such ‘smart spaces’ enact a kind of computation to optimize their conditions, andeven exhibit some algorithmic qualities. Users interact with hosted services, which are themselves acommunity of (software) agents, acting as proxies for human intentions. This client interaction behaveslike a long-range weak coupling: an autonomous ‘gas’ of visitors , and malls and directories act ascatalysts for value generating interactions. Applications and companies, on the other hand, consumeunique and shared resources as residents of the infrastructure. They have superagent boundaries aroundcompany and functional concerns, which limit internal processes to short-range (and typically strongcoupling) interactions, in the manner of a tenancy [7]. Some brief remarks below concern what onemight expect to study and ﬁnd in software agents that are proxies for human intent, and are enhancedwith signiﬁcant automation. [35] has made some comments on the distribution of community sizes for physical communities. In [36, 37], I showed how statistical behaviours in computers behave like a gas at equilibrium, with periodic boundary condi-tions. .2 The productivity of agents in human-machine systems Productive output, in agent communities, is driven either by output from agents working independently,in parallel, or from the mesh of interactions between them. It is throttled by the contention for sharedresources (serialization or queueing) and the high cost of long range coherence (i.e. equilibration of state,or mutual calibration). These are the main features captured in Gunther’s Universal Scaling [22].Automation allows individual agents to generate much greater outputs, without the cost of coopera-tion, so unequal automation might skew the measured performance outputs in empirical studies, makingit difﬁcult to compare cities and other systems . When forming a statistical ensemble of systems (cities,organisms, or cloud infrastructure), we have to be sure to compare similar systems. A fundamentallydifferent technology base would undermine the universality of data for individual data points, and couldinﬂuence the interpretation of the scaling.As pointed out in section 3.8, the impact of using messenger channels, like telecommunications ratherthe physical mixing, is that the process of information and service ‘discovery’ is fundamentally different,altering the costs by removing the imprint of physical scales from the output scaling. Caching becomesa crucial enabler here, because it converts quadratic N costs in to linear N outputs from single agents(thanks to smart behaviour).For technology infrastructure, the analogue of discovery by wandering around a town browsing storefronts, in the high street or shopping mall, is browsing a database, or directory (e.g. yellow pages, oronline shopping catalogue). Information spread by rumour and reputation, in a pre-telecommunicationscommunities, are supplemented by targeted information about recommendations (e.g. Amazon, Google,and Facebook ads and search).Productivity is not only increased by automation. The cost of transporting and equilibrating goods andservices remains. In information technology, larger and larger amounts of data are now stored in physicaland data warehouses. Distribution channels, for products and services are mirrored by technologiesfor data replication, such as Paxos [38] and Raft [39]. These are now being adopted widely, based onthe sense that data equilibrium is important to avoid the inconsistency of ‘many worlds’ viewpoints,when interacting with distributed systems. However, the scaling of equilibration software is necessarilypoor, since the cost scales superlinearly (typically like N ), to maintain consistent states [20]. Resource‘latency’ or response time is a key issue, when dependencies are not local. This too can be improved bycaching. The cost of transporting data and materials from remote suppliers, either to a city is like the costof reaching across the planet to rent a virtual machine platform. Ultimately, no community wants to relyon a fragile remote resource. A cheap non-scalable solution is better than an expensive long term one.This has driven the centralization of large cloud installations, where economies of scale can be arguedlocally, as transport costs and latencies are the responsibility of the clients.Caching and replication are strategies that decouple long-range dependences, and restore agent au-tonomy. It is cheap to cache and replicate many technology functions today, offering a kind of ‘smartbehaviour’ like a brain to keep interactions local. The mobility of small data is high, as transferringsmall data is cheap. The mobility of computation has traditionally been low, as computers were largeand cumbersome machinery. But that is reversing, thanks to encapsulation methods (superagent tech-nologies), like machine virtualization and so-called ‘containerization’ of software. Moreover, there iscomputational processing capacity everywhere today, including on smartphones in our pockets, whereasthe sheer size of data being collected is growing to impractical levels for transportation. Computation isprogressing from being a shared resource, to a ubiquitous agent capability. Some notes to this effect were remarked in [10]. .3 Separation of long and short range interactions (modules) Limiting unnecessary mixing due to long-range or strong-coupling interactions is essential for establish-ing modular functionality in networks. Modularity promotes specialization, and the quiet isolation forthe gestation of lengthy processes like learning and innovation. It makes economic sense, provided themodules do not themselves become bottlenecks. Where humans are coupled strongly, the Dunbar limitsfor human valency [40] intrude on scalable design, even with the help of automation.What does this tell us about cloud computing, microservice communities, and the Internet of Things?Suppose the economy of scale were only of the order of 85%, as in cities, this would not a huge incentiveto centralize, if local resources were available too. The economy would have to be offset against effectslike latency and long range dependence, equilibration, etc. As our interest in data from embedded sourcesincreases, and we equip environments with smart ecosystems [26], the idea that cloud computing willbe localized in large datacentres seems unrealistic. Latency costs suggest a greater delocalization ofworkforce to eliminate long range interaction.Managing specialization, or separation of concerns, is a human-technology issue that we are onlyjust starting to grapple with at scale [41]. Modularity has long been a part of system doctrine, but theevolution of so-called ‘microservices’, or small, specialized software services, is now being motivatedempirically, by the limitation of Dunbar valency for human agents [40, 42]. Breaking a system intomany small specialist parts, each associated with a different human owner, incurs a new cost of serviceinteraction, but this cost may be paid cheaply with automation to alleviate a more expensive humanburden: the hard limit of what a human brain can cope with. The density of information services inmodern society is now huge, and the complexity of interactions is signiﬁcantly more than humans canmanage unassisted. Technological agents can handle the greater numbers of interactions, but only ifthey are sufﬁciently simple that a human collaboration can understand how to design the promises theyshould keep. Coordination services for replicating siloed resource clusters are being developed, basedon the experiences of large industrial providers like Google [43]. These currently rely on long-rangedata equilibration methods, which is serious throttle on their performance, and must become worse whenspacetime dimension plays a role.Scaling issues pose the question: at what point should systems (cities, datacentres, communities)break up into decentralized regions? Cloud datacentres grew up from the economies of scale that can beachieved through specialized expertise in infrastructure management. We have already witnessed clouddatacentres multiply like power stations, placed at strategic geographic locations, to cover distances. Thenext logical step is ubiquitous infrastructure. At this point, the economic advantages of physical clustering(indeed cities themselves) may disappear altogether, leaving only the value of centralized meeting places,clubs, conferences, etc for human contact.

Silos that encapsulate specializations are isolated semantic functions. The appearance of silos in humanorganizations is often thought of as a negative phenomenon, because it represents an exclusion of outsideinterests. However, it also has a necessary and positive effect, implying a separation of short rangeinteractions.This is not the same as separation of dynamical scales. Voluntary partitioning to mitigate contentionhas clear advantages, if resources permit. Conversely, confusing semantic separation with physical sepa-ration may have dynamical consequences. An excellent example of this may be seen from town planning:for a while ‘garden cities’ were a trend, designed with the aim of tidy separation of functions, separatedby green spaces. The principle backﬁred, however, e.g. in Brasilia [44], were different city functionswere so distributed that residents have to travel considerable distances to reach town facilities. This ledto severe trafﬁc congestion. 34hen the cost of collaboration between partitioned agencies grows [33], business networks can de-generate and become non-viable, as a larger part of the infrastructure becomes non-contributing. Thisis a lesson for microservices. IT architects are starting to realize this now, and are opting for ubiquitous‘hyperconverged’ architectures, i.e. a return to total package servers to reduce latency and congestion.Sometimes regions form by themselves, from dynamical and semantic principles, with partitions basedon language, business, geographic centrality, eigenvector centrality [45], etc.

In the past, sparse machine resources were immobile, and inputs were brought to the machine for pro-cessing. It was cheaper to move inputs to machinery. Today, mobile devices with signiﬁcant processingcapability are commonplace. and encompass computers and manufacturing (aka 3d printing). Particu-larly, in information infrastructure, the ubiquity of stationary and mobile processing capacity reduces theneed for data to be sent over large distances. Conversely, the amount of data expected to come fromdomestic and industrial sources will grow massively, decreasing relative mobility of data. Mobility ofprocessing resources is now a crucial issue, and is enabled by containment wrapper technologies thatform intentional superagent modules around speciﬁc functional roles.The ubiquity and miniaturization of information technology suggests that the role of space dimensionmay well be a temporary phenomenon, at least for local interactions. Even with ubiquitous informationinfrastructure, there will still be some role for large datacentres, particularly in the realm of storagearchives, since disaster redundancy is one of the critical issues. Similarly, ‘data gravity’ proposes movingthe smaller resource to the larger resource, e.g. move computational power to large data instead ofassuming that computational power has to be at a ﬁxed location and transporting data.

Garbage collection, sewage, and drainage services are amongst the most important dependencies for asystem. They are sinks for pushed output: scaling results for these services would be interesting indeed.One would expect them to be more susceptible to ﬂash ﬂoods and long tail behaviours than the corre-sponding supply networks, as they can only fail catastrophically when a threshold is reached [31]. Itwould be very interesting to know how the converse of ‘push’ methods, or imposition services, scale withcity size: pull on demand, voluntary cooperation are popping up in society to replace mass broadcasting,e.g. video on demand, replacing television broadcasts.

The perception of time has an interesting relationship with long range order. Time is a local concept,as both physicists and distributed systems specialists know only too well. Without long range data syn-chronization, clocks may not tell the same time, and not just human clocks, but the pulses that drive andclock processes . For example, imagine an orchestra of musicians. The orchestra can play without aconductor is everyone has enough to do all the time, but when every instrument plays a sparse role in thewhole, coordination is difﬁcult as the agents are mostly dormant. Multiplexing allows better utilizationof a network. When all agents are busy, i.e. utilization is high, process time passes at a faster rate, relativeto its surroundings, and coordination becomes easier. Busy agents are ready to receive new inputs andrespond with services, suggesting that a highly utilized system could be more efﬁcient, provided that itlies below the queueing instability. Each change in an agent is a tick of the network clock, and each partitioned network has its own sense of time, and time onlymoves forward according to that clock when something happens (this is essentially Einsteinian relativity [6]).

35t high speed, the world is dominated by time, and is essentially one dimensional. In a world dom-inated by space there is multi-dimensionality, and volumes can quickly become relevant in a continuumlimit, and time is suppressed by equilibration. The higher dimensionality promotes an increased averageconnectivity for the promise graph, k ( N ) (cid:39) N δ ( D ) , that depends on the dimension of the embeddingspace D . In the complex interaction between space and time, all possibilities are on the table. Universality and scaling are powerful notions in science. Having data about the scaling of functionalprocesses, at large and small N , offers an invaluable insight into what we can expect of technologicalsystems at scale, and their increasing intrusion into human society. Understanding social sciences interms of laws, analogous to physical law, is an area where progress has been made over the past century.Understanding such patterns in ‘smart’ admixtures of humans and technology seems an even more rele-vant challenge today [4]. An obvious question becomes: is there something that can be transferred fromthe study of cities to other network and community systems, e.g. • Online user communities—human communities in a virtual space. • Humans-software communities—interacting processes in a virtual space. • Microservice communities—collaborating agents in a virtual space, interacting with humans. • Cloud computing—a shared infrastructure community of contending processes.Datacentres and software systems share a lot of similarities with cities, but there are differences too.Information infrastructure is largely one dimensional, in most locations. In datacentres infrastructure isonly just in the process of becoming two and even three dimensional. It is also quite inhomogeneous,and one needs to know when and where dynamical similarities might be exploited to argue dynamicalsimilarity [26]. As N becomes large, issues of space and time become much more entwined with themore common one-dimensional algorithmic behaviours generally studied in computer science.Universality reveals emergent laws, on broad scales. However, a fuller understand of systems, whetherhuman cities, smart cities, computers, or any other human structure, is only achieved by describing bothdynamics and semantics at micro- and mesoscopic scales. Just as we cannot understand medicine withoutunderstanding the functional roles of structures inside organisms, so the functional organs in a city arekey to what it does. The universal scaling arguments for urban areas, in [1, 2], are exciting discoveries,as they point to the involvement of spacetime in functional systems, which is still poorly understood ininformation and computer science.Datacentres are still tiny compared to cities, but the two are also growing together thanks to thepervasive spread of the Internet. By applying promise theory to expose some short range interactions, weﬁnd more possible exponents that match quite well with the anomalous results in citebettencourt1 (seesection 3.7). The immediate applications of this approach leads to some basic observations: • Value creation in communities comes from a mesh of promises whose outcomes funnelled, ﬁlteredand powered by serialized processes for supply and harvesting. • The sparse probabilistic utilization of shared infrastructure allows agents to interleave their effortsand achieve economies of scale. This keeps a ﬂuctuating city in an approximately stable, steadystate over short timescales, and admits limited long term growth. • Specialization of tasks into modulaar services allows systems to focus their time and capabilitieswithout the cost of context switching. The strategy of specialization also brings fragility: if the36ost of reconnecting the specialists grows too large, the community can fail to keep promises foressential functions [33]. • Increasing autonomy of a system population, due to the availability of personal assistants, andlocalized capabilities (smartphones, 3d printers, etc) will undermine the need for transport, and theinvolvement of city scales in many processes. • Superlinear scaling results from a dependence on exterior specialist agencies. When remote depen-dencies are involved, staged economies of scale can accumulate bringing superlinear effects, foreach remote dependence to the ensemble. If these external dependencies could be redistributed to apurely autonomous attribute of each agent (e.g. when phone boxes are replaced by mobile phones),then scaling would at best be linear, and the artefact of superlinearity would disappear. • The ability to discover promised information, to mix it and select new combinations, is the basisof innovation and collaboration. A highly discoverable ecosystem is ‘smart’ in the sense that it canadapt and invent. Any space can, in principle, be made smart in this way, if its agencies make thenecessary infrastructure promises.There are two main lessons to take from these notes, that may be surprising to technologists: (i) thebehaviour of a system can involve more than one spatial dimension, and (ii) we can measure and describejust how smart and productive functional spaces actually are across a range of scales from local to global.

Acknowledgments

Many thanks to Luis Bettencourt for patiently reading and reﬁning my understanding of his model. I’mgrateful to Noah Brier for an invitation to Transition in late September 2015, where I met Geoffrey Westand learned about scaling in cities. This work could not have happened but for that chance meeting in aremote city. Also thanks to John D. Cook for carefully reading a draft.

References [1] L.M.A. Bettencourt. The origins of scaling in cities (with supplements).

Science , 340:1438–1441,2013.[2] L.M.A. Bettencourt, J. Lobo, D. Helbing, C. H¨uhnert, and G.B. West. Growth, innovation, scalingand the pace of life in cities.

Proceedings of the National Academy of Sciences , 104(107):7301–7306, 2007.[3] L.M.A. Bettencourt. Impact of chainging technology on the evolution of complex informationalnetworks.

Proc. IEEE , 102(12):1878–1891, 2014.[4] M. Burgess.

Analytical Network and System Administration — Managing Human-Computer Sys-tems . J. Wiley & Sons, Chichester, 2004.[5] M. Burgess. On the theory of system administration.

Science of Computer Programming , 49:1,2003.[6] M. Burgess. Spacetimes with semantics (i). http://arxiv.org/abs/1411.5563 , 2014.[7] M. Burgess. Spacetimes with semantics (ii). http://arxiv.org/abs/1505.01716 , 2015.[8] G.B. West. The origin of universal scaling laws in biology.

Physica A , 263:104–113, 1999.379] J.A. Bergstra and M. Burgess.

Promise Theory: Principles and Applications . χtAxis Press, 2014.[10] L.M.A. Bettencourt. Urban scaling in europe. arXiv.org , arXiv:1510.00902 [physics.soc-ph], 2015.[11] A.L. Barab´asi.

Linked . (Perseus, Cambridge, Massachusetts), 2002.[12] M. Burgess, H. Haugerud, T. Reitan, and S. Straumsnes. Measuring host normality.

ACM Transac-tions on Computing Systems , 20:125–160, 2001.[13] E. Arcaute, E. Hatna, P. Ferguson, H. Youn, A. Johansson, and M. Batty. Constructing cities,deconstructing scaling laws.

Journal of The Royal Society Interface , 12(102), 2014.[14] B. Metcalfe. Metcalfe’s law: A network becomes more valuable as it reaches more users.

Infoworld ,October 1995.[15] A. Odlyzko and B. Tilly. A refutation of metcalfe’s law and a better estimate for the value ofnetworks and network interconnections.

IEEE Spectrum , 2006.[16] B. Briscoe. Metcalfe’s law is wrong.

IEEE Spectrum , 2006.http://spectrum.ieee.org/computing/networks/metcalfes-law-is-wrong (retrieved 1.1.2016).[17] X.Z. Zhang, J.J. Liu, and Z.W. Xu. Tencent and facebook data validate metcalfes law.

Journal ofComputer Science and Technology , 30(2):246–251, 2015.[18] R. Albert and A. Barab´asi. Statistical mechanics of complex networks.

Reviews of Modern Physics ,74:47, 2002.[19] G. Canright and K. Engø-Monsen.

Handbook of Network and System Administration , chapter SomeAspects of Network Analysis and Graph Theory. Elsevier, 2007.[20] N.J. Gunther, P. Puglia, and K. Tomasette. Hadoop superlinear scalability: The perpetual motion ofparallel performance.

ACM Queue , 13(5), 2015.[21] M. Burgess and G. Canright. Scaling behaviour of peer conﬁguration in logically ad hoc networks.

IEEE eTransactions on Network and Service Management , 1:1, 2004.[22] N. J. Gunther. A simple capacity model of massively parallel transaction systems. In

CMG NationalConference , 1993.[23] N.J. Gunther. A general theory of computational scalability based on rational functions. Technicalreport, arXiv:0808.1431, 2008.[24] Leonard Kleinrock.

Queueing Systems: Computer Applications , volume 2. John Wiley & Sons,Inc., 1976.[25] M. Burgess and S. Fagernes. Pervasive computing management: A model of network policy withlocal autonomy.

IEEE Transactions on Software Engineering , page (submitted).[26] M. Burgess.

In Search of Certainty: the science of our information infrastructure . Xtaxis Press,2013.[27] M. Burgess and S. Fagernes. Autonomic pervasive computing: A smart mall scenario using promisetheory.

Proceedings of the 1st IEEE International Workshop on Modelling Autonomic Communica-tions Environments (MACE); Multicon verlag 2006. ISBN 3-930736-05-5 , pages 133–160, 2006.[28] M. Burgess.

Classical Covariant Fields . Cambridge University Press, Cambridge, 2002.3829] M. Burgess and S. Fagernes. Laws of systemic organization and collective behaviour in ensembles.In

Proceedings of MACE 2007 , volume 6 of

Multicon Lecture Notes . Multicon Verlag, 2007.[30] J. Bjelland, M. Burgess, G. Canright, and K. Eng-Monsen. Eigenvectors of directed graphs andimportance scores: dominance, t-rank, and sink remedies.

Data Mining and Knowledge Discovery ,20(1):98–151, 2010.[31] M. Burgess. A promise theory approach to understanding resilience: faults, errors, and tolerancewithin systems. Technical report, Available at markburgess.org, 2015-2016.[32] S. Arbesman, J.M. Kleinberg, and S.H. Strogatz. Superlinear scaling for innovation in cities. Tech-nical report, arXiv:0809.4994 [physics.soc-ph], 2008.[33] J. Tainter.

The Collapse of Complex Societies . Cambridge, 1988.[34] M. Burgess and G. Canright. Scalability of peer conﬁguration management in partially reliable andad hoc networks.

Proceedings of the VIII IFIP/IEEE IM conference on network management , page293, 2003.[35] Y.G. Chen. The spatial meaning of paretos scaling exponent of city-size distribution. arXiv:1309.4862 , 2013.[36] M. Burgess. The kinematics of distributed computer transactions.

International Journal of ModernPhysics , C Physical Review E ,62:1738, 2000.[38] Leslie Lamport. Paxos Made Simple.

SIGACT News , 32(4):51–58, December 2001.[39] Diego Ongaro and John Ousterhout. In search of an understandable consensus algorithm. In

Pro-ceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference , USENIXATC’14, pages 305–320, Berkeley, CA, USA, 2014. USENIX Association.[40] W.X. Zhou, S. Sornette, R.A. Hill, and R.I.M. Dunbar. Discrete hierarchical organization of socialgroup sizes.

Proc. Royal Soc. , 272:439–444, 2004.[41] P. Borrill, M. Burgess, M. Dvorkin, and H. Wildfeuer. Workspaces. Technical report, 2015.[42] R. Dunbar.

Grooming, Gossip and the Evolution of Language . Faber and Faber, London, 1996.[43] Brendan Burns. How kubernetes changes operations. ;login: , 40(5), 2015.[44] F. de Holanda, R. Ribeiro, and V. Medeiros. Brasilia, brazil: economic and social costs of dispersion.In , 2008.[45] T.H. Stang, F. Pourbayat, M. Burgess, G. Canright, K. Engø, and ˚A. Weltzien. Archipelago: Anetwork security analysis tool. In

Proceedings of The 17th Annual Large Installation Systems Ad-ministration Conference (LISA 2003) , San Diego, California, USA, October 2003.[46] J.A. Bergstra and M. Burgess. Local and global trust based on the concept of promises. Technicalreport, arXiv.org/abs/0912.4637 [cs.MA], 2006.39

The value of promises

The value of links in a network depends on the promises they make. The value of a promise is a form ofassessment [9] that any agent can make independently. We write an assessment of whether a promise waskept α i ( A i b −→ A k ) ∈ [0 , (93)to mean the assessment by agent A i that the promise from A j to A k was kept. A valuation is an estimateof what a promise is worth to an agent. This may of may not depend on the assessment of to what extentthe promise is kept or not. Every agent assesses on its own calibrated scale. If we want a commoncurrency valuation for all parties, this has to be calibrated by a single agent according to its scale.The interpretation of value is also an individual judgement that relies on trust, and may be based onaccounting of the assessments over time (reputation) [46].My reputation ∝ (cid:88) you (cid:16) you − b −−→ me (cid:17) (94)In words, my reputation is proportional to all the number of ‘you’ who (publicly) promise to accept mypromised service. Even unilateral promises may have some value:V ALUATION BY X A BOUT PROMISE R EASON FOR VALUE TO X Me me + b −−→ you An reputation building investmentMe you + b −−→ me A service that might help meYou me + b −−→ you A service that might help youYou you − b −−→ me You need the service nowCooperative relationships are usually based on conditional assistance [9], and take the form of a condi-tional equilibrium: S + S | M −−−−→ R (95) R + M | S −−−−→ S (96) S − M −−→ R (97) R − S −−→ S. (98)in words, S promises R a service, if it receives payment M ; and R promises to pay M if it receivesservice S . In a network without trust, this is a deadlock. But if any agent trusts the other enough to goﬁrst, it is a cyclic generator of a long term relationship. Such relationships imply lasting value, as knownfrom game theory (for a review see [26]). Both agents also promise that they will take (-) what the otheris offering unconditionally. This is a signal of trust. Valuations are not necessarily rational to anyone butthe agent that makes them, and are unrelated to cost.The economic value is that something is exchanged, which requires a binding of both + and - promises. S + S −−→ R (99) R − S −−→ S (100)40oth agents recognize the value of the other party, so the value exchanged is proportional their assess-ments that the promises were kept: v S ∝ α S ( S + S −−→ R ) α S ( R − S −−→ S ) (101) v R ∝ α R ( S + S −−→ R ) α R ( R − S −−→ S ) . (102)In a community where such transfers are made often and between arbitrary pairs of agents, standardsof valuation are equilibrated, and may be exchange in league with a calibration agency (e.g. a bank orgovernment). Thus, in a well-connected community, with a spanning infrastructure, we may posit thatthe value of a one-way transfer is simply v C (Π Sij ) = c S α i α j , (103)where c S is the currency value of a perfect service relationship S , and α i is an impartial assessment ofthe probability with which A i will keep its promise to give or receive S . B Deﬁning the edge of a city or community by membership promises

So far, we’ve skirted around the issue of what constitutes the city limit, in the deﬁnition of a city. Thesize of the city plays a role in the measurements, and the model in [1] treats the city as an economicallyinﬂated balloon, with an edge, so we must understand what constitutes the scope of the city or semanticspace. In fact the city is more like a club with membership than a balloon with an edge.Consider how the network we call a city is reached from beyond. How does it communicate with theoutside world? Cities have roads and other infrastructure links in and out of the city (James Blish novelsnotwithstanding). Should these links be treated as if they were part of the same infrastructure mesh asthe city itself? If so, where does the city start and end? The use of dynamical scaling arguments foreverything else suggests that there might be a dynamical ‘network’ answer to this question of boundaryconditions, but this is not the case.Consider a simple thought experiment, in which we start with two separate cities and join them to-gether by increasing the density of connections (see ﬁgure 12). As soon as a city is connected to anFigure 12:

A simple thought experiment: when do two cities merge into a single one? As we add more an morelink capacity, when does the scaling change from N + N to ( N + N ) ? Providing the links are not saturated, ittakes only a single link. So where what does the edge mean? outside network, this extends the internal network. As long as the ‘external’ network has the capacity41o support the aggregate level of trafﬁc from the population N I , then it is no different from the internalinfrastructure network, and we have simply extended the boundary of the city.One proposal might be to look for the presence of a discontinuity in the network capacity, like athreshold event horizon, at which the response time for a network interaction changed: R external (cid:29) R internal (104)But this doesn’t quite make sense either. Some processes are fast and some are slow, even in an urbancentre (a letter posted may take longer to arrive than an ofﬁce worker takes to process and even reply toit, while a person can take the train across town in half that time) . An ecosystem has a broad mixtureof timescales. Only by trying to separating weakly coupled agents can we compare relative timescalesfor modular components. Moreover, the entirely local gestation or production time is usually the limitingtime factor in the economic processes considered here, not the transport or delivery times.While physically plausible, this this is not how cities are deﬁned. They are deﬁned semantically,by labelling of community membership. The simple explanation [6, 9, 29] is that a city is deﬁned to bethat collection of agents that mutually promise to be members of the city, and that are accepted as suchby the city authorities. In practice, the population must register as residents, and they receive promisesof services (including tax collection). One assumes that the transient population of any city is a smallcorrection to this.The autonomy of any observer counts in making the judgement of community boundary. Each agentcan (and will) judge independently whether it considers itself a part of a region or not. This begs thequestion how collected data deﬁne communities, and they really have an edge or not. Can the scalinglaws be made to ﬁt any interconnected network? The structural considerations in section 3.7 suggest that,with the right understanding of functional structure, the universalities can be applied properly. Theseissues have been highlighted in the empirical studies in [10, 13], where it is found that the scaling isdistorted by picking the ‘wrong boundary’ in urban regions.As we try to apply the ideas to similar networks, such as IT infrastructure and online communities,the role of the physical city as an entity becomes unimportant. It is rather the community that resides andinteracts within it that plays the major role of mixing. The details of the infrastructure play a role, but theuniversality lies in the notion of a community. C The phase of a community: mobility of agents, and interactioncatalysts

Agents exist a priori in an unbound state, effectively a gas phase, mixing with their changing surround-ings. By promising constrained cooperation, they can voluntarily become part of a solid phase, interactingonly with ﬁxed neighbours. Mobility of promising agents, in the gas phase, remains important in all func-tional systems.There are two kinds of networks in a city: • Supply or delivery networks which are one-way ﬂows from source to destination. These are mainlybranching processes, but may also have simple redundancy. • Collaboration networks with two-way interactions between communicating agents. The agents canbe people, machines, companies, etc. An impartial approach, based on actual network topology, would be to use the ‘Archipelago method’, to deﬁning regions ofnetwork eigenvector centrality. Using a hill climbing to deﬁne natural regions that are seeded on very central nodes leads to welldeﬁned regions [45]. The question is whether these have any semantic signiﬁcance. There are two ways to do it: either based on theshared infrastructure network, or on the virtual business networks that describe the outputs Y , by deﬁning an effective adjacencymatrix based on promise bindings. • People at ﬁxed locations can use telephone or Internet to send messages (solid state agents). • People travel between locations, carrying messages with them (agents as a gas).These methods may be treated as different kinds of network in the model. Promise theory is a chemistryfor generalized semantics bindings.The model proposed by [1] is close to the appearance of a kinetic theory, but the city is not a gaswith random motions in this model. It’s phase is not deﬁned, because the physical realization of thenetwork is not deﬁned. The movement of the population could responsible for forming links betweenagents, i.e. transport via the infrastructure networks (like taxis and subway); or intermediate messengertechnologies could be responsible (such as post and Internet). The phase could play a role in the scalingof the infrastructure network in general, constraining its degrees of freedom and range. A solid phaselimits the effective dimensionality D .Modern cities comprise a ﬁxed infrastructure in a mainly ‘solid state’, while the agents are sometimesbound in a solid state, and sometimes free as a gas. People move around the city in the subway likewater, but increasingly they use messages, like ‘covalently’ bonded work-molecules rather than transportpipes. Now that distance is less costly, the ease and speed of networking is reﬂected in the lower densityof modern cities, versus older cities where high density enabled ease of meeting, in a kind of primordialsoup of mixed intercourse.Cities are not the only functional networks, of course. Any community could be completely mobile,with no ﬁxed address, such as online communities, coming together only in meeting places that play therole of catalysts. This makes the mixing of skills and promises more efﬁcient . Catalysts for bringingagents together, like social meeting places play an increasingly important role. Open source software isone of the important outputs of the modern world, which happens completely outside cities . In biology,the infrastructure networks are in a ﬂuid state, with functional agents (cells) transported suspension,and their promises advertised by compatability molecules and receptors on their surfaces, like exteriorsuperagent promises [7], exactly analogous to small businesses. IT networks are mainly solid, or quasi-solid, even when using mobile devices, as the messages move between the agents much faster than theagents move themselves so the motion of devices is negligible for many purposes. D Effective power law scaling from Amdahl’s and Gunther’s law

The Amdahl and Gunther scaling relations are workload scalings, not in the usual form of universalscaling relations. Let’s consider how we might derive an approximate power law scaling relation fromthese. If we want to know how the time fraction (speedup) scales for as a function of the number ofprocessors N , then we can compare N with γN . Then, we can write, for γ > T ( γN ) T ( N ) = σ + πγN σ + πN (105) = 1 + (cid:16) γ − (cid:17) πσn πσn . (106) Curiously, it is also believed in biology that the cooking of food is what made humans an efﬁcient species, supplying energy tofuel our large brains. Github and other version control repository virtual code libraries function as catalytic meeting places. δ = DD + H , where D = 1 and H = πσn , then, approximating as a binomial expansion, T ( γN ) T ( N ) = 1 − (cid:18) γ − γ (cid:19) δ. (107) (cid:39) (cid:18) − (cid:18) γ − γ (cid:19)(cid:19) δ . . . (108) (cid:39) γ − δ . (109)Thus we have an approximate power law ﬁt for large N compared to π/σ , and we can write T ( N ) (cid:39) T N − δ . (110)i.e. there is a marginal relative economy of scale for small N , which decays to an essentially scaleinvariant constant result. If we allow, like Gunther, for the presence of equilibration, or mesh coherenceeffects, then we could use the form T ( N ) = σ + πγN + κN, (111)where κ represents linear time take to poll each of the worker agents. This is the case where replicationand consistency are required. With this extra term, we have T ( γN ) T ( N ) = σ + πγN + κγNσ + πN + κN (112) = 1 + (cid:16) γ − (cid:17) πσ + ( γ − κσ N πσ + κσ N . (113)Now the behaviour doesn’t separate cleanly, and there are two regimes of approximate power law scaling,with something more messy in between. Using the same procedure as before, we get an anomalous termfor κ (cid:54) = 0 : T ( γN ) T ( N ) (cid:39) (cid:18) − (cid:18) γ − γ (cid:19)(cid:19) δ + κ ( γ − N π + σN + κN (114) (cid:39) γ − δ ( κ (cid:28) σ, N small ) (115) (cid:39) γ ( κ > , N large ) (116)(117)So for large N , with κ > , we have simply T = T N, (118)i.e. the scaling cost becomes linearly worse with increasing size.When we compare these results to a spacetime scaling, it will become apparent that this takes theapproximate form of a scaling law in a one-dimensional spacetime D = 1 , with Hausdorff dimension H = π/σn < . This indicates that a serial workﬂow, with some parallelism, is essentially a onedimensional problem, with some fractal complexity in its trajectory due to parallelism . Interestingly, asthe parallelism increases, the duration of the fractal dimensionality shrinks to nothing. Thus the large N limit for serial processing tends to squeeze the degrees of freedom in the system. Alternatively, if we think about the problem graph theoretically, we can also say that it behaves like a D = N dimensionalspace, and a trajectory with Hausdorff dimension H = π/sigma . In a graph, the node degree k = N is the effective dimensionof spacetime at the point [6].is the effective dimensionof spacetime at the point [6].