DDiscovering the laws of urbanisation
Filippo Simini , & Charlotte R. James Department of Engineering Mathematics, University of Bristol, Merchant Venturers Building, WoodlandRoad, BS8 1UB, Bristol, UK.
In 2012 the world’s population exceeded 7 billion, and since 2008 the number of individu-als living in urban areas has surpassed that of rural areas. This is the result of an overallincrease of life expectancy in many countries that has caused an unprecedented growth ofthe world’s total population during recent decades, combined with a net migration flow fromrural villages to urban agglomerations . While it is clear that the rate of natural increase andmigration flows are the driving forces shaping the spatial distribution of population, a gen-eral consensus on the mechanisms that characterise the urbanisation process is still lacking . Here we present two fundamental laws of urbanisation that are quantitatively supportedby empirical evidence: 1) the number of cities in a country is proportional to the country’s to-tal population, irrespective of the country’s area, and 2) the average distance between citiesscales as the inverse of the square root of the country’s population density. We study thespatio-temporal evolution of population considering two classes of models, Gravity and In-tervening Opportunities, to estimate migration flows and show that they produce differentspatial patterns of cities. Ranking cities by population, it has been observed
8, 9 that the population of the i -th largestcity of a country is approximately equal to the population of the largest city divided by i , i.e. a1 a r X i v : . [ phy s i c s . s o c - ph ] D ec ity’s rank is inversely proportional to its population. In other words, the fraction of cities withpopulation larger than X follows a Zipf law, P > ( X ) ∝ /X . Several mechanisms have beenproposed to account for this observation
3, 10–13 , and the prevailing explanation suggests that the Zipflaw results from a stochastic process based on the rule of proportionate random growth, or Gibrat’slaw : the growth rate of a city’s population is independent of its size. Although providinga convincing explanation to the Zipf law, the rule of proportionate random growth is unable toanswer the following fundamental questions about the urbanisation process: How do cities form?What determines the number of cities in a country? What is the spatial distribution of these cities?To answer these questions we analyse a comprehensive dataset on the population and location ofcities globally . We first consider the relationship between the number of cities in a geographicalregion (country or state) and other relevant quantities like the region’s area, population, and density.Each circle in Fig. 1a corresponds to one of the 210 countries of the world, and the number ofcities with more than 50,000 inhabitants in each country ( y axis) is displayed as a function of itstotal population ( x axis), population density (colour), and total area (circle size). The correlationcoefficients between the number of cities and total population (0.84), area (0.64), and populationdensity (-0.04) indicate a strong linear dependence between the number of cities in a country andits total population. This hypothesis is further confirmed by considering more homogeneous setsof regions, like the United States in Fig. 1b and 2a, as well as European countries and India’s states(Supplementary Information). In general, there is clear evidence that the number of cities growsproportionally with the state or country population, whilst there is a small or indirect relationshipbetween the number of cities and the region’s area or population density: in the US, the cities-2opulation, cities-area, cities-density correlation coefficients are 0.95, 0.04, and -0.08 respectively.Our first law of urbanisation establishes the linear dependence between the number of cities andthe population of a region: The total number of cities in a region is proportional to the region’spopulation , C ( N ) ∼ N , where C ( N ) is the number of cities within a region with population N .Combining this result with Zipf’s law we can estimate C ( N, X ) , the number of cities above a givensize, X , in a region with total population N as C ( N, X ) = C ( N ) P > ( X ) ∼ N/X. (1)In Figure 1c we plot the number of cities with more than X inhabitants in each US state, C ( N, X ) ,as a function of the ratio N/X for values of X ranging from 5,000 to 5,000,000 inhabitants. Allpoints collapse on a straight line, confirming that Eq. 1 holds for several orders of magnitudeof N and X . The first law of urbanisation is also a condition to ensure the stationarity of theZipf law for a country with a uniformly growing population. Indeed, assuming that the stationarydistribution of city sizes follows Zipf’s law, if the population of each city doubles in a time ∆ t , andso does the country’s total population, then the number of cities must also double, otherwise thecity size distribution would shift to the right and not be stationary. Thus when the total populationof a region increases, not only do existing cities increase in size but new cities are also createdin accordance with Eq. 1. We find evidence of this behaviour in Iowa, US, where historical datashows that between 1850 and 2000 the number of incorporated places (i.e. self-governing cities,towns, or villages) grew at the same rate as the state population (Figure 2b). The second lawof urbanisation describes the spatial distribution of cities: The distribution of cities in space isa statistically self-similar object with fractal dimension equal to 2 . This means that the average3umber of cities in a region with uniform population density (if measured on a length scale largerthan the average distance between cities) is proportional to the region’s area, or equivalently thatthe density of cities scales as χ ∼ C/A . Combining this result with the first law and Zipf’s law,Eq.1, and observing that the average distance to the closest city, (cid:104) d c (cid:105) , scales as the inverse of thesquare root of the density of cities, we obtain the following result: (cid:104) d c (cid:105) ∼ / √ χ ∼ (cid:112) A/C ∼ (cid:112) X ( A/N ) ∼ (cid:112) X/ρ (2)i.e. the average distance to the closest city for cities with more than X inhabitants is proportionalto the square root of X and inversely proportional to the square root of the region’s population den-sity, ρ ≡ N/A . Figure 2c shows the average distance to the closest city with more than X =5,000inhabitants for the United States as a function of the state population density, and confirms thescaling behaviour predicted by Eq.2. This finding supports some of the conclusions of the CentralPlace Theory of human geography
18, 19 , whilst disproving others. On the one hand, it is true thatfor a region with uniform population density the larger the cities are, the fewer in number they willbe, and the greater the distance, i.e. increasing X in Eq.2 results in a greater average distance (cid:104) d c (cid:105) .On the other hand, the average distance between cities of a given size X is not the same for allthe states, but depends on the state’s population density: cities of a given size are closer in denselypopulated states than in sparsely populated ones, i.e. for a fixed city size X and state area A thedistance between cities decreases as the inverse square root of the state population, N . To explainthe empirical laws of urbanisation, Eqs.1 and 2, we must understand the effect of migrations oncities’ demographic dynamics. Traditionally, two main mechanisms have been used to estimateand model migrations and other aggregated spatial flows: Gravity models
20, 21 and Intervening Op-4ortunities (IO) models . In both approaches flows are estimated as the product of two types ofvariable; one type that depends on an attribute of each individual location (the number of opportu-nities, usually identified with population), and the other type that depends on a quantity relating apair of locations (i.e. a distance). The difference between the two models pertains to the distancevariable considered: the geographical distance in Gravity models, and the number of interveningopportunities in IO models (the number of intervening opportunities between locations i and j isdefined as the sum of the opportunities of all locations that are closer to i than j is). Both mod-els are able to estimate migration flows with comparable accuracy when fitted to empirical data,and currently there is no objective quantitative criterion for selecting one modelling approach overthe other in order to infer which between geographic distance or intervening opportunities is thevariable that best describes domestic migration flows. We characterise the patterns of populationdistribution generated by each model of migration by running two sets of numerical simulationsof the spatiotemporal evolution of a population distributed in the cells of a square grid. At ev-ery time step the population of each cell can vary due to both natural increase (i.e. births-deaths)modelled with the proportionate random growth mechanism, and migrations of individuals from/toother cells modelled using a Gravity model in one case, and an IO model in the other (see Sup-plementary Information). Imposing a reflecting boundary condition to ensure that the populationin each cell cannot decrease below a minimum value n and starting from a random populationslightly above n in each cell, we observe that in both cases the total population increases with n (see insets of Fig. 3a-b) and the population distribution of the emerging peaks or clusters of highdensity, i.e. cities, is close to Zipf’s law (see main panels of Fig. 3a-b). The difference between5he two migration models is that for the Gravity model the number of clusters is independent of n , whereas for the IO model, fixing the values of all other parameters, the number of clustersgrows with n and the total population, see Fig. 3c-d. To explain this result we consider the cor-responding deterministic dynamics, using an approach similar to the ones developed in economicgeography . In this approach we represent the spatial distribution of population as a continuousdensity, and describe its time evolution using a deterministic equation that comprises of two terms:migrations and natural increase. We model migrations using a continuum version of Gravity andIO models in which the deterrence function, f , describing the decay of flows with distance, canbe any continuous function that may depend on one or more parameters denoted by R . We es-timate the overall opportunities at a location as the sum of natural resources, w , such as fertilesoil and minerals which we assume to be constant and uniformly distributed so that unpopulatedlocations are a priori equivalent, and population-driven opportunities such as jobs and serviceswhich are proportional to the population, ρ . We model natural increase using a logistic equationcharacterised by two parameters; the carrying capacity, ρ , which is equal to the stationary densityof population attainable in each location in isolation, and the growth rate, g , which determines thespeed to reach the stationary state. The resulting dynamic equation for the density of population ρ ( x, t ) is ˙ ρ ( x, t ) = gρ ( x, t )[1 − ρ ( x, t ) /ρ ] − T ρ ( x, t ) + T in ( ρ ( t ) , w, f, R ) (3)where the constant T is the migration rate, and the term T in represents the increase in densityat location x due to migrations from all other locations (see Supplementary Information). Weobserve that a uniform density of population ρ ( x ) = ρ is an equilibrium state because the growth6erm vanishes and the net migration at every location T in − T ρ is zero by symmetry. First, wedetermine the conditions for the formation of cities by linearising Eq.3 around ρ and studying itsstability. If ρ is an unstable equilibrium then an initially small perturbation can grow leading tothe formation of zones of high population density (cities), whereas if the state of uniform density ρ is a stable equilibrium no cities will form. For Gravity models in one and two dimensions,the uniform state is unstable and cities emerge when T > g ρ ( ρ + w )( ρ − w ) , which tends to T > g in the limit of large carrying capacity ρ (cid:29) w (Fig. 3e). This means that cities will form if thepopulation is sufficiently mobile, i.e. if the migration rate T is sufficiently higher than the growthrate g , and this result is independent of both the particular deterrence function f considered, andthe average distance of migrations, determined by R . For IO models in one and two dimensions,the condition for the instability does depend on the particular deterrence function considered (seeSupplementary Information), however in the large population limit ρ (cid:29) w we recover the sameresult as the Gravity model: cities form if the population is sufficiently mobile, T > Bg whereconstant B may depend on f (Fig. 3e). We can thus conjecture a zeroth law of urbanisation: The formation of cities is only possible if the migration rate is sufficiently higher than the rateof natural increase.
Second, we determine the number of cities that will form from an unstableequilibrium by computing the wavelength of the mode with the highest instability, k m , which isproportional to the number of cities per unit length (i.e. inversely proportional to the characteristicdistance between cities). The two migration models produce radically different results. For Gravitymodels the number of cities, k m , depends on the range of migrations determined by the deterrencefunction f and its parameters R , but is independent of the carrying capacity in the limit ρ (cid:29) w ;7or IO models k m depends on f , R , and grows proportionally to the total population ρ in the limit ρ (cid:29) w (see Fig.3f and Supplementary Information), in agreement with numerical simulations andthe laws of urbanisation. The fact that Gravity and IO models generate different spatial patterns ofpopulation distribution enables the possibility to assess their compatibility with the empirical lawsof urbanisation, hence providing a criterion to determine which between geographic distance ornumber of intervening opportunities is the correct variable to describe migration flows. References
1. Martine, G. & Marshall, A.
State of world population 2007: unleashing the potential of urbangrowth . State of world population 2007: unleashing the potential of urban growth (UNFPA,2007).2. Batty, M. The size, scale, and shape of cities.
Science , 769 (2008).3. Rozenfeld, H. D. et al.
Laws of population growth.
Proceedings of the National Academy ofSciences , 18702 (2008).4. Bettencourt, L., Lobo, J., Helbing, D., Kühnert, C. & West, G. B. Growth, innovation, scaling,and the pace of life in cities.
Proceedings of the National Academy of Sciences , 7301(2007).5. Bettencourt, L. M. The origins of scaling in cities.
Science (New York, N.Y.) , 1438–1441(2013). 8. Louf, R. & Barthelemy, M. Modeling the polycentric transition of cities.
Physical ReviewLetters , 198702 (2013).7. Batty, M. Cities and complexity: Understanding cities through cellular automata, agent-basedmodels and fractals (2005).8. Zipf, G. K. The psycho-biology of language. (1935).9. Ioannides, Y. M. & Overman, H. G. Zipf’s law for cities: an empirical examination.
Regionalscience and urban economics , 127–137 (2003).10. Gabaix, X. Zipf’s law for cities: an explanation. The Quarterly Journal of Economics ,739–767 (1999).11. Eeckhout, J. Gibrat’s law for (all) cities.
American Economic Review
Nature ,608–612 (1995).13. Gabaix, X. & Ioannides, Y. M. The evolution of city size distributions.
Handbook of regionaland urban economics , 2341–2378 (2004).14. Gibrat, R. Les inégalités économiques (Recueil Sirey, 1931).15. Marsili, M. & Zhang, Y.-C. Interacting individuals leading to zipf’s law.
Physical ReviewLetters , 2741 (1998).16. Hernando, A., Hernando, R., Plastino, A. & Zambrano, E. Memory-endowed us cities andtheir demographic interactions. Journal of The Royal Society Interface , 20141185 (2015).97. Vatant, B. & Wick, M. Geonames ontology (2012).18. Christaller, W.
Die zentralen Orte in Süddeutschland: eine ökonomisch-geographische Un-tersuchung über die Gesetzmässigkeit der Verbreitung und Entwicklung der Siedlungen mitstädtischen Funktionen (University Microfilms, 1933).19. Losch, A. Die raumliche ordnung der wirtschaft; eine untersuchung uber standort, wirtschafts-gebiete und internationalen handel (1943).20. Zipf, G. K. The p 1 p 2/d hypothesis: On the intercity movement of persons.
AmericanSociological Review , 677–686 (1946).21. Wilson, A. G. The use of entropy maximising models, in the theory of trip distribution, modesplit and route split. Journal of Transport Economics and Policy
AmericanSociological Review , 845–867 (1940).23. Schneider, M. Gravity models and trip distribution theory. Papers in Regional Science ,51–56 (1959).24. Simini, F., González, M. C., Maritan, A. & Barabási, A. L. A universal model for mobilityand migration patterns. Nature , 96 (2012).25. Noulas, A., Scellato, S., Lambiotte, R., Pontil, M. & Mascolo, C. A tale of many cities:universal patterns in human urban mobility.
PloS one , e37027 (2012).26. Krugman, P. A dynamic spatial model (1992).107. Wilson, A. G.
Catastrophe theory and bifurcation: applications to urban and regional systems (Univ of California Press, 1981).28. Rosser, J. B.
Complex evolutionary dynamics in urban-regional and ecologic-economic sys-tems: From catastrophe to chaos and beyond (Springer Science & Business Media, 2011).29. Favaro, J. & Pumain, D. Gibrat revisited: An urban growth model incorporating spatial inter-action and innovation cycles.
Geographical Analysis , 261–286 (2011).30. Simini, F., Maritan, A. & Néda, Z. Human mobility in a continuum approach. PloS one ,e60069 (2013). 11 , country population ( , ) , nu m be r o f c i t i e s World a D en s i t y ( k m ) , state population ( , ) , nu m be r o f c i t i e s United States b D en s i t y ( k m ) , state area [km ] ( , ) / ( , ) , nu m be r o f c i t i e s c ( , ) Figure 1
The first law of urbanisation. a , The number of cities with more than 50,000 inhabitants, C ( N, k ) ( y value), as a function of the population N ( x value), area A (circle size), and populationdensity ρ (color), for 210 countries. The correlation coefficients corr ( C, N ) = 0 . , corr ( C, A ) =0 . , and corr ( C, ρ ) = − . indicate a strong linear dependence between C and N . The weakercorrelation between a country’s area and the number of cities is due to an indirect dependence ofarea and total population, i.e. larger countries tend to be more populated. b , The number of citieswith more than 5,000 inhabitants in the Unites States is proportional to the state’s population,corr ( C, N ) = 0 . . The correlations with area ( . , see inset) and population density ( − . )are negligible, as illustrated by the following pairs of states with similar area or density and verydifferent number of cities: Alaska ("AK": A = 1 . M km , C (5 k ) = 22 ) vs Texas ("TX": A = 0 . Mkm , C (5 k ) = 392 ), and Rhode Island ("RI": ρ = 393 km − , C (5 k ) = 35 ) vs New Jersey ("NJ": ρ = 467 km − , C (5 k ) = 316 ). c , Combining the first law of urbanisation with Zipf’s law it is possibleto estimate the number of cities with more than X inhabitants in a country with population N as C ( N, X ) ∼ N/X . As a consequence, the scattered cloud of points resulting when plotting C ( N, X ) against N for various X ’s in the range · − · (inset) collapses on a straight linewhen C ( N, X ) is plotted against the ratio N/X .
840 1860 1880 1900 1920 1940 1960 1980 2000
Year I o w a popu l a t i on ( × ) ba Population N u m be r o f c i t i e s , t o w n s , v ill age s i n I o w a Cities, towns,villages
Population C i t i e s , t o w n s , v ill age s / , state population density [km ] , a v e r age d i s t an c e t o t he c l o s e s t c i t y [ k m ] c ( / ) / Iowa
Pop: 3046k, Area: 144k kmCities > Connecticut
Pop: 3574k, Area: 12k kmCities > South Dakota
Pop: 814k, Area: 196k kmCities > Figure 2 a , Illustration of the relationships between total population, number of cities, and theiraverage distance in Iowa, Connecticut, and South Dakota. In agreement with the first law ofurbanisation, Eq.1, Iowa and Connecticut have similar populations and a similar number of citieswith more than 5,000 inhabitants, despite Connecticut having one-twelfth the area of Iowa; SouthDakota instead has roughly one-fourth of both the population and number of cities of Iowa andConnecticut, despite having a slightly larger area than Iowa. In agreement with the second lawof urbanisation, Eq.2, cities in Connecticut are closer than cities in Iowa because of the higherpopulation density in Connecticut. By rescaling distances such that Connecticut’s area becomesequal to Iowa’s area, the two states would have the same population density and consequentlythe same average distance between cities. b , Historical records of the number of incorporatedplaces ( C , red triangles) and the state population ( N , blue circles) in Iowa from 1850 to 2000 source: State library of Iowa, state data center). The similar growth rates of C and N entail thevalidity of the first law of urbanisation C ∼ N during the 150-year period (inset). In a countrywith a uniformly growing population the first law of urbanisation implies the stationarity of Zipf’slaw and vice versa: If the population of each city doubles in a time ∆ t , the number of cities withmore than X inhabitants at time t + ∆ t is equal to the number of cities that had more than X/ inhabitants at time t , C (2 N ) P > ( X, t + ∆ t ) = C ( N ) P > ( X/ , t ) , then the Zipf law is the stationarydistribution of city sizes P > ( X, t + ∆ t ) = P > ( X, t ) = 1 /X , if the first law of urbanisation holds C (2 N ) = 2 C ( N ) ⇒ C ( N ) ∝ N . c , The second law of urbanisation for the United States. Theaverage distance to the closest city scales as the inverse of the square root of the state’s populationdensity (here all cities with more than 5,000 inhabitants are considered). The asymmetric errorbars denote the standard deviations above and below the average.
100 200 300 400 500 600 time steps N u m be r o f c l u s t e r s ( I O m ode l ) c === X, cluster population I n t e r v en i ng O ppo r t un i t i e s m ode l P > () a === time steps T o t a l popu l a t i on ( × ) time steps N u m be r o f c l u s t e r s ( G r a v i t y m ode l ) d === X, cluster population G r a v i t y m ode l P > () b === time steps T o t a l popu l a t i on ( × ) / / CitiesFlat e / f gravity expgravity p-lIO expIO p-l Figure 3 a-b , Main panels, counter cumulative distributions of clusters’ populations, X , at the endof the numerical simulations using Intervening Opportunities (a) and Gravity (b) models of migra-tion, for various values of minimum population n = 50 , , (see Supplementary Informationfor simulation details). The grey line is a guide for the eye depicting Zipf’s law, P > ( X ) ∼ /X .Symbols at each X denote the average fraction of clusters with population larger than X over 16realizations of the numerical simulation, and error bars represent the 25th and 75th percentiles.Insets, total population as a function of the simulation time steps. Lines with symbols denote themean total population whereas each shaded area represents the region between the 25th and75th percentiles. Simulations with larger n have a larger total population at all time steps for bothmodels, implying that the total population increases with n . c-d , Time evolution of the number ofclusters obtained in the 16 realizations of numerical simulations using Intervening Opportunities c) and Gravity (d) models of migration. Symbols denote the median number of clusters at a giventime step and error bars correspond to the 10th and 90th percentile. While for the Gravity model(d) the number of clusters at stationarity is independent of n , for the IO model (c) the number ofclusters becomes larger as n and the total population increase. e , Zeroth law of urbanisation. Ac-cording to our deterministic model of population dynamics, Eq.3, cities can form if the populationis sufficiently mobile, i.e. if the migration rate T is sufficiently higher than the population growthrate g . The three curves show the conditions for the equilibrium state of uniform population to beunstable for the exponential IO model, Gravity models, and a power-law IO model (from bottom totop). f , The wavelength of the mode with the highest instability, k m , which is proportional to thenumber of cities per unit length, as a function of the ratio ρ /w . For Gravity models (red curves) k m tends to a constant in the limit of high carrying capacity, ρ (cid:29) w , whereas in the same limit k m grows proportionally to ρ for IO models (green curves). iscovering the Laws of Urbanisation - SupplementaryInformation Contents ρ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.1.1 1D: g=0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.1.2 1D: g=1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.1.3 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 The Dynamic Equation
A general theory aiming at describing human demographic dynamics, i.e. the spatio-temporalevolution of the population distribution in a region, must consider all factors that contribute to thepopulation change in any location. These contributions are reduced to the following two fundamentalcategories: births, deaths and external (international) migrations on one end, and internal (national)migrations or relocations on the other. The first contribution modifies the total population ofthe region, while the second is responsible for the redistribution of individuals within the region’sarea.If relocations from one point to another are to be considered, alongside population growth andinternational migrations, a deterministic equation to describe the change in population density, ρ ( x, t ) , at location x is given by: ∂ρ ( x, t ) ∂t = g ( ρ ( x, t )) − T out ( x, t, ρ ) + T in ( x, t, ρ ) . (S1) T out and T in are the number of individuals leaving and relocating to point x from other parts ofthe region respectively; these terms represent internal migrations. The form of these functions isdetermined by the mathematical model used to describe the flow of individuals who relocate fromone point to another. Here we consider a Gravity model alongside Intervening Opportunities andRadiation models. The probability that an individual will relocate to a given location is dependenton both the distance to and opportunities at the destination (Gravity models) or at all intermediatelocations (Intervening Opportunities and Radiation models). Here, an opportunity refers to anyproperty of a location that may be of interest to the relocating individual. We distinguish betweentwo classes of opportunities; those that depend on the population, such as availability of goods andservices, and those that are independent of population such as the presence of naturally occurringresources.The function g ( ρ ( x, t )) represents the contribution in population change caused by births, deaths,and external (international) migrations. It can be zero, g ( ρ ( x, t )) = 0 , if the system has con-stant population, linear g ( ρ ( x, t )) = g · ρ ( x, t ) for an exponential population growth, or Logistic g ( ρ ( x, t )) = g · ρ ( x, t )(1 − ρ ( x, t ) /ρ ) , characterizing population dynamics with a uniform stationarystate, ρ ( x, t ) = ρ .In the following we consider a logistic form of growth and study the time evolution of a continuouspopulation distribution when subjected to small perturbations about the uniform stationary state ρ ( x, t ) = ρ . While this approach neglects the inherent randomness of stochastic models required toreproduce phenomena such as Zipf’s law, it allows for a fully deterministic analysis of the migrationmodels and their parameters, particularly the parameter constraints under which cities will form.Furthermore we note that additional contributions, such as a diffusion term responsible for therelocation of individuals from highly populated areas to less populated suburbs, can easily be added,however these will not be considered here. Gravity models are derived from Newton’s Law of Gravitation;2 ny two bodies attract one another with a force that is proportional to the product oftheir masses and inversely proportional to the square of the distance between them.
In the context of migratory dynamics, the Gravity model suggests that the probability per unit timethat an individual moves between two locations, say i and j , is proportional to the product of thepopulation densities at i and j , each to some power, and a function of the distance, r ij , between i and j , ie: P i ( j ) = ρ ( i ) α ρ ( j ) β f ( r ij ) R D djρ ( i ) α ρ ( j ) β f ( r ij ) . (S2)The denominator ensures that the probability, P i ( j ) , is correctly normalised over the domain D.Exponents α and β are usually positive, and often β = 1 . f ( r ) may be any continuous function,usually an exponential, e − rR , or power law, r − γ .In Equation S2 it is assumed that individuals will relocate to another position only if there is non-zero population at the destination. In reality, the probability for an individual to relocate is alsodependent upon the availability of natural resources at the new location or more generally, anyopportunity which is not related to the population, such as the presence of fertile soil or minerals.If these resources are combined into a single term, w , then we can re-write Equation S2 as P i ( j ) = [ ρ ( j ) + w ( j )] f ( r ij ) R D dj [ ρ ( j ) + w ( j )] f ( r ij ) . (S3)where β has been replaced with and ρ ( j ) + w ( j ) denotes the total number of opportunities atlocation j . For simplicity we assume that w is stationary in time; this may be justified by the factthat the rate of recovery of such resources is greater than the rate of consumption.In order to model the change in population density at location i , ie ρ ( i ) , in time, the average numberof people both leaving from and moving to i must be included. The number of people leaving i perunit time is given by: T out ( i ) = T ( ρ ( i ) , w ( i )) ρ ( i ) Z D djP i ( j ) = T ( ρ ( i ) , w ( i )) ρ ( i ) . (S4)In the above, T ( ρ ( i ) , w ( i )) is the average migration rate, i.e. the fraction of people in i that willrelocate in a time unit. To a first approximation this is assumed to be constant and independentof location; T ( ρ ( i ) , w ( i )) = T . The average number of people relocating to i per unit time is givenby: T in ( i ) = Z D djT ( ρ ( j ) , w ( j )) ρ ( j ) P j ( i )= [ ρ ( i ) + w ( i )] Z D dj T ( ρ ( j ) , w ( j )) ρ ( j ) f ( r ij ) R D di [ ρ ( i ) + w ( i )] f ( r ij ) . (S5)Inserting these expressions into Equation S1 allows for the change in population density with time,as described by a Gravity model, to be studied. 3 .1 Pattern Formation and Growth in 1D Considering a 1-dimensional line of infinite length, we write the dynamic equation, S1, for theGravity model as: ∂ρ ( x, t ) ∂t = g (1 − ρ ( x, t ) ρ ) ρ ( x, t ) − T ρ ( x, t )+ T ( ρ ( x, t ) + w ) (cid:16) Z ∞ f ( r ) ρ ( x − r, t ) R ∞ f ( z )( ρ ( − r + x − z, t ) + ρ ( − r + x + z, t ) + 2 w ) dz + f ( r ) ρ ( r + x, t ) R ∞ f ( z )( ρ ( r + x − z, t ) + ρ ( r + x + z, t ) + 2 w ) dz dr (cid:17) . (S6)The uniform distribution ρ ( x ) = ρ is a stationary state of Equation S1 because the growth termis equal to zero and T in is equal to T out for all x , hence the time derivative on the left-hand side iszero. In order to determine the stability of the uniform steady state we linearise Equation S6 about ρ and study the time evolution of a small perturbation ˜ ρ ( x, t ) ; ∂ ˜ ρ ( x, t ) ∂t =( − g − T ) ˜ ρ ( x, t ) + T Z ∞ ρ f ( r ) ˜ ρ ( x, t ) R ∞ f ( z )(2 ρ + 2 w ) dz dr + T ( u + w ) Z ∞ − ρ f ( r ) R ∞ f ( z )( ˜ ρ ( − r + x − z, t ) + ˜ ρ ( − r + x + z, t )) dz (cid:0)R ∞ f ( z )(2 ρ + 2 w ) dz (cid:1) − uf ( r ) R ∞ f ( z )( ˜ ρ ( r + x − z, t ) + ˜ ρ ( r + x + z, t )) dz (cid:0)R ∞ f ( z )(2 ρ + 2 w ) dz (cid:1) + f ( r ) ˜ ρ ( x − r, t ) R ∞ f ( z )(2 ρ + 2 w ) dz + f ( r ) ˜ ρ ( r + x, t ) R ∞ f ( z )(2 ρ + 2 w ) dz ! dr. (S7) ˜ ρ ( x, t ) represents a small fluctuation in the stationary density, ie ρ ( x, t ) = ρ + ˜ ρ ( x, t ) , and can bedecomposed as a sum of normal modes of wave number k ∈ R and amplitude h k ( t ) .Substituting ˜ ρ ( x, t ) = h k ( t ) e ikx into Equation S7 we obtain: ∂h k ( t ) ∂t =( − g − T ) h k ( t ) + T ρ h k ( t ) (cid:0)R ∞ f ( r ) dr (cid:1)R ∞ f ( z )( ρ + w ) dz + T ( ρ + w ) h k ( t ) R ∞ f ( r ) cos( kr ) dr R ∞ f ( z )( ρ + w ) dz − ρ (cid:0)R ∞ f ( r ) cos( kr ) dr (cid:1) (cid:0)R ∞ f ( z )( ρ + w ) dz (cid:1) ! , (S8)which may be written as ∂h k ( t ) ∂t = h k ( t )Λ k ( ρ , g, T, w, f ) . (S9)Here Λ k is the growth rate of the instability of the normal mode with wavenumber k ; in order forcities to develop and patterns to emerge Λ must be greater than zero for some k . We consider thecases in which the deterrence function, f ( r ) takes an exponential or power law form.4 .1.1 Exponential Deterrence Function If f ( r ) takes an exponential form (ie. f ( r ) = e − rR ) then the growth function Λ is given by: Λ k ( ρ , g, T, w, R ) = − g − T ρ ρ + w + T w + ( k/R ) ( ρ + w )(1 + ( k/R ) ) ( ρ + w ) . (S10)This expression simplifies using rescaled variables to give: λ κ ( µ , τ ) = − − τ µ µ + 1 + τ κ [ µ + 1][1 + κ ] [ µ + 1] . (S11)Here, κ = k/R , τ = T /g , µ = ρ /w and λ = Λ /g .The curve defined by Equation S11 is displayed in Figures 1a and 1b for different values of µ and τ .As aforementioned, in order for cities to form, λ κ must be greater than zero at its maximum κ . Themaximum is found by differentiating the expression for λ κ with respect to κ . On doing this, we findthat λ κ ( µ , τ ) has a maximum in κ m =0 of µ ≤ and in κ m = p ( µ − / ( µ + 1) (S12)otherwise. This corresponds to a wavenumber k m = R p ( ρ − w ) / ( ρ + w ) which, providing Λ k m > , corresponds to the average number of cities per unit length. For the case that ρ (cid:29) w itcan be seen that k m only depends on R , therefore the density of cities is fully determined by thecharacteristic length of travels, R − , and is independent of all other variables such as growth andmigration rate.The condition on the migration parameter T in order for λ κ to be greater than zero is found bysubstituting the expression for κ m back into Equation S11 and solving for τ . On doing this, theconditions: T c ≥ gρ ( ρ + w )( ρ − w ) , τ c ≥ µ ( µ + 1)( µ − (S13)are obtained. From these relations we can conclude that if ρ (cid:29) w then cities will emerge if τ > ,or T > g . It is important to note that Equation S13 is independent of R; the characteristic traveldistance has no effect on whether or not cities will emerge. The power law form for the deterrence function is given by f ( r ) = (1 + r ) − γ with γ > . Theanalysis of Equation S8 is more complicated for this case, however the maximum of λ k ( µ , τ, γ ) willoccur at a κ m that is only dependent on µ and the function f with its parameter γ .5his is a general result for every function f due to the fact that κ m is defined as λ k m ( µ , τ, γ ) = 0 ,with λ k = ddk ( − − τ µ + 1 + τ " R ∞ f ( r ) cos ( kr ) dr R ∞ f ( z ) dz − µ µ + 1 R ∞ f ( r ) cos ( kr ) dr R ∞ f ( z ) dz ! = τ " ddk R ∞ f ( r ) cos ( kr ) dr R ∞ f ( z ) dz ! − µ µ + 1 ddk R ∞ f ( r ) cos ( kr ) dr R ∞ f ( z ) dz ! , (S14)a function of µ and f only. For the case of µ (cid:29) (that is ρ (cid:29) w ), it is only the function f ,defining the spatial range of the migration process, that the characteristic distance between citiesis dependent upon. λ k m = 0 may be solved numerically for different values of γ . On doing this we find that the criticalcurve collapses on the same universal function, independent of γ .Moreover, it turns out that the critical curve of Equation S13 determining the condition for the for-mation of cities is valid for any continuous function f , including a power law deterrence function andtherefore holds universally. We may prove this by defining A ( k ) ≡ (cid:0) R ∞ f ( r ) cos ( kr ) dr (cid:1) / (cid:0) R ∞ f ( z ) dz (cid:1) .From Equation S14 we have A ( k m ) − µ µ + 1 A ( k m ) A ( k m ) A ( k m ) = µ + 12 µ . (S15)Inserting this into λ k m ( µ , τ c ) = 0 , the critical curve defined in Equation S13 is re-obtained. If a population is described by a logistic growth with individuals relocating according to a Gravitymodel, cities will develop if the following two conditions are met simultaneously: • The average population is sufficiently large: ρ (cid:29) w • The population is sufficiently mobile:
T > g When the population density is sufficiently high, the average distance between cities depends onlyon the deterrence function f and is independent of growth rate, migration rate and average popu-lation. 6 .2 Pattern formation and Growth in 2D In 2 dimensions, Equation S1 describing the evolution of the density of population for a Gravitymodel of human migration is given by: ∂ρ ( x, y, t ) ∂t = gρ ( x, y, t ) (cid:18) − ρ ( x, y, t ) ρ (cid:19) − T ρ ( x, y, t ) + T ( ρ ( x, y, t ) + w ) . Z ∞ Z π f ( r ) ρ ( r cos( θ ) + x, r sin( θ ) + y, t ) R ∞ R π f ( z )( ρ ( r cos( θ ) + x + z cos( φ ) , r sin( θ ) + y + z sin( φ ) , t ) + w ) dφdz dθdr ! . (S16)We linearize this equation about the stationary solution, ρ , obtaining an expression for the evolutionof a small perturbation in the population density: ∂h k,l ( t ) ∂t = h k,l ( t ) n T ( ρ + w ) Z ∞ Z π (cid:16) f ( r ) e i ( kr cos( θ )+ lr sin( θ )) R ∞ πf ( z )( ρ + w ) dz − ρ f ( r ) R ∞ πf ( z ) J (cid:16) √ k + l z (cid:17) e i ( kr cos( θ )+ lr sin( θ )) dz (cid:0)R ∞ πf ( z )( ρ + w ) dz (cid:1) (cid:17) dθdr + T ρ Z ∞ f ( r ) R ∞ f ( z )( ρ + w ) dz dr − g − T o (S17)where h k,l is the amplitude of a 2 D plane wave with wavevector components k, l and J is a Besselfunction of the first kind. For the 2D case, the exponential form of f ( r ) is given by f ( r ) = e − Rr , where r = p x + y ,the classic Euclidean distance. Substituting this into S17, we obtain an expression for the growthfunction Λ k,l : Λ k,l = T ρ (cid:0) k + l (cid:1) ( ρ + w ) ( k + l + R ) + R √ k + l + R − ! − g. (S18)The function Λ k,l depends only on the magnitude of the wavevector, p = √ k + l , and substituting p for k + l in Equation S18 results in an identical expression as that for Λ k , Equation S10. Hencethe critical value of the ratio between migration and growth above which cities are able to form isthe same for both the 1D and 2D Gravity models, given by Equation S13. The power law form of the deterrence function in 2D is given by f ( r ) = (1 + r ) − γ , as for the 1Dcase, but with r equal to the Euclidean distance. Substituting this into Equation S17 we find thatan analytical expression for Λ is not obtainable. Solving the resulting equation for the formation7f cities numerically, in analogy with the 2D model with an exponential f ( r ) , the critical curve ofEquation S13 is obtained. Summarising, in both 1 and 2 dimensions, if the population dynamics aredescribed by a logistic growth with individuals relocating according to a Gravity model cities willform if T > g and ρ (cid:29) w and this is independent of the deterrence function considered. The Intervening Opportunities model of human migration was hypothesised by Stouffer [1]. Hislaw states:
The number of persons going a given distance is directly proportional to the number ofopportunities at that distance and inversely proportional to the number of interveningopportunities.
According to this model, the probability per unit time of observing an individual at location i moving to location j is given by: P i ( j ) = [ ρ ( j ) + w ( j )] f Z B i ( r ij ) dz [ ρ ( z ) + w ( z )] ! , (S19)where B i ( r ij ) is a ball of radius | j − i | centred on i and f is a continuous function. The normalisationof Equation S19 over a domain D with total population N may be imposed as a condition on f : Z D djP i ( j ) = Z ∞ drr Z π dθ [ ρ ( r, θ ) + w ( r, θ )] f (cid:18)Z r dzz Z π dθ [ ρ ( z, θ ) + w ( z, θ )] (cid:19) = Z ∞ dr da ( r ) dr f ( a ( r ))= Z N daf ( a )= F (0) − F ( N ) = 1 . (S20)Here, a ( r ) = R r dzz R π dθ [ ρ ( z, θ ) + w ( z, θ )] and F ( a ) ≡ R ∞ a duf ( u ) is a decreasing function suchthat F (0) = 1 and F ( N ) = 0 . The deterrence function, F ( a ) may be either exponential, e − aR , asis the case for the Intervening Opportunities model, or power law ( a ) which corresponds to theRadiation model.For these models, the total number of travellers leaving from i per unit time is the same as for theGravity model, described by Equation S4. The total number of people moving to i per unit time isgiven by: T in ( i ) = Z D djT ( ρ ( j ) , w ( j )) ρ ( j ) P j ( i )= [ ρ ( i ) + w ( i )] Z D djT ( ρ ( j ) , w ( j )) ρ ( j )] f Z B i ( r ij ) dz [ ρ ( z ) + w ( z )] ! . (S21)These terms combine to give the general dynamic equation for the time evolution of the populationdensity as per Equation S1. 8 .1 Pattern formation and Growth in 1D The full 1D dynamic equation for the Intervening Opportunities and Radiation models is given by ∂ρ ( x, t ) ∂t = gρ ( x, t )[1 − ρ ( x, t ) ρ ] − T ρ ( x, t ) + T [ ρ ( x, t ) + w ] . Z ∞ (cid:20) ρ ( x − r, t ) f (cid:18)Z xx − r [ ρ ( z, t ) + w ] dz (cid:19) + ρ ( x + r, t ) f (cid:18)Z x +2 rx [ ρ ( z, t ) + w ] dz (cid:19)(cid:21) dr. (S22)Using the same method as used for the Gravity model, we write this equation for small fluctuationsin ρ ( x, t ) , denoted ˜ ρ ( x, t ) , by linearising about the uniform stationary state ρ : ∂ ˜ ρ ( x, t ) ∂t = − ( g + T ) ˜ ρ ( x, t ) + 2 T ρ ˜ ρ ( x, t ) Z ∞ f (2 r [ ρ + w ]) dr + T [ ρ + w ] Z ∞ f (2 r [ ρ + w ])[ ˜ ρ ( x − r, t ) + ˜ ρ ( x + r, t )]+ ρ f (2 r [ ρ + w ]) Z x +2 rx − r ˜ ρ ( z, t ) dz ! dr. (S23)Studying the stability of this equation to a small perturbation with specific wavenumber k , ie ˜ ρ ( x, t ) = h k ( t ) e ikx , we obtain a general expression for the growth function, Λ . Λ k = − ( g + T ) + T ρ ρ + w +2 T [ ρ + w ] (cid:20)Z ∞ cos ( kr ) f (2 r [ ρ + w ]) dr + ρ Z ∞ sin (2 kr ) f (2 r [ ρ + w ]) k dr (cid:21) . (S24)Introducing the rescaled variables: τ = T /g , µ = ρ /w , κ = k/w and λ = Λ /g , the aboveexpression becomes: λ κ ( µ , τ, f ) = − (1 + τ ) + µ µ + 1 + 2 τ [ µ + 1] . (cid:20)Z ∞ cos ( κz ) f (2 z [ µ + 1]) dz + µ Z ∞ sin (2 κz ) f (2 z [ µ + 1]) κ dz (cid:21) . (S25) For the Intervening Opportunities model, the deterrence function takes the form of an exponential; f ( a ) = Re − Ra . With this, Equation S25 is evaluated exactly to give λ ˜ κ ( µ , τ ) = − (1 + τ ) + τ µ µ + 1 − τ ( µ + 1) (cid:20) µ ˜ κ + ( µ + 1) − µ + 1)˜ κ + 4( µ + 1) (cid:21) , (S26)where ˜ κ = κ/R = k/ ( wR ) . The curve described by Equation S26 is shown in Figures 1d and 1e fordifferent values of the parameters. 9he function λ ˜ κ ( µ , τ ) has a maximum in ˜ κ m = 0 if µ ≤ / or in ˜ κ m = s − µ ( µ + 2) + 6 p µ ( µ + 1) − µ + 4 (S27)otherwise. The parameter ˜ κ m corresponds to a wavenumber k m = Rw ˜ κ m . If λ k m > , cities areable to form and k m is proportional to the number of cities per unit length. From the expressionfor k m we find that, in contrast to the Gravity model, as µ → ∞ (ie ρ (cid:29) w ), k m → ∞ . Thisimplies that for a fixed R , the density of cities will be greater in more populated countries.The condition for λ ˜ κ m > is found by inserting the expression for ˜ κ m into Equation S25 and solvingfor τ , to obtain an expression for τ c ; if τ > τ c , cities will emerge. τ c is given by: τ c = 3(5 µ + 11 µ + 7 µ + 4 p µ ( µ + 1) + 1)( µ + 1)(3 µ − . (S28)It should be noted that this equation is independent of R . Furthermore, if µ (cid:29) ( ρ (cid:29) w ), thecondition for growth reduces to τ > , or T > g . Here we may draw a parallel with the Gravitymodel; in both cases cities can only emerge if the migration rate is sufficiently higher than thegrowth rate. For the Radiation model, f ( a ) = 1 / (1 + a ) and, as with a power law deterrence function in theGravity model, the evaluation of Equation S25 is more complicated. By solving λ ˜ κ ( µ , τ, f ) = 0 numerically, we find that ˜ κ m = 0 if µ ≤ , and ˜ κ m > otherwise. λ ˜ κ m ( µ , τ ) = 0 is solvednumerically to obtain critical values of the parameters that will allow the emergence of cities.The results are shown in Figures 2c and 2d where it is seen that the critical curves for the InterveningOpportunities and Radiation models are not the same; in contrast to Gravity models, for InterveningOpportunities type models these curves do depend on the deterrence function used.A distinct contrast between the Gravity model and an Intervening Opportunities model is thevariation of k m with increasing density ρ . The value of k m may be related to the number of cities, C , that will grow from a small perturbation through the equation C = Lk m / π. (S29)For the Gravity model, as µ → ∞ , k m approaches unity, whereas for the Intervening Opportunitiesmodel, k m increases proportionally to µ . This result is displayed in Figure 3f (main text). Fromthis difference is it possible to determine which model of migration best describes patterns seen indata. If a population grows following a logistic equation and relocates according to an Intervening Op-portunities or Radiation model, an initial perturbation to the steady state population distributionwill result in the development of urban settlements if the following two conditions are met simulta-neously: 10
The average population is sufficiently large: ρ > C w , with C = 1/3 and 1 for the InterveningOpportunities and Radiation models respectively. • The population is sufficiently mobile:
T > C g with C = 3 and 5 for the Intervening Oppor-tunities and Radiation models respectively. The Intervening Opportunities model in 2 dimensions is studied by considering the equation: ∂ρ ( x, y, t ) ∂t = gρ ( x, y, t ) (cid:18) − ρ ( x, y, t ) u (cid:19) − T ρ ( x, y, t ) + T ( ρ ( x, y, t ) + w ) . Z ∞ Z π ρ ( x + r cos( θ ) , y + r sin( θ ) , t ) .f (cid:20)Z r Z π ( ρ ( x + r cos( θ ) + z cos( φ ) , y + r sin( θ ) + z sin( φ ) , t ) + w ) dφdz (cid:21) dθdr ! . (S30)This is analogous to the equation in 1 dimension, Equation S22, however now we consider theopportunities within a circle of radius r , with r being the Euclidean distance between the originand destination.Performing linear stability analysis on the above equation, with a perturbation to the stationarydistribution of the form e i ( kx + ly ) , we find that the conditions for growth of cities are unchangedfrom the 1-dimensional case if the 1 dimensional wavenumber k is replaced by the magnitude of the2 dimensional wavevector p .This result is demonstrated in Figures 3a and 3b where all parameters are equal to those used togenerate Figures 1d and 1e respectively, for the 1D case. Here, ˜ p = p /R . We simulate both the Gravity and Intervening Opportunities models using numerical methods inorder to assess the accuracy of the analytical results derived above.
To simulate the Gravity model in 1-dimension, we evolve the dynamic Equation S6 in time startingwith an initial distribution at t = 0 of small random perturbations about the steady state solution ρ . The integrals are evaluated using the trapezoidal rule. By varying the model parameters, thevalidity of Equation S13 describing the condition for city formation and the relationship betweenthe wavenumber and the number of cities, Equation S29, is determined. Unless otherwise stated,we rescale variables by setting growth rate and resources, g and w respectively, to equal unity; thisallows for the model to be specified with 3 variables rather than 5.On running simulations with fixed parameters ρ = 80 , R = 0 . and varying the migration parameter T between and , we find that Equation S13 holds; we can predict whether an initial perturbation11ill grow or decay based on the ratio between the migration and growth rate. We can also simulatethe model with fixed parameters T = 10 and R = 1 . and increasing carrying capacity ≤ ρ ≤ .From this we find that increasing the carrying capacity increases the size of cities however the numberof cities remains constant, determined by Equation S29. The results of this simulation are displayedin Figure 4. In order to numerically simulate the Intervening Opportunities model, we make an adjustment tothe expression for the probability of observing an individual at i moving to j , originally formulatedin Equation S19. Rather than using the probability density function f ( a ) , we use the cumulativedistribution function, F ( a ) . On doing this, we write the dynamic equation, analogous to EquationS22, as: ∂ρ ( x i ) ∂t = gρ ( x i )(1 − ρ ( x i ) ρ ) − T ρ ( x i ) + T X j P ij = gρ ( x i )(1 − ρ ( x i ) ρ ) − T ρ ( x i ) + T. X j (cid:20) F ( a ij ) − F ( a ij + ρ j + w j ) F ( ρ i + w i ) − F ( N ) (cid:21) . ρ j + w j P k ∈ R ij ρ k + w k (S31)where R ij is the set of locations on the ring of radius r ij , centered on i , therefore the summation: X k ∈ R ij ρ k + w k (S32)is a sum over all locations k at the same distance as j from the origin i . We define D ij such thatit represents the set of locations within the disc of radius r ij centered in i and the opportunitiesbetween but not including locations i and j may be written: a ij = X k ∈ D ij ρ k + w k . (S33)Using this formulation, the dynamical equation corresponding to a population relocating accordingto an Intervening Opportunities model is simulated.We run simulations for fixed parameters R and ρ and varying T between and ; finding thatEquation S28 successfully determines whether cities will form from a perturbation about a stationarydistribution. Simulations are also carried out for fixed T and R whilst varying ρ , the results ofwhich are shown in Figure 5. From this we find that the number of cities increases with ρ foran Intervening Opportunities model as expected from Equations S27, S29 and Figure 3f (maintext). From both the analytical results and those obtained from simulations, we note that the main contrastbetween the Gravity and Intervening Opportunities models is their behaviour with increasing ρ ,12pecifically how the number of cities changes due to a change in the steady state population density.From the analysis of the Gravity model, we find that the number of cities does not vary with ρ ;this is shown in Equation S12 and supported by Figure 4. In comparison to this, for the InterveningOpportunities model, we find that increasing ρ also results in an increase in the number of cities.For this model, the expression for k m suggests a linear relationship between ρ and the numberof cities, when all other parameters are fixed, and this is supported for low values of ρ by Figure5.The dynamic equation, S1, used for each model only differs in the migration term; both modelsfollow a logistic growth. As a result, by setting this growth to equal zero the differences betweenGravity driven migration and migration driven by an Intervening Opportunities model may beeffectively analysed. Setting the growth to zero to study the number of peaks is also justified bythe fact that the number of peaks is proportional to k m ; for both models this is independent of thevalue of g . ρ Figures 6a and 6c demonstrate the difference between the two models noted above. Here we simulateboth models with an identical initial distribution and equal parameters except for R , the value ofwhich is specific to the model being used. It is seen that as ρ is increased the final distribution fromthe Gravity model displays peaks of increasing height where as that for the Intervening Opportunitiesmodel displays an increasing number of peaks alongside a less significant increase in height. The fulltime evolution of the initial perturbation is displayed in Figures 7a,c,e (Gravity model) and 7b,d,f(Intervening Opportunities model). In Figures 6b and 6d we demonstrate the effect of a small growth on the final distributions for theGravity and Intervening Opportunity models respectively. We observe that the effect of a non-zerogrowth rate on the final distribution of cities is that all cities evolve to approximately the samesize; ρ . This is a result of the logistic form of growth, g · (1 − ρ ( x, t ) /ρ ) ; as ρ ( x, t ) approaches ρ ,the growth rate tends to zero, all cities have reached the same size and have equal opportunities,therefore there is also no further migration. The full time evolution of the initial perturbation isdisplayed in Figures 8a,c,e (Gravity model) and 8b,d,f (Intervening Opportunities model). Figure 9 displays the result from 2D a simulation of an initially random distribution of populationwith a logistic growth and individuals relocating according to a Gravity model with an exponentialdeterrence function. The dynamics for this simulation are described by Equation S16. From thiswe observe that the initial population distribution, randomised about the value ρ = 25 , convergeson a stationary state with 6 cities of varying population. Figure 10 displays the correspondingevolution for individuals relocating according to an Intervening Opportunities model, described byEquation S30, when the initial distribution is randomised about ρ = 25 . From this it is evident13hat the distribution reaches a steady state with 30 cities of varying size. Both simulations wereperformed with fixed parameters L = 15 , T = 30 , R = 0 . (Gravity model) and R = 0 . (InterveningOpportunities model).In Figure 11 we display the final stationary distributions of cities obtained from simulations of boththe Gravity model (red squares) and Intervening Opportunities model (green triangles) for initialdistributions randomised about increasing values of ρ over the range [10,50]. From this we observethat, in correspondence with the 1-dimensional simulations, the number of cities for the Gravitymodel remains fixed and constant with increasing ρ where as for the Intervening Opportunitiesmodel the number of cities increases with ρ . As the initial distribution of population is randomisedabout ρ this result demonstrates that if individuals relocate according to a Gravity model thenumber of cities is unaffected by the total population of the region. In contrast, if it is an InterveningOpportunities model by which people relocate then the number of cities within a region will increasewith the total population of that region. Here we present the analysis of data sets for European countries and states within India, withthe goal of assessing the validity of the two laws of urbanisation stated in the main text: therelationship between total population and the number of cities within a region (first law, Eq. 1),and the relationship between population density and average distance between cities (second law,Eq. 2).Figure 12a displays the relationship between the number of cities with over 5,000 inhabitants ( y axis)for each individual country within Europe and the country’s total population ( x axis), populationdensity (colour) and area (size). From this we see that there is strong linear correlation (0.92)between the number of cities with more than 5,000 inhabitants and total population of a country,whilst there is a weak correlation between number of cities and area (0.53) and no correlation betweenthe number of cities and population density (-0.1), thus supporting the first law of urbanisation.Further support for the first law comes from the analysis of data from India; Figure 13a displaysthe same results as 12a for states within India; the corresponding correlation coefficients betweennumber of cities and total population (0.92), area (0.74), and density (-0.1) confirm the validity ofthe first law.In 12c and 13c we present further evidence for the validity of Eq. 1 (main text); for countriesor states with a total population N , the number of cities with more than X inhabitants, with X ranging from 5,000 to 5,000,000, is a linear function of N/X . The implication of this result is thatthe number of cities of size X within a country or state may be fully determined by the populationof the region, N .In Figures 12b and 13b we display the average distance to the closest city with more than X =5,000inhabitants for countries in Europe and more than X =1,000 inhabitants for states in India respec-tively. The strong linear relationship for both data sets provides further support for Eq. 2 (maintext); the average distance between cities of population X depends on both X and the populationdensity of the region considered.From 13b we observe that the Indian state of Goa lies away from the line; 13d shows that this statehas a significantly smaller area than other states while 13a displays that, given the total population N , there are more cities than expected by the first law of urbanisation. From this we can conclude14hat the region has a higher than average number of cities given its population and a smaller thanaverage area, thus explaining why the average distance between cities is lower than expected.The above analysis is displayed for states within the US in Figure 14 providing further support forthe first and second law.Figure 15 displays the average distance to the closest city with population larger than X rescaledby the square root of X h d c i X − . ( y axis) vs population density N/A ( x axis) for states withinthe US, countries within Europe and states within India ( X ranges from 1,000 to 1,000,000). Allpoints collapse on the line y = x − / as prescribed by the second law of urbanisation (Eq. 2 maintext) indicating that the average distance between cities with more than X inhabitants scales asthe inverse root of the population density (ie. N/A ) of the region being considered.In Figure 16 we display the correlations between total population N , area A , population density ρ and the number of cities with more than X = 5 , inhabitants (black) and X = 50 , inhabitants(red) for states within the US, countries in Europe, states within India. The total population N and the number of cities are the most strongly correlated variables for all regions and both valuesof X , demonstrating the validity of the first law of urbanisation (Eq. 1 main text).The relatively high correlation between number of cities and area is due to the correlation betweenthe total population of a region and its area; a region with a large area can easily accommodatea large population. The correlations between total population and area for the US, European andIndian regions are displayed in Figure 17. We now consider a stochastic model of population dynamics which combines the models of migrationwith Gibrat’s proportionate random growth. Such an approach is expected to reproduce both thelaws of urbanisation and Zipf’s law for city sizes.For this approach, we model the population as individuals rather than a continuous density. Spaceis discretised into cells of size ( L x /N x ) × ( L y /N y ) ; L x and L y are the sizes of the 2-dimensionalregion being considered and N x , N y are the number of cells in each dimension. For simplicity, wewill start by only considering square regions with L x = L y = L , and N x = N y = N , thus the totalnumber of cells is N and each has an area of ( L/N ) ≡ dl . Each cell may be specified by a pair ofcoordinates ( x, y ) with ≤ x < N and ≤ y < N . The population of cell ( x, y ) at time t is givenby n ( x, y, t ) .The population in any cell may change in two ways; natural increase and migrations. Here, naturalincrease refers to the difference between the number of births and the number of deaths within a celltherefore it may be both positive or negative. Migrations refers to any person relocating betweencells; both inward migrations and outward migrations must be accounted for.The mechanism that dictates the natural increase in the population of a cell is proportionate randomgrowth, the growth mechanism known to produce Zipf’s Law. If we consider the change in thepopulation of cell ( x, y ) due to births and deaths between time t and t + dt , calling this change δ g ( x, y, t ) , then by the rule of proportionate random growth we have δ g ( x, y, t ) = ξ ( t ) n ( x, y, t ) .Here, ξ ( t ) is a normally distributed random variable with mean µ and variance σ . In order forthe resulting distribution of city sizes to be a power-law [2], µ must be slightly negative, σ must be15mall (of order ≈ n min = n .We model relocations between cells using either a Gravity or an Intervening Opportunities model.To implement this, we define a probability of migration, T , which corresponds to the probability thatan individual in cell ( x, y ) will relocate to any other cell between time t and t + dt . The total numberof outgoing migrants from cell ( x, y ) in time dt , m ( x, y, t ) is extracted from a binomial distributionwith migration probability T = 0 . . We then relocate these outgoing migrants according to theGravity or Intervening Opportunities model as described in the following sections. If individuals relocate according to a Gravity model, the probability of a migrant relocating from ( x, y ) to ( i, j ) is given by P ( x, y → i, j ) = F ( r ( x, y → i, j ))[ n ( i, j, t ) + w ( i, j, t )] P i,j F ( r ( x, y → i, j ))[ n ( i, j, t ) + w ( i, j, t )] (S34)where r ( x, y → i, j ) is the euclidean distance between locations ( x, y ) and ( i, j ) . The deterrencefunction F is assumed to be exponential, e − rR . According to the Intervening Opportunities model, the probability of a migrant relocating from ( x, y ) to ( i, j ) is given by P ( x, y → i, j ) = (cid:20) F ( A x,y → i,j ) + F ( A x,y → i,j + A i,j ) F ( n ( x, y, t ) + w ( x, y )) − F ( N ) (cid:21) . n ( i, j, t ) + w ( i, j ) A i,j (S35)where A i,j corresponds to the sum of the population and opportunities in all cells at an equaldistance from ( x, y ) as ( i, j ) and F is the deterrence function that we assume to be exponential, e − AR . A x,y → i,j is the sum of the population and resources of all intervening cells; all cells that lie ata distance from ( x, y ) that is less than the distance to ( i, j ) , including the population and resourcesof cell ( x, y ) itself.Using Equations S34 or S35, we compute the probability of relocation to all cells according to themodel in question. The number of migrants moving between any two locations are then extractedfrom a multinomial distribution. The number of incoming migrants to cell ( x, y ) in between time t and t + dt , δ in ( x, y, t + dt ) , is given by the sum of the outgoing migrants from all other cells thathave destination ( x, y ) . We denote this sum Φ( x, y, t ) .Using this framework, we simulate the population dynamics. In a time dt , the population in eachcell will change due to natural increase, outgoing migrants and incoming migrants. The populationof a cell ( x, y ) after dt is therefore given by: n ( x, y, t + dt ) = n ( x, y, t ) + δ g ( x, y, d ) − δ out ( x, y, d ) + δ in ( x, y, d )= n ( x, y, t ) + ξ ( t ) n ( x, y, t ) − m ( x, y, t ) + Φ( x, y, t ) . (S36)16e start our simulation at time t = 0 from an initial condition where the population of each cellis drawn from a uniform distribution within the range n ≤ n ≤ . · n , where n is the minimumpopulation of a cell.After each time step, dt , the state of the system is assessed by counting the total number of citiesand their populations. Cities correspond to clusters of population and may therefore be defined asadjacent populated geographical spaces (cells) [3]. We define a cluster as a set of adjacent cells forwhich the Manhattan distance of a single cell from the closest cell that is a member of the clusterhas a maximum of 1. This satisfies the adjacency condition and is equivalent to a condition on thecell edges; any cell that is a member of a cluster must share at least one edge with another cellwithin the same cluster. Alongside this, the population of every cell within the cluster must begreater than a minimum value X to ensure that all cells are populated.Results of simulations with increasing n are displayed in Figure 18. Figures 18g and 18h displaythe time evolution of the number of clusters with T = 0 . , X = 2000 , µ = − . , σ = 0 . , L = 60 and N = 60 computed according to the above algorithm, for the Intervening Opportunities andGravity models respectively. From 18g, we observe that, when individuals relocate according to anIntervening Opportunities model the number of clusters (cities) at any given time increases with n .Conversely, after an initial transient, the number of clusters is independent of n when individualsrelocate according to a Gravity model (18h).In Figures 19 and 20 we display the variation in the average total population with simulation timefor the Gravity and Intervening Opportunities models respectively. The shaded regions correspondto the values of total population that fall within the 25th and 75th percentiles at each step. Forboth models we observe that simulations with larger n also have a larger total population at alltime steps, implying that the total population is an increasing function of n .Figures 21 and 22 display the average CCDF of cluster population for increasing n . The grey linerepresents the Zipf distribution. Both models of migrations successfully produce clusters with apopulation distribution fitted by Zipf’s Law, however while the Intervening Opportunities modelhas a distribution that is independent of n , the Gravity model’s distribution shifts to the right as n increases. The linear stability analysis (sections 2 and 3) performed on both models of migration is only validfor small perturbations about the stationary distribution. With regards to the simulations, thismeans that the conditions for the number of cities (Equation S29) are only valid during the initialstages of the simulation; at later times the effects of the non-linearities present in Equation S1become more pronounced.From the linear stability analysis it was concluded that the growth and migration rates affectwhether or not cities may form from the initial perturbation (Equations S13 and S28) however thenumber of cities that form depend on the parameters ρ , R and w only. Numerical simulationsindicate that on longer time scales, the number of cities that are present also depends the growthrate g and the migration parameter T . In particular the number of cities is constant for a givenratio T /g = τ , however if the ratio is varied, by changing either g or T , the number of cities alsochanges. 17e display these results in Figure 23 where it is observed that the number of cities present after along time period increases with g for both the Gravity model (a-c) and the Intervening Opportunitiesmodel (d-f). If the migration parameter T is increased, the number of cities decreases.In order to find empirical confirmation to the model’s prediction, we would need a complete seriesof historical data on both growth and migration rates. While it is possible to estimate the growthrate of each state within the US since 1790 using census data, we were not able to determine themigration rates of the states prior to 1940, the first year in which the US census included questionsabout people’s mobility [4]. Due to the lack of historical data on migration rates prior to 1940we are unable to estimate the ratio T /g of each state, hence precluding the possibility to test thepresence of a correlation between the ratio
T /g and the number of cities.
References [1] S. A. Stouffer, “Intervening opportunities: a theory relating mobility and distance,”
Americansociological review , vol. 5, no. 6, pp. 845–867, 1940.[2] X. Gabaix, “Zipf’s law for cities: an explanation,”
Quarterly journal of Economics , pp. 739–767,1999.[3] H. D. Rozenfeld, D. Rybski, J. S. Andrade, M. Batty, H. E. Stanley, and H. A. Makse, “Laws ofpopulation growth,”
Proceedings of the National Academy of Sciences , vol. 105, no. 48, pp. 18702–18707, 2008.[4] C. Potter, “New questions in the 1940 census,”
Prologue-Quarterly of the National Archives andRecords Administration , vol. 42, no. 4, pp. 46–52, 2010. κ − − Λ κ , G ( µ = , τ ) a κ − . − . − . − . . Λ κ , G ( µ , τ = . ) b κ − . − . − . − . . . Λ κ , G ( µ , τ c ) c κ/R − . − . − . − . . . . . Λ κ , I O ( µ = . , τ ) d κ/R − . − . − . − . . . . . . Λ κ , I O ( µ , τ = . ) e κ/R − . − . − . − . . . Λ κ , I O ( µ , τ c ) f Figure 1: a The function λ κ ( µ , τ ) from Equation S11 with fixed µ =10 and varying τ : 1,3,10,20.The red line corresponds to τ c = 5.4321, as described by Equation S13. b The function λ κ ( µ , τ ) from Equation S11 with fixed τ =5.4321 and varying µ : 0.5, 1, 4, 40. The red line correspondsto µ =10 for which τ = 5.4321 is critical, as described by Equation S13. c The function λ κ ( µ , τ ) from Equation S11 at the critical value τ c defined in Equation S13 for different µ : 0.5(green),2(red), 5(yellow), 10(blue). If µ < λ κ is in κ m = 0. As µ grows, the maximumapproaches the limiting value of κ m = 1. d The function λ ˜ κ ( µ ,τ ) from Equation S26 with fixed µ =0.65 and varying τ : 1,5,20,100. The red line corresponds to τ c = 46.03, as described by EquationS28. e The function λ ˜ κ ( µ ,τ ) from Equation S26 with fixed τ =46.03 and varying µ : 1/4, 1/3, 1/2,1. The red line corresponds to µ = 0.65, for which τ = 46.03 is critical (Equation S28). f Thefunction λ ˜ κ ( µ ,τ ) from Equation S26 at the critical value τ c defined in Equation S28 for different µ :0.25(green), 0.5(red), 1(yellow), 1.5(blue). If µ < λ κ is in κ m = 0. As µ grows, the maximum κ m increases indefinitely. 19
20 40 60 80 100 ρ /w T / g CitiesFlat a ρ /w T / g CitiesFlat b ρ /w T / g CitiesFlat c ρ /w T / g CitiesFlat d Figure 2: a Gravity Model - exponential f(r): Rescaled parameter space ( µ , τ ) described byEquation S13. Above the critical curve the formation of cities is possible whilst below it, anyperturbations to the steady state distribution will decay back to ρ and the landscape will be flat. b Gravity Model - power law f(r): Rescaled parameter space ( µ , τ ) described by Equation S13. Bycomparison with a we demonstrate that for the Gravity model the particular deterrence functionhas no effect on the condition for cities to form. c Intervening Opportunities Model: Rescaledparameter space ( µ , τ ) described by Equation S28. From comparison with a and b it can be seenthat for an Intervening Opportunities model cities may form when the ratio between the migrationrate and growth rate is lower than that required for Gravity models. d Radiation Model: Rescaledparameter space ( µ , τ ). By comparison with c it can be seen that if people relocate according toa Radiation model, the ratio between the migration rate and growth rate must be higher than thatfor an Intervening Opportunities model in order for cities to form.20 p/R − . − . − . − . . . . . . Λ p ( µ = . , τ ) a p/R − . − . − . − . . . . . . Λ p ( µ , τ = . ) b Figure 3: a The function λ ˜ p ( µ , τ ) for fixed µ = 0.65 and varied τ = 1,5,20,100. The red linecorresponds to τ c = 46.03, the same as the critical value for the 1D case. b The function λ ˜ p ( µ , τ ) for fixed τ = 46.03 and varied µ = 1 /4, 1 /3,1 /2,1. The red line corresponds µ = 0.65, for which τ = 46.03 is critical. x ρ ( x ) ρ = 10 . . . . . . . . x ρ ( x ) ρ = 20 . . . . . . . . x ρ ( x ) ρ = 30 . . . . . . . .
60 2 4 6 8 10 x ρ ( x ) ρ = 40 . . . . . . . . x ρ ( x ) ρ = 50 . . . . . . . . x ρ ( x ) ρ = 60 . . . . . . . .
60 2 4 6 8 10 x ρ ( x ) ρ = 70 . . . . . . . . x ρ ( x ) ρ = 80 . . . . . . . . x ρ ( x ) ρ = 90 . . . . . . . . Figure 4: Gravity Model: Increasing ρ , fixed parameters: T =10, R =1.5.21 x ρ ( x ) ρ = 10 . . . . . . . . . . . x ρ ( x ) ρ = 20 . . . . . . . . . . . x ρ ( x ) ρ = 30 . . . . . . . . . . .
00 2 4 6 8 10 x ρ ( x ) ρ = 40 . . . . . . . . . . . x ρ ( x ) ρ = 50 . . . . . . . . . . . x ρ ( x ) ρ = 60 . . . . . . . . . . .
00 2 4 6 8 10 x ρ ( x ) ρ = 70 . . . . . . . . . . . x ρ ( x ) ρ = 80 . . . . . . . . . . . x ρ ( x ) ρ = 90 . . . . . . . . . . . Figure 5: Intervening Opportunities Model: Increasing ρ , fixed parameters: T =10, R =0.5.22 ρ , popu l a t i on a ρ = 4 ρ = 20 ρ = 100 ρ , popu l a t i on c ρ = 4 ρ = 20 ρ = 100 ρ , popu l a t i on b ρ = 4 ρ = 20 ρ = 100 ρ , popu l a t i on d ρ = 4 ρ = 20 ρ = 100 Figure 6: The final distribution obtained from a Gravity model, with parameters T =10, R =1.5,g=0 b Gravity model, with parameters T =10, R =1.5, g=1 c Intervening Opportunities model, withparameters T =10, R =0.1, g=0 d Intervening Opportunities model, with parameters T =10, R =0.1,g=1. ρ is varied between 4 and 100 to demonstrate the relationship between the number of peaksand the steady state population density for each model.23
20 40 60 80 100 x ρ ( x ) a ρ = 4 x ρ ( x ) c ρ = 20 x ρ ( x ) e ρ = 100 x ρ ( x ) b ρ = 4 x ρ ( x ) d ρ = 20 x ρ ( x ) f ρ = 100 Figure 7: a, c, e
Gravity Model: Increasing ρ over the range 4, 20, 100 with fixed parameters T =10, R =1.5, g=0. It can be seen that the distributions are identical; the effect of increasing ρ increases the height of the peaks only. The corresponding final states are displayed in Figure 6a. b,d, f Intervening Opportunities Model: Increasing ρ over the range 4, 20, 100 with fixed parameters T =10, R =0.1, g=0. It can be seen that the effect of increasing ρ is to increase the density of peaksalong with the maximum height; the increase in height is not as great as that for the Gravity modelas the total population is conserved. The corresponding final states are displayed in Figure 6c.24
20 40 60 80 100 x ρ ( x ) a ρ = 4 x ρ ( x ) c ρ = 20 x ρ ( x ) e ρ = 100 x ρ ( x ) b ρ = 4 x ρ ( x ) d ρ = 20 x ρ ( x ) f ρ = 100 Figure 8: a, c, e
Gravity Model: Increasing ρ over the range 4, 20, 100 with fixed parameters T =10, R =1.5, g=1. It can be seen that the distributions are identical; the effect of increasing ρ increases the height of the peaks only. The corresponding final states are displayed in Figure 6b. b,d, f Intervening Opportunities Model: Increasing ρ over the range 4, 20, 100 with fixed parameters T =10, R =0.1, g=1. It can be seen that the effect of increasing ρ is to increase the density of peaksalong with the maximum height. The corresponding final states are displayed in Figure 6d.25 Figure 9: Gravity model: time evolution of an initial perturbation about ρ = 25 with fixedparameters L = 15 , T = 30 and R = 0 . . The spatial distribution of population converges to astationary state of 6 cities. 26 Figure 10: Intervening Opportunities model: time evolution of an initial perturbation about ρ = 25 with fixed parameters L = 15 , T = 30 and R = 0 . . The spatial distribution of population convergesto a stationary state of 30 cities. 27 IO: 12 gravity: 6 = IO: 25 gravity: 6 = IO: 34 gravity: 6
Figure 11: The final stationary distribution of cities when individuals relocate according to a Gravitymodel (red) and an Intervening Opportunities model (green) for increasing values of ρ = 10 , , with fixed parameters L = 15 , T = 30 , R = 0 . (Gravity model) and R = 0 . (InterveningOpportunities model). It is observed that the final number of cities remains constant with increasing ρ if individuals relocate according to a Gravity model whereas the number of cities increases with ρ if individuals relocate according to an Intervening Opportunities model.28 N , country population C ( N , k ) , nu m be r o f c i t i e s Europe a D en s i t y [ k m − ] N/X C ( N , X ) , nu m be r o f c i t i e s X c N C ( N , X ) − − − − N/A , Country population density − h d c i , a v e r aged i s t an c e t o t he c l o s e s t c i t y b ( N/A ) − / − − − − − − − − − − − − /A − − h d i , a v e r aged i s t an c ebe t w een c i t i e s d (1 /A ) − / Figure 12: a Number of cities with population greater than 5,000 ( y axis) vs country population( x -axis), density (colour) and area (marker size) for countries in Europe (excluding Russia). Thestrong linear correlation between the number of cities and the country population supports the firstlaw of urbanisation. b Number of cities with population larger than X ( y axis) vs the ratio N/X ( x axis); the ratio of the total population to minimum city population for countries in Europe. X ranges from 5,000 to 5,000,000; as all points collapse on a single line we can conclude that the firstlaw holds over several magnitudes of X . c The average distance to the closest city with populationgreater than 1,000 ( y axis) vs population density ( x axis) for countries in Europe. d The averagedistance between cities with population greater than 1,000 ( y axis) vs the inverse of the area ( x axis) for countries in Europe. 29 N , state population − C ( N , k ) , nu m be r o f c i t i e s India a D en s i t y [ k m − ] N/X C ( N , X ) , nu m be r o f c i t i e s X b N C ( N , X ) − − − N/A , State population density h d c i , a v e r aged i s t an c e t o t he c l o s e s t c i t y c ( N/A ) − / − − − − − − /A h d i , a v e r aged i s t an c ebe t w een c i t i e s d (1 /A ) − / Figure 13: a Number of cities with population greater than 5,000 ( y axis) vs state population( x -axis), density (colour) and area (marker size) for states in India. The strong linear correlationbetween the number of cities and the state population supports the first law of urbanisation. b Number of cities with population larger than X ( y axis) vs the ratio N/X ( x axis); the ratioof the total population to minimum city population for states in India. X ranges from 5,000 to5,000,000; as all points collapse on a single line we can conclude that the first law holds over severalmagnitudes of X . c The average distance to the closest city with population greater than 1,000 ( y axis) vs population density ( x axis) for states in India. The state of Sikkim was removed from thedata set as it only has two cities of population greater than 5,000. d The average distance betweencities with population greater than 1,000 ( y axis) vs the inverse of the area ( x axis) for states inIndia. 30 N , state population C ( N , k ) , nu m be r o f c i t i e s United States a D en s i t y [ k m − ] N/X C ( N , X ) , nu m be r o f c i t i e s X c N C ( N , X ) − − − − N/A , state population density h d c i , a v e r aged i s t an c e t o t he c l o s e s t c i t y b ( N/A ) − / − − − − − − − /A h d i , a v e r aged i s t an c ebe t w een c i t i e s d (1 /A ) − / Figure 14: a Number of cities with population greater than 5,000 ( y axis) vs state population( x -axis), density (colour) and area (marker size) for states in the US. The strong linear correlationbetween the number of cities and the state population supports the first law of urbanisation. b Number of cities with population larger than X ( y axis) vs the ratio N/X ( x axis); the ratio ofthe total population to minimum city population for states in the US. X ranges from 5,000 to5,000,000; as all points collapse on a single line we can conclude that the first law holds over severalmagnitudes of X . c The average distance to the closest city with population greater than 1,000 ( y axis) vs population density ( x axis) for states in the US. d The average distance between cities withpopulation greater than 1,000 ( y axis) vs the inverse of the area ( x axis) for states in the US.31 − N/A , state population density − − − − − h d c i ∗ X − . , a v e r aged i s t an c e t o t he c l o s e s t c i t y X a ( N/A ) − / N/A h d c i − − N/A , country population density − − − h d c i ∗ X − . , a v e r aged i s t an c e t o t he c l o s e s t c i t y X b ( N/A ) − / N/A h d c i − N/A , state population density − − − h d c i ∗ X − . , a v e r aged i s t an c e t o t he c l o s e s t c i t y X cc ( N/A ) − / N/A h d c i Figure 15: The average distance to the closest city, h d c i X − . ( y axis) vs population density N/A ( x axis) for a states within the US, b countries within Europe and c states within India. X rangesfrom 1,000 to 1,000,000; as all points collapse on a single line we can conclude that the second lawholds over several magnitudes of X . C ( X ) , C i t i e s United States a X = 5 k,r = 0 . X = 50 k,r = 0 . b r = 0 . r = 0 . − c r = − . r = − . C ( X ) , C i t i e s Europe d X = 5 k,r = 0 . X = 50 k,r = 0 . e r = 0 . r = 0 . f r = − . r = − . N,TotalPopulation C ( X ) , C i t i e s India g X = 5 k,r = 0 . X = 50 k,r = 0 . A,Area ( km ) h r = 0 . r = 0 . ρ,PopulationDensity ( km − ) i r = − . r = − . Figure 16: Correlation between total population N , area A , population density ρ and the numberof cities with more than X = 5 , inhabitants (black) and X = 50 , inhabitants (red) for a-c states within the US, d-f countries in Europe, g-i states within India.32 TotalPopulation,N A r e a , A United States a r = 0 . TotalPopulation,N A r e a , A b Europe r = 0 . TotalPopulation,N A r e a , A c India r = 0 . Figure 17: Correlation between total population, N , and area, A for a states within the US, b countries within Europe, c states within India N c ( X = ) - G r a v i t y M o d e l h n = 50 n = 200 n = 600 N c ( X = ) - I O M o d e l g n = 50 n = 200 n = 600 d n = 50 e n = 200 f n = 600 a n = 50 b n = 200 c n = 600 Figure 18: The final distribution of cities obtained from a set of stochastic simulations with pop-ulation relocating according to a-c an Intervening Opportunities model and d-f a Gravity modelwith exponential deterrence functions and fixed probability of migration T = 0 . and increasinginitial population n and parameters X = 2000 , µ = − . , σ = 0 . , L = 60 and N = 60 . Theevolution of the average number of clusters in each system, computed from 16 realisations of eachset of parameters, is displayed for g an Intervening Opportunities model and h a Gravity model.It is observed that the average (median) number of clusters increases with n when individualsrelocate according to an Intervening Opportunities model whilst it tends to the same value for all n when individuals relocate according to a Gravity model. Error bars correspond to the 10th and90th percentile. 33
100 200 300 400 500
Time step T o t a l P o p u l a t i o n n =50 n =200 n =600 Figure 19: Total population vs time - Gravity model with and exponential deterrence function,increasing n = 50,200,600 and fixed parameters R = 1 . , T = 0 . , X = 2000 , µ = − . , σ = 0 . , L = 60 and N = 60 . The central lines within each region correspond to the mean total population ateach time step where as the shaded area represents the region between the 25th and 75th percentiles.34
100 200 300 400 500
Time step T o t a l P o p u l a t i o n n =50 n =200 n =600 Figure 20: Total population vs time - Intervening Opportunities model with an exponential deter-rence function, increasing n = 50, 200, 600 and fixed parameters R = 0 . , T = 0 . , X = 2000 , µ = − . , σ = 0 . , L = 60 and N = 60 . The central lines within each region correspond to themean total population at each time step where as the shaded area represents the region betweenthe 25th and 75th percentiles. n, population -3 -2 -1 CC D F ( n ) n =50 n, population -3 -2 -1 CC D F ( n ) n =200 n, population -3 -2 -1 CC D F ( n ) n =600 Figure 21: The average CCDF obtained from Equation S36 with individuals relocating according toa Gravity model. The grey line is a guide for the eye depicting Zipf’s law,
P > ( n ) ∼ /n . Symbolsat each n denote the average fraction of clusters with population larger than n over 16 realisationsof the numerical simulation. Error bars correspond to the 25th and 75th percentiles and coloursindicate the time steps in the simulations. 35 n, population -3 -2 -1 CC D F ( n ) n =50 n, population -3 -2 -1 CC D F ( n ) n =200 n, population -3 -2 -1 CC D F ( n ) n =600 Figure 22: The average CCDF obtained from Equation S36 with individuals relocating accordingto an Intervening Opportunities model. The grey line is a guide for the eye depicting Zipf’s law,
P > ( n ) ∼ /n . Symbols at each n denote the average fraction of clusters with population largerthan n over 16 realisations of the numerical simulation. Error bars correspond to the 25th and 75thpercentiles and colours indicate the time steps in the simulations. x ρ ( x ) g = 0 . a x ρ ( x ) g = 0 . b x ρ ( x ) g = 1 c x ρ ( x ) g = 0 . d x ρ ( x ) g = 0 . e x ρ ( x ) g = 1 f Figure 23: The long-time evolution of the distribution of cities with increasing g = 0 . , . , andfixed parameters L = 100 , N = 100 , ρ = 20 , T = 10 for a-c a Gravity model ( R = 1 ) and d-f anIntervening Opportunities model ( R = 0 . ). It is seen that the number of cities after a long periodof time increases with g . This corresponds to an increase in cities with decreasing ττ