A Rank-Based Approach to Zipf's Law
AA rank-based approach to Zipf ’s law
Ricardo T. Fernholz Robert Fernholz August 27, 2018
Abstract
An Atlas model is a rank-based system of continuous semimartingales for which the steady-statevalues of the processes follow a power law, or Pareto distribution. For a power law, the log-log plot ofthese steady-state values versus rank is a straight line. Zipf’s law is a power law for which the slope ofthis line is −
1. In this note, rank-based conditions are found under which an Atlas model will followZipf’s law. An advantage of this rank-based approach is that it provides information about the dynamicsof systems that result in Zipf’s law.
Introduction
A family of random variables follows a power law, or Pareto distribution, if the log-log plot of their valuesversus rank forms (approximately) a straight line. The random variables follow
Zipf ’s law if the slope of thisline is −
1. Newman (2006) and Gabaix (2009) both present surveys of many different power laws observedin the real world. A characterization of conditions that result in Zipf’s law for the population of cities ispresented in Gabaix (1999), and this characterization is based on the idea that under a stable distributionthe expected change in the population of each individual city is zero, at least when the city is away from areflecting lower barrier.In the setting of Atlas models and other systems of rank-based continuous semimartingales (see Fernholz(2002)), we examine the conditions that give rise to Zipf’s law and consider several generalizations that arecommon in the real world. We shall find that this new setting is natural for an understanding of Zipf’s lawand provides insight into the dynamics involved.
Atlas models
An Atlas model is a family of positive continuous semimartingales X , . . . , X n , with n ≥
2, that satisfy d log X i ( t ) = (cid:0) γ − g + ng { r t ( i )= n } (cid:1) dt + σ dW i ( t ) , (1)for i = 1 , . . . , n , where γ is a constant, g and σ are positive constants, r t ( i ) is the rank of X i ( t ) (withties resolved lexicographically), and W , . . . , W n is an n -dimensional Brownian motion with the Brownianfiltration F t (see Fernholz (2002), Banner et al. (2005), and Fernholz and Karatzas (2009)). The processes X i might represent, for example, the wealth of households, the capitalizations of companies, or the populationof cities. Let X (1) ≥ · · · ≥ X ( n ) represent the ranked processes X , . . . , X n , so X ( r t ( i )) ( t ) = X i ( t ). We candefine the total value process X by X ( t ) (cid:44) X i ( t ) + · · · + X n ( t ) , and the weight processes θ i and the ranked weight processes θ ( k ) by θ i ( t ) (cid:44) X i ( t ) /X ( t ) and θ ( k ) ( t ) (cid:44) X ( k ) ( t ) /X ( t ) , for i, k = 1 , . . . , n. The term ng { r t ( i )= n } in (1) is a device that stabilizes the model by driving the “Atlas” process X ( n ) upward at a rate that counteracts the general downward drift of − g . The Atlas process can be thought of asthe “birth and death” of processes in the lowest ranks, as is common in the firm size and income distributionliteratures in economics (see Luttmer (2011) and Gabaix et al. (2015)). It can also be thought of as a Robert Day School of Economics and Finance, Claremont McKenna College, 500 E. Ninth St., Claremont, CA 91711,[email protected]. INTECH, One Palmer Square, Princeton, NJ 08542. [email protected]. The authors thank the members of theINTECH/Princeton SPT seminar for their comments and suggestions regarding this research. a r X i v : . [ q -f i n . E C ] F e b roxy for a system that extends infinitely downward, as in the infinite models of Pal and Pitman (2008) andChatterjee and Pal (2009).The parameter γ in (1) represents the growth rate of the entire system. Since here we are interested inrelative behavior under steady-state conditions, we can assume that γ = 0, and we shall do so from here on.In this case, definition (1) reduces to d log X i ( t ) = (cid:0) − g + ng { r t ( i )= n } (cid:1) dt + σ dW i ( t ) , (2)for i = 1 , . . . , n . With this defining equation, the asymptotic growth rate of each of the X i will be zero, solim t →∞ t − log X i ( t ) = 0 , a.s. , for i = 1 , . . . , n (see, e.g., Fernholz (2002), Banner et al. (2005), or Fernholz and Karatzas (2009)).By Itˆo’s rule, it follows from (2) that dX i ( t ) = (cid:18) − g + σ ng { r t ( i )= n } (cid:19) X i ( t ) dt + σX i ( t ) dW i ( t ) , a.s. , (3)for i = 1 , . . . , n . From this we see that while the asymptotic growth rate of the system (2) is zero, the localbehavior of the individual processes X i is more complicated.For the model (2), when the gap processes log X ( k ) − log X ( k +1) are in their steady-state distribution,these gaps are exponentially distributed with E (cid:2) log X ( k ) ( t ) − log X ( k +1) ( t ) (cid:3) = E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) = σ kg , (4)for k = 1 , . . . , n − E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) log k − log( k + 1) = σ kg (log k − log( k + 1)) ∼ = − σ g , a.s. , (5)for k = 1 , . . . , n −
1, and the log-log plot of θ (1) ( t ) , . . . , θ ( n ) ( t ) versus rank, which is called the distribution curve of the model, is approximately a straight line with (log-log) slope − σ / g . Therefore, we have approximately θ ( k ) ( t ) ∝ k − σ / g , a.s. , (6)for k = 1 , . . . , n , and we say that the Atlas model (2) has a Pareto distribution with parameter λ , where λ = σ g . Zipfian Atlas models
Zipf’s law is a Pareto distribution with parameter λ = 1, so the ranked weights in (6) will satisfy,approximately, θ ( k ) ( t ) ∝ k − , a.s. , (7)for k = 1 , . . . , n , and the distribution curve will be a straight line with slope − r t ( i ) < n , equation (3) becomes dX i ( t ) = (cid:18) − g + σ (cid:19) X i ( t ) dt + σX i ( t ) dW i ( t ) , a.s. , (8)2nd E (cid:2) dX i ( t ) (cid:12)(cid:12) F t , r t ( i ) < n (cid:3) = 0 , a.s. , (9)if (cid:18) σ g − (cid:19) X i ( t ) = 0 , a.s. (10)Since X i ( t ) >
0, a.s., the necessary and sufficient condition for this to hold is that σ / g = 1. Thiscondition is equivalent to the requirement that (8) be a martingale for r t ( i ) < n (see also Bruggeman (2016),Section 3.6). Definition 1.
An Atlas model of the form (2) is
Zipfian if σ / g = 1.For a Zipfian Atlas model, λ = σ g = 1 , so it follows from (6) that (7) holds, which is exactly Zipf’s law. We shall see in the next section thatalthough a Zipfian Atlas model is distributed according to Zipf’s law, a model that follows Zipf’s law is notnecessarily Zipfian. Weakly Zipfian Atlas models
Empirically, for observed Zipf-like distributions it is not unusual for the distribution curve to be concavewith the slope of the tangent flatter than − − σ k that increase with rank, as is conjectured for citysize by Gabaix (2009), and is documented for stock capitalizations in Figure 5.5 of Fernholz (2002).In order to study the case of increasing variances, let us consider a generalized Atlas model with variancesthat depend on rank. For n ≥
2, let d log X i ( t ) = (cid:0) − g + ng { r t ( i )= n } (cid:1) dt + σ r t ( i ) dW i ( t ) , (11)for i = 1 , . . . , n , where g and σ , . . . , σ n are positive constants, r t ( i ) is the rank of X i ( t ), and W , . . . , W n isan n -dimensional Brownian motion (see, e.g., Fernholz (2002) or Banner et al. (2005)). Here all the ranksshare a common reversion rate g , but each rank k has its own variance rate σ k . Banner et al. (2005) showthat for a system of this form, if the gap processes log X ( k ) − log X ( k +1) are in their steady-state distribution,then E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) = σ k + σ k +1 kg , for k = 1 , . . . , n −
1, so the tangent to the distribution curve between rank k and rank k + 1 has log-log slopeof E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) log k − log( k + 1) = σ k + σ k +1 kg (log k − log( k + 1)) ∼ = − σ k + σ k +1 g . (12)Let us note that this slope is consistent with the slope (5) for the standard Atlas model (2).From (12) we can construct an example of a generalized Atlas model for which Zipf’s law holds, but themodel is not Zipfian. For an even number n and g >
0, let σ j = g and σ j +1 = 3 g, for j = 1 , . . . , n/
2. For these values, by (12), E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) log k − log( k + 1) ∼ = − σ k + σ k +1 g = − , k = 1 , . . . , n −
1, so the log-log slope of the tangent to the distribution curve is − σ r t ( i ) / g (cid:54) = 1 for any X i . Hence, Zipf’s lawholds for a Zipfian Atlas model, but a generalized Atlas model for which Zipf’s law holds need not be Zipfian.For a generalized Atlas model (11), equation (3) becomes dX i ( t ) = (cid:18) − g + σ r t ( i ) ng { r t ( i )= n } (cid:19) X i ( t ) dt + σ r t ( i ) X i ( t ) dW i ( t ) , a.s. , (13)for i = 1 , . . . , n . Let us assume that the gap processes log X ( k ) − log X ( k +1) are in their steady-statedistribution. For variable σ k , we cannot expect that σ r t ( i ) / g = 1 for all i with r t ( i ) < n , so this modelcannot be Zipfian. Instead, a more general definition is needed, so let us consider the adjusted total valueprocess (cid:101) X defined by d (cid:101) X ( t ) (cid:44) dX ( t ) − ngX ( n ) ( t ) dt. (14)We would like to impose conditions such that E (cid:2) d (cid:101) X ( t ) (cid:12)(cid:12) F t (cid:3) = 0 , a.s. , (15)which is a natural generalization of the expected change in X i when r t ( i ) < n in (9). We see from (13) that d (cid:101) X ( t ) = n (cid:88) i =1 dX i ( t ) − ngX ( n ) ( t ) dt = n (cid:88) i =1 (cid:18) − g + σ r t ( i ) (cid:19) X i ( t ) dt + n (cid:88) i =1 σ r t ( i ) X i ( t ) dW i ( t ) , a.s.Hence, condition (15) implies that n (cid:88) i =1 (cid:18) σ r t ( i ) g − (cid:19) X i ( t ) = n (cid:88) k =1 (cid:18) σ k g − (cid:19) X ( k ) ( t ) = 0 , a.s. , (16)which is a natural generalization of (10). Since X ( t ) >
0, a.s., we can divide by it in (16) and take theexpectation, which gives us the condition n (cid:88) k =1 (cid:18) σ k g − (cid:19) E (cid:2) θ ( k ) ( t ) (cid:3) = 0 , (17)where the expected weights satisfy E (cid:2) θ (1) ( t ) (cid:3) > · · · > E (cid:2) θ ( n ) ( t ) (cid:3) >
0, and E (cid:2) θ (1) ( t ) (cid:3) + · · · + E (cid:2) θ ( n ) ( t ) (cid:3) = 1. Definition 2.
A generalized Atlas model of the form (11) is weakly Zipfian if (17) holds.Now suppose that a generalized Atlas model is weakly Zipfian and that the variances σ < · · · < σ n areincreasing with rank. In this case the values of σ k / g for the larger weights θ ( k ) ( t ), i.e., for smaller k , willbe less than one, and the values of σ k / g for the smaller weights θ ( k ) ( t ), i.e., for larger k , will be greaterthan one. The same will be true for ( σ k + σ k +1 ) / g , so the distribution curve will be concave, with the slopeof the tangent flatter than − − Other Zipfian systems
In the models (2) and (11) all the ranks share a common reversion rate g . However, Atlas models can befurther generalized to first-order models , which are systems of the form d log X i ( t ) = g r t ( i ) dt + σ r t ( i ) dW i ( t ) , (18)4or i = 1 , . . . , n , where σ , . . . , σ n are positive constants, r t ( i ) is the rank of X i ( t ), W , . . . , W n is an n -dimensional Brownian motion, and g , . . . , g n are constants such that g + · · · + g n = 0 , and g + · · · + g m < m < n. (19)(see, e.g., Fernholz (2002) or Banner et al. (2005)). For these models, E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) = σ k + σ k +1 − g + · · · + g k ) , for k = 1 , . . . , n −
1, so the tangent to the distribution curve between rank k and rank k + 1 will have alog-log slope of E (cid:2) log θ ( k ) ( t ) − log θ ( k +1) ( t ) (cid:3) log k − log( k + 1) = σ k + σ k +1 − g + · · · + g k )(log k − log( k + 1)) ∼ = k ( σ k + σ k +1 )4( g + · · · + g k ) . (20)Note that this slope is consistent with the slopes (5) and (12) for the previous more restrictive models (2)and (11). From (20) we see that the generalized model (18) can be parameterized to fit an arbitrary strictlydecreasing distribution curve. Nevertheless, these results do not appear to suggest any obvious furthergeneralizations (at least not to the authors).There is a further generalization of these models that might be worthy of mention since it would accom-modate a generalization of Zipf’s law that appears in Gabaix (1999), Section III.2. In this more general case,we would consider a version of (18) where the X i depend on parameters based on both rank and index or name . These hybrid Atlas models were introduced by Ichiba et al. (2011), however, parameter estimationfor these models has not been completely resolved (as far as the authors know; see Fernholz et al. (2012)). Example: the “size effect” for stocks
It was observed by Banz (1981) that stocks of U.S. companies with smaller capitalizations can be expectedto have higher returns on average than stocks of U.S. companies with larger capitalizations. The explanationfor this “anomaly” was considered to be the higher risk involved in holding smaller stocks (see Fama andFrench (1993)). Here we present a simple structural explanation based on the weak version of Zipf’s law.Suppose that the processes X i in (11) represent the capitalizations of U.S. companies. It follows from(13) that the relative return of the stock X i at time t is dX i ( t ) X i ( t ) = (cid:18) − g + σ r t ( i ) ng { r t ( i )= n } (cid:19) dt + σ r t ( i ) dW i ( t ) , a.s. , (21)and similarly for the more restrictive configuration (8), where σ replaces σ r t ( i ) . Note that we are usingrelative return, since we removed the overall growth γ from the general model (1). For simplicity, we haveignored the payment of dividends or other distributions as a source of return since it was shown in Fernholz(1998) that the difference in these payments between large and small U.S. stocks had little influence on theobserved size effect.For a Zipfian model we have σ r t ( i ) / g = σ / g = 1, so it follows from (21) that E (cid:20) dX i ( t ) X i ( t ) (cid:12)(cid:12)(cid:12)(cid:12) r t ( i ) = k (cid:21) = (cid:18) σ g − (cid:19) g dt = 0 , a.s. , (22)for k = 1 , . . . , n −
1. Hence, the expected relative return of each stock above the bottom rank is zero.However, for a Zipfian model the distribution curve will be linear, and we know that the distribution curvefor stock capitalizations is concave rather than linear (see, e.g., Ijiri and Simon (1974) or Fernholz (2002),Figure 5.1). Therefore, the model can be at most weakly Zipfian.Suppose the model is weakly Zipfian with increasing variances σ < · · · < σ n , which is consistent with aconcave distribution curve. Instead of (22), we now have E (cid:20) dX i ( t ) X i ( t ) (cid:12)(cid:12)(cid:12)(cid:12) r t ( i ) = k (cid:21) = (cid:18) σ k g − (cid:19) g dt, a.s. , (23)5or k = 1 , . . . , n −
1. Condition (17) and the increasing variances σ < · · · < σ n imply that a large stock X i will have lower variance, so σ k / g <
1, and the conditional expectation in (23) will be negative, while a smallstock X i will have higher variance, so σ k / g >
1, and the conditional expectation in (23) will be positive.Hence, the expected return on small stocks will be greater than the expected return on large stocks, andthis provides a natural structural explanation for the size effect.
References
Banner, A., R. Fernholz, and I. Karatzas (2005). On Atlas models of equity markets.
Annals of AppliedProbability 15 , 2296–2330.Banz, R. (1981). The relationship between return and market value of common stocks.
Journal of FinancialEconomics 9 , 3–18.Bruggeman, C. (2016).
Dynamics of Large Rank-Based Systems of Interacting Diffusions . Ph. D. thesis,Columbia University.Chatterjee, S. and S. Pal (2009). A phase transition behavior for Brownian motions interacting throughtheir ranks. arXiv:0706.3558v2 , 1–30.Fama, E. F. and K. R. French (1993, February). Common risk factors in the returns on stocks and bonds.
Journal of Financial Economics 33 (1), 3–56.Fernholz, R. (1998, May/June). Crossovers, dividends, and the size effect.
Financial Analysts Jour-nal 54 (3), 73–78.Fernholz, R. (2002).
Stochastic Portfolio Theory . New York: Springer-Verlag.Fernholz, R., T. Ichiba, and I. Karatzas (2012). A second-order stock market model.
Annals of Finance ,1–16.Fernholz, R. and I. Karatzas (2009). Stochastic portfolio theory: an overview. In A. Bensoussan andQ. Zhang (Eds.),
Mathematical Modelling and Numerical Methods in Finance: Special Volume, Hand-book of Numerical Analysis , Volume XV, pp. 89–168. Amsterdam: North-Holland.Fernholz, R. T. and C. Koch (2016, February). Why are big banks getting bigger?
Federal Reserve Bankof Dallas Working Paper 1604 .Gabaix, X. (1999, August). Zipf’s law for cities: an explanation.
The Quarterly Journal of Economics 114 ,739–767.Gabaix, X. (2009). Power laws in economics and finance.
Annual Review of Economics 1 (1), 255–294.Gabaix, X., J.-M. Lasry, P.-L. Lions, and B. Moll (2015, July). The dynamics of inequality.
NBER WorkingPaper 21363 .Ichiba, T., V. Papathanakos, A. Banner, I. Karatzas, and R. Fernholz (2011). Hybrid Atlas models.
Annalsof Applied Probability 21 , 609–644.Ijiri, Y. and H. Simon (1974). Interpretations of departures from the Pareto curve firm-size distributions.
Journal of Political Economy 82 , 315–331.Luttmer, E. G. J. (2011, July). On the mechancis of firm growth.
Review of Economic Studies 78 (3),1042–1068.Newman, M. E. J. (2006, May). Power laws, Pareto distributions, and Zipf’s law. arXiv:cond-mat/0412004v3 [cond-mat.stat-mech] .Pal, S. and J. Pitman (2008). One-dimensional Brownian particle systems with rank dependent drifts. arXiv:0704.0957v2arXiv:0704.0957v2