[PDF] Dirichlet random walks of two steps

Abstract

Random walks of n steps taken into independent uniformly random directions in a d-dimensional Euclidean space (d larger than 1), are named Dirichlet when their step lengths are distributed according to a Dirichlet law. The latter continuous multivariate distribution, which depends on n positive parameters, generalizes the beta distribution (n=2). The sum of step lengths is thus fixed and equal to 1. In the present work, the probability density function of the distance from the endpoint to the origin is first made explicit for a symmetric Dirichlet random walk of two steps which depends on a single positive parameter q. It is valid for any positive q and for all d larger than 1. The latter pdf is used in turn to express the related density of a random walk of two steps whose step length is distributed according to an asymmetric beta distribution which depends on two parameters, namely q and q+s where s is a positive integer.

Full PDF

11 Two-step Dirichlet random walks

Gérard Le Caër

Institut de Physique de Rennes, UMR UR1-CNRS 6251, Université de Rennes I, Campus de Beaulieu, Bâtiment 11A, F-35042 Rennes Cedex, France

Tel :+33 2 23 23 53 77 Fax :+33 2 23 23 67 17

E-mail address : [email protected]

PACS: 05.40.Fb Random walks and Lévy flights, 02.50.Cw Probability theory

Keywords: Dirichlet random walks; random flights; Dirichlet distribution; asymmetric distributions, Pochhammer symbols, Lucas coefficients

Abstract

Random walks of n steps taken into independent uniformly random directions in a d -dimensional Euclidean space   d  , which are characterized by a sum of step lengths which is fixed and taken to be 1 without loss of generality, are named “Dirichlet” when this constraint is realized via a Dirichlet law of step lengths. The latter continuous multivariate distribution, which depends on n positive parameters, generalizes the beta distribution   n  . It is simply obtained from n independent gamma random variables with identical scale factors. Previous literature studies of these random walks dealt with symmetric Dirichlet distributions whose parameters are all equal to a value q which takes half-integer or integer values. In the present work, the probability density function of the distance from the endpoint to the origin is first made explicit for a symmetric Dirichlet random walk of two steps. It is valid for any positive value of q and for all d  . The latter pdf is used in turn to express the related density of a random walk of two steps whose step length is distributed according to an asymmetric beta distribution which depends on two parameters, namely q and q s  where s is a positive integer.

1. Introduction

To model the infiltration rate of a given species into possible habitats, Pearson defined in 1905 a simple planar “random walk” (RW) which is made of a sequence of n steps with identical fixed lengths taken into uniformly random directions [1-2]. This idealized RW has been used recently to assess the electromagnetic compatibility of a group of N identical power electronic converters [3]. The cases of five to some tens of steps ( N ) were more particularly analyzed. The Pearson’s RW has been applied as well to the characterization of the cosmic microwave background (CMB) [4-5]. Complex coefficients are obtained in a spherical harmonic representation of temperature maps of the CMB. Different types of random walks, associated with the spherical harmonic mode l , are then performed in the phase space, one being a Pearson’s RW [4-5]. In addition, the analysis of the temperature and polarization of the CMB led Reimberg and Abramo [6-7] to define random flights made of two successive stages in spaces with different dimensions   , 2 d d d d   , both with deterministic step lengths. Such flights emerge from the treatment of Boltzmann equations which codify the interplay between collisional physics and free propagation. The coefficients of a multipole decomposition of the temperature and polarization of the CMB are determined from these equations. For each stage, the space dimension is determined by the order of the multipole which dominates it [7]. Variations on the theme of Pearson’s random walk involve space dimensions higher than two, changes of step length distributions, deviations of step orientations from a uniform repartition and the introduction of correlations between steps [6-34]. A frequent change consists in allowing step lengths to vary according to some continuous probability law. Such modifications find applications in diverse fields such as physics, biology, ecology ([4-7, 9–20] and references therein). A few examples, by no means exhaustive, are given hereafter. Random walks with exponentially distributed step lengths were studied in 2D by Stadje [13] as a possible description of the motion of microorganisms on planar surfaces and in 3D by Vignolles et al. [14] to model the chemical vapor infiltration method used to prepare ceramic-matrix composites. To study the relation between the Boltzmann equation and the underlying stochastic processes, Zoia et al. [15] investigated exponential flights in d . The probability density of finding a particle at position r at the n -th collision was determined for an infinite medium. The cases were

1, 2 d  find applications respectively in the field of electron transport in nanowires or carbon nanotubes and in the study of the dynamics of chemical and biological species on surfaces. Exponential walks are quite naturally extended to random walks with gamma distributed step lengths (appendix A). An important class of walks is that of Lévy flights whose step lengths have heavy-tailed probability distributions. Many organisms are believed to perform Lévy flights in their search for resources [16-20]. The Levy-flight foraging hypothesis is the subject of much debate but there is presently a growing consensus that many organisms diffuse anomalously. Alternative models to Lévy flights or the emergence of Levy patterns from composite models, are regularly put forward and discussed controversially (see for instance [18]). We notice that a shifted gamma distribution, with a small shape factor ~0.3, was used for some time to account for flight durations of sea birds [17]. The applications of the previous random walks imply neither that the walk space is a physical space with a dimension of at most three nor that the step length distributions are limited to a few standard mathematical forms. The present work focuses on random walks in a d -dimensional Euclidean space   d  whose step lengths have a Dirichlet distribution. The Dirichlet distribution is applied for instance to model fragmentation or compositional data [35]. Further, gamma and Dirichlet distributions are strongly connected as are the associated random walks (appendix A). of n steps Random walks of n steps in d   d  taken into independent uniformly random directions, with unequal step lengths k l   k n  which obey the additional constraint of a constant total sum n kk l S cst     , were investigated recently [21-29]. The constant S is taken hereafter as equal to 1 without loss of generality (eq. 5, section 2). The problem of the step length distribution of such walks is thus directly related to the broken stick problem, i.e. the problem of the random splitting of a unit interval. The Dirichlet distribution has been more particularly considered as an appropriate step length distribution [22-29]. Following Letac and Piccioni [29], the associated random walks are named hereunder “Dirichlet random walks” and referred to as     , , n W d n q . The Dirichlet distribution of the random vector     , ,.., nn L L L  L nn kk L L        , denoted here as     n D q , with parameters     ,..., nn q q  q has a multivariate probability density function (pdf) given by ([36], p. 18):     ,.., 11 , 0, 1,.., nm n imn i ii i qi f l l K ll l l i n          where m n   ,     nn ii K nq q         , n ii nq q    . For n  , the Dirichlet distribution reduces to a beta distribution denoted hereafter as   , Be q q (eq. 2). For convenience, these two-step beta random walks will still be named “Dirichlet”. In the literature, special attention has been paid to “symmetric” Dirichlet random walks,   , ,

W d n q , i.e. to walks whose step lengths are distributed according to a symmetric Dirichlet distribution for which     , ,..., n q q q  q . As the notation   , , W d n q is self-explanatory, the word “symmetric” is omitted to designate them. There is a close connection between a symmetric Dirichlet random walk   , ,

W d n q and a gamma random walk   , ,

G d n q whose step lengths have identical and independent gamma distributions   ,1 q  (appendix A). The pdf of the endpoint position of   , , G d n q is obtained from a single integral from the pdf of the endpoint position of   , ,

W d n q (eq. A.8, [25-26]). One of the initial motivations for studying constrained exponential random walks,   , ,1

W d n , was to answer the question as to whether it is possible to find triplets   , ,1 d n for which the endpoints are uniformly distributed on the unit ball of d [21]. The latter quest was extended to walks   , , W d n q [25] and more generally to hyperuniform random walks where the word “hyperuniform” used in [29] is preferred here to the term “hyperspherical uniform” used in [25-26]. A n - step random walk in d is said to be hyperuniform of type k d  if the distribution of the endpoint of the walk in d is identical with the distribution of the projection in the walk space of a point uniformly distributed on the surface of the unit hypersphere of k [25-26,29]. The pdf of the position R of the endpoint of a Dirichlet random walk and that of the distance R = R from the endpoint to the starting point will be denoted respectively as       n d p q r and       n d P r q . These two pdf’s, which are simply related (eq. 4), can be considered interchangeably. For hyperuniform random walks,       n d p q r is     k d r     . A uniform random walk on the unit ball of d defined above is thus hyperuniform of type k d   . The pdf’s       n d p q r were derived for any space dimension d d  and any n  for the three following families of walks   , , W d n q : q d   with d  , q d   and q d  , both with d  [25-26]. The walks with q d   and q d   were shown to be hyperuniform of type   k n d    and   k n d    respectively [25]. In addition, the pdf’s       n d p q r were obtained for particular values of two of the three parameters of the triplet   , , d n q such as   , 2,1 d [22] and  

6, ,1 n [24] or for particular values of these three parameters. Altogether, q takes either half-integer or integer values in all cases where       n d p q r has been made explicit. It is much simpler to study two-step Dirichlet random walks   , 2,

W d q than the general Dirichlet random walks mentioned above so that the question of their relevance might even arise. However, their unequalled advantage lies in the existence of an explicit expression of the pdf of the endpoint distance,           d dq,q P r P r     q , which is valid for any value of q  and for any d  such that   q d  . To the best of our knowledge, the latter expression has not yet been given explicitly in the literature. In the present work, we obtain first the density     dq,q P r (section 3). Second, the relevance and the usefulness of the permutation invariant distribution associated with an asymmetric step length distribution are discussed (section 4). The former distribution is used among others to establish that the two following families of Dirichlet random walks     , , n W d n q , where   n q is respectively   , ,..., q q q and  

1, ,..., q q q  , are indistinguishable for any   q d  (section 4). Last, the results of the two previous steps are applied to the derivation of the pdf     dq+s,q P r of two-step random walks         , 2, , 2, , W d W d q s q   q which depend on two different Dirichlet parameters, q  and q s  where s is a positive integer (section 5). A second formal representation of     dq+s,q P r is derived. Both representations are shown to be equivalent by a method of moments.

2. Notations

As usual upper-case letters will be used to denote random variables (r.v.) and lower-case letters for the values they take. The mean of a function   f X of a continuous random variable X , with a pdf   X p x whose support is D , will be denoted hereafter as       XD f X f x p x dx   . The Pochhammer symbol   k a , its duplication formula [37], the beta function   , B  , the beta distribution,   , L Be q q , whose pdf is   L f l and its moments k L , will be used repeatedly throughout the text:                                         = 1 .. 1 , 114 2 2, 1 21, 0,1,, , qqL k k k kk k kk a a k a a a a k aa b a ba bB l l dll lL Be q q f l lB q qL B q k q B q q q q q                                                 where   x  is the classical gamma function, , 0   , , 0 q q  and k is taken here to be a positive integer. The position of the endpoint of a Dirichlet random walk     , , n W d n q is   n  :     n di ii L    R U where the   di U are n independent unit vectors uniformly distributed over the surface of the hypersphere in d . The   i L i n  follow a Dirichlet law     n D q . For the spherically symmetric walks described by eq. 3, the pdf of the endpoint position R and the pdf of the endpoint distance R = R of a     , , n W d n q walk depend only on r =  r . Both are related through:                 d n n d d d P r r p rd     q q In the following, we will most often consider the pdf       n d P r q . As we will deal essentially with two-step walks, the components   , q q of   q will be given explicitly in the notations. The position and distance pdf’s associated with any walk     , 2, , W d q q will respectively be denoted as     , dq q p r and     , dq q P r . Finally, once pdf’s are obtained for a walk     , , n W d n q of a total length of 1, then densities       , n d S p q r and       , n d S P r q are immediately calculated for an arbitrary total walk length S from:                           , ,

1 0, 51 n nn n d S ddd S d p p SS r SrP r PS S                q qq q rr Similar relations were given previously as eq. 16 of [25] and as eq. 2 of [26] where S is replaced by l . In the second relation of each equation, d l must be replaced by l . The results discussed in [25-26] are not affected by this error as they were obtained from the first part of eq. 5 followed by the application of eq. 4.   , 2, W d q

The final position of a walk   , 2,

W d q is more simply written as (eq. 3):         d d L 1- L  

R U U

From eq. 6, we obtain the square of the final distance . R = R R :          

1- 2 1 1 . 7 d d

R L L   = U U . Defining:            

21 2 2 d d

Y L LX RZ           

U U eq. 7 becomes:   X YZ = In eq. 9, the r.v.’s Y and Z are independent and have beta distributions, respectively

1, 2

Y Be q     when   , L Be q q and d dZ Be      [29]. Appendix B gives a derivation of the distribution of Y for   , L Be q q . The latter distribution is relevant for the discussion presented in section 4. The distribution of Z is simply found to be d dZ Be      from the pdf of the polar angle  (see for instance [36] p. 104),       sin 1 2, 1 2 d p B d       . The moments of Z are thus       -1 2 1 k kkd Z d d   (eq. 2). The moments of Y , , k q q Y    , k q q L L    , are in turn readily obtained from eq. 2 for a walk     , 2, ,

W d q q whose length distribution is then   , L Be q q . Eq. 9 yields finally:                , , ,, , 2 -1 21 4 101 kk k k k k k kd q q q q dd q q k k dq qX R Y Z q q d       

0 The pdf of a r.v., which is a product of two independent beta r.v.’s, is explicitly known as a function of the parameters of the two beta laws ([38-41] and appendix C). With

1, 2

Y Be q     and d dZ Be      , the pdf of X writes (eq. C.1):               d dqX d d dp x x x F q x xB q             The common support of the distributions of X and of R is [0,1]. As r is a monotonically decreasing function of x , r x   , the pdf of R is obtained from the pdf   X p x to be       , X dq q P r rp r   . From eq. 11, we get then    

2, 0, 0,1 d q r    :           , d dq q qd d d dP r r r F q rB q           Transforming the Gauss hypergeometric function [42], eq.12 can be written equally as:           , dq q d dd dP r r r F d q rB q               The associated pdf’s of the endpoint position,     , dq q p r , are readily obtained from eq. 4. Previously known pdf’s are retrieved from eq. 12 or from eq. 13, adding eq. 4 if necessary. For instance, the pdf     d p r was given in [22]. The specific parameters q d   and q d   of the two families of hyperuniform Dirichlet random walks mentioned in section 1.1 are seen to appear quite naturally when the first arguments of the hypergeometric functions of eq. 12 and of eq. 13 are made equal to zero. We obtain:                               , d ddd ddd d d dq q r r q d dB dP r r r q d dB d r d r r q d dB d d d                         

1 These pdf’s agree with those of table 1 of [25] for

2, 2 1, 1 n q d d     and with eq. 44 of [26] for n q d   . Finally, eqs 4 and 12 give         p r   r in agreement with eq. 13 of [24] for S ct   . In addition, we performed Monte-Carlo simulations of   , 2,

W d q random walks for arbitrary values of q . Figure 1   s  gives an example, among many others, of a comparison between simulated and calculated results for

15 4 q  . In the following, we will use the pdf     , dq q P r determined above, to express the corresponding pdf’s for the random walks,     , 2, ,

W d q s q  , where s is a positive integer. The independent step length of a walk     , 2, , W d q s q  is distributed according to asymmetric beta distributions,   , L Be q s q  (eq. 2, fig. 1). Before calculating the abovementioned pdf’s, we first discuss the consequences of the use of such an asymmetric distribution.

4. Asymmetric distributions of step lengths

We consider a n –step random walk in d whose final position (eq. 3) is rewritten again as:           n nd di i i ii i L L         R U U where the sole constraint is still, n ii L    . In eq. 15,         n     denotes any of the ! n permutations of  

1, 2,.., n . An asymmetric distribution of the i L ’s means here that the multivariate distribution of lengths  

11 2 1 1 , ,.., 1 nn n n ii f l l l l l         is not invariant under permutations of the i l ’s . Thus, the univariate marginal distributions are not all identical. When convenient, the multivariate distribution, which is not necessarily a Dirichlet distribution, will equally be written hereafter as   , ,.., n f l l l . The sum which gives R is 2 commutative (eq. 15). The ! n possible attributions of a set of lengths   , 1,.., k k n l  to the steps numbered

1, 2,.., n result then in undistinguishable walks. Then, the principle of indifference states that each permutation should be given a probability of n with the consequence that all steps end up with identical length distributions independently of their (arbitrary) order. It suffices then to symmetrize the initial distribution of L to make it invariant under permutations. We conclude that it is this permutation invariant distribution which is the sole meaningful step length distribution. An example is discussed in appendix B for n  . The permutation invariant distribution associated with   , ,.., n f l l l     , ,.., n n f l l l   is:              

1, ,.., , ,.., 16! n n n f l l l f l l ln       where the sum runs over the ! n permutations  of  

1, 2,.., n . Constrained random walks with different initial length distributions are thus undistinguishable if and only if the distributions     , ,.., n n f l l l   associated with them are identical. We apply now the previous discussion to random walks whose step length distributions are asymmetric Dirichlet distributions in the above sense. Dirichlet random walks with such step length distributions are not really “Dirichlet” as the relevant distribution (eq. 16) is no more a symmetric Dirichlet distribution, except in some particular cases (eq. 18 below with q q   ), but a mixture of asymmetric Dirichlet distributions. To distinguish between these two types of walks, the latter will be designated in abridged form as “asymmetric Dirichlet random walks”. We consider more particularly Dirichlet distributions of step lengths whose parameters are all equal except one taken to be the first,     , ,.., n q q q  q . The associated pdf (eq. 1) is explicitly: 3            

1, ,.., 171 0 nq qm in inn i ii n q qf l l l l lq ql l l i                 The marginal distributions of all   k L k  are identical beta distributions   , 1 k q mq L Be  while that of L is   q mq Be  ([36] and eq. A.3). The permutation invariant distribution associated with eq. 17 is then:              

11, ,.., 18 n n q qqn n i in ii n q qf l l l l ln q q                        as all i l , except one, play the same role. Thus, the permutation invariant density (eq. 18) reduces to the Dirichlet law whose parameters are all equal to q when q q   because n ii l    . In other words, the asymmetric Dirichlet random walk whose parameters are  

1, ,.., q q q  is identical with the symmetric Dirichlet random walk whose parameters are   , ,.., q q q . The latter conclusion is valid for any q  . This results in the identity of the pdf’s,         , ,..1, ,.. d dq q qq q q p p   r r and         , ,..,1, ,.. d dq q qq q q P r P r   for any value of n  . In section 5 of [25], it was shown that the n -steps Dirichlet random walks in d , whose parameters are   , ,.., q q q and  

1, ,.., q q q    q d d    , yield both hyperuniform random walks of type k where k is   n d   for q d   and   n d   for q d   . The families   , ,.., q q q and  

1, ,.., q q q  were incorrectly considered to be different [25].

5. Asymmetric Dirichlet random walks of two steps     , 2, ,

W d q s q  In the case of two steps, now with q q s   where q takes any positive value and s is a positive integer, the permutation invariant distribution (eq. 18) writes: 4            ss q l lf l l lB q s q            Two formal representations of the pdf     , dq s q P r  are obtained from this permutation invariant pdf by the simple methods described below. The first representation is derived from eq. 9, X YZ = (section 5.1) while the second (section 5.2) is based on an expansion of   ss l l   in terms of powers of   l l  .     , dq s q P r  In appendix B, the distribution of Y   L L   (eq. 8) is shown to be a weighted sum of s      beta distributions with weights    , 1 22 2 , k q s s B q kc k B q s q         sk        (eq. B.6). The distribution of X YZ R    (eqs 8 and 9) is consequently a mixture of s      distributions of r.v.’s k X ,     = X X k s kk p x c p x      . The distribution of each k X is that of a product of independent beta r.v.’s, k k X Y Z  , with   , 1 2 k Y Be q k  (eq. B.6) and d dZ Be      (section 3). It is given by the following relation (eq. C.1):                  

11 1 , ; ;12 2 2 201 2, 1 2, 1 2 1 2, 1 2 k k dqX k k d d dp x x x F q k k xB d kB q k B d d                        The sought-after pdf is finally obtained from the relation,       , X dq s q P r rp r    (section 3): 5                        

211 2 2 2, , 2 10, 2 2

11 1 , ; ;2 2 2 211 , 1 2 , 1 222 , 1 2 , 1 2 sqd d d kq s q q s q kkdq s q kq s d d dP r r r r F q k k rs B d kkB q s q B d d                                             This first representation is a sum of s      non-negative contributions. This is not the case for the second one which is now derived.     , dq s q P r  To express explicitly     f l  (eq. 19) as a linear combination of symmetric pdf’s , we need first to expand   ss l l   in terms of powers of   l l  . For this end, we use a classical expansion [43-44]:           ss s j j s jj x y C s j xy x y          For reasons made clear later (eq. 32, appendix D), we prefer to write it as:            ss s jjs jj xyx y C s jx y x y            x y   and s  . The case where s  , which reduces to 1=1, is solved in section 4.2 with the result that         ,1, d dq qq q P r P r   . This case is no longer considered even if eqs 26 and 28 below hold for s  . The numbers   , C s j , known as coefficients of Lucas (or Cardan) polynomials [45], are given by:      

1 0, 0 24 1 12 1 2 if js js sC s j j s js sjjs j jj                                    

6 From eq. 22, we write:             sss jjj l l C s j l l s              We are now ready to express explicitly     f l  as a linear combination of s      beta distributions from which the pdf     , dq s q P r  can be obtained immediately from the known pdf     ', ' dq q P r given in section 3 (eqs 12 and 13). First, eq. 19 writes:             

11 2 26, s q jj jj l lf l w sB q j q j                  With:                  

2, , , 272 , 2 2 s j jj s j q q qC s j B q j q j C s jw B q s q q q     

We deduce then the sought-after pdf of R :               , ,0

1 0, 2 28 s d djq s q j q j q jj P r w P r q s            The coefficients j w of this expansion, which depend only on the distribution of L , remain the same for any space dimension d . From eq. 13 (noticing that   d q B q q  =   d B q  ), eq. 28 can be written explicitly as:                 s dd q d jdq s q jj r r C s j dP r F d q j rB q s q                         Equivalently, from eq. 12: 7                 s d q qd dq s q j jjj P r r rB q s qC s j d d dr F q j r                            A direct proof of the identity of the two representations of the pdf     , dq s q P r  might be obtained from a cascade of Gauss’ transformations between contiguous hypergeometric functions applied to eq. 21 as done in appendix E for

2, 3 s  . Indeed, the third argument '2 d k  of the hypergeometric functions

22 1

1' 1 , ; ';2 2 2 d d dF q k k r        of eq. 21 must be transformed into d for any ' k ' 1,.., 2 sk        . However, this calculation appears to be complicated. As distributions with bounded supports are uniquely determined by their moments of positive integer order k , it suffices to prove that both representations yield identical moments  

2, , , , kk d q s q d q s q X R     (eq. 8) for the walks     , 2, ,

W d q s q  . Indeed, the pdf of R is obtained from the pdf   X p x as       , X dq s q P r rp r    (section 3), conversely         , X dq s q p x P x x     . Thus, the identification of the distribution of X from its moments leads ipso facto to that of R and vice versa.   k d q s q R   of the walks     , 2, , W d q s q  The first representation of     , dq s q P r  (eq. 21) is directly derived from eq. 9 which yields , , , k k kd q s q q s q d X Y Z    . The moments , k q s q Y  are for instance obtained from eq. B.8 for the mixture of distributions while k d Z is given above eq. 10. Thus:                , , 2 k k k k kd q s q k k dq s qR q s d      

8 It is equivalent to apply eq. 10 to a walk whose step length is distributed according to an asymmetric beta distribution   , Be q s q  to get eq. 31. The calculation of the moments from the second representation (eq. 28) makes use of the following relation which mirrors eq. 23 in which powers are replaced by Pochhammer symbols:                 s j j js s js j x yx y C s jx y x y             

0, 2 s x y s    . As we failed to find eq. 32 in the literature, we derive it in appendix D. Eqs 27 and 32 confirm readily that   s j jj w        . Eq. 10 applied now to the symmetric walk   , 2, W d q j  gives:                , , 2 k k k k kd q j q j k k dq j q jR q j d        . The factor      k kk q j q jq j   can be rewritten as               

22 2

22 2 2 j j jk kk j j j q q k q kq qq q q q k     . Together with the expression of j w (eq. 27), the moment   , , k d q s q R   becomes:                           , , 20 2 s k k k k s kd q s q k s kj j jj j dq q qR q q dq k q kC s j q k                   The bracketed sum in the right-hand side of eq. 34 is equal to   

22 2 ss q kq k  (eq. 32). Then:                      , , 2 k k k s s kkd q s q k s s k dq q q kR q q q q k d               The bracketed product in eq. 35 simplifies into    k k q sq s  , so that we obtain finally: 9                , , 2 k k k k kd q s q k k dq s qR q s d       which is identical with eq. 31. Without surprise, the previous calculation proves that the densities     , dq s q P r  obtained from eqs 21 and 29 are two representations of the same pdf. It is indeed the permutation invariant length distribution     f l  (eq. 19) associated with the asymmetric beta distribution   , Be q s q  which is the relevant one as it yields the exact pdf’s of the endpoint position and of the final distance.     , 2, , W d q s q  For

2, 3 s  , eq. 28 becomes:                           ,2, 1, 1,3, 1, 1 d d dq qq q q qd d dq qq q q q q qP r P r P rq qq qP r P r P rq q                                   These linear combinations were previously derived by G. Letac (personal communication, 2014). Explicit densities, which are obtained from the pdf’s of sections 5.1 and 5.2, are given in appendix E for

2, 3 s  where a direct calculation proves that the two representations of     , dq s q P r  are identical for these two values of s . We performed too Monte-Carlo simulations of     , 2, , W d q s q  random walks for

15 4 q  with s ranging between 1 and 6 (figure 1). The beta r.v.,  

15 4 ,15 4

L Be s  (fig. 1a), was simulated with a method described by Devroye (section IX.4 of [48]). As shown above, the asymmetric beta distribution  

15 4 ,15 4

Be s  and its permutation invariant counterpart     Be s Be s       give rise to identical walks     , 2, 15 4 ,15 4

W d s  . In numerical simulations, advantage is then taken from the fact that the simplest distribution to simulate is the former,  

15 4 ,15 4

Be s  , despite the fact that it is 0 Figure 1: Comparison of the results of Monte-Carlo simulations of 8.10 Dirichlet random walks of two steps in ,    

3, 2, 15 4 ,15 4

W s  , with those calculated for

15 4 q  and for s varying from 0 to 6 as indicated. The differences between simulated and calculated results are of the order of line thicknesses. Calculated results are obtained from: a) eq. 2 for the step length pdf L ,  

15 4 ,15 4

Be s  b) eq. 29 for the pdf of the final distance R :    

315 4 ,15 4 s P r  ( the pdf’s      

315 4,15 4 P r s  and      

319 4,15 4 P r s  are identical as discussed in section 4.2). 1 the latter which is the reference distribution as confirmed by fig. 1b which compares the simulated pdf’s      

315 4 ,15 4 s P r s   to those obtained from the symmetrized distribution (eq. 29). When s becomes larger and larger for a given q , the step length distribution   , Be q s q  concentrates more and more in the vicinity of l  . In consequence, one of the step lengths decreases progressively down to 0 while the other increases concomitantly up to 1. The resulting pdf of the endpoint distance becomes steeper and steeper in the vicinity of 1 as shown by fig. 1.

6. Conclusion

First, explicit relations (eqs 12-13) have been given for the pdf’s     , dq q p r and     , dq q P r of the endpoint position and of the final distance of a two-step Dirichlet random walk,   , 2,

W d q , whose associated step length distribution is a symmetric beta distribution   , Be q q . These expressions are valid for any value of q  and for any d  such that   q d  . The specific parameters q d   and q d   of the two families of hyperuniform Dirichlet random walks are seen to emerge quite naturally from the previous pdf’s. Second, n -step random walks, whose step length densities are asymmetric, have been considered to conclude that the relevant distributions are actually the permutation invariant step length distributions associated with the initial distributions (eq. 16). Last, the previous results have been applied to the case of two-step random walks,     , 2, , W d q s q  , with an asymmetric step length distribution   , Be q s q  , where q has any positive value and s is an integer   s  . Two representations have been derived for the pdf     , dq s q P r  (eqs 21, 29-30), both as sums in a number s     of terms. The first one is a sum of non-negative terms and the second is a linear combination, with coefficients of opposite signs, of the pdf’s of symmetric Dirichlet walks   , 2, W d q j  ,   j s     . 2 Acknowledgments:

I thank Gérard Letac (University of Toulouse, France) , Emanuele Casini and Andrea Martinelli (University of Como, Italy) for fruitful discussions on asymmetric length distributions and for a critical reading of the first version of the manuscript. Gérard Letac generalized some of the results discussed in section 4 (personal communication). 3

Appendix A: Connections between gamma and Dirichlet distributions and between the associated random walks

A gamma distributed random variable G , denoted for brevity as   , G   , has a probability density function   G p x given by [36]:          exp 0 A.1 G x xp x x        where  is the shape parameter and  the scale parameter while    is the Euler gamma function. The characteristic function of G is     itGG t e i t      [36]. A sum G of n independent gamma random variables, i G     , 1,.., i q i n    , with identical scale parameters and a priori different shape parameters, is a gamma random variable   , , n ii G nq nq q         . This is readily deduced from the characteristic function of the sum n kk G G    ,       k n q nqitGG k t e it it              . As the scale parameter is irrelevant in the present context, its value will be fixed at 1 from now on. The Dirichlet distribution can be obtained [36] from a set of n m   independent gamma random variables     ,1 1,..., i i G q i n   by defining   ,1 n jj G G nq     and   j j L G G j n   . The distribution of     , ,.., nn L L L  L is then a Dirichlet distribution with parameters     ,..., n n q q  q ,       n n D L q , and a pdf ([36], p. 17):     ,.., A.21 , 0, 1,.., mm i ii i n qn iin f l l K ll l l i n         

4 where     nn ii K nq q         . If the n components of       n n D L q are collected into k groups and summed up to form k new components     ,.., , 1 kk ii k S S S        S , then the distribution of   k S is   *k D     q where each   * i q i k  is the sum of the parameters j q ‘s of the components of   n L which add up to i S . This amalgamation property [36] results directly from the characteristics of the sum of independent gamma random variables described above. The marginal distribution of any component   k L k n  is then obtained from the two components k S L  and i ni i k S L S       . The marginal distribution   k f l of k L is thus a beta distribution whose pdf is [36]:          

1 0,1 A.3, kk nq qqk k k l lf l lB q nq q     We consider the stochastic relation between the n -dimensional random vectors   n G and   n L (section 4 of [26]):       A.4 n n S G L where

A B means that the random variables A and B are identically distributed, S is gamma distributed,   ,1 nq  ,     , ,.., n n = L L L L is Dirichlet distributed,       , ,.. n D q q q  q and S and   n L are independent. Then the vector     , ,.., n n n G SL G SL G SL  G has independent   ,1 q  components ([36], p. 148). From   n G , we define a gamma random walk   , , G d n q whose endpoint position is:      

A.5 n nd di i i ii i G G S L         

R U U

5 where     di i n  U are n independent and identically distributed unit vectors. Similarly   n L defines a Dirichlet random walk   , , W d n q whose endpoint position is:     A.6 n di ii L    R U

The connection between the endpoint positions G R and R of the random walks   , , G d n q and   , ,

W d n q , which results from eq. A.4, can then be condensed in the following stochastic representation:  

A.7 G S R R where S and R are independent. We denote the pdf’s of the endpoint position and of the endpoint distance of the gamma walk   , , G d n q respectively as       n d g r q and       n d G r q . Translating eq. A.7 in terms of pdf’s,       n d g r q is related to the pdf of the endpoint position of the Dirichlet walk   , , W d n q by (section 4 of [26]):                 n n nq d nq d d d rg r rt t p dtnq t             q q with       n d p r  q for r  . The pdf       n d g r q is thus the Laplace transform of       nq d n d t p t   q . The known pdf       n d g r q of the walk   , , G d n d was used in [26] to derive       n d p r q for the walk   , , W d n d . Finally,       n d G r q is given by               n n dd dd G r r g rd     q q (eq. 4).       n d p q r Appendix B: Distribution of  

Y L L   for   , L Be q q

The continuous r.v.’s L and Y have a common support, [0,1]. When n  , the Dirichlet distribution (eq. 1) reduces to a beta distribution,   , L Be q q (eq. 2). From the relation between probabilities:       y yY y L L y                    and from

14 1 dldy y   , we get the pdf of Y :     L L Y y yp y f f y                       When q q q   ,       qqL f l l l B q q    , eq. B.2 becomes, after applying the duplication formula of the gamma function,       q q qq     :         

11 12 1 1 1 = = B.3, 2 2 2 2 ,1 24 1 Y q q y yy yp y B q q B qy                          which shows that Y has a beta distribution,

1, 2

Y Be q     [29]. When the distribution of L is asymmetric, q q  ,      

11 1 2 qqL f l l l B q q    , we define   min , q q q  ,   max , Q q q  and q Q q q q      , to write:        Y q qqq y y y yp y B q q                          

7 When q   , the pdf’s of Y (eq. B.4) and consequently the pdf’s of the final distance R (eqs 8 and 9), are seen to be identical when obtained either from the length distribution,   , L Be q q , or from the symmetric mixture,    

L Be q q Be q q  . This point is further discussed in section 4. When q s   , where s is a positive integer, the distribution of Y becomes a mixture of beta distributions. It reduces to a beta distribution,

1, 2

Y Be q     , for s  (see too section 4.2). Using the fact that:    

21- 0 =2 s ks k s s sy y yk                                   the distribution of Y (eq. B.4) becomes indeed:          Y ks qkkk q s y yp y c B q ks B q kc k B q s q                          It is a mixture of s      beta distributions,     = Y Y k s kk p y c p y      with   , 1 2 k Y Be q k  . As only normalized distributions have been dealt with all along the calculation, it follows that s kk c       . Equivalently, the latter relation writes:       , 1 2 2 , B.72 s q sk s B q k B q s qk               A direct proof of eq. B.7 is straightforward, being the converse of the previous calculation. The definition of the beta function (eq. 2), the use of eq. B.5 and a change of variable x t   gives indeed: 8       , 1 2 1- 1- 2 2 s s kqk k s sB q k t t tk k dt                                     q q s q s x x dx B q s q             where the last equality results from integral 3.196.3 of [42]. Similarly the moments , k q s q Y  are easily retrieved from the mixture of distributions, using eqs 2 , B.6 and B.7:                   , , , 1 2= 2 1 22 ,4, 1 2 4 ,= = B.82 , 22 , s sk k kn n q sn n ks kk k kq sn k q s q q s q qs B q nY c Y n q nB q s q q s qs B q k n B q s k q kn B q s q q sB q s q                                       which is the contribution of , k q s q Y  in eq. 10 when   , L Be q s q  . Appendix C: Distribution of the product of two independent beta r.v.’s

Let us consider a r.v. X which is the product of two independent r.v.’s Y and Z which are distributed according to beta distributions, respectively   , Y Be   and   , Z Be   . Then the pdf of X is given by [38-41]:              , 1- , ; ;1 C.1, , X Bp x x x F xB B                          x  , where   , ; ; F a b c x is a Gauss hypergeometric function. The moments k X , where k is chosen here to be a positive integer, are simply obtained from the product of the corresponding moments of Y and Z given both by eq. 2:          C.2 k kk k k k k

X Y Z         Appendix D: The relation               n j j j n nj j n x y x yC n j x y x y         (eq. 32) Relations, in which powers are “replaced” by Pochhammer symbols, have been known for a long time. This is for instance the case for the expansion of the power of a binomial whose analog in term of Pochhammer symbols is the Vandermonde’s identity       n k n kk n nx y x yk         (see among others [37]). The relation we consider is similarly analogous to a polynomial identity, n n x y           n j j n jj C n j xy x y        (eqs 22 and 23). Robbins [44] mentions that the latter identity would be due to Lucas and possibly to Lagrange while Gould [43] shows that it is a special case of a formula first established by Girard in 1629 and later given by Waring in the eighteen century. The validity of eq. 32 is proven here by a method which is likely only one among many others and is in no way claimed to be the simplest one. Another example of a pair of analogous relations is given by eqs D.7 and D.9 below. Here, we consider the case where n is an integer larger than 1 as the relation reduces to 1=1 for n  . We assume further that   n x y   . We define first the sum:                    , 1 1 , D.1, = 1 2 ! n j j jn j j x yS x y C n j x yn j n n jnC n j jn j n j j                       where the   , C n j are the Lucas coefficients [45]. Because nj      , the Pochhammer symbols   j n  and   j n  are different from zero for n  . The falling factorial is defined as:                   j j j a a a j a a a j a             

0 Thus:                                j j jjj j j n j n n n nn j n n n n n                      where the duplication formula (eq. 2) has been applied to   j n  . Finally, applying again the duplication formula to   j x y  , eq. D.1 becomes:                 n j j j jn j j j j n n x yS x y x y x yn j                       The latter relation can be expressed in term of a generalized hypergeometric function:     n n n x y x yS x y F x y n          Prudnikov et al. don’t give the value of the latter hypergeometric function but they give instead the following value (7.5.3.57 of [46]):          n n n x yn n x y x yF x y n x y x y               

Reversing the method used above, eq. D.6 can be converted back into the following sum, which involves Pochhammer symbols, after dividing each member by   x y  :                 n j j j n nj j n x y x yn jj x y x y x y                 The polynomial identity given as eq. 22 of [43] is: 1           n n nj n j jj n j x yx y xyj x y              Interestingly, eq. D.7 is then seen to be the analogous of the following identity obtained by dividing each member of eq. D.8 by   n x y   :           j n j j n nj nj n j x y x yj x y x y x y                  Back to eq. D.6, we use first a relation between contiguous hypergeometric functions [47]:            

1, , , ; , 1, 1, ;, , , ; , 1, , ;, , , ; , , 1, ; 0 D.10

F n zF n zF n z                                     

With n n x y x yx y n                   , eq. D.10 becomes:       n n x y x yx y n F x y nn n x y x yx y F x y nn n x y x yx y n F x y n                                

From relation D.6, the first and the third hypergeometric functions in eq. D.11 are respectively equal to        n n n x yx y x y     and        n n n x yx y x y      while the second is the hypergeometric function we wish to express. Therefore: 2                                              n n n nn nn n n nn n n x y x yF x y nx y n x y x y n x yx y x y x y x y x y x yx y x y n x yx y x y                                    Writing now:                 n n n n x y n x y x n y n x y                      n n n n x y x y y n y y x x n x                           n n n n x y x y x y         we obtain finally the relation sought for from eqs D.5 and D.12:                 n j j j n nj j n x y x yC n j x y x y         When x y  , the hypergeometric function (eq. D.5) becomes n nF x n x       which can then be obtained from relation 7.4.4.113 of [46],        n nn n n b x b nn nF x b x b n b x b              . This leads in turn to eq. D.13 with x y  . 3 Appendix E: The pdf’s     dq q P r  and     dq q P r  (eq. 37) For the first representation and for s  , eq. 21 gives:           d q qd dq q P r r rB q qd d d r d d dF q r F q rd                                 while for s  :           d q qd dq q P r r rB q qd d d r d d dF q r F q rd                                

For the second representation, eqs 12 and 37 yield respectively:                  

11 1 22, 22 1 2 22 1 qd d dq q

P r r rq d d dF q rq B qq d d dr F q rq B q                              and                  

11 1 23, 22 1 2 22 1 qd d dq q

P r r rq d d dF q rq B qq d d dr F q rq B q                             

4 The hypergeometric function

22 1

12 , ; 1;2 2 2 d d dF q r       is then expressed from a Gauss’relation for contiguous hypergeometric functions (eq. 15.2.20 of [49]) as:    

22 1 2 2 22 1 2 12

12 , ; 1;2 2 2 1 11 , ; ; 1 2 , ; ; E.52 2 2 2 2 2 d d dF q rd d d d d d dF q r r F q rr                            

From  

B q q =   q B q  we obtain:         q q q qB q q q B q q B q                 It suffices now to insert eq. E.5 into eq. E.1 and to use eq. E.6 to transform the first representation of the pdf     dq q P r  (eq. E.1) into the second (eq. E.3). Similarly:         q q q qB q q q B q q B q                 The left hypergeometric function on the right-hand side of eq. E.2 is multiplied by four once eq. E.5 has been inserted into eq. E.2. Then, eq. E.7 transforms the first representation of the pdf     dq q P r  (eq. E.2) into the second (eq. E.4). The two representations of     , dq s q P r  (sections 5.1 and 5.2) are thus proven by a direct calculation to be identical for

2, 3 s  . 5 References :

1. K. Pearson, The problem of the random walk, Nature (1905) 294; (1905) 342. 2. K. Pearson, A Mathematical Theory of Random Migration, Mathematical Contributions to the Theory of Evolution XV. Draper’s Company Research Memoirs, Biometric Series. Dulau and Co, London, 1906. 3. J. Bojarski, R. Smolenski, A. Kempski, P. Lezynski, Pearson’s random walk approach to evaluating interference generated by a group of converters”, Appl. Math. Comput. (2013) 6437–6444. 4. A. Stannard, P. Coles, Random-walk statistics and the spherical harmonic representation of CMB Maps, Mon. Not. R. Astron. Soc. (2005) 929–933. 5. M. Hansen, A. M. Frejsel, J. Kim, P. Naselsky, F. Nesti, Pearson’s random walk in the space of the CMB phases: Evidence for parity asymmetry. Phys. Rev. D (2011) 103508 (9 pages). 6. P.H.F. Reimberg, L.R. Abramo, CMB and Random Flights: temperature and polarization in position Space. JCAP (2013) 043 (31 pages). 7. P.H.F. Reimberg, L.R. Abramo, Random flights through spaces of different dimensions, J. Math. Phys. (2015) 013512 (10 pages). 8. G.N. Watson, A treatise on the theory of Bessel functions, Cambride University Press, Cambridge, 1995. 9. S. Chandrasekhar, Stochastic problems in physics and astronomy, Rev. Mod. Phys. (1943) 1–89. 10. J.E. Kiefer, G.H. Weiss, The Pearson random walk, AIP Conf. Proc. (1984) 11–32. 11. J. Dutka, On the problem of random flights, Arch. Hist. Exact Sci. (1985) 351-375. 12. E.A. Codling, M.J. Plank, S. Benhamou, Random walk models in biology, J. R. Soc. Interface (2008) 813–834. 13. W. Stadje, The exact probability distribution of a two-dimensional random walk, J. Stat. Phys. (1987) 207-216. 14. G.L. Vignoles, W. Ros, C. Mulat, O. Coindreau, C. Germain, Pearson random walk algorithms for fiber-scale modeling of Chemical Vapor Infiltration, Comp. Mater. Sci. (2011) 1157–1168. 15. A. Zoia, E. Dumonteil, E., A. Mazzolo, Collision densities and mean residence times for d - dimensional exponential flights, Phys. Rev. E (2011) 041137 (11 pages). 6 16. G.M. Viswanathan, V. Afanasyev, S.V. Buldyrev, E.J. Murphy, P.A. Prince, H.E. Stanley, Lévy flight search patterns of wandering albatrosses, Nature (1996) 413–415. 17. A.M. Edwards, R.A. Phillips, N.W. Watkins, M.P. Freeman, E.J. Murphy, V. Afanasyev, S.V. Buldyrev, M.G.E. da Luz, E. P. Raposo, H.E. Stanley, G .M. Viswanathan, Revisiting Lévy flight search patterns of wandering albatrosses, bumblebees and deer, Nature (2007) 1044-1049. 18. S. Benhamou, How many animals really do the Lévy walk?, Ecology (2007) 1962–1969 19. G.M. Viswanathan, M.G.E. da Luz, E. P. Raposo, H.E. Stanley, The Physics of foraging an introduction to random searches and biological encounters, Cambridge University Press, Cambridge, 2011.

20. N. E. Humphries, H. Weimerskirch, N. Queiroz, E. J. Southall, D. W. Sims, Foraging success of biological Lévy flights recorded in situ, PNAS (2012) 7169–7174. 21. M. Franceschetti, When a random walk of fixed length can lead uniformly anywhere inside a hypersphere, J. Stat. Phys. (2007) 813–823. 22. E. Orsingher, A. De Gregorio, Random flights in higher spaces, J. Theor. Probab. (2007) 769–806. 23. A.D. Kolesnik, Random motion at finite speed in higher dimensions, J. Stat. Phys. (2008) 1039–1065. 24. A.D. Kolesnik, The explicit probability distribution of a six-dimensional random flight, Theory Stoch. Process. (2009) 33–39. 25. G. Le Caër, A Pearson random walk with steps of uniform orientation and Dirichlet distributed lengths, J. Stat. Phys. (2010) 728–751. 26. G. Le Caër, A new family of solvable Pearson-Dirichlet random walks, J. Stat. Phys. (2011) 23-45. 27. A. de Gregorio, E. Orsingher, Flying randomly in R d with Dirichlet displacements, Stoch. Proc. Appl. (2012) 676-713. 28. A. de Gregorio, A family of random walks with generalized Dirichlet steps, J. Math. Phys. (2014) 023302 (17 pages).

29. G. Letac, M. Piccioni, Dirichlet random walks, J. Appl. Probab. (2014) 1081-1099. 30. A.A. Pogorui, R.M. Rodríguez-Dagnino, Isotropic random motion at finite speed with K -Erlang distributed direction alternations, J. Stat. Phys. (2011) 102-112. 31. A.A. Pogorui, R.M. Rodríguez-Dagnino, Random motion with uniformly distributed directions and random velocity, J. Stat. Phys. (2012) 1216-1225. 7 32. A.A. Pogorui, R.M. Rodríguez-Dagnino, Random motion with gamma steps in higher dimensions, Stat. Probabil. Lett. (2013) 1638-1643. 33. R. García-Pelayo, Exact solution for isotropic random flights in odd dimensions, J. Math. Phys. (2012) 103504 (15 pp). 34. E. d’Eon, Rigorous asymptotic and moment-preserving diffusion approximations for generalized linear Boltzmann transport in arbitrary dimension, arXiv:1312.1412 [cs.GR], to appear in Journal of Computational and Theoretical Transport (2014). 35. J. Aitchison, The Statistical Analysis of Compositional Data, Chapman and Hall, London, 1986. 36. K.-T. Fang, S. Kotz, S., K.-W. Ng, Symmetric Multivariate and Related Distributions, Chapman and Hall, London, 1990. 37.http://functions.wolfram.com/GammaBetaErf/Pochhammer/introductions/FactorialBinomials/ 38. S.Y. Dennis III, On the distribution of independent beta variables, Commun. Statist. Theor. Meth., (1994) 1895-1913. 39. T. Pham-Gia, N. Turkkan, The product and quotient of general beta distributions, Stat. Papers (2002) 537–550. 40. D.K. Nagar, E. Zarrazola, Distributions of the product and the quotient of independent Kummer-beta variables, Sci. Math. Jpn. (2005) 109–117. 41. S. Nadarajah, Reply to “Comments on ‘Sums, Products, and Ratios of Non-Central Beta Variables by Saralees Nadarajah”, Commun. Statist. Theor. Meth., (2010) 837–854. 42. I.S. Gradshteyn, I.M. Ryzhik, Tables of Integrals, Series and Products, Academic Press, New York, 1980. 43. H.W. Gould, The Girard-Waring power sum formulas for symmetric functions and Fibonacci sequences, Fibonacci Quart. (1999) 135-140. 44. N. Robbins, Vieta’s triangular array and a related family of polynomials, Internat. J. Math. & Math. Sci.14