[PDF] Posterior Probabilities for Lorenz and Stochastic Dominance of Australian Income Distributions

Abstract

Using HILDA data for the years 2001, 2006, 2010, 2014 and 2017, we compute posterior probabilities for dominance for all pairwise comparisons of income distributions in these years. The dominance criteria considered are Lorenz dominance and first and second order stochastic dominance. The income distributions are estimated using an infinite mixture of gamma density functions, with posterior probabilities computed as the proportion of Markov chain Monte Carlo draws that satisfy the inequalities that define the dominance criteria. We find welfare improvements from 2001 to 2006 and qualified improvements from 2006 to the later three years. Evidence of an ordering between 2010, 2014 and 2017 cannot be established.

Full PDF

Posterior Probabilities for Lorenz and Stochastic Dominance of Australian Income Distributions

David Gunawan

University of Wollongong

William E. Griffiths

University of Melbourne

Duangkamon Chotikapanich

Monash University

24 March 2020

Abstract

1. Introduction

Concern over the welfare implications of income growth and inequality has led to numerous studies of the Australian income distribution and how it has changed overtime. Wilkins (2013) examines the impact of using alternative data sources on trends in income inequality. The alternative data sources he considers are ABS data from the survey of Income and Housing, HILDA data, tax records, and National Accounts. Sila and Dugain (2019) use HILDA data to examine how changes in inequality relate to income growth in different segments of the population and to labour participation. Using ABS data, Coelli and Borland (2016) investigate the effect of changes in the occupation structure on job polarization and earnings inequality. Chatterjee et al . (2016) characterize wage inequality over an individual’s life cycle using HILDA data, and examine the contributing factors to inequality. A review of labour market inequality, with an analysis using HILDA data, has been provided by Borland and Coelli (2016). These are just a few recent examples of the many studies which have appeared in the literature; others, particularly earlier studies, can be found from these sources. A characteristic of most studies is that they use a single index such as the Gini coefficient to measure inequality. We add to the existing literature by providing evidence on Lorenz dominance, a stricter condition for ordering inequality. We also consider first and second order stochastic dominance as broader concepts for welfare comparisons. Using HILDA data, we estimate Australian income distributions for 2001, 2006, 2010 and 2017. Bayesian methods for assessing Lorenz and stochastic dominance are used to examine welfare changes over time for the whole population and for subgroups of the population: the poor, and metropolitan and non-metropolitan residents. Stochastic dominance is related to welfare changes for broad classes of social welfare functions. First-order stochastic dominance implies greater utility for all social welfare functions that are strictly increasing. Second-order stochastic dominance provides an unambiguous welfare ranking for the class of social welfare functions that are increasing and concave. For two income distributions with the same ABS is an acronym for Australian Bureau of Statistics. HILDA is an acronym for the Household, Income and Labour Dynamics in Australia (HILDA) survey (Watson and Wooden, 2012). The HILDA Project was initiated and is funded by the Australian Government Department of Social Services (DSS) and is managed by the Melbourne Institute of Applied Economic and Social Research (Melbourne Institute). The findings and views reported in this paper, however, are those of the authors and should not be attributed to either DSS or the Melbourne Institute. mean income, Lorenz dominance implies greater utility with respect to all strictly increasing and concave social welfare functions. Generalized Lorenz dominance, where Lorenz curves are multiplied by mean income, is equivalent to second-order stochastic dominance. While we focus on computing posterior probabilities for Lorenz dominance, and first and second-order stochastic dominance, our computations can be extended to third and higher-order stochastic dominance criteria. Third-order stochastic dominance is relevant for ordering distributions for social welfare functions where the marginal utility function is positive, decreasing, and strictly convex. See, for example, Chakravarty (2009) and Le Breton and Peluso (2009). The higher the degree of dominance, the greater the number of restrictions that need to be imposed on the social welfare function and the weaker the dominance inequality. That is, first-order stochastic dominance implies second-order stochastic dominance, and so on, but the converse is not true. Assessing Lorenz or stochastic dominance involves comparing the complete range of two curves: Lorenz curves for Lorenz dominance, generalized Lorenz curves for second-order stochastic dominance, and distribution functions for first-order stochastic dominance. Because the curves, or points on those curves, are estimated from sample data, they are subject to sampling error. The existence of sampling error has led to a multitude of nonparametric statistical tests designed to test whether the difference between two curves is “significant”. These tests consider estimates at a number of points; some examine the maximum distance between curves at these points, others use the joint distribution over a number of points; some use dominance as the null hypothesis, others use no dominance as the null. An extensive list of such studies is referenced in Lander et al . (2020), hereafter LGGC. An example applied to Australian income distributions estimated with ABS data is the work of Valenzuela et al . (2014), hereafter VLA. They use tests proposed by Barrett and Donald (2003) to test for first and second order stochastic dominance, comparing several distributions over the years 1983 to 2010. The VGA describe their test statistic for second-order stochastic dominance as ˆˆmax ( ) ( )

STAT i ii i

S F p G p  = −  ∑ where ˆ ( ) i F p and ˆ ( ) i G p are empirical distribution functions at points , , , . q p p p  We suspect that they mean (1) (1) ˆˆmax ( ) ( )

STAT F i G ii

S x F p x G p  = −  where (1) ˆ ( ) i F p and (1) ˆ ( ) i G p are empirical first moment distribution functions and F x and G x are the sample means of the distributions F and . G distributions they consider are gross income, disposable income, and two expenditure categories. The Bayesian approach that we adopt follows that in LGGC, and differs from the sampling theory approach adopted by VLA in two main ways. First, it is parametric rather than nonparametric. We estimate parametric income distributions implying, correspondingly, parametric Lorenz curves, generalized Lorenz curves, and distribution functions. The parametric approach has both an advantage and a disadvantage. The obvious disadvantage is the extra assumption, that the income distribution can be modelled using a specific functional form. The advantage is that estimation of the difference between two curves is more precise, particularly in the tails, where dominance assessment can be both sensitive and important, sensitive because the two curves being compared converge at the tails, and important because of the treatment of poverty in the left tail. We minimize the disadvantage by using a very flexible model for the income distributions, an infinite mixture of gamma densities, extending the work of LGGC who used a finite mixture of gamma densities. A second difference between our Bayesian approach and the nonparametric sampling theory approach adopted by VLA and others is that we do not specify null and alternative hypotheses. Instead, we find three posterior probabilities, the probability of dominance in each direction, and the probability of no dominance. These probabilities are estimated using the proportion of Markov chain Monte Carlo (MCMC) parameter draws, obtained from estimation of the mixture of gamma densities, for which one curve is greater than another over the range of population proportions. To ensure an accurate calculation over this range we consider a fine grid of population proportions, from 0.001 to 0.999, at 0.001 intervals. Using this approach, it is also possible to focus on some segments of the population, like the poor, if they are deemed to have particular importance. As explained in LGGC, providing three posterior probabilities is more informative than the outcomes of many sampling theory tests. The dilemma faced by sampling-theory tests is well illustrated by considering the results in VLA. Suppose we have two distributions X and Y . VLA perform two tests, one where the null hypothesis is that X dominates Y and one where the null hypothesis is that Y dominates . X They conclude that X dominates Y (say) if the first null hypothesis is not rejected, and the second is rejected. However, non-rejection of a null hypothesis does not imply the null hypothesis is true; only that there is not strong evidence against the null. The proper conclusion is that either X dominates Y or neither distribution is dominant. LGGC provide examples of where the VLA strategy would lead to a conclusion that one distribution is first-order stochastically dominant even when an empirical distribution function difference ˆˆ ( ) ( ) F x G x − is positive for some x and negative for some . x It is brave to conclude that one distribution dominates another when estimates of the two curves cross; intuition suggests dominance is neither direction is a more likely scenario. Reporting posterior probabilities for the three possible outcomes solves this dilemma. In Table 1 we summarize possible sampling theory test outcomes, the conclusions that correspond to those reached by VLA, and what we regard as the correct conclusions. In the rows corresponding to the first two test outcomes, dominance in one direction is a possibility, but failure to reject a null hypothesis of dominance does not constitute strong evidence that dominance exists. The proper conclusion must allow for the possibility that neither distribution is dominant. In the third test outcome, there is strong evidence that neither distribution is dominant. There is no ambiguity about the conclusion. In the final test outcome in Table 1, where dominance in both directions is not rejected, it is not possible to accept both null hypotheses; that would imply X dominates Y and Y dominates . X Faced with this awkward outcome, VLA record this result as “insignificant”. It occurs when two curves are close together relative to the standard deviation of their difference. In this instance, one curve may dominate another, but because they are close together, dominance cannot be rejected. Alternatively, the curves may cross but remain close together, preventing rejection of a null hypothesis. Acknowledging the proper test outcomes from this sampling theory approach shows that this approach can never be conclusive about the existence of dominance. No dominance is always a possibility. Finding posterior probabilities overcomes this dilemma. Having curves that are far apart will lead to a high posterior probability that one dominates the other – a conclusive outcome. Suppose that the curves cross, with a large maximum positive difference and a small maximum negative difference, an outcome that fits one of the first two test scenarios in Table 1. The greater precision in estimation afforded by the parametric specification means that the lack of dominance is more likely to be picked up by the Bayesian approach. A high posterior probability of no dominance is the likely outcome. For the fourth test scenario in Table 1, where the curves are close together, the posterior probability of dominance is likely to be slightly greater than 0.5 if the curves do not cross, and close to zero if they do cross. We are not the first set of authors to recognize that dominance can never be established when using a null hypothesis that one distribution dominates another. To overcome this problem, Davidson and Duclos (2013) propose testing a null hypothesis of non-dominance, and devise a test based on an empirical likelihood test statistic. In this case, rejection of the null leads to a legitimate claim of dominance. However, they go on to show that, with continuous distributions, a null-hypothesis of non-dominance can never be rejected unless the range of income is restricted. The problem lies in the tails, where all quantile functions converge to zero or one. Faced with this dilemma, Davidson and Duclos argue that the only empirically sensible approach to take is to test for “restricted dominance”, with the null of non-dominance and the alternative of dominance specified over a restricted range of income. By considering population proportions from 0.001 to 0.999, we are implicitly considering restricted dominance. The poorest 0.1% and the richest 0.1% of the population are being ignored. Not surprisingly, our results are sensitive to this choice. However, as we shall see, it is easy to quantify how the posterior probability of dominance changes for different ranges of population proportions. In addition to extending the finite gamma mixture in LGGC to an infinite gamma mixture, in the current paper we combine the MCMC algorithm for estimating the infinite gamma mixture with an algorithm that accommodates the sampling weights provided with the HILDA data. The Bayesian bootstrap is used to generate pseudo random samples using an algorithm that is described in Dong et al. (2014) and evaluated further in Gunawan et al . (2020). The format of the paper is as follows. In Section 2 we summarize the conditions for Lorenz and first and second order stochastic dominance. An outline of the estimation algorithm for the infinite mixture of gamma densities is presented in Section 3. In Section 4 we describe how the MCMC draws Examples of this kind can be found in LGGC. are used to estimate dominance probabilities. The data are described in Section 5 and results presented in Section 6. Some concluding remarks are made in Section 7.

2. Dominance Conditions

Consider an income distribution X with mean , X µ density function ( ), X f x distribution function ( ) ( ) , xX X F x f t dt = ∫ and first moment distribution function ( ) (1) 0 xX X X F tf t dt = µ ∫ For a given proportion of the population , u a convenient way to write the corresponding Lorenz curve is (1) 1 ( ) ( ) X X X

L u F F u −  =   u ≤ ≤ (1) where ( ) X F u − is the quantile function for . X With these definitions, and corresponding ones for another distribution , Y we can write the required dominance conditions as follows: Lorenz dominance ( ) LD X Y ≥ ( ) ( ) X Y

L u L u ≥ for all u ≤ ≤ and ( ) ( ) X Y

L u L u > for some u < < (2) Second-order stochastic dominance ( ) SSD

X Y ≥ ( ) ( ) X X Y Y

L u L u µ ≥ µ for all u ≤ ≤ and ( ) ( ) X X Y Y

L u L u µ > µ for some u < < (3) First-order stochastic dominance ( ) FSD

X Y ≥ ( ) ( ) X Y

F u F u − − ≥ for all u ≤ ≤ and ( ) ( ) X Y

F u F u − − > for some u < < (4) First-order stochastic dominance can also be stated in terms of distribution functions instead of quantile functions. Another representation of second-order stochastic dominance is in terms of integrals of distribution functions. However, (2), (3) and (4) are convenient for our purpose because the range of the argument u (the proportion of population) must lie within the (0,1) interval. Further details can be found, for example, in Davidson and Duclos (2000), Lambert (2001) and Maasoumi (1997). First-order stochastic dominance implies the level of income from distribution X is greater than or equal to the level of income from distribution Y at all population proportions u . With second-order stochastic dominance, the sum of incomes below any population proportion u is at least as great as the corresponding sum for distribution . Y For Lorenz dominance, the Lorenz curve for X lies nowhere below that for Y for all . u It is a stronger condition than a comparison of two inequality indices such as the Gini coefficient.

3. Estimating an Infinite Mixture of Gamma Densities

For estimating the income distributions using an infinite mixture of gamma densities, we employ the Dirichlet process mixture model proposed by Escobar and West (1995) in the context of normal distributions. In this Section we provide a brief sketch of the model and some aspects of the estimation procedure. The MCMC algorithm is described in an appendix. For a more complete description see Gelman et al. (2014, Ch. 23). An infinite mixture of gamma densities can be written as ( ) ( | , , ) | , k k k kk p y w G y v v ∞= = µ ∑ µ v w where y is a random income draw from the probability density function (pdf) ( | , ) p y µ v,w , with parameter vectors ( ) , , , w w ′ =  w ( ) , , , ′ = µ µ  µ and ( ) , , . v v ′ =  v The pdf ( ) | , k k k

G y v v µ is a gamma density with mean k µ > and shape parameter k v > ( ) ( )( ) | , exp k k vk k v kk k k k k v vG y v v y yv − µ  µ = − Γ µ  Each component of the mixture is assumed to be random with randomness introduced by placing distributions on the parameters k v and . k µ These distributions are known as base distributions; they are denoted by v G and . G µ We assume that G µ is an inverted gamma density, ( ) , , IG v s and v G is an exponential distribution with parameter . λ Further randomness is introduced through the weights k w which are assigned values according to what is known as a stick-breaking process, written as ( ) ~ k w SBP α where ( ) k k jj k w < = η − η ∏ A similar model, using an infinite mixture of lognormal distributions to estimate income distributions, was used by Hasegawa and Kozumi (2003). The densities are ( ) ( ) ( ) ( ) ( )

10 0 0 0 0 | , exp vv p v s s v s − + µ = Γ µ − µ and ( ) | exp( ). p v v λ = λ −λ with the k η being draws from the beta distribution ( ) ~ 1, k Beta η α . The parameters ( ) , , v s λ are prior parameters whose values are set by the investigator. A hierarchical prior ( ) ~ , G α α β is used for α with ( ) , α β being set by the investigator. The intuition behind the stick-breaking process is as follows. We have a stick of unit length from which probabilities k w are to be allocated to each component of the mixture. The first probability i w = η is drawn from a beta distribution with parameters ( )

1, . α There is then − η of the stick remaining to be allocated. To obtain a second value , w we break off a proportion ( ) ~ 1, Beta η α from the stick of length − η and set ( ) w = η − η As we proceed, the stick gets shorter and shorter, and the k w decrease stochastically. The effect of these assumptions is to create a posterior distribution for the weights which is a Dirichlet distribution with parameters that are a weighted average of those implied by the base distribution ( ) , v G G G µ = and the proportion of observations allocated to each component of the mixture. The prior parameter α is called a concentration parameter and controls the relative weight placed on . G The apparent necessity to sample an infinite number of parameters at each iteration of the algorithm is avoided by using the slice sampler proposed by Walker (2007). At each iteration, this device truncates the infinite number of components using a uniformly distributed latent random variable, u  ~uniform(0,1). It turns out that the joint density of ( ) , i i y u  for

1, 2, , i n =  is given by ( ) ( ) ( ) , | , , | , i i i k i k k kk p y u I u w G y v v ∞= ∝ < µ ∑   µ w v where ( ) I ⋅ is an indicator function. Thus, the slice sampler truncates the infinite number of components to a finite number of components, say . K Let

K ll K w w ∞+ = + = ∑ be the residual weight. Then, posterior sampling is with respect to the weights ( ) , , , , . K K w w w w + =  w Another latent variable i s identifies the component of the mixture from which i y is to be taken. That is, ( ) ( ) | , , ~ | , i i i k k k y s k G y v v = µ v µ where ( ) Pr | i k s k w = = w for

1, 2, k =  . Including i s leads to a joint distribution of , i i y u  and i s given by, ( ) ( ) ( ) , , | , , | , i i i i i i i i s i s s s p y u s I u w G y v v ∝ < µ   µ w v With the two sets of latent variables ( ) , , , n u u u     u = and ( ) , , , , n s s s  s = the complete likelihood function for the observed income ( ) , , , , n y y y  y = is given by ( ) ( ) ( ) , , | , , | , i i i i n i s i s s si p I u w G y v v = ∝ < µ ∏   y u s w v µ With this background, the hierarchical model for an infinite mixture of gamma densities can be summarized as ( ) ~ k w SBP α ~ k G µ µ for

1, 2, , k = ∞  ~ k v v G for

1, 2, , k = ∞  ( ) | ~ | , i i k i k k k y s G y v v = µ ( ) Pr i k s k w = = ( ) | ~ 0, i i k u s k U w =  Let ( ) ( ) ( ) ( ) ( ) ( ) ( ) , , , , m m m m m m s v w u = µ  θ be the m- th draw of the parameters from the posterior distribution of the parameters. The conditional predictive density for a new observation * y is ( ) ( ) ( )

1( ) ( ) ( ) ( ) ( )** 1 | , | , m Km m m m mk k k kk p y w G y v v += = µ ∑ y θ The corresponding unconditional predictive density is ( ) ( ) ( ) ( )* *1 ˆ | 1 | , ,

M mkm p y M p y = = ∑ y y θ where M is the total number of MCMC draws from the posterior distribution. To accommodate the sampling weights provided with the HILDA data, in the MCMC algorithm the original observations are replaced by a representative sample drawn at each iteration using the Bayesian bootstrap procedure described in Gunawan et al . (2020). The steps are described in the Appendix. In our empirical work we generated 200 pseudo representative samples, and, for each of these samples, we obtained a total of 6,000 MCMC draws. The first 1,000 of the MCMC draws were discarded as a burn in, and every 100 th draw of the remaining 5,000 draws was retained. This strategy gave a total of 10,000 draws for dominance assessment. The settings for the prior parameters were v s = = and λ = For α we used a hierarchical gamma prior ~ (10,10). G α

4. Estimating Posterior Probabilities of Dominance

Having obtained MCMC draws on complete parameter vectors from two distributions, X θ and , Y θ we are in a position to compute corresponding values for Lorenz curves ( ) ; X X

L u θ and ( ) ; Y Y

L u θ , generalized Lorenz curves ( ) ; X X X

L u µ θ and ( ) ; Y Y Y

L u µ θ , and quantile functions ( ) ; X X

F u − θ and ( ) ; Y Y

F u − θ , for a grid of u values. We chose values from 0.001 to 0.999 at intervals of For each MCMC draw there is a distribution function for an infinite mixture of gamma densities given by ( ) ( )

1( ) ( ) ( ) ( ) ( ) ( ) ( )1 ( , , ) = | , m Km m m m m m mk k k k k k kk

F y w F y v v += µ ∑ | v w µ

1, 2, , m M =  Also, the first moment distribution function is given by ( )

1( , , ) = | 1,

Km m m m m m m m mk k k k k k k k km k

F y w F y v v += µ + µµ ∑ | v w µ

1, 2, , m M =  where ( ) ( ) ( ) ( ) | 1, m m mk k k k F v v ⋅ + µ is the distribution function of a gamma density with parameters ( ) ( ) mk v + and ( ) ( ) ( ) m mk k v µ , and ( )

1( ) ( ) ( )1 . m Km m mk kk w += µ = µ ∑ However, obtaining the quantile function, required to assess all three forms of dominance, is not straightforward. For this purpose we used an algorithm proposed in the Technical Appendix to LGGC. Let ( ) ; X X

C u θ and ( ) ; Y Y

C u θ be generic curves, representing any one of the three curves whose dominance properties are being considered. Also, let ( ) ( ) ; mX X C u θ and ( ) ( ) ; , mY Y C u θ

1, 2, , m M =  be their values at each of the M MCMC draws, for a given population proportion . u Following LGGC, we estimate the posterior probability of X dominating Y as the proportion of draws for which ( ) ( ) ( ) ( ) ; ; m mX X Y Y C u C u ≥θ θ for all . u To express this proportion mathematically, let i u i = and let [ ] I  denote an indicator function equal to 1 if its argument is true and zero, otherwise. Then we have, ( ) ( )

999 ( ) ( )1 1

1( dominates ) ; ;

M m mX i X Y i Ym i

P X Y I C u C uM = =  = ≥  ∑ ∏ θ θ ( ) ( )

999 ( ) ( )1 1

1( dominates ) ; ;

M m mY i Y X i Xm i

P Y X I C u C uM = =  = ≥  ∑ ∏ θ θ (neither distribution dominates) 1 ( dominates ) ( dominates )

P P X Y P Y X = − −

Since distributions X and Y are estimated independently, the order of X θ and Y θ in ( ) ( ) ( ) ( ) ; ; m mX i X Y i Y C u C u ≥θ θ is arbitrary. To check the results, the order of one of the vectors was randomized 1,000 times and for each randomization a dominance probability was calculated. There were no substantive changes in the probabilities. To give an indication of the variation in probabilities across different orderings, in Table 2 we report the smallest, largest and average values over the orderings for some selected pairwise comparisons: two cases each for small, intermediate and large probabilities. The average of estimates over the randomizations was taken as the final estimate of the posterior probabilities. A by-product of the estimation procedure is a plot of the curve ( ) ( ) ( ) ( )1

1( ) ; ;

M m mX Y X i X Y i YM

P u I C u C uM ≥ =  = ≥  ∑ θ θ against the value of . u Called probability curves by LGGC, these curves give the probability of “dominance” at a given population proportion . u The probability of dominance over any range of u will be no greater than the minimum value of ( ) X Y

P u ≥ within that range. This characteristic makes ( ) X Y

P u ≥ a valuable device for finding the population proportions which have the greatest impact on the probability of dominance. If a dominance probability is largely determined by behaviour in the tails of the distribution, we can examine the sensitivity of the probability to omission of extreme values of . u Also, if we are concerned with a particular segment of the population, say the poor, we can see how the probability of dominance changes if only the poor are considered.

5. Data

Data were extracted from waves of the HILDA survey corresponding to years 2001, 2006, 2010, 2014 and 2017. To obtain an income variable we subtracted “total disposable income negative per household” from “total disposable income positive per household”, and then divided by the square root of the number of individuals in the household. This calculation gave us an equivalised income which was assigned to each member of the household. Values were deflated using the Consumer Price Index, treating 2000/2001 as the base. The observational units were all individuals aged 15 and above; those aged less than 15 were omitted because of the unavailability of sampling weights for these individuals. Thus, the total number of children was used in the calculation of equivalised income, but children less than 15 were not included in the samples. This approach follows that adopted by Sila and Dugain (2019). Incomes that were non-positive were omitted from the samples: 0.55% in 2001, 0.39% in 2006, 0.35% in 2010, 0.24% in 2014 and 0.30% in 2017. For examining distributions for metropolitan versus non-metropolitan areas, major urban or major city was classified as metropolitan. All other areas were classified as non-metropolitan. Summary statistics for the income series are presented in Table 3. Sampling weights provided with the HILDA data were used for calculating the means, standard deviations and Gini coefficients. Mean income is increasing throughout the period. Inequality as measured by the Gini coefficient is highest in 2006. However, the standard deviation is highest in 2017.

6. Results

Estimates of the income densities for each year are plotted in Figure 1. The units are hundreds of 2001 dollars, equivalized according to household size. The densities are all bimodal with a sharp peak between $10,000 - $20,000 and a lesser peak in the range $20,000 - $40,000. Prior to 2010 there have been clear shifts to the right, but the distributions for 2010, 2014 and 2017 are similar, particularly those in 2014 and 2017. The posterior means and standard deviations for mean income and the Gini coefficient are reported in Table 4. It is reassuring that these values are close to the values from the raw data, reported in Table 3. The mean income estimates are in line with our observations about the density functions: relatively large increases from 2001 to 2006 and from 2006 to 2010, and relatively small increases thereafter. The posterior densities for mean incomes and the Gini coefficients are plotted in Figures 2 and 3, respectively. The relative closeness of the mean incomes for the last three periods compared to those for the earlier two periods is also reflected in the posteriors in Figure 2. The 2001 posterior for the Gini coefficient suggests a lower level of inequality in that year compared to the others. There is considerable overlap in the densities in the other years. A more complete picture of whether or not there has been a welfare improvement is obtained by comparing distribution functions and (generalized) Lorenz curves, and computing their dominance probabilities. The distribution functions plotted in Figure 4 suggest that distributions in 2010, 2014 and 2017 all FSD the 2001 and 2006 distributions, with no clear ranking between 2010, 2014 and 2017. The dominance probabilities in Table 5 confirm the likely dominance of 2001 by all other years – the probabilities are all greater than 0.91. However, in all other pairwise comparisons, the probabilities for no dominance are all greater than 0.79. From examining Figure 4, this result is expected for comparisons between 2010, 2014 and 2017, but it is surprising that there was not more evidence of dominance of 2006 by these three years. Examining the probability curves we discover that the low probabilities for dominance of 2006 by the later years can be attributed to behaviour in the tails of the distributions. As an example, in Figure 5, we plot the probability curve for 2017 FSD 2006. The low probability of dominance of 0.0051 comes from the left tails of the distributions. The upper tails also reduce the probability, although not to the same extent. If the tails are ignored by changing the range over which dominance is considered from u ≤ ≤ to u ≤ ≤ the dominance probability becomes one. This outcome reinforces the argument made by Davidson and Duclos (2013), that, because all points coverage at the end points, it is inevitable that dominance can only be established over a restricted range. For GLD, we compare the generalized Lorenz curves plotted in Figure 6. Visually, we expect to find a relationship similar to that expected from Figure 4, namely, that 2001 is dominated by 2006 which in turn is dominated by 2010, 2014 and 2017, with very little difference between the last three years. Noting that the dominance probabilities for GLD will always be at least as great as those for FSD, in Table 5 we observe that the probabilities for dominance over 2001 are always greater than 0.96. It is again true that 2006 is not dominated by any of the later years; the probabilities of no dominance are all greater than 0.97. Checking the probability curves, we again find the issue is in the tails, but not both tails as was the case for FSD. In this case the left tails are the source of no dominance; an example is given in Figure 7 where the probability curve for 2017 GLD 2006 is plotted. Most of the population are better off in the later year, but the poor people are not. Restricting the range to u ≤ ≤ yields a dominance probability of 0.8918, a value less than the FSD probability of one. When a bottom segment of the population is ignored, it is no longer true that a GLD probability will necessarily be at least as great as the corresponding FSD probability unless the distribution is truncated at the lower bound and normalized accordingly. Another point worth noting from Table 5 that is not evident from Figure 6 is that the probabilities of dominance for pairwise comparisons of the later three years are no longer negligible. No dominance is still the most likely outcome, but with ( ) Pr 2014 2010 0.25

GLD ≥ = and ( )

Pr 2017 2014 0.21,

GLD ≥ = dominance is still a possibility. A plot of the probability curve for 2017 GLD 2014 is given in Figure 8. It reveals a relatively high probability over most population proportions. What is perhaps surprising is the overall dominance probability of 0.21, when the minimum value of the probability curve is greater than 0.4. It can be explained by the nature of the curve. It is not monotonic. It increases, decreases, increases, decreases and then increases again. If each local minimum introduces a new set of MCMC draws that do not satisfy the required inequality, then the probability of dominance can fall well below the global minimum of the curve. Restricting the range to u ≤ ≤ led to a dominance probability of 0.73, a value close to the global minimum within this range. The Lorenz curves plotted in Figure 9, and the posterior densities for the Gini coefficient in Figure 3 suggest that inequality is less in 2001 than it is in the other years, and that it is difficult to separate the other three years. However, from Table 6 we find that no dominance is the most likely outcome for all pairwise comparisons. The probabilities of no dominance are all greater than 0.87. To illustrate the behaviour, in Figure 10 we plot the probability curves for 2001 LD 2017 and 2017 LD 2001. Because these curves are mirror images, we can say that 2001 fails to dominate 2017 because of behaviour in the left tails of the Lorenz curves, or, 2017 fails to Lorenz dominate 2001 because of behaviour in the right tails of the Lorenz curves. It is difficult to compare our results with those of VLA. They use ABS, not HILDA data, and their dominance assessments are for different years. Also, they consider FSD and GLD, but not LD. And they consider two expenditure series as well as two income series. However, we make some attempt to compare results for their income series with our results for three cases where their years are closest to being comparable to ours. The first case is their 2009/10 versus 1998/99, compared to our 2010 versus 2001. With FSD and GLD probabilities of 0.94 and 0.98, respectively, we find that 2010 dominates 2001. For both of their income series, VLA conclude that 1998/99 GLD 2009/10, an outcome that conflicts with the posterior probability. VLA do not provide results for FSD because both null hypotheses of dominance, one for each direction, were not rejected. In a second comparison (2009/10 versus 2003/04 for VLA and 2010 versus 2006 for us), VLA obtain similar results to their 2009/10 versus 1998/99 comparison. The earlier year dominates with respect to GLD and no results are provided for FSD. We find that neither year dominates with posterior probabilities for no dominance of 0.9995 and 0.9715 for FSD and GLD, respectively. At first glance, these contrasting results appear to be contradictory. However, when we recognize that non-rejection of a null hypothesis of dominance does not necessarily imply dominance, the two outcomes are no longer incompatible. VLA’s conclusions about dominance do not preclude overlapping distributions. If we exclude tails of the distributions we obtain results in direct contradiction of those from VLA. The later year dominates the earlier year. The probability curves in Figure 11 suggest that 2010 would FSD 2006 if only the middle 90% of the population was considered; considering only the upper 90% of the population would lead to 2010 GLD 2006. The third comparison is VLA’s 2003/04 versus 1998/99 with our 2006 versus 2001. In this case our results are consistent with those of VLA. They report FSD and GLD for 2003/04 over 1998/99 for both income series. Our posterior probabilities are 0.994 for 2006 FSD 2001 and 0.996 for 2006 GLD 2001. In Table 8 we compare the Gini coefficients from our income series with those of two other studies: Wilkins et al. (2019) and Sila and Dugain (2019). Our coefficients suggest lower levels of inequality compared to those of Sila and Dudain and higher levels relative to those in Wilkins et al.

For Sila and Dugain the differences are likely attributable to the treatment of negative and zero incomes. Wilkins et al. use a more sophisticated method of calculating equivalised income.

In Tables 9, 10 and 11 we report dominance probabilities for the poorest 10% of the population. For FSD (Table 9) and GLD (Table 10) the results are generally in line with those for the whole population in the sense that 2001 is dominated by all subsequent years and, for all other pairwise comparisons, no dominance is the most likely outcome. There are slight increases in the probability of dominance, as one would expect, since the number of draws satisfying dominance in a restricted range must be at least as great as the number over the complete range. Lorenz dominance in this context implies that all population proportions up to 0.1 are getting proportionately more income. Whereas there was no evidence of dominance when the complete Lorenz curve was considered, in this sense there are some relatively large dominance probabilities: ( )

Pr 2006 2001 0.76, LD ≥ = ( ) Pr 2014 2001 0.67, LD ≥ = ( ) Pr 2006 2014 0.46 LD ≥ = and ( ) Pr 2006 2017 0.71. LD ≥ = Dominance criteria can also be used to track welfare over time for subgroups of the population and to compare subgroups at a particular point in time. To illustrate, we have chosen metropolitan and non-metropolitan subgroups. The country versus city divide often receives attention in the media, particularly from disgruntled rural politicians. Mixture of gamma densities were estimated for both subgroups for the years 2001, 2006, 2010, 2014 and 2017. Table 12 contains the sample means, standard It is not a comparison of inequality within the poorest 10% of the population. deviations and Gini coefficients for each subgroup and year, as well as the mean incomes and Gini coefficients estimated from the mixture of gamma densities. Examining the mean incomes suggests metropolitan incomes are substantially above non-metropolitan incomes, and that changes in incomes over time are consistent with those for the complete sample. Changes from 2001 to 2006 and from 2006 to 2010 are large relative to changes from 2010 to 2014 and from 2014 to 2017. The Gini coefficients suggest inequality increased for both groups from 2001 to 2006, but has been relatively constant thereafter. There is little difference between the metropolitan and non-metropolitan Gini coefficients in most years; 2017 is an exception with metropolitan inequality being greater in this year. When dominance criteria are applied to assess welfare changes over time for each subgroup, the conclusions reached for the metropolitan subgroup are similar to those reached for the overall population. However, for the non-metropolitan subgroup there are a few differences which are highlighted in Table 13. For FSD and GLD of 2014 over 2001 and 2010 over 2001 the non-metropolitan dominance probabilities are much less than those for the metropolitan subgroup. Dominance probabilities for comparing the metropolitan and non-metropolitan subgroups in each year are displayed in Table 14. The probabilities for non-metropolitan dominance (FSD or GLD) are zero in every year. The probabilities for metropolitan dominance range from 0.1545 in 2017 to 0.6062 in 2010 for FSD, and from 0.2882 in 2017 to 0.9923 in 2014 for GLD. It is interesting that these probabilities are the highest in 2010 and 2014, the years in which it was more difficult to establish that the non-metropolitan distributions dominated the 2001 non-metropolitan distribution. We also note that, if the tails of the distributions are ignored, then the probabilities for metropolitan being dominant, for both FSD and GLD, are close to 1 in every year. We illustrate this fact by plotting the 2017 FSD and GLD probability curves in Figure 12. When the ranges for these two curves are restricted to u ≤ ≤ the probabilities increase from 0.1545 to 0.9968 for FSD and from 0.2882 to 0.9845 for GLD. The Lorenz dominance probabilities in Table 14 show little evidence of dominance by either subgroup. The probabilities for no dominance are all greater than 0.9. The probability curves for non-metropolitan Lorenz dominating metropolitan for all years are plotted in Figure 13. Their shapes are quite different. Most have a relatively high maximum and a relatively low minimum, suggesting the Lorenz curves cross. That for 2017 suggests the probability of metropolitan being dominant would be relatively high if the tails were ignored, particularly the right tails. This last fact is confirmed by the LD probabilities in Table 15 where dominance probabilities for the poorest 10% are recorded. Here there are three years where the probability of LD by the non-metropolitan subgroup are more substantial: 0.3578 in 2001, 0.5739 in 2006 and 0.5323 in 2017. An examination of the tails of the probability curves in Figure 13 reveals why this increase in probabilities has occurred and why there have not been similar increases in 2010 and 2014. Table 15 also contains dominance probabilities for FSD and GLD for the poorest 10% of the metropolitan and non-metropolitan subgroups. Comparing Tables 14 and 15, we note that the absence of the right tails has led to increases in every year for the probabilities of metropolitan being FSD over non-metropolitan. However, no dominance is still the most likely outcome in 2001, 2006 and 2017, and FSD continues to be the most likely outcome in 2010 and 2014. For GLD the probabilities for metropolitan dominance remain the same when only the poorest 10% are considered. The identical values can be explained by the probability curves where the minimum value is within the region u ≤ ≤ as illustrated for 2017 in Figure 12. There are small but non-zero values for non-metropolitan dominance in 2001, 2006 and 2017. The increases from zero, obtained when the whole range of u was considered in Table 14, occur because their probability curves, like that in Figure 12, have not quite reached their maximum value at u =

7. Concluding Remarks

We have used an innovative approach to assess first and second order stochastic dominance orderings and Lorenz ordering of the Australian household income distribution at different points in time, as well as a comparison of metropolitan and non-metropolitan household income distributions. Our innovations include expressing empirical information about dominance in terms of posterior probabilities, and using Bayesian estimation of an infinite mixture of gamma densities to provide flexible fits to the income distributions. Using either FSD or GLD as a metric, we find strong evidence that welfare in 2001 is less than in the subsequent years: 2006, 2010, 2014 and 2017. However, pairwise comparisons of the later four years show no evidence of welfare differences unless the range of population proportions is restricted away from the tails of the distributions. With such a restriction, we can conclude that 2006 is dominated by the later three years, but an ordering of 2010, 2014 and 2017 is still not possible. There are some apparent contradictions between these conclusions and those of VLA for roughly equivalent years, but the contradictions often disappear when it is recognized that non-rejection of a dominance null hypothesis does not establish dominance. Making the same comparisons for the poorest 10% of the population leads to similar conclusions. The poor are worse off in 2001 than in the later years, but it is hard to discriminate between the later years. If we are interested only in inequality and consider Lorenz dominance instead of FSD or GLD, examining the complete Lorenz curves reveals no evidence of any ordering over the four years, despite strong posterior evidence that the Gini coefficient is smaller in 2001. When a comparison is restricted to the poorest 10%, we find some evidence that their income shares were better in 2006 and 2014 than they were in 2001, but 2006 was a better year than both 2014 and 2017. Comparing the metropolitan and non-metropolitan subgroups in each year suggests the metropolitan subgroup is better off in terms of both FSD and GLD. However, to make this conclusion with high posterior probabilities for all years, the tails had to be excluded, with population proportions restricted to u ≤ ≤ For Lorenz dominance there is some evidence, but not strong evidence, that the non-metropolitan group was better off in 2001, 2006 and 2017. The potential for applying the tools used in this paper is enormous. We have only scratched the surface. Possible extensions include comparing before and after tax income distributions, or the resulting distributions from other government interventions, and the development of multivariate distributions for comparing multivariate welfare functions. References

Barrett, G.F., and S.G. Donald (2003). Consistent Tests for Stochastic Dominance.

Econometrica , 71–104. Borland, J. and M. Coelli (2016). Labour Market Inequality in Australia. Economic Record

92, 517-547. Chakravarty, S.R. (2009).

Inequality, Polarization and Poverty: Advances in Distributional Analysis . New York: Springer. Chatterjee, A., A. Singh, and T. Stone (2016). Understanding Wage Inequality in Australia.

Economic Record

92, 348-360. Coelli, M. and J. Borland (2016). Job Polarisation and Earnings Inequality in Australia.

Economic Record

92, 1-27. Davidson, R., and J.Y. Duclos (2000). Statistical Inference for Stochastic Dominance and for the Measurement of Poverty and Inequality.

Econometrica , 1435-64. Davidson, R., and J.Y. Duclos (2013). Testing for Restricted Stochastic Dominance. Econometric Reviews , 84-125. Dong, Q., M. R. Elliott, and T. E. Raghunathan (2014). A nonparametric method to generate synthetic populations to adjust for complex sampling design features. Survey Methodology 40 , 29-46. Escobar, M. D., and M. West (1995). Bayesian Density Estimation and Inference using Mixtures.

Journal of American Statistical Association 90 , 577-588. Gelman, A., J.B. Carlin, H.S. Stern, D.B. Dunson, A. Vehtari and D.B. Rubin.

Bayesian Data Analysis , third edition, Boca Raton: CRC Press. Gunawan, D., A. Panagiotelis, W. E. Griffiths and D. Chotikapanich (2020), Bayesian Weighted Inference from Surveys,

Australian and New Zealand Journal of Statistics, in press. Hasegawa, H. and H. Kozumi (2003). Estimation of Lorenz Curves: A Bayesian Nonparametric Approach.

Journal of Econometrics

The Distribution and Redistribution of Income: A Mathematical Analysis , third edition. Manchester: Manchester University Press. Lander, D., D. Gunawan, W. E. Griffiths, and D. Chotikapanich (2020), Bayesian Assessment of Lorenz and Stochastic Dominance,

Canadian Journal of Economics , in press. Le Breton, M. and E. Peluso (2009). Third-Degree Stochastic Dominance and Inequality Measurement.

Journal of Economic Inequality

7, 249-68. Maasoumi, E. (1997). Empirical Analyses of Inequality and Welfare. In M.H. Pesaran and P. Schmidt (Eds.),

Handbook of Applied Econometrics: Volume II Microeconomics . Malden, MA: Blackwell. Sila, U. and V. Dugain (2019), Income, Wealth and Earnings Inequality in Australia: Evidence from the HILDA Survey. OECD Economics Department Working Paper No.1538, Paris. Valenzuela, M. R., H. H. Lean, and G. Anathanasopoulos (2014), Economic Inequality in Australia between 1983 and 2010: A Stochastic Dominance analysis,

Economic Record

90, 49-62. Walker, S. G. (2007). Sampling for Dirichlet Mixture Model with Slices.

Communications in Statistics-Simulation and Computation

36, 45-54. Watson, N., and M. Wooden (2012), The HILDA Survey: A case Study in the Design and Development of a Successful Household Panel Study,

Longitudinal and Life Couse Studies

3, 369-381. Wilkins R. (2013). Evaluating the Evidence on Income Inequality in Australia in the 2000s.

Melbourne Institute Working Paper Series, Working Paper no.

Appendix The MCMC Algorithm

Corresponding to the observed incomes ( ) , , , n = y y y  y are sample weights ( ) , , , . n = τ τ τ  τ The MCMC algorithm used to draw observations from the joint posterior density for ( ) , , , , , α  u s w v µ is applied to a series of pseudo-representative samples drawn from y using the Bayesian bootstrap procedure described in Dong et al . (2014) and Gunawan et al. (2020). The general steps are: 1. Generate a pseudo-representative sample ( ) , , , n y y y y =     . 2. Sample | , , , , , α   u y s w v µ

3. Sample | , , , , α  w v s y µ and update K

4. Sample | , , , , , α   v w u s y µ

5. Sample | , , , , , α   v w u s y µ

6. Sample | , , , , , α   s y w v u µ

7. Sample | , , α v s µ

8. Repeat steps 2 to 7, M times. 9. Repeat steps 1 to 8, J times. Details of each step follow. Step 1 Let N = the size of the population. We set N = The results are not sensitive to this setting as long as N is much bigger than . n The sample ( ) , , , n y y y  is augmented with N n − values ( ) * * *1 2 , , , N n y y y −  to form a pseudo population. A random sample of size , n ( ) , , , n y y y    is then drawn from this pseudo population. Let * ( ) . N N n n = −

Normalize the sampling weights such that . n ii N = τ = ∑ Let i = 

1, 2, , . i n =  For

1, 2, , , k N n = −  draw * k i y y = with probability * * i i NN n k N τ − +− + −  Each time a i y is used the corresponding i  is incremented by 1, such that n ii k = = − ∑  Step 2 Sample i u  from a uniform density on the interval ( ) i s w for

1, 2, , . i n =  Step 3 Sample ( ) , , , ,

K K w w w w + =  w from a Dirichlet distribution ( ) , , , , K D n n n α  with k n being the number of observations for which . i s k = Then, if ( )

K k nk w u u u = − > ∑    , repeat the following steps. (i) Increment K to K + ; sample new values K + µ from its base distribution ( ) , IG v s , and K v + from its base distribution, exponential with parameter λ . (ii) The old residual weight K w + is broken into two pieces. Draw ( ) ~ 1, . Beta η α

Set (1 )

K K w w + + = − η and change K w + to . K w + η Step 4 Given , K draw values , k µ for

1, 2, , , k K =  from ( ) , , k k k k IG v n v s S v + + where : . i k ii s k S y = = ∑  Step 5 Given K , draw values , k v for

1, 2, , , k K =  from ( ) ( ) | , , , exp log log k k n vk kk k i k kkk v Sp v v n Pv   ∝ − λ + + µ −  µΓ       µ y s w with : . i k ii s k P y = = ∏  This density is not a recognizable form and requires a Metropolis step. A candidate k v  is drawn from a gamma density ( ) , k G r r v with mean equal to the previous draw k v and is accepted with the probability ( )( ) ( )( ) | , , , ,min 1, ,| , , , , k k kk k k p v p v vp v p v v          y s wy s w µµ where ( ) , k k p v v  is the gamma density used to generate . k v  Step 6 For each , i s draw a value for k from { }

1, 2, , k K =  with probability ( ) ( ) ( ) Pr | , , , , , | , i i i i k k k s k K I u w G y v v = ∝ < µ    v w u y µ for

1, 2, , . i n =  Step 7 Draw α using the following steps. (i) Draw | x α from ( ) , Beta n α (ii) Draw | x α from ( ) log( ), G x K α − β + T ABLE Possible Sampling Theory Test Outcomes

Hypotheses Test Outcome VLA conclusion Proper Conclusion : H X dominates Y : H Y dominates X Fail to rejectReject  X dominates Y X dominates Y or neither is dominant : H X dominates Y : H Y dominates X RejectFail to reject  Y dominates X Y dominates X or neither is dominant : H X dominates Y : H Y dominates X RejectReject 

Neither distribution is dominant Neither distribution is dominant. : H X dominates Y : H Y dominates X Fail to rejectFail to reject 

Insignificant results Curves close together relative to standard deviation of difference. T

ABLE Examples of Variation in Probabilities from Reordering Draws

Probability Minimum Bound Average Maximum Bound ( )

Pr 2014 2017

GLD ≥ ( ) Pr 2017 2014

GLD ≥ ( ) Pr 2017 2010

GLD ≥ ( ) Pr 2014 2010

FSD ≥ ( ) Pr 2017 2001

FSD ≥ ( ) Pr 2006 2010

FSD ≥ Notes : The minimum and maximum bounds are the smallest and largest probability estimates, respectively, from 1,000 random reorderings of the MCMC draws. The averages of the 1,000 estimates are reported in the “Average” column and in later tables. T ABLE Summary Statistics for Incomes

Notes : Equivalised income is calculated as [disposable income positive per household ( ahifditp ) − disposable income negative per household ( ahifditn )] ÷ persons in household ( ) ahhpers , and then deflated with the Consumer Price Index, with 2000/2001 base. The calculations for the mean, standard deviation and Gini coefficient have been weighted using the weight series ahhwtrps . T ABLE Posterior Means (Standard Deviations) for Mean Income and the Gini Coefficient

Means (2.39) (4.10) (3.87) (3.79) (4.19)

Gini (0.0035) (0.0055) (0.0047) (0.0044) (0.0048)

Notes:

The values are those from the gamma mixture model. The means are in units of hundreds of equivalized 2001 dollars. T ABLE First Order Stochastic Dominance Probabilities

A B A B A B A B ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance T ABLE Generalized Lorenz Dominance Probabilities

A B A B A B A B ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance T ABLE Lorenz Dominance Probabilities

A B A B A B A B ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance T ABLE Gini Coefficients et al. (2019) 0.303 0.297 0.301 0.301 0.302 Sila and Dugain (2019) 0.321 0.348 0.340 0.342 ̶ T ABLE First Order Stochastic Dominance Probabilities for the Poorest 10% of the Population

A B A B A B A B ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance ( ) Pr FSD

A B ≥ ( ) Pr FSD

B A ≥ ( ) Pr no dominance T ABLE Generalized Lorenz Dominance Probabilities for the Poorest 10% of the Population

A B A B A B A B ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance ( ) Pr GLD

A B ≥ ( ) Pr GLD

B A ≥ ( ) Pr no dominance T ABLE Lorenz Dominance Probabilities for the Poorest 10% of the Population

A B A B A B A B ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance ( ) Pr LD A B ≥ ( ) Pr LD B A ≥ ( ) Pr no dominance

ABLE Summary Statistics for Metropolitan and Non-metropolitan Incomes

ABLE Selected Dominance Probabilities for Metropolitan and Non-metropolitan Subgroups over Time

Metropolitan Non-metropolitan 2014 versus 2001 ( )

Pr 2014 2001

FSD ≥ ( ) Pr 2001 2014

FSD ≥ ( ) Pr No FSD ( )

Pr 2014 2001

GLD ≥ ( ) Pr 2001 2014

GLD ≥ ( ) Pr No GLD ( )

Pr 2010 2001

FSD ≥ ( ) Pr 2001 2010

FSD ≥ ( ) Pr No FSD ( )

Pr 2010 2001

GLD ≥ ( ) Pr 2001 2010

GLD ≥ ( ) Pr No GLD

ABLE Dominance probabilities for Metropolitan vs Non-metropolitan Subgroups ( )

Pr MET NON-MET

FSD ≥ Pr(No FSD) ( )

Pr MET NON-MET

GLD ≥ Pr(No GLD) ( )

Pr MET NON-MET LD ≥ ( ) Pr NON-MET MET LD ≥ Pr(No LD)

Note:

The omitted posterior probabilities for FSD and GLD of non-metropolitan over metropolitan for each year are all zero. T ABLE Restricted Dominance Probabilities for Metropolitan vs Non-metropolitan Subgroups for the Poorest 10% of the Population ( )

Pr MET NON-MET

FSD ≥ Pr(No FSD) ( )

Pr MET NON-MET

GLD ≥ ( ) Pr NON-MET MET

GLD ≥ Pr(No GLD) ( )

Pr MET NON-MET LD ≥ ( ) Pr NON-MET MET LD ≥ Pr(No LD)

Note: