[PDF] Consistency of Bayes factor for nonnested model selection when the model dimension grows

Abstract

Zellner's g -prior is a popular prior choice for the model selection problems in the context of normal regression models. Wang and Sun [J. Statist. Plann. Inference 147 (2014) 95-105] recently adopt this prior and put a special hyper-prior for g , which results in a closed-form expression of Bayes factor for nested linear model comparisons. They have shown that under very general conditions, the Bayes factor is consistent when two competing models are of order O( n τ ) for τ<1 and for τ=1 is almost consistent except a small inconsistency region around the null hypothesis. In this paper, we study Bayes factor consistency for nonnested linear models with a growing number of parameters. Some of the proposed results generalize the ones of the Bayes factor for the case of nested linear models. Specifically, we compare the asymptotic behaviors between the proposed Bayes factor and the intrinsic Bayes factor in the literature.

Full PDF

aa r X i v : . [ m a t h . S T ] J un Bernoulli (4), 2016, 2080–2100DOI: 10.3150/15-BEJ720 Consistency of Bayes factor for nonnestedmodel selection when the modeldimension grows

MIN WANG and YUZO MARUYAMA Department of Mathematical Sciences, Michigan Technological University, Houghton, MI49931, USA.E-mail: [email protected] Center for Spatial Information Science, University of Tokyo, Bunkyo-ku, Tokyo, 113-0033,Japan.E-mail: [email protected]

Zellner’s g -prior is a popular prior choice for the model selection problems in the context of nor-mal regression models. Wang and Sun [ J. Statist. Plann. Inference (2014) 95–105] recentlyadopt this prior and put a special hyper-prior for g , which results in a closed-form expressionof Bayes factor for nested linear model comparisons. They have shown that under very generalconditions, the Bayes factor is consistent when two competing models are of order O ( n τ ) for τ < τ = 1 is almost consistent except a small inconsistency region around the nullhypothesis. In this paper, we study Bayes factor consistency for nonnested linear models witha growing number of parameters. Some of the proposed results generalize the ones of the Bayesfactor for the case of nested linear models. Speciﬁcally, we compare the asymptotic behaviorsbetween the proposed Bayes factor and the intrinsic Bayes factor in the literature. Keywords:

Bayes factor; growing number of parameters; model selection consistency;nonnested linear models; Zellner’s g -prior

1. Introduction

We reconsider the classical linear regression model Y = n α + X p β p + ε , (1.1)where Y = ( y , . . . , y n ) ′ is an n -vector of responses, X p is an n × p design matrix of fullcolumn rank, containing all potential predictors, n is an n × α is anunknown intercept, and β p is a p -vector of unknown regression coeﬃcients. Throughoutthe paper, it is assumed that the random error for all models follows the multivariate This is an electronic reprint of the original article published by the ISI/BS in

Bernoulli ,2016, Vol. 22, No. 4, 2080–2100. This reprint diﬀers from the original in pagination andtypographic detail. (cid:13)

M. Wang and Y. Maruyama normal distribution, denoted by ε ∼ N ( n , σ I n ), where n is an n × σ is an unknown positive scalar, and I n is an n -dimensional identity matrix. Withoutloss of generality, we also assume that the columns of X p have been centered, so thateach column has mean zero.In the class of linear regression models, we often assume that there is an unknownsubset of the important predictors which contributes to the prediction of Y or has animpact on the response variable Y . This is by natural a model selection problem where wewould like to select a linear model by identifying the important predictors in this subset.Suppose that we have two such linear regression models M j and M i , with dimensions j and i , M j : Y = n α + X j β j + ε , (1.2) M i : Y = n α + X i β i + ε , (1.3)where X i is an n × i submatrix of X p and β i is an i × i = O ( n a ) and j = O ( n a ) with 0 ≤ a ≤ a < i = O ( n a ) and j = O ( n a ) with 0 ≤ a < a = 1.Scenario 3. i = O ( n a ) and j = O ( n a ) with a = a = 1.When the two models M i and M j are nested, Moreno, Gir´on and Casella [18] study theconsistency of the intrinsic Bayes factor under the three asymptotic scenarios. Later on,Wang and Sun [22] derive an explicit closed-form Bayes factor associated with Zellner’s g -prior for comparing the two models. They show that under very general conditions,the Bayes factor is consistent when the two models are of order O ( n τ ) for τ < τ = 1 is almost consistent except a small inconsistency region around the null hypothesis.Such a small set of models around the null hypothesis can be characterized in terms ofa pseudo-distance between models deﬁned by Moreno and Gir´on [17]. Finally, Wang andSun [22] compare the proposed results with the ones for the intrinsic Bayes factor due to[18].It should be noted that M i and M j are not necessarily nested in many practical situ-ations. As commented by Pesaran and Weeks [20], “ in econometric analysis , nonnestedmodels arise naturally when rival economic theories are used to explain the same phe-nomenon , such as unemployment , inﬂation or output growth .” In fact, the problem ofcomparing nonnested models has been studied in a fairly large body of ecomometricand statistical literature from both practical and theoretical viewpoints, dating back to[10]. For instance, Cox [4] develops a likelihood ratio testing procedure and shows thatunder appropriate conditions, the proposed approach and its variants have well-behaved ayes factor consistency for nonnested models encompassing from above and encompassing from below . Later on, Moreno and Gir´on [17] present a comparative analysisof the intrinsic Bayes factor under the two criteria in linear regression models. Recently,Gir´on et al. [8] study the consistency of the intrinsic Bayes factor for the case of nonnestedlinear models under the ﬁrst two asymptotic scenarios above. The latter two papersmainly focus on the consistency of the intrinsic Bayes factor when the model dimensiongrows with the sample size, whereas under the same asymptotic scenario, the researchersshould also be interested in the consistency of Bayes factor based on Zellner’s g -prior,which is a popular prior choice for the model selection problems in linear regressionmodels. To the best of our knowledge, the latter has just received little attention overthe years, even though it is of the utmost importance to address the consistency issuefor nonnested models.In this paper, we investigate Bayes factor consistency associated with Zellner’s g -priorfor the problem of comparing nonnested models under the three asymptotic scenariosabove. Speciﬁcally, we compare the asymptotic results between the proposed Bayes factorand the intrinsic Bayes factor due to [8]. The results show that the asymptotic behaviorsof the two Bayes factors are quite comparable in the ﬁrst two scenarios. It is remarkablethat we also study the consistency of the proposed Bayes factor under Scenario 3, whereassuch a scenario is still an open problem for the intrinsic Bayes factor highlighted by Gir´onet al. [8].The remainder of this paper is organized as follows. In Section 2, we present an explicitclosed-form expression of Bayes factor based on the null-based approach. In Section 3, weaddress the consistency of Bayes factor for nonnested models under the three asymptoticscenarios. Additionally, we compare the proposed results with the ones of the intrinsicBayes factor. An application of the results in Section 3 to the ANOVA models is providedin Section 4. Some concluding remarks are presented in Section 5, with additional proofsgiven in the Appendix.

2. Bayes factor

Within a Bayesian framework, one of the common ways for the model selection problemsis to compare models in terms of their posterior probabilities given by P ( M j | Y ) = p ( M j ) p ( Y | M j ) P i p ( M i ) p ( Y | M i ) = p ( M j ) BF[ M j : M b ] P i p ( M i ) BF[ M i : M b ] , (2.1)where p ( M j ) is the prior probability for model M j and p ( M j | Y ) is the marginal likelihoodof Y given M j , and BF[ M j : M b ] is the Bayes factor , which compares each model M j to M. Wang and Y. Maruyama the base model M b and is deﬁned asBF[ M j : M b ] = p ( Y | M j ) p ( Y | M b ) . (2.2)The Bayes factor in (2.2) depends on the base model M b , which is often chosen arbi-trarily in practical situations. There are two common choices for M b : one is the null-basedapproach by using the null model ( M ), the other is the full-based approach by choosingthe full model ( M F ). This paper focuses on the null-based approach because (i) the nullmodel is commonly used as the base model when using Zellner’s g -priors in most of theliterature [14] and (ii) unlike the full model, the dimension of the null model is indepen-dent of the sample size. This is crucial in addressing the consistency of Bayes factor withan increasing model dimension. Accordingly, we compare the reducing model M j with M : M j : Y = n α + X j β j + ε , (2.3) M : Y = n α + ε . (2.4)Zellner’s g -prior [27] is often to choose the same noninformative priors for the commonparameters that appear in both models and to assign Zellner’s g -prior for others thatare only in the larger model. The reasonability of this choice is that if the commonparameters are orthogonal (i.e., the expected Fisher information matrix is diagonal) tothe new parameters in the larger model, the Bayes factor is quite robust to the choiceof the same (even improper) priors for the common parameters; see [12]. Since α and σ are the common orthogonal parameters in (2.3) and (2.4), we consider the followingprior distributions for ( α, σ , β j ) M : p ( α, σ ) ∝ σ , (2.5) M j : p ( α, σ , β j ) ∝ σ and β j | σ ∼ N ( , gσ ( X ′ j X j ) − ) . The amount of information in Zellner’s g -prior is controlled by a scaling factor g , andthus the choice of g is quite critical. A nice review of various choices of g -priors wasprovided by Liang et al. [14] and later discussed further by Ley and Steel [13]. In mostof the developments of the g -priors, the expression of Bayes factor may not have ananalytically tractable form, so numerical approximations will generally be employed,whereas it may not be an easy task for practitioners to choose an appropriate one.In particular, standard approximation, such as Laplace approximation, becomes quitechallenging when the number of parameters grows with the sample size.It is remarkable that Maruyama and George [16] propose an explicit closed-form ex-pression of Bayes factor based on combined use of a generalization of Zellner’s g -priorand the beta-prime prior for g : π ( g ) = g b (1 + g ) − a − b − B ( a + 1 , b + 1) I (0 , ∞ ) ( g ) , (2.6) ayes factor consistency for nonnested models a > − b > −

1, and B ( · , · ) is a beta function. Noting that Zellner’s g -prior is aspecial case of the generalization of Zellner’s g -prior in [16], we obtain the following resultand the proof directly follows Theorem 3.1 of [16] and is thus omitted for simplicity. Theorem 1.

Under the prior in (2.6) with b = ( n − j − / − a − , the Bayes factorfor comparing M j and M can be simpliﬁed as BF[ M j : M ] = Γ( j/ a + 1)Γ(( n − j − / a + 1)Γ(( n − /

2) (1 − R j ) − ( n − j − / a +1 , (2.7) where R j is the usual coeﬃcient of the determination of model M j . The Bayes factor in (2.7) is very attractive for practitioners because of its explicitexpression without integral representation, which is not available for other choices ofthe hyperparameter b . One may argue that such an expression comes at a certain coston interpreting the role of the prior for g , since this prior depends on both the samplesize and the model size through the hyperparameter b . It is noteworthy that this typeof the prior has been studied in the literature. For example, Bayarri et al. [1] proposea truncated version of the beta-prime prior for g , such that g > ( n + 1) / ( j + 3) −

1. Asimilar type of the prior has also been considered by Ley and Steel [13].At this point, we provide several arguments justifying the speciﬁcation of the hyper-parameters as follows. (i) The choice of b = ( n − j − / − a − O ( n )choice of g [16], that is, g = O ( n ), which will prevent the hyper- g prior from asymp-tomatically dominating the likelihood function; (ii) as the sample size grows, the righttail of the beta-prime prior behaves like g − ( a +2) , leading to a very fat tail for small val-ues of a , an attractive property suggested by Gustafson, Hossain and MacNab [9]; (iii)with a choice of a = − / θ = ( X ′ X ) / β , the prior makes theasymptotic tail behavior of p ( θ | σ ) = Z ∞ p ( θ | σ , g ) π ( g ) dg (2.8)become the multivariate Cauchy for suﬃcient large θ ∈ R p , recommended by Zellner[27]; (iv) the resulting Bayes factor in (2.7) enjoys nice theoretical properties and goodperformances in practical applications; see, for example, [16, 22, 23], among others, and(v) when the model dimension j is bounded, the Bayes factor in (2.7) is asymptoticallyequivalent to the Schwarz approximation. Theorem 2.

When the model dimension j is ﬁxed, for large sample sizes n , the Bayesfactor in (2.7) is equivalent to the Schwarz approximation given by BF[ M j : M ] ≈ exp (cid:20) − j n − n − R j ) (cid:21) . (2.9) Proof.

See the Appendix. (cid:3)

M. Wang and Y. Maruyama

One of the most attractive properties in the Bayesian approaches is the model selectionconsistency, which means the true model (assuming it exists) will be selected if enoughdata is provided. This property has been intensively studied under diﬀerent asymptoticscenarios as the sample size approaches inﬁnity. For example, when the model dimensionis ﬁxed, see [3, 13, 14, 16], to name just a few. Of particular note is that the consis-tency of various Bayes factors in the listed references behaves very similarly, becausefor suﬃciently large values of n , the intrinsic Bayes factor and Bayes factors associatedwith mixtures of g -priors (e.g., g = n and Zellner–Siow prior) can all be approximatedby the Schwarz approximation in (2.9); see Theorem 2 of [19]. Also, we can show thatthis approximation is valid for the Bayes factor with the hyper- g prior in [14].When the model dimension grows with the sample size, Moreno, Gir´on and Casella [18]study the consistency of the intrinsic Bayes factors for comparing nested models, and ageneralization of the consistency to nonnested models has been addressed by Gir´on et al.[8]. More recently, Wang and Sun [22] address the consistency of Bayes factor associatedwith Zellner’s g -prior for nested models, whereas its consistency for the case of nonnestedmodels is also of the utmost importance. We shall particularly be interested in comparingthe asymptotic behaviors between the proposed Bayes factor and the intrinsic Bayesfactor under the same asymptotic scenario. The presented results provide researchers avaluable theoretical base for the comparison among nested and nonnested models, whichnaturally appears in practical situations.

3. Bayes factor consistency for nonnested linearmodels

In this section, we consider the model selection consistency of Bayes factor for comparingnonnested models under the three asymptotic scenarios. The Bayes factor in (2.7) maynot be directly applied to the problem of comparing nonnested models, whereas we cancalculate the Bayes factor between M j and M , BF[ M j : M ], and the Bayes factorbetween M i and M , BF[ M i : M ]. Thereafter, the Bayes factor for comparing M j and M i can be formulated as BF[ M j : M i ] = BF[ M j : M ]BF[ M i : M ] . (3.1)The Bayes factor for comparing M j and M i in (1.2) and (1.3) is thus given byBF[ M j : M i ] = Γ( j/ a + 1)Γ(( n − j − / i/ a + 1)Γ(( n − i − /

2) (1 − R j ) − ( n − j − / a +1 (1 − R i ) − ( n − i − / a +1 . (3.2)Let M T stand for the true model M T : Y = n α + X T β T + ε . ayes factor consistency for nonnested models n →∞ BF[ M j : M i ] = ∞ , if M j is the true model M T , whereasplim n →∞ BF[ M j : M i ] = 0 , if M i is the true model M T , where ‘plim’ stands for convergence in probability and theprobability distribution is the sampling distribution under M T . For notational simplicity,let δ ji = 1 σ β ′ j X ′ j ( I n − H i ) X p n β j , where H i = X i ( X ′ i X i ) − X i with X i being an n × i submatrix of X p . According to [8],the value of δ ji can be viewed as a pseudo-distance between M j and M i , in which the twomodels are not necessarily nested. Such a pseudo-distance has the following properties:(i) it is always equal to 0 from any model M j to itself, that is, δ jj = 0; (ii) if M i is nestedin M j , it is also equal to 0, that is, δ ij = 0, and (iii) for any model M k , we have δ ki ≥ δ kj if M i is nested in M j . To study the model selection consistency, it is usually assumedthat when the sample size approaches inﬁnity, the limiting value of δ ji , denoted by δ ∗ ji ,always exists, where δ ∗ ji = lim n →∞ σ β ′ j X ′ j ( I n − H i ) X j n β j . (3.3)In what follows, let lim n →∞ [ M ] Z n represent the limit in probability of the randomsequence { Z n : n ≥ } under the assumption that we are sampling from model M . Wepresent one useful lemma which is critical for deriving the main theorems in this paper,and the proof of the lemma is directly from Lemma 1 of [8] and is not shown here forsimplicity. Lemma 1.

Suppose that we are interested in comparing two models M i and M p withdimensions i and p , respectively, where M i is nested in M p . As n approaches inﬁnity, both i and p grow with n as i = O ( n a ) and p = O ( n a ) for ≤ a ≤ a ≤ . When samplingfrom the true model M T , (i) if ≤ a ≤ a < , it follows that lim n →∞ [ M T ] (cid:26) − R p − R i (cid:27) = 1 + δ ∗ tp δ ∗ ti . (ii) If ≤ a < a = 1 , it follows that lim n →∞ [ M T ] (cid:26) − R p − R i (cid:27) = 1 + δ ∗ tp − /r δ ∗ ti , M. Wang and Y. Maruyamawhere r = lim n →∞ n/p > . (iii) If a = a = 1 , it follows that lim n →∞ [ M T ] (cid:26) − R p − R i (cid:27) = 1 + δ ∗ tp − /r δ ∗ ti − /s , where r = lim n →∞ n/p > and s = lim n →∞ n/i > . We are now in a position to characterize the consistency of Bayes factor in (3.2) forcomparing nonnested linear models. We begin with Scenario 1, that is, the dimensions ofmodels M i and M j are i = O ( n a ) and j = O ( n a ) with 0 ≤ a ≤ a <

1, respectively. Thefollowing theorem summarizes Bayes factor consistency when either of the two models isthe true model.

Theorem 3.

Let M be the null model nested in both nonnested models M i and M j ,whose dimensions are i and j , respectively. Suppose that i = O ( n a ) and j = O ( n a ) with ≤ a ≤ a < and that δ ∗ ij > and δ ∗ ji > . The Bayes factor in (3.2) is consistentwhichever the true model is. Proof.

See the Appendix. (cid:3)

Under the same asymptotic scenario, Gir´on et al. [8] also conclude that the intrinsicBayes factor is consistent whichever the true model is when δ ∗ ij > δ ∗ ji >

0. Such anagreement of the consistency between the two Bayes factors is due to the fact that thedominated term is exactly the same on their asymptotic approximations under Scenario1. It is noteworthy that Theorem 3 is also valid for other chosen base model nested inboth models M i and M j , even though the main result of the theorem is derived basedon the null-based approach. Moreover, Theorem 3 can be directly applied to the case inwhich the dimensions of the two competing models are ﬁxed, because it can be viewedas a limiting case with both lim n →∞ n/j and lim n →∞ n/i approaching inﬁnity. Corollary 1.

Suppose we are interested in comparing two models M i and M j with di-mensions i and j , respectively, and that both dimensions are ﬁxed. The Bayes factor in(3.2) is consistent under both models provided that δ ∗ ij > and δ ∗ ji > . We now investigate Bayes factor consistency when the dimension of one of thenonnested models is of order O ( n ). The main results are provided in the following theo-rem. Theorem 4.

Let M be the null model nested in both nonnested models M i and M j whose dimensions are i and j , respectively. Suppose that i = O ( n a ) and j = O ( n a ) with ≤ a < a = 1 and that there exists a positive constant r such that r = lim n →∞ n/j > . (a) The Bayes factor in (3.2) is consistent under M i , provided that δ ∗ ij > .ayes factor consistency for nonnested models The Bayes factor in (3.2) is consistent under M j provided that δ ∗ ji ∈ ( κ ( r, δ ∗ j ) , δ ∗ j ] , (3.4) and δ ∗ j > δ ( r ) , where κ ( r, s ) = [ r (1 + s )] /r − and δ ( r ) = r / ( r − − . (3.5) Proof.

See the Appendix. (cid:3)

Some of the interesting ﬁndings can be drawn from the theorem as follows. First, thelower bound of δ ∗ j , denoted by δ ( r ), is exactly the same as the one in Theorem 2 of [22]for comparing nested linear models. Second, Theorem 4 can be extended to the case ofnested model comparisons (i.e., M i is nested in M j ) by assuming that M = M i . Third,the Bayes factor depends on the choice of the base model through the value of δ ∗ j , andtherefore, to enlarger the consistency region in (3.4), we need to make δ ∗ j be as largeas possible. This justiﬁes that the null model M would be the best choice as the basemodel. Fourth, the lower bound of δ ∗ ji , denoted by κ ( r, δ ∗ j ), is a bounded decreasingfunction in r and satisﬁes that for any δ ∗ j > r →∞ κ ( r, δ ∗ j ) = 0 . Finally, under the same scenario, Gir´on et al. [8] consider the consistency of the intrinsicBayes factor and conclude that the intrinsic Bayes factor is consistent under M i if δ ∗ ij > M j , provided that δ ∗ j > ξ ( r ) with ξ ( r ) = r − r + 1) ( r − /r − − , (3.6)and δ ∗ ji ∈ ( η ( r, δ ∗ j ) , δ ∗ j ] , (3.7)where η ( r, s ) = r + s (1+ r ) ( r − /r − δ ∗ ji bounded by δ ∗ j . Figure 1 showsthat the upper bounds of their inconsistency regions tend to each other as r increases.Moreover, Figure 2 provides their lower bounds with diﬀerent values of δ ∗ j . When δ ∗ j is small, the consistency region of the proposed Bayes factor is included by the one ofthe intrinsic Bayes factor, whereas the diﬀerence between the two regions is small; seeFigure 2(a). However, when δ ∗ j gets larger, the consistency region of the proposed Bayesfactor will contain the one of the intrinsic Bayes factor, whereas the diﬀerence betweenthe two regions becomes signiﬁcantly as δ ∗ j increases; see Figure 2(b). Thus, we mayconclude that as δ ∗ j increases, the proposed Bayes factor outperforms the intrinsic Bayesfactor from a theoretical viewpoint.0 M. Wang and Y. Maruyama

Figure 1.

The inconsistency region comparisons (below the curves) for the proposed Bayesfactor and the intrinsic Bayes factor under Scenario 2.

It deserves mentioning that the existence of an inconsistency region around the nullhypothesis is quite reasonable from a practical point of view, because the nontrue smallermodel M i is parsimonious under large- p situation and is generally selected when conduct-ing model selection, if the true larger model M j is not so distinguishable from M i . Fromthe prediction view of point, Maruyama [15] has demonstrated the reasonability of theinconsistency region for the one-way ﬁxed-eﬀect ANOVA model, which could be viewedas a special case of the classical linear models in (1.1) after some reparameterization. (a) σ ∗ j = 0 . σ ∗ j = 20 Figure 2.

The lower bounds of the consistency regions in (3.4) and (3.7) with diﬀerent limitingvalues of δ j under Scenario 2. ayes factor consistency for nonnested models

11A theoretical justiﬁcation of this line of thought for a more general model is still underinvestigation and will be reported elsewhere.The ﬁrst two theorems mainly focus on the consistency of Bayes factor for the case inwhich at least one model is of order O ( n α ) for α <

1. It is worthy of investigating theconsistency issue for the case where both models are of order O ( n ): the growth rates ofthe two model dimensions are as fast as n . Such a scenario remians an open problem forthe intrinsic Bayes factor commented by Gir´on et al. [8]. We summarize the consistencyof the proposed Bayes factor under this scenario in the following theorem. Theorem 5.

Let M be the null model nested in both nonnested models M i and M j with dimensions i = O ( n ) and j = O ( n ) , respectively. Suppose that there exist positiveconstants r and s such that r = lim n →∞ n/j > and s = lim n →∞ n/i > . Without lossof generality, we assume that r ≤ s . (a) The Bayes factor in (3.2) is consistent under M i provided that δ ∗ ij ∈ (cid:18) r − r (cid:26)(cid:20) s /s r /r (1 + δ ∗ i ) /s − /r (cid:21) r/ ( r − − (cid:27) , δ ∗ i (cid:21) , (3.8) and that δ ∗ i > satisfying (cid:18) δ ∗ i − /r (cid:19) − /r > (1 /r ) /r (1 /s ) /s (1 + δ ∗ i ) /s − /r . (3.9)(b) The Bayes factor in (3.2) is consistent under M j provided that δ ∗ ji ∈ ( φ ( r, s, δ ∗ j ) , δ ∗ j ] , (3.10) where φ ( a, b, c ) = b − b (cid:20) a /a b /b (1 + c ) /a − /b − (cid:21) b/ ( b − , and that δ ∗ j > satisfying (cid:18) δ ∗ j − /s (cid:19) − /s > r /r s /s (1 + δ ∗ j ) /r − /s . (3.11) Proof.

See the Appendix. (cid:3)

Unlike the ﬁrst two asymptotic scenarios, Theorem 5(a) shows that under Scenario 3,there exists an inconsistency region around the alternative hypothesis when M i is trueand that the consistency under M i depends on the chosen base model M through thedistance δ ∗ i only. The existence of the inconsistency region is quite reasonable becausethere are many candidates to be the base model, which could have a dimension of order O ( n a ) with a ≤

1. In particular, we observe that the inconsistency region disappears2

M. Wang and Y. Maruyama for the case in which r = s . This is also very understandable, because with the samegrowth rates, the parsimonious model is typically preferred in terms of model selection.Furthermore, it can be easily shown that the inequality in (3.9) and the lower bound ofthe consistency region in (3.8) are both valid for any δ ∗ i > s /s ≤ r /r , indicating thatfor any δ ∗ i >

0, the inconsistency region disappears whenever s ≥ r ≥ e ≈ . δ ∗ i . Finally, when s tends to inﬁnity, the inconsistency region disappears forany δ ∗ i > r >

1, which shows that Theorem 5(a) just reduces to Theorem 4(a).Theorem 5(b) shows that the consistency region under M j depends on the chosen basemodel through δ ∗ j only. Thus, the base model should be chosen as small as possible tomaximize the value of δ ∗ j . Note that when r = s , the inconsistency region disappearsunder M j . Also, if the rate of growth of M i is smaller than that of M j (i.e., s tends toinﬁnity), then with lim s →∞ s /s = 1, the inequality in (3.11) turns to be δ ∗ j > r / ( r − − δ ( r ) , (3.12)which becomes inequality in (3.5) in Theorem 4, and the lower bound in (3.10) islim s →∞ φ ( r, s, δ ∗ j ) = lim s →∞ s − s (cid:20) r /r s /s (1 + δ ∗ j ) /r − /s − (cid:21) s/ ( s − = [ r (1 + δ ∗ j )] /r − κ ( r, δ ∗ j ) . This illustrates that Theorem 4(b) is just a special of Theorem 5(b) when s approachesinﬁnity. We may thus conclude that when s tends to inﬁnity, Theorem 5 reduces toTheorem 4.We have compared the consistency of the proposed Bayes factor with the one of theintrinsic Bayes factor due to [8] under the ﬁrst two asymptotic scenarios above. A briefsummary of comparisons between the two Bayes factors is presented in Table 1. We ob-serve that the consistency results presented here are similar to the ones for the intrinsicBayes factor studied by Gir´on et al. [8]. The similarity occurs, mainly because the asymp-totic behaviors of the two Bayes factors depend on a limiting value of (1 − R j ) / (1 − R i )summarized in Lemma 1. The consistency of the intrinsic Bayes factor is still an openproblem under Scenario 3. We presume that under Scenario 3, the consistency of theintrinsic Bayes factor also behaves similarly with the one of the proposed Bayes factor,but some further investigation about this presumption is required. Table 1.

The consistency regions of the Bayes factor in (3.2) and the intrinsic Bayes factor dueto [8] for diﬀerent choices of a and a Rate of divergenceThe proposed Bayes factor The intrinsic Bayes factor0 < a = a = 1 M j : δ ∗ j > ψ ( r ) and δ ∗ ji ∈ ( φ ( r, s, δ ∗ j ) , δ ∗ j ] M j : unknown0 ≤ a < a = 1 M j : δ ∗ j > δ ( r ) and δ ∗ ji ∈ ( κ ( r, δ ∗ j ) , δ ∗ j ] M j : δ ∗ j > ξ ( r ) and δ ∗ ji ∈ ( η ( r, δ ∗ j ) , δ ∗ j ]0 ≤ a ≤ a < M j : δ ∗ ij > δ ∗ ji > M j : δ ∗ ij > δ ∗ ji > ayes factor consistency for nonnested models

4. Application

It is well known that the ANalysis Of VAriance (ANOVA) models are extremely impor-tant in exploratory and conﬁrmatory data analysis in various ﬁelds, including agriculture,biology, ecology, and psychology studies. One major diﬀerence between the ANOVA mod-els and the classical linear model is that the matrix [ n , X p ] does not necessarily havefull column rank in ANOVA setting. Some constraints are thus required for making themodel be identiﬁable. Here, under the sum-to-zero constraint [6], the ANOVA modelwith constraints for uniqueness can be reparameterized into the classical linear modelwithout constraints; see [26].As an illustration, Maruyama [15] and Wang and Sun [21] reparameterize the ANOVAmodels with the sum-to-zero constraint into the classical linear model in (1.1). There-after, based on Zellner’s g -prior with the beta-prime prior for g , they obtain an explicitclosed-form Bayes factor, which can be treated as a special case of the Bayes factor in(2.7). Consequently, the asymptotic results of the proposed Bayes factor can be easilyapplied to various ANOVA models. The application to the one-way ANOVA model isstraightforward and is thus omitted here for simplicity. In this section, we mainly considerthe results for the two-way balanced ANOVA model with the same number of observa-tions per cell. It deserves mentioning that the results can also be generalized to cover theunbalanced case.Consider a factorial design with two treatment factors A and B having p and q levels,respectively, with a total of pq factorial cells. Suppose y ijl is the l th observation in the( i, j )th cell deﬁned by the i th level of A and the j th level of B , satisfying the followingmodel y ijl = µ + α i + β j + γ ij + ε ijl , ε ijl ∼ N (0 , σ ) , (4.1)for i = 1 , . . . , p , j = 1 , . . . , q , and l = 1 , . . . , r . The number of parameters is pqr . We shallbe interested in the following ﬁve submodels: M : No eﬀect of A and no eﬀect of B , that is, α i = 0 , β j = 0, and γ ij = 0 for all i and j . M : Only eﬀect of A , that is, β j = 0 and γ ij = 0 for all i and j . M : Only eﬀect of B , that is, α i = 0 and γ ij = 0 for all i and j . M : The additive model (without interaction), that is, γ ij = 0 for all i and j . M : The full model (with interaction).By using the sum-to-zero constraint, Maruyama derives an explicit closed-form Bayesfactor associated with Zellner’s g-prior for the regression coeﬃcients of the reparameter-ized model (i.e., equation (4.7) of [15]) and the beta-prime distribution for the scalingfactor g . Moreover, Maruyama studies the consistency of Bayes factor under diﬀerentasymptotic scenarios. When both p and q approach inﬁnity and r is ﬁxed, Maruyamaconcludes that the Bayes factor is consistent except under the full model M , and thatwhen sampling from M , the Bayes factor is consistent only if δ ∗ > H ( r, δ ∗ + δ ∗ ) , (4.2)4 M. Wang and Y. Maruyama where δ ∗ ji is equal to the limit of the sum of squares of the diﬀerences between thecoeﬃcients of model M i and the coeﬃcients of model M j as n tends to inﬁnity, and H ( r, c ) with positive c is the (unique) positive solution of( x + 1) r r − ( x + 1) − c = 0 . (4.3)Such an inconsistency region occurs due to the model comparison between M and M .Of particular note is that when comparing M and M , we are in the case of Theorem4 with a = 1 and that any null hypothesis will result in a model M i with a reduced setof parameters that will satisfy a < a of Theorem 4. Consequently, when sampling fromthe full model M , the Bayes factor in (3.2) is consistent only if δ ∗ i ≤ δ ∗ and δ ∗ i > [ r (1 + δ ∗ )] /r − . (4.4)When comparing models M and M , the consistency region in (4.4) becomes δ ∗ > [ r (1 + δ ∗ + δ ∗ + δ ∗ )] /r − , which is equivalent to ( δ ∗ + 1) r r − ( δ ∗ + 1) − ( δ ∗ + δ ∗ ) = 0 . (4.5)This is exactly coincident with equation (4.3) provided by Maruyama [15]. It deservesmentioning that an extension of the results of the preceding section to higher-orderdesigns is straightforward.

5. Concluding remarks

In this paper, we have investigated the consistency of Bayes factor for nonnested linearmodels for the case in which the model dimension grows with the sample size. It hasbeen shown that in some cases, the proposed Bayes factor is consistent whichever thetrue model is, and that in others, the consistency depends on the pseudo-distance betweenthe larger model and the base model. Speciﬁcally, the pseudo-distance can be used tocharacterize the inconsistency region of Bayes factor. By comparing the consistency issuesbetween the proposed Bayes factor and the intrinsic Bayes factor, we observe that theasymptotic results presented here are similar to the ones for the intrinsic Bayes factor. Itwould be interesting to see the ﬁnite sample performance of the two Bayes factors, whichis currently under investigation and will be reported elsewhere.The consistency of Bayes factor further indicates that besides the three commonlyused families of hyper- g priors in [14], the beta-prime prior is also a good candidate forthe scaling factor g in Zellner’s g -prior. Such a comment has also been claimed by Wangand Sun [22] when studying Bayes factor consistency for nested linear models with agrowing number of parameters. From a theoretical point of view, we may conclude that ayes factor consistency for nonnested models g -priors due to [14] under the three asymptotic scenarios. However, in most of thedevelopments of the g -priors, the expression of Bayes factor may not have an analyticallytractable form, and some eﬃcient approximations are required. Standard approximationtechnique, such as Laplace approximation, becomes quite challenging when the numberof parameters grows with the sample size, because the error in approximations needsto be uniformly small over the class of all possible models. Such a situation has alsobeen encountered by Berger, Ghosh and Mukhopadhyay [2] when studying the ANOVAmodels. We plan to address these issues in our future work.Finally, it deserves mentioning that we mainly address Bayes factor consistency basedon a special choice of the hyperparameter b in the beta-prime prior, which results in anexplicit closed-form expression of Bayes factor. In an ongoing project, we investigate theeﬀects of b on the consistency of Bayes factor, especially for the case when b does notactually depend on n . Appendix

It is well known that the asymptotic approximation of the gamma function, given byStirling’s formula, can be approximated byΓ( γ x + γ ) ≈ √ πe − γ x ( γ x ) γ x + γ − / , (A.1)when x is suﬃciently large. Here, “ f ≈ g ” is used to indicate that the ratio of the twosides approaches one as x tends to inﬁnity, that is,lim x →∞ Γ( γ x + γ ) √ πe − γ x ( γ x ) γ x + γ − / = 1 . Proof of Theorem 2.

When the model dimension is j is bounded and the sample size n is large, it follows directly from Stirling’s formula thatΓ (cid:18) n − j − (cid:19) ≈ √ πe − n/ (cid:18) n (cid:19) ( n − j ) / − and Γ (cid:18) n − (cid:19) ≈ √ πe − n/ (cid:18) n (cid:19) n/ − . The Bayes factor in (2.7) is asymptotically equivalentBF[ M j : M i ] ≈ √ πe − n/ ( n/ ( n − j ) / − √ πe − n/ ( n/ n/ − (1 − R j ) − ( n − j − / a +1 ≈ (cid:18) n (cid:19) − j/ (1 − R j ) − n/ ≈ exp (cid:20) − j n − n − R j ) (cid:21) . M. Wang and Y. Maruyama

This completed the proof. (cid:3)

We now investigate the model selection consistency of Bayes factor in (3.2) underthe three diﬀerent asymptotic scenarios mentioned above. For simplicity of notation, let c i represent a ﬁnite constant for i = 1 , , . . . , j/ a + 1) and ( n − j − / (cid:18) j a + 1 (cid:19) ≈ √ πe − j/ (cid:18) j (cid:19) j/ a +1 / and Γ (cid:18) n − j − (cid:19) ≈ √ πe − ( n − j ) / (cid:18) n − j (cid:19) ( n − j ) / − . Proof of Theorem 3.

Under Scenario 1, i = O ( n a ) and j = O ( n a ) with 0 ≤ a ≤ a <

1, by using the two approximation equations above, it follows thatBF[ M j : M i ] = Γ( j/ a + 1)Γ(( n − j − / i/ a + 1)Γ(( n − i − /

2) (1 − R j ) − ( n − j − / a +1 (1 − R i ) − ( n − i − / a +1 = c j j/ a +1 ( n − j ) ( n − j ) / i i/ a +1 ( n − i ) ( n − i ) / (1 − R j ) − ( n − j ) / (1 − R i ) − ( n − i ) / (A.2)= c ( j/n ) j/ ( i/n ) i/ (cid:18) ji (cid:19) a +1 (cid:18) − j/n − i/n (cid:19)(cid:20) (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) (cid:21) n/ . (a) We ﬁrst show the Bayes factor consistency when the true model is M i . As n tendsto inﬁnity, we observe that the dominated term in brackets of equation (A.2) can beapproximated by (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) ≈ (cid:18) − R j − R i (cid:19) − , because of j/n and i/n approaching to zero as n approaches inﬁnity. From Lemma 1(a)and the fact that δ ii = 0, we observe that under M i , it followsBF[ M j : M i ] = c ( j/n ) j/ ( i/n ) i/ (cid:18) ji (cid:19) a +1 (cid:18) − j/n − i/n (cid:19)(cid:18) δ ij δ ii (cid:19) − n/ = c ( j/n ) j/ ( i/n ) i/ (cid:18) ji (cid:19) a +1 (cid:18) − j/n − i/n (cid:19) (1 + δ ij ) − n/ , which approaches zero as δ ij >

0, indicating that the Bayes factor in (3.2) is consistentwhen M i is true. ayes factor consistency for nonnested models M j is provided as follows. By using Lemma 1(a), it follows thatunder model M j , the Bayes factor in (3.2) can be further approximated byBF[ M j : M i ] = c ( j/n ) j/ ( i/n ) i/ (cid:18) ji (cid:19) a +1 (cid:18) − j/n − i/n (cid:19)(cid:18) δ jj δ ji (cid:19) − n/ = c ( j/n ) j/ ( i/n ) i/ (cid:18) ji (cid:19) a +1 (cid:18) − j/n − i/n (cid:19) (1 + δ ji ) n/ , because δ jj = 0. It should be noted that as n tends to inﬁnity, the ﬁfth dominated termapproaches inﬁnity if δ ji >

0. Therefore, the Bayes factor also approaches inﬁnity as δ ji >

0, proving the consistency under M j . This completes the proof the theorem. (cid:3) Proof of Theorem 4.

Under Scenario 2, i = O ( n a ) and j = O ( n a ) with 0 ≤ a < a =1, by using the two approximation equations above, it follows thatBF[ M j : M i ] = Γ( j/ a + 1)Γ(( n − j − / i/ a + 1)Γ(( n − i − /

2) (1 − R j ) − ( n − j − / a +1 (1 − R i ) − ( n − i − / a +1 = c ( j/i ) a +1 ( i/n ) i/ (cid:18) − j/n − i/n (cid:19) (A.3) × (cid:20)(cid:18) jn (cid:19) j/n (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) (cid:21) n/ . (a) If the true model is M i , from Lemma 1(b) and the fact that δ ii = 0, we observethat the dominated term in brackets of (A.3) can be approximated by (cid:18) jn (cid:19) j/n (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) ≈ (cid:18) r (cid:19) /r (cid:18) − r (cid:19) − /r (cid:18) − R j − R i (cid:19) − (1 − /r ) (1 − R i ) /r ≈ (cid:18) r (cid:19) /r (cid:18) − /r − /r + δ ij (cid:19) − /r (cid:18)

11 + δ i (cid:19) /r . Accordingly, the approximation of Bayes factor in (3.2) is given byBF[ M j : M i ] ≈ c ( j/i ) a +1 ( i/n ) i/ (cid:20)(cid:18) r (cid:19) /r (cid:18) − /r − /r + δ ij (cid:19) − /r (cid:18)

11 + δ i (cid:19) /r (cid:21) n/ , which approaches zero as n tends to inﬁnity, and therefore, the consistency under M i isproved.8 M. Wang and Y. Maruyama (b) If the true model is M j , from Lemma 1(b) and the fact that δ jj = 0, we observethat the dominated term in brackets of (A.3) can be approximated by (cid:18) jn (cid:19) j/n (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) ≈ (cid:18) r (cid:19) /r (cid:18) − r (cid:19) − /r (cid:18) − R j − R i (cid:19) − (1 − R j ) /r ≈ (cid:18) r (cid:19) /r (cid:18) − r (cid:19) − /r (cid:18) − /r δ ji (cid:19) − (cid:18) − /r δ j (cid:19) /r ≈ (cid:18) r (cid:19) /r (1 + δ ji ) (cid:18)

11 + δ j (cid:19) /r . Therefore, the Bayes factor in (3.2) under M j turns out to beBF[ M j : M i ] = c ( j/i ) a +1 ( i/n ) i/ (cid:20)(cid:18) r (cid:19) /r (1 + δ ji ) (cid:18)

11 + δ j (cid:19) /r (cid:21) n/ . (A.4)To show the consistency under M j , it is suﬃcient to show that the dominated term inbrackets of (A.4) is strictly larger than one when n tends to inﬁnity. This is equivalentto (cid:18) r (cid:19) /r (1 + δ ji ) (cid:18)

11 + δ j (cid:19) /r > , which gives that δ ji > [ r (1 + δ j )] /r − . On the other hand, we have δ ji ≤ δ j , which provides that δ j ≥ δ ji > [ r (1 + δ j )] /r − , indicating that δ j > r / ( r − − δ ( r ) . In order for the interval where the distance δ ji should lie δ ji ∈ ([ r (1 + δ j )] /r − , δ j ]to be nonempty, a necessary and suﬃcient condition is that δ j > δ ( r ). This completesthe proof. (cid:3) ayes factor consistency for nonnested models Proof of Theorem 5.

Under Scenario 3, i = O ( n a ) and j = O ( n a ) with a = a = 1,by using the two approximations equations, it follows thatBF[ M j : M i ] = Γ( j/ a + 1)Γ(( n − j − / i/ a + 1)Γ(( n − i − /

2) (1 − R j ) − ( n − j − / a +1 (1 − R i ) − ( n − i − / a +1 = c (cid:18) ji (cid:19) a +1 (cid:18) − j/n − i/n (cid:19) (A.5) × (cid:20) ( j/n ) j/n ( i/n ) i/n (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) (cid:21) n/ . (a) If the true model is M i , from Lemma 1(c) and the fact that δ ii = 0, we observethat the dominated term in brackets of (A.5) can be approximated by( j/n ) j/n ( i/n ) i/n (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) ≈ (1 /r ) /r (1 /s ) /s (1 − /r ) − /r (1 − /s ) − /s (cid:18) − R j − R i (cid:19) − (1 − /r ) (1 − R i ) /r − /s (A.6) ≈ (1 /r ) /r (1 /s ) /s (1 − /r ) − /r (1 − /s ) − /s (cid:18) δ ij − /r − /s (cid:19) − (1 − /r ) (cid:18) − /s δ i (cid:19) /r − /s ≈ (1 /r ) /r (1 /s ) /s [1 + δ ij / (1 − /r )] − (1 − /r ) (1 + δ i ) /r − /s . For the Bayes factor to be consistent, it is suﬃcient to show that the dominated term in(A.6) is strictly less than 1 as n approaches inﬁnity. This is equivalent to (cid:18) δ ij − /r (cid:19) − /r > (1 /r ) /r (1 /s ) /s (1 + δ i ) /s − /r , which implies that δ ij > r − r (cid:26)(cid:20) s /s r /r (1 + δ i ) /s − /r (cid:21) r/ ( r − − (cid:27) . In addition, from the property of the pseudo-distance, we have δ i ≥ δ ij . Therefore, itfollows that δ i ≥ δ ij > r − r (cid:26)(cid:20) s /s r /r (1 + δ i ) /s − /r (cid:21) r/ ( r − − (cid:27) , M. Wang and Y. Maruyama indicating that the value of δ ij must satisfy (cid:18) δ i − /r (cid:19) − /r > (1 /r ) /r (1 /s ) /s (1 + δ i ) /s − /r . Under the conditions stated in the theorem, we take limits and obtain that the Bayesfactor tends to zero, and thus, the Bayes factor is consistent under M i .(b) If the true model is M j , from Lemma 1(c) and the fact that δ jj = 0, we observethat the dominated term in brackets of (A.5) can be approximated by( j/n ) j/n ( i/n ) i/n (1 − j/n ) − j/n (1 − i/n ) − i/n (1 − R j ) − (1 − j/n ) (1 − R i ) − (1 − i/n ) ≈ (1 /r ) /r (1 /s ) /s (1 − /r ) − /r (1 − /s ) − /s (cid:18) − R j − R i (cid:19) − (1 − /s ) (1 − R j ) /r − /s (A.7) ≈ (1 /r ) /r (1 /s ) /s (1 − /r ) − /r (1 − /s ) − /s (cid:18) − /r δ ji − /s (cid:19) − (1 − /s ) (cid:18) − /r δ j (cid:19) /r − /s ≈ (1 /r ) /r (1 /s ) /s [1 + δ ji / (1 − /s )] − /s (1 + δ j ) /r − /s . For the Bayes factor to be consistent, it is suﬃcient to show that the dominated term in(A.7) is strictly larger than one as n approaches inﬁnity. This is equivalent to(1 /r ) /r (1 /s ) /s [1 + δ ji / (1 − /s )] − /s (1 + δ j ) /r − /s > . Simple algebra shows that δ ji > s − s (cid:26)(cid:20) r /r s /s (1 + δ j ) /r − /s (cid:21) s/ ( s − − (cid:27) . On the other hand, we also have δ j ≥ δ ji , which provides that δ j ≥ δ ji > s − s (cid:26)(cid:20) r /r s /s (1 + δ j ) /r − /s (cid:21) s/ ( s − − (cid:27) , (A.8)indicating that (cid:18) δ j − /s (cid:19) − /s > r /r s /s (1 + δ j ) /r − /s . In order for the interval where the distance δ ji should lie δ ji ∈ (cid:18) s − s (cid:20) r /r s /s (1 + δ j ) /r − /s − (cid:21) s/ ( s − , δ j (cid:21) ayes factor consistency for nonnested models δ j satisﬁes inequality in(A.8). This completes the proof. (cid:3) Acknowledgements

The authors thank the Editor and two referees for their helpful comments, which haveled to an improvement of the manuscript.

References [1]

Bayarri, M.J. , Berger, J.O. , Forte, A. and

Garc´ıa-Donato, G. (2012). Criteria forBayesian model choice with application to variable selection.

Ann. Statist. Berger, J.O. , Ghosh, J.K. and

Mukhopadhyay, N. (2003). Approximations and con-sistency of Bayes factors as model dimension grows.

J. Statist. Plann. Inference

Casella, G. , Gir´on, F.J. , Mart´ınez, M.L. and

Moreno, E. (2009). Consistency ofBayesian procedures for variable selection.

Ann. Statist. Cox, D.R. (1962). Further results on tests of separate families of hypotheses.

J. Roy.Statist. Soc. Ser. B Fern´andez, C. , Ley, E. and

Steel, M.F.J. (2001). Benchmark priors for Bayesian modelaveraging.

J. Econometrics

Fujikoshi, Y. (1993). Two-way ANOVA models with unbalanced data.

Discrete Math.

Gir´on, F.J. , Mart´ınez, M.L. , Moreno, E. and

Torres, F. (2006). Objective testingprocedures in linear models: Calibration of the p -values. Scand. J. Stat. Gir´on, F.J. , Moreno, E. , Casella, G. and

Mart´ınez, M.L. (2010). Consistency of ob-jective Bayes factors for nonnested linear models and increasing model dimension.

Rev.R. Acad. Cienc. Exactas F´ıS. Nat. Ser. A Math. RACSAM

Gustafson, P. , Hossain, S. and

MacNab, Y.C. (2006). Conservative prior distribu-tions for variance parameters in hierarchical models.

Canad. J. Statist. Hoel, P.G. (1947). On the choice of forecasting formulas.

J. Amer. Statist. Assoc. Kass, R.E. and

Raftery, A.E. (1995). Bayes factors.

J. Amer. Statist. Assoc. Kass, R.E. and

Vaidyanathan, S.K. (1992). Approximate Bayes factors and orthogonalparameters, with application to testing equality of two binomial proportions.

J. Roy.Statist. Soc. Ser. B Ley, E. and

Steel, M.F.J. (2012). Mixtures of g -priors for Bayesian model averaging witheconomic applications. J. Econometrics

Liang, F. , Paulo, R. , Molina, G. , Clyde, M.A. and

Berger, J.O. (2008). Mixturesof g priors for Bayesian variable selection. J. Amer. Statist. Assoc.

Maruyama, Y. (2013). A Bayes factor with reasonable model selection consistency forANOVA model. Available at arXiv:0906.4329v2 [stat.ME]. M. Wang and Y. Maruyama [16]

Maruyama, Y. and

George, E.I. (2011). Fully Bayes factors with a generalized g -prior. Ann. Statist. Moreno, E. and

Gir´on, F.J. (2008). Comparison of Bayesian objective procedures forvariable selection in linear regression.

TEST Moreno, E. , Gir´on, F.J. and

Casella, G. (2010). Consistency of objective Bayes factorsas the model dimension grows.

Ann. Statist. Moreno, E. , Gir´on, F.J. and

Casella, G. (2014). Posterior model consistency in variableselection as the model dimension grows. Preprint.[20]

Pesaran, M.H. and

Weeks, M. (1999). Non-nested hypothesis testing: An overview. Cam-bridge Working Papers in Economics 9918, Faculty of Economics, Univ. of Cambridge.[21]

Wang, M. and

Sun, X. (2013). Bayes factor consistency for unbalanced ANOVA models.

Statistics Wang, M. and

Sun, X. (2014). Bayes factor consistency for nested linear models with agrowing number of parameters.

J. Statist. Plann. Inference

Wang, M. , Sun, X. and

Lu, T. (2015). Bayesian structured variable selection in linearregression models.

Comput. Statist. Watnik, M. , Johnson, W. and

Bedrick, E.J. (2001). Nonnested linear model selectionrevisited.

Comm. Statist. Theory Methods Watnik, M.R. and

Johnson, W.O. (2002). The behaviour of linear model selection testsunder globally non-nested hypotheses.

Sankhy¯a Ser. A Wetzels, R. , Grasman, R.P.P.P. and

Wagenmakers, E.-J. (2012). A default Bayesianhypothesis test for ANOVA designs.

Amer. Statist. Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with g -prior distributions. In Bayesian Inference and Decision Techniques . Stud. BayesianEconometrics Statist.233–243. Amsterdam: North-Holland. MR0881437