[PDF] Empirical Likelihood Ratio Test on quantiles under a Density Ratio Model

Abstract

Population quantiles are important parameters in many applications. Enthusiasm for the development of effective statistical inference procedures for quantiles and their functions has been high for the past decade. In this article, we study inference methods for quantiles when multiple samples from linked populations are available. The research problems we consider have a wide range of applications. For example, to study the evolution of the economic status of a country, economists monitor changes in the quantiles of annual household incomes, based on multiple survey datasets collected annually. Even with multiple samples, a routine approach would estimate the quantiles of different populations separately. Such approaches ignore the fact that these populations are linked and share some intrinsic latent structure. Recently, many researchers have advocated the use of the density ratio model (DRM) to account for this latent structure and have developed more efficient procedures based on pooled data. The nonparametric empirical likelihood (EL) is subsequently employed. Interestingly, there has been no discussion in this context of the EL-based likelihood ratio test (ELRT) for population quantiles. We explore the use of the ELRT for hypotheses concerning quantiles and confidence regions under the DRM. We show that the ELRT statistic has a chi-square limiting distribution under the null hypothesis. Simulation experiments show that the chi-square distributions approximate the finite-sample distributions well and lead to accurate tests and confidence regions. The DRM helps to improve statistical efficiency. We also give a real-data example to illustrate the efficiency of the proposed method.

Full PDF

EEMPIRICAL LIKELIHOOD RATIO TEST ON QUANTILESUNDER A DENSITY RATIO MODEL

By Archer (Gong) Zhang, Jiahua Chen

University of British Columbia

Population quantiles are important parameters in many applica-tions. Enthusiasm for the development of eﬀective statistical inferenceprocedures for quantiles and their functions has been high for thepast decade. In this article, we study inference methods for quantileswhen multiple samples from linked populations are available. The re-search problems we consider have a wide range of applications. Forexample, to study the evolution of the economic status of a coun-try, economists monitor changes in the quantiles of annual householdincomes, based on multiple survey datasets collected annually. Evenwith multiple samples, a routine approach would estimate the quan-tiles of diﬀerent populations separately. Such approaches ignore thefact that these populations are linked and share some intrinsic la-tent structure. Recently, many researchers have advocated the useof the density ratio model (DRM) to account for this latent struc-ture and have developed more eﬃcient procedures based on pooleddata. The nonparametric empirical likelihood (EL) is subsequentlyemployed. Interestingly, there has been no discussion in this contextof the EL-based likelihood ratio test (ELRT) for population quantiles.We explore the use of the ELRT for hypotheses concerning quantilesand conﬁdence regions under the DRM. We show that the ELRTstatistic has a chi-square limiting distribution under the null hypoth-esis. Simulation experiments show that the chi-square distributionsapproximate the ﬁnite-sample distributions well and lead to accuratetests and conﬁdence regions. The DRM helps to improve statisticaleﬃciency. We also give a real-data example to illustrate the eﬃciencyof the proposed method.

1. Introduction.

Suppose we have m + 1 independent random samplesfrom population distributions G , G , . . . , G m . Let their respective densityfunctions with respect to some σ -ﬁnite measure be g k ( · ). If there exist avector-valued function q ( x ) and unknown vector-valued parameters θ k suchthat g k ( x ) = exp { θ (cid:62) k q ( x ) } g ( x ) , (1) Keywords and phrases:

Multiple samples, quantile estimation, density ratio model, em-pirical likelihood, likelihood ratio test, conﬁdence region a r X i v : . [ m a t h . S T ] J u l . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM then they deﬁne a density ratio model (DRM) as introduced by Anderson(1979). By convention, we call G the base distribution and q ( x ) the basisfunction. There is a symmetry in the DRM: any one of G , . . . , G m mayserve as the base distribution. We require the ﬁrst element of q ( x ) to be 1so that the corresponding coeﬃcient is a normalization constant, and theelements of q ( x ) must be linearly independent. The linear independence isa natural assumption: otherwise, some elements of q ( x ) are redundant.When data are collected from a DRM, the whole data set can be utilizedto estimate G , which will lead to eﬃciency gain. The nonparametric G assumption in the DRM is unrestrictive. Combined with a moderate-sized q ( x ), a single DRM contains a broad range of parametric distribution fam-ilies. Thus, the DRM has a low risk of model misspeciﬁcation. There is agrowing interest in the DRM in statistics (Qin, 1998; Fokianos et al., 2001;De Oliveira and Kedem, 2017; Zhuang et al., 2019) as well as in the ma-chine learning community (Sugiyama et al., 2012). In this paper, we studythe inference problem for population quantiles under the DRM. Populationquantiles and their functions are important parameters in many applica-tions. For example, government agents gauge the overall economic status ofa country based on annual surveys of household income distribution. Thetrend in the quantiles of the income distribution is indicative (Berger andSkinner, 2003; Muller, 2008). In forestry, the lower quantiles of the mechan-ical strength of wood products are vital design values (Verrill et al., 2015).Other examples include Chen and Hall (1993); Yang and He (2012); Chenand Liu (2013); Chen et al. (2016); Koenker et al. (2017); Gon¸calves et al.(2020); Chen and Liu (2019).The data from DRMs are a special type of biased sample (Vardi, 1982,1985; Qin, 1998, 2017). The empirical likelihood (EL) of Owen (2001) isan ideal platform for statistical inference under the DRM. The EL retainsthe eﬀectiveness of likelihood methods and does not impose a restrictiveparametric assumption. The ELRT statistic has a neat chi-square limitingdistribution, much like the parametric likelihood ratio test given indepen-dent and identically distributed (i.i.d.) observations (Owen, 1988; Qin andLawless, 1994). The EL has already been widely used for data analysis un-der the DRM (Qin, 1993; Qin and Zhang, 1997; Chen and Liu, 2013; Caiet al., 2017). However, there has been limited discussion of the ELRT in thebiased sampling context. Both Qin (1993) and Cai et al. (2017) permit noadditional equations. Although the classical Wald method remains eﬀectivefor both hypothesis tests and conﬁdence regions (Qin, 1998; Chen and Liu,2013; Chen et al., 2016), it must be aided by a consistent and stable vari-ance estimate. In addition, its conﬁdence regions are oval-shaped regardless . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM of the shape of the data cloud. Thus, an ELRT has the potential to pushthe boundary of the DRM much further.This paper establishes the limiting chi-square distribution of the ELRT forquantiles under the DRM. We prove that the ELRT statistic has a chi-squarelimiting distribution under certain conditions. The resulting conﬁdence re-gions have data-driven shapes, more accurate coverage probabilities, andsmaller volumes. In Section 2, we state the problem of interest and the pro-posed ELRT under the DRM. In Section 3, we study the limiting distributionof the ELRT statistic and some other useful asymptotic results. We illustratethe superiority of the ELRT and the associated conﬁdence regions throughsimulated data in Section 4 and for real-world data in Section 5. Technicaldetails and the proofs of the main theorems are given in Appendices A andB.

2. Research problem and proposed approach.

Let { x kj : 1 ≤ j ≤ n k , ≤ k ≤ m } be m + 1 independent i.i.d. samples from a DRM deﬁned by(1). Denote by ξ k the τ k quantile of the k th population for some τ k ∈ (0 , k = 0 , , . . . , m . Let ξ = { ξ k : k ∈ I } be the quantiles at some levels ofpopulations in an index set I ⊆ { , , . . . , m } of size l . We study the ELRTunder the DRM for the following hypothesis: H : ξ = ξ ∗ against H : ξ (cid:54) = ξ ∗ , (2)for some given ξ ∗ of dimension l .The hypothesis formulated in (2) has many applications. In socio-economicstudies, when studying the distributions of household disposable incomes,economists and social scientists often divide the collected survey data intoﬁve groups. These groups are famously known as quintile groups. The ﬁrstgroup consists of the lowest 20% of the data, the second group consists ofthe next 20%, and so on. Many studies have shown that the quintiles are im-portant for explaining the economy and consumer behaviour (Castell´o andDom´enech, 2002; Wunder, 2012; Humphries et al., 2014; Corak, 2019). Instatistics, the cut-oﬀ points of these quintile groups are the quantiles of thepopulations: for example, the 20% quantile separates the ﬁrst and secondquintile groups. Governments may, therefore, consider this 20% quantile askey for determining which families should receive a special subsidy to helpsociety’s less fortunate. Moreover, when new policies are implemented, theevolution of the quantiles of household income over time may reﬂect theimpact of the policies. As a consequence, these quantiles are of particularinterest to social scientists and politicians as a way to measure the eﬀects . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM of policy changes. In statistical inference, these types of tasks can most ap-propriately be carried out using a hypothesis testing procedure, which canbe naturally extended to the construction of conﬁdence regions. Hence, theresearch problem we study here is of scientiﬁc signiﬁcance in many applica-tions. In the real-data analysis, we study conﬁdence regions for quantiles ofhousehold incomes based on US Consumer Expenditure Surveys.We use an ELRT to test the hypothesis in (2). Let p kj = d G ( x kj ) = P ( X = x kj ; G ) for all applicable k, j . The EL function under the DRM isgiven by L n ( G , . . . , G m ) = (cid:89) k,j d G k ( x kj ) = (cid:8) (cid:89) k,j p kj (cid:9)(cid:8) (cid:89) k,j exp (cid:16) θ (cid:62) k q ( x kj ) (cid:17)(cid:9) . (3)For notational convenience, we have dropped the ranges of the indices in theexpressions. Observe that the EL in (3) is 0 if G is a continuous distribution.Surprisingly, this seemingly devastating property does little harm to theusefulness of the EL. Since the EL in (3) can also be regarded as a functionof the parameters θ := { θ r : 1 ≤ r ≤ m } and the base distribution G , wemay write its logarithm as (cid:96) n ( θ , G ) = log L n ( G , . . . , G m ) = (cid:88) k,j log p kj + (cid:88) k,j θ (cid:62) k q ( x kj ) , where we deﬁne θ = by convention.Let E r be the expectation operation under G r , and let h r ( x, θ ) = exp (cid:16) θ (cid:62) r q ( x ) (cid:17) be the density of G r with respect to G for r = 0 , , . . . , m . Clearly, h ( x, θ ) =1. This also implies that E [ h r ( X, θ )] = E (cid:104) exp (cid:16) θ (cid:62) r q ( X ) (cid:17)(cid:105) = 1 . (4)The τ r population quantile ξ r of G r satisﬁes or is deﬁned to be a solution of E r (cid:2) ( X ≤ ξ r ) − τ r (cid:3) = E [ h r ( X, θ ) { ( X ≤ ξ r ) − τ r } ] = 0 . (5)Let ϕ r ( x, θ , ξ ) = h r ( x, θ )[ ( x ≤ ξ r ) − τ r ] . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Following Owen (2001) and Qin and Lawless (1994), we introduce the proﬁlelog-EL of the population quantiles ξ :˜ (cid:96) n ( ξ ) = sup θ ,G (cid:110) (cid:96) n ( θ , G ) | (cid:88) k,j p kj h r ( x kj , θ ) = 1 , r = 0 , , . . . , m, (cid:88) k,j p kj ϕ r ( x kj , θ , ξ ) = 0 , r ∈ I (cid:111) (6)andsup θ ,G { (cid:96) n ( θ , G ) } = sup θ ,G { (cid:96) n ( θ , G ) | (cid:88) k,j p kj h r ( x kj , θ ) = 1 , r = 0 , , . . . , m } . An ELRT statistic for the hypothesis in (2) is deﬁned as R n = 2 (cid:34) sup θ ,G { (cid:96) n ( θ , G ) } − ˜ (cid:96) n ( ξ ∗ ) (cid:35) . We call R n the ELRT statistic hereafter. Clearly, the larger the value of R n , the stronger the evidence for departure from the null hypothesis in thedirection of the alternative hypothesis. We reject H when R n exceeds somecritical value that is decided based on the distributional information of R n under H . The limiting distribution of R n and other related properties aregiven in the next section.We observe that the approach needs no change for a set of quantiles fromthe same population. For notational simplicity, the presentation is given forquantiles from diﬀerent populations.

3. Asymptotic properties of R n and other quantities. The dis-tributional information of R n is vital to the implementation of the ELRTin applications. In this section, we show that it is asymptotically chi-squaredistributed. We also present some secondary but useful asymptotic results.3.1. A dual function.

The proﬁle log-EL function ˜ (cid:96) n ( ξ ∗ ) is deﬁned to bethe solution of an optimization problem that can be solved by the Lagrangemultiplier method. Let t = ( t , . . . , t m ) and λ = { λ r : r ∈ I } be Lagrangemultipliers. Deﬁne a Lagrangian as L ( t , λ , θ , G ) = (cid:96) n ( θ , G ) + m (cid:88) r =0 t r (cid:8) − (cid:88) k,j p kj h r ( x kj , θ ) (cid:9) − (cid:88) r ∈ I nλ r (cid:8) (cid:88) k,j p kj ϕ r ( x kj , θ , ξ ∗ ) (cid:9) . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM In Appendix B, we will show that under mild conditions that are easy toverify, there aways exists some θ such that a solution in G to (4) and (5)exists. With this promise, according to the Karush–Kuhn–Tucker theorem(Boyd and Vandenberghe, 2004), the solution to the constrained optimiza-tion problem in (6) satisﬁes ∂ L ( t , λ , θ , G ) ∂ ( t , λ , θ , p kj ) = . Let (ˆ t , ˆ λ , ˆ θ , ˆ p kj ) be the solution. Some simple algebra gives ˆ t r = n r andˆ p kj = n − (cid:40) m (cid:88) r =0 ρ r h r ( x kj , ˆ θ ) + (cid:88) r ∈ I ˆ λ r ϕ r ( x kj , ˆ θ , ξ ∗ ) (cid:41) − , where ρ r = n r /n .We now introduce another set of notation:¯ h ( x, θ ) = m (cid:88) r =0 ρ r h r ( x, θ ) , h ( x, θ ) = ( ρ h ( x, θ ) / ¯ h ( x, θ ) , . . . , ρ m h m ( x, θ ) / ¯ h ( x, θ )) (cid:62) ,ψ r ( x, θ ) = ϕ r ( x, θ , ξ ∗ ) / ¯ h ( x, θ ) , ψ ( x, θ ) = { ψ r ( x, θ ) : r ∈ I } . To aid our memory, we note that ¯ h ( x, θ ) is a mixture density with mixingproportions ρ , . . . , ρ m ; h ( x, θ ) is a vector of density functions with respectto the mixture ¯ h ( x, θ ) combined with the mixing proportions; and ψ ( x, θ ) isa vector of normalized ϕ r ( x, θ , ξ ∗ ). With the help of this notation, we deﬁnea dual function D ( λ , θ ) = (cid:88) k,j θ (cid:62) k q ( x kj ) − (cid:88) k,j log ¯ h ( x kj , θ ) − (cid:88) k,j log (cid:8) (cid:88) r ∈ I λ r ψ r ( x kj , θ ) (cid:9) . (7)The dual function has some easily veriﬁed mathematical properties. We canshow that ˜ (cid:96) n ( ξ ∗ ) = D (ˆ λ , ˆ θ ) − n log n, (8)and that (ˆ λ , ˆ θ ) is a saddle point of D ( λ , θ ) satisfying ∂ D ( λ , θ ) ∂ ( λ , θ ) = . (9) . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM In the following section, we study some of the properties of ˜ (cid:96) n ( ξ ∗ ) throughthe dual function D ( λ , θ ).3.2. Asymptotic properties.

We discuss the asymptotic properties underthe following nonrestrictive conditions on the sampling plan and the DRM.

Conditions: (i) The sample proportions ρ k = n k /n have limits in (0 ,

1) as n → ∞ ;(ii) The matrix E [ q ( X ) q (cid:62) ( X )] is positive deﬁnite;(iii) For each k = 0 , , . . . , m and θ k in a neighbourhood of the true param-eter value θ ∗ k , we have E (cid:104) exp (cid:16) θ (cid:62) k q ( X ) (cid:17)(cid:105) = E [ h k ( X, θ )] < ∞ . Here are some implications of the above conditions.1. Under Condition (iii), the moment generating function of q ( X ) withrespect to G k exists in a neighbourhood of . Hence, all ﬁnite-ordermoments of (cid:107) q ( X ) (cid:107) are ﬁnite.2. When n is large enough and ( λ , θ ) is in a small neighbourhood of( , θ ∗ ), the derivatives of the dual function D ( λ , θ ) are all bounded bysome polynomials of (cid:107) q ( x ) (cid:107) . Hence, they are all integrable.3. Under Condition (ii), the sample version of E [ q ( X ) q (cid:62) ( X )] is alsopositive deﬁnite when n is very large.We now state the main results; the proofs are given in Appendix A. Lemma . Under Conditions (i) to (iii) , as n → ∞ , n − ∂ D ( λ , θ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) (cid:12)(cid:12)(cid:12)(cid:12) λ = , θ = θ ∗ → S almost surely for some full-rank square matrix S of dimension ( dm + l ) . The second derivative of the dual function D ( λ , θ ) is not negative deﬁnitein comparison to a usual likelihood function. This is understandable because λ is not a model parameter. However, it has full rank and plays an importantrole in localizing ˆ θ .The next result implies that the dual function D ( λ , θ ) resembles the log-likelihood function under regularity conditions in an important way: its ﬁrstderivative is an unbiased estimating function. . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Lemma . Under Conditions (i) to (iii) , we have E (cid:20) ∂ D ( λ , θ ) ∂ ( λ , θ ) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) λ = , θ = θ ∗ = , where the expectation is calculated by regarding x kj as a random variablewith distribution G k .Furthermore, as n → ∞ , we have n − / ∂ D ( λ , θ ) ∂ ( λ , θ ) (cid:12)(cid:12)(cid:12)(cid:12) λ = , θ = θ ∗ d → N ( , V ) , where V is a square matrix of dimension ( dm + l ) . A key step in the asymptotic study of ˆ θ and the ELRT statistic R n islocalization. That is, ˆ θ is in a small neighbourhood of the true value θ ∗ asthe sample size n goes to inﬁnity. The following lemma asserts that ˆ θ isalmost surely located in the O ( n − / )-neighbourhood of θ ∗ . Lemma . Under Conditions (i) to (iii) , as n → ∞ , the saddle point (ˆ λ , ˆ θ ) of the dual function D ( λ , θ ) is in the n − / -neighbourhood of ( , θ ∗ ) with probability .In addition, √ n (ˆ λ , ˆ θ − θ ∗ ) is asymptotically multivariate normal. The results in the previous lemma shed light on the asymptotic propertiesof the EL under the DRM. At the same time, they pave the way for thefollowing celebrated conclusion in the EL literature.

Theorem . Under Conditions (i) to (iii) and the null hypothesis (2) ,as n → ∞ , the ELRT statistic R n = 2 (cid:34) sup θ ,G { (cid:96) n ( θ , G ) } − ˜ (cid:96) n ( ξ ∗ ) (cid:35) d → χ l . Theorem 3.4 enables us to determine an approximate rejection region forthe ELRT. We reject the null hypothesis at the signiﬁcance level α whenthe observed value of R n is larger than the upper α quantile of the chi-square distribution χ l . This also provides a foundation for the constructionof conﬁdence regions of ξ . Let R n ( ξ ) = 2 (cid:34) sup θ ,G { (cid:96) n ( θ , G ) } − ˜ (cid:96) n ( ξ ) (cid:35) . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM An ELRT-based (1 − α ) approximate conﬁdence region for ξ is { ξ : R n ( ξ ) ≤ χ l (1 − α ) } , (10)where χ l (1 − α ) is the (1 − α ) quantile of χ l .

4. Simulation studies.

In this section, we report some simulation re-sults. We conclude that the chi-square approximation to the sample distri-bution of R n is very accurate. The corresponding conﬁdence regions have adata-driven shape and accurate coverage probabilities. In almost all casesconsidered, the R n -based conﬁdence regions outperform those based on theWald method in terms of the average areas and coverage probabilities. TheDRM markedly improves the statistical eﬃciency, and the details are asfollows.4.1. Numerical implementation and methods included.

Recall that theELRT statistic R n is deﬁned to be R n = 2 (cid:34) sup θ ,G { (cid:96) n ( θ , G ) } − ˜ (cid:96) n ( ξ ∗ ) (cid:35) . In data analysis, we must solve the optimization problem sup θ ,G { (cid:96) n ( θ , G ) } .As Cai et al. (2017) suggest, it can be transformed into an optimization prob-lem of a convex function, and it has a simple solution. We further turn thisoptimization problem into the problem of solving a system of equations thatare formed by equating the derivatives of the induced convex function to .The numerical implementation can be eﬃciently carried out by a root solverin the R (R Core Team, 2018) package nleqslv (Hasselman, 2018) for non-linear equations. It uses either the Newton or Broyden iterative algorithms.To compute ˜ (cid:96) n ( ξ ∗ ), we can solve (9), as (8) suggests. This leads to a systemof dm + l nonlinear equations in ( λ , θ ), with d being the dimension of thevector-valued basis function q ( x ) and l the number of population quantilesof interest speciﬁed in ξ ∗ . In most applications, a q ( x ) with dimension 4 orless is suitable. For a system of this size, the R package nleqslv for roots isvery eﬀective even when m is as large as 20. The existence of the solutionto (4) and (5) is proved in Appendix B. Guided by this proof, our choice ofthe initial λ and θ guarantees numerical success.As is typical for DRM examples, we simulate data from the normal andgamma distributions and examine the ELRT-based hypothesis tests andconﬁdence regions for the population quantiles. For comparison, we includeWald-based and nonparametric inference on the same quantiles. To make . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM the article self-contained, we now brieﬂy review the Wald and nonparamet-ric methods. Wald method.

The Wald method for conﬁdence region construction of ξ was given in Chen and Liu (2013). Let (˜ θ , ˜ G ) be the argument maximizerof sup θ ,G { (cid:96) n ( θ , G ) } , and also let˜ G r ( x ) = (cid:88) k,j ( x kj ≤ x ) h r ( x kj , ˜ θ )d ˜ G ( x kj ) , for r = 1 , . . . , m , where d ˜ G ( x ) = ˜ G ( x ) − ˜ G ( x − ). The maximum ELestimator (MELE) of the τ r quantile of G r is then given by˜ ξ r = inf { x : ˜ G r ( x ) ≥ τ r } . Let ˜ ξ = { ˜ ξ r : r ∈ I } . We have, as n → ∞ , √ n (˜ ξ − ξ ∗ ) → N ( , Ω) , for some matrix Ω that is a function of G r and θ . A plug-in estimate ˜Ω ofΩ was suggested by Chen and Liu (2013), and an R package drmdel (Cai,2015) by the authors of Cai et al. (2017) includes the MELE ˜ ξ and ˜Ω in itsoutput. A level (1 − α ) approximate conﬁdence region for ξ based on theWald method is then given by { ξ : n (˜ ξ − ξ ) (cid:62) ˜Ω − (˜ ξ − ξ ) ≤ χ l (1 − α ) } . (11)The Wald method can also be used for hypothesis tests on quantiles. Werefer to the conﬁdence region in (11) as the one based on the Wald method . Nonparametric method.

Suppose ˆ G r ( x ) = n − r (cid:80) n r j =1 ( x rj ≤ x ) is theempirical distribution based on a sample from the distribution G r , and ˆ ξ r isthe sample quantile. The sample quantile is asymptotically normal (Serﬂing,1980) with asymptotic variance τ r (1 − τ r ) / ( ρ r g r ( ξ r )) as n → ∞ and n r /n → ρ r . In view of this, the Wald method remains applicable with the help of anonparametric consistent density estimator. We follow the literature and letˆ g r ( x ) = 1 n r b r n r (cid:88) j =1 K (cid:18) x rj − xb r (cid:19) , for some kernel function K ( · ) and bandwidth b r . Under mild conditions on g r ( · ) and proper choices of K ( · ) and b r , ˆ g r ( x ) is consistent (Silverman, 1986). . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM We set K ( · ) to the density function of the standard normal distribution, andwe use a rule-of-thumb bandwidth suggested by Silverman (1986): b r = 0 . { ˆ σ r , (cid:100) IQR r / . } n − / r , where ˆ σ r is the standard deviation of ˆ G r and (cid:100) IQR r is the interquartile range.With these, we obtain a plug-in estimateˆ T := diag { τ r (1 − τ r ) / ( ρ r ˆ g r ( ˆ ξ r )) : r ∈ I } , and subsequently a (1 − α ) approximate conﬁdence region for ξ : { ξ : n (ˆ ξ − ξ ) (cid:62) ˆ T − (ˆ ξ − ξ ) ≤ χ l (1 − α ) } , (12)where ˆ ξ = { ˆ ξ r : r ∈ I } . This nonparametric Wald method can also beemployed for hypothesis tests on quantiles. We refer to the conﬁdence regionin (12) as the one based on the nonparametric method .4.2. Data generated from normal distributions.

Normality is routinelyassumed but unlikely strictly valid in real-world applications. When mul-tiple samples are available, we include all normal distributions without anormality assumption via a DRM coupled with q ( x ) = (1 , x, x ) (cid:62) . In thissimulation, we generate data from m + 1 = 6 normal distributions with sam-ple sizes n r = 100. Their means and standard deviations are chosen to be(0 , , , , ,

2) and (1 , . , . , . , , . n r = 100 and compute the R n valuesfor the hypothesis on the medians of G and G : H : ( ξ , ξ ) = ( ξ ∗ , ξ ∗ ) versus H : ( ξ , ξ ) (cid:54) = ( ξ ∗ , ξ ∗ )where ξ ∗ , ξ ∗ are the true values. Note that although we simulate data fromnormal distributions, the parametric information does not play any role inthe data analysis.Because H is true, R n has a χ limiting distribution. Figure 1 gives aquantile-quantile (Q-Q) plot of the 1000 simulated R n values against the χ distribution. Over the range from 0 to 6 that matters in most applica-tions, the points are close to the red 45-degree line. Clearly, the chi-squaredistribution is a good approximation of the sampling distribution of R n ,demonstrating good agreement with Theorem 3.4.In Figure 2, we depict the 95% conﬁdence regions of ξ = ( ξ , ξ ) based onthe ELRT in (10), the Wald method in (11), and the nonparametric method . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 1: Q-Q plot of R n values against χ based on normal data of equalsample size n r = 100. Quantiles of chi−square (df = 2) S a m p l e quan t il e s o f E L R T s t a t i s t i cs in (12) based on a typical simulated data set with the true ξ ∗ marked as ared diamond. The ELRT contour is not smooth because R n ( ξ ) is not smoothat data points. Clearly, the ELRT conﬁdence region has the smallest areaand is therefore the most eﬃcient. In Table 1, we make direct quantitativecomparisons between the three methods in terms of the coverage probabil-ities and areas of the 90% and 95% conﬁdence regions. Both the LRT andWald methods under the DRM have empirical coverage probabilities close tothe nominal levels; the nonparametric method has overcoverage. The ELRTclearly outperforms.In applications, the sample sizes from diﬀerent populations are unlikely tobe equal. Does the superiority of the ELRT require equal sample sizes fromthese populations? We also simulated data from the same distributions withunequal sample sizes. We set the sizes of populations G , G , G , G to 100and 200, and the sizes of populations G , G to 50 and 100, respectively. Weconstructed conﬁdence regions for the 90% quantile of G and the 95% quan-tile of G , where both populations have the smaller sample sizes. Figure 3shows the three 95% conﬁdence regions based on a simulated data set; wesee that the ELRT is superior. Admittedly, this is one of the more extreme . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 2: Conﬁdence regions of ( ξ , ξ ) by ELRT (solid), Wald (dashed), andnonparametric (dotted) methods, based on a simulated normal data set ofequal sample size n r = 100. The true quantiles are marked with a diamond.The level of conﬁdence is 95%.

50% quantile of G0 % quan t il e o f G −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 . . . . . . . ELRTWaldNonparametric

Table 1

Empirical coverage probabilities and average areas based on normal data of equal samplesize.

Method 90% 95%Coverage probability Area Coverage probability Area n r = 100ELRT 89 .

1% 0 .

250 95 .

8% 0 . .

8% 0 .

266 95 .

4% 0 . .

7% 0 .

374 95 .

9% 0 . n r = 200ELRT 89 .

7% 0 .

126 95 .

0% 0 . .

5% 0 .

132 95 .

2% 0 . .

3% 0 .

183 95 .

3% 0 . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM cases. Table 2 gives the average areas and empirical coverage probabilitiesof the three conﬁdence regions, based on 1000 repetitions. The ELRT conﬁ-dence regions have the most accurate coverage probabilities, while the othertwo methods have low coverage. The ELRT conﬁdence regions have largeraverage areas that are not excessive.Fig 3: Conﬁdence regions of ( ξ , ξ ) by ELRT (solid), Wald (dashed), andnonparametric (dotted) methods, based on a simulated normal data set ofunequal sample sizes. The true quantiles are marked with a diamond. Thelevel of conﬁdence is 95%.

90% quantile of G2 % quan t il e o f G . . . . . . . ELRTWaldNonparametric

Data generated from gamma distributions.

In applications, income,lifetime, expenditure, and strength data are positive and skewed. Gamma orWeibull distributions are often used for statistical inference in such applica-tions. In the presence of multiple samples, replacing the parametric model bya DRM with q ( x ) = (1 , x, log x ) (cid:62) is an attractive option to reduce the riskof model mis-speciﬁcation. We generate 1000 sets of m + 1 = 6 independentsamples of sizes n r = 100 and 200 from gamma distributions with shapeparameters (5 , , , , ,

7) and scale parameters (2 , . , . , . , . , . G and G : H : ( ξ , ξ ) = ( ξ ∗ , ξ ∗ ) versus H : ( ξ , ξ ) (cid:54) = ( ξ ∗ , ξ ∗ ) , . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Table 2

Empirical coverage probabilities and average areas based on normal data of unequalsample sizes.

Method 90% 95%Coverage probability Area Coverage probability Area n = n = 50 , n = n = n = n = 100ELRT 90 .

1% 1 .

307 94 .

5% 1 . .

7% 1 .

096 88 .

9% 1 . .

6% 1 .

439 80 .

0% 1 . n = n = 100 , n = n = n = n = 200ELRT 90 .

1% 0 .

642 94 .

5% 0 . .

7% 0 .

572 91 .

8% 0 . .

3% 0 .

804 86 .

7% 1 . where ξ ∗ , ξ ∗ are the true medians of Gamma(5 , .

9) and Gamma(6 , . R n values against the the-oretical limiting distribution χ . The points in the Q-Q plot are close to(but slightly above) the 45-degree line in the range from 0 to 6. This impliesthat the corresponding tests will have close to nominal levels. Overall, thechi-square approximation is satisfactory.In Figure 5, we depict the 95% conﬁdence regions of ξ = ( ξ , ξ ) using theELRT in (10), the Wald method in (11), and the nonparametric method in(12), based on a typical simulated data set with ξ ∗ marked as a red diamond.Clearly, the ELRT-based conﬁdence region has a smaller area and is thereforemore eﬃcient. In Table 3 we make direct quantitative comparisons of thecoverage probabilities and areas. Both the ELRT and Wald methods underthe DRM have empirical coverage probabilities very close to the nominallevels. The nonparametric conﬁdence regions have overcoverage and inﬂatedsizes. We again conclude that the ELRT is superior.We also study the conﬁdence regions for a pair of lower quantiles: the 5%quantile of G and the 10% quantile of G . Figure 6 shows the three 95%conﬁdence regions based on a simulated data set. Table 4 gives the averageareas and coverage probabilities of the three conﬁdence regions, based on1000 repetitions. The ELRT method is still the most eﬃcient. Maintainingthe accurate coverage probabilities, the ELRT conﬁdence regions still havesatisfactory areas that are comparable to the Wald conﬁdence regions. . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 4: Q-Q plot of R n values against χ based on gamma data of equalsample size n r = 100. Quantiles of chi−square (df = 2) S a m p l e quan t il e s o f E L R T s t a t i s t i cs Table 3

Empirical coverage probabilities and average areas based on gamma data of equal samplesize.

Method 90% 95%Coverage probability Area Coverage probability Area n r = 100ELRT 88 .

3% 2 .

808 94 .

2% 3 . .

9% 2 .

953 95 .

3% 3 . .

1% 4 .

264 95 .

2% 5 . n r = 200ELRT 88 .

6% 1 .

395 94 .

4% 1 . .

7% 1 .

451 95 .

3% 1 . .

3% 2 .

111 94 .

3% 2 .

5. Real-data analysis.

In the previous simulations, we chose the mostsuitable basis function q ( x ) in each case because the population distribu-tions were known to us. This is not possible in real-world applications. Inthis section, we create a simulation population based on the US Consumer . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 5: Conﬁdence regions of ( ξ , ξ ) by ELRT (solid), Wald (dashed), andnonparametric (dotted) methods, based on a simulated gamma data set ofequal sample size n r = 100. The true quantiles are marked with a diamond.The level of conﬁdence is 95%.

50% quantile of G1 % quan t il e o f G ELRTWaldNonparametric

Table 4

Empirical coverage probabilities and average areas based on gamma data of equal samplesize.

Method 90% 95%Coverage probability Area Coverage probability Area n r = 100ELRT 88 .

4% 2 .

312 93 .

7% 3 . .

5% 2 .

236 92 .

0% 2 . .

8% 3 .

250 88 .

7% 4 . n r = 200ELRT 90 .

8% 1 .

139 95 .

3% 1 . .

4% 1 .

114 95 .

4% 1 . .

0% 1 .

684 92 .

6% 2 . Expenditure Surveys data concerning US expenditure, income, and demo-graphics. The data set is available on the US Bureau of Labor Statistics . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 6: Conﬁdence regions of ( ξ , ξ ) by ELRT (solid), Wald (dashed), andnonparametric (dotted) methods, based on a simulated gamma data set ofequal sample size n r = 100. The true quantiles are marked with a diamond.The level of conﬁdence is 95%.

5% quantile of G4 % quan t il e o f G . . . . . . . ELRTWaldNonparametric website ( ). The data are collected bythe Census Bureau in the form of panel surveys, in which approximately 5000households are contacted each quarter. After a household has been surveyedit is dropped from subsequent surveys and replaced by a new household. Theresponse variable is the annual sum of the wages or salary income receivedby all household members before any deductions. Household income is agood reﬂection of economic well-being. The data ﬁles include some imputedvalues to replace missing values due to non-response.We study a six-year period from 2013 to 2018, and we log-transform theresponse values to make the scale more suitable for numerical computation.Note that the quantiles are transformation equivariant. We exclude house-holds that have no recorded income even after imputation, and there remain4919, 5304, 4641, 4606, 4475, and 4222 households from 2013 to 2018. Thehistograms shown in Figure 7 indicate that it is diﬃcult to determine asuitable parametric model for these data sets, but a DRM may work wellenough. We take the basis function q ( x ) = (1 , x, x ) (cid:62) ; it may not be the . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM best choice, but as a result the simulation results for the DRM analysis aremore convincing.Fig 7: Histograms of log-transformed annual household incomes. Income in 2013 log−transformed income D en s i t y Income in 2014 log−transformed income D en s i t y Income in 2015 log−transformed income D en s i t y Income in 2016 log−transformed income D en s i t y Income in 2017 log−transformed income D en s i t y Income in 2018 log−transformed income D en s i t y In this simulation, we form 6 populations based on the yearly data sets. Wetest hypotheses on the 20% and 50% quantiles based on independent samplesof size 100. To test the size of a single quantile of a single population, thelimiting distribution of R n is χ . Figures 8 and 9 contain a few Q-Q plotsof R n versus χ for H : ξ r = ξ ∗ r with τ r = 20% or τ r = 50%. In all theplots, the points of R n are close to the 45-degree line. Thus, the precisionof the chi-square approximation is satisfactory. The plots for other levels orpopulations are similar and not presented.The Wald method (11) may be regarded as being derived from an asymp-totic χ distributed statistic: W n = n ( ˜ ξ r − ξ ∗ r ) (cid:62) ˜Ω − ( ˜ ξ r − ξ ∗ r ) . We also obtain W n values and construct Q-Q plots, and a selected few are . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 8: Q-Q plots of R n values against χ , based on real data of equal samplesize n r = 100. Quantile levels are 20%. ELRT for 20% quantile in 2017

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f E L R T s t a t i s t i cs ELRT for 20% quantile in 2018

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f E L R T s t a t i s t i cs Fig 9: Q-Q plots of R n values against χ , based on real data of equal samplesize n r = 100. Quantile levels are 50%. ELRT for 50% quantile in 2013

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f E L R T s t a t i s t i cs ELRT for 50% quantile in 2014

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f E L R T s t a t i s t i cs given in Figures 10 and 11. These plots show that the chi-square approx-imation is not as satisfactory. There are many possible explanations, buta major factor could be the unstable variance estimator ˜Ω that the Waldmethod must rely on, especially for lower quantiles. One of the most valuedproperties of the likelihood ratio test approach is that there is no need toestimate a scale factor. . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Fig 10: Q-Q plots of Wald statistic values against χ , based on real data ofequal sample size n r = 100. Quantile levels are 20%. Wald test for 20% quantile in 2017

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f W a l d s t a t i s t i cs Wald test for 20% quantile in 2018

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f W a l d s t a t i s t i cs Fig 11: Q-Q plots of Wald statistic values against χ , based on real data ofequal sample size n r = 100. Quantile levels are 50%. Wald test for 50% quantile in 2013

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f W a l d s t a t i s t i cs Wald test for 50% quantile in 2014

Quantiles of chi−square (df = 1) S a m p l e quan t il e s o f W a l d s t a t i s t i cs A direct consequence of the poor chi-square approximation could be un-dercoverage of the conﬁdence intervals. Table 5 gives the coverage probabili-ties and average lengths of the conﬁdence intervals based on three methods:ELRT in (10), Wald in (11), and nonparametric in (12). The improved eﬃ-ciency of the DRM is best reﬂected in the average lengths of the conﬁdenceintervals. It can be seen that the DRM-based methods achieve on average . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM about 15% and 25% improvement over the nonparametric method for the20% and 50% quantiles respectively. Comparing the ELRT and Wald meth-ods, both done under DRM, we ﬁnd that the ELRT is comparable to theWald method for the 20% quantile and clearly more eﬃcient for the 50%quantile.As mentioned in Section 2, the ELRT approach can also be used to con-struct conﬁdence regions for quantiles from the same population. In thenext simulation, we focus on the ﬁrst and second quintiles of the householdincomes in the year 2018 jointly, which correspond to the 20% and 40%quantiles respectively. Figure 12 shows the 95% conﬁdence regions usingthe three methods based on simulated real data of size n r = 100. Table 6gives the average coverages and areas of the three conﬁdence regions, basedon 1000 repetitions. The ELRT produces the most satisfactory conﬁdenceregions, with acceptable coverage probabilities and small average areas.Fig 12: Conﬁdence regions of 20% and 40% quantiles for the year 2018 byELRT (solid), Wald (dashed), and nonparametric (dotted) methods, basedon a simulated real data set of equal sample size n r = 100. The true quantilesare marked with a diamond. The level of conﬁdence is 95%.

20% quantile in 2018 % quan t il e i n . . . . . ELRTWaldNonparametric . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Table 5

Average lengths and empirical coverage probabilities of the individual conﬁdenceintervals, based on real data of equal sample size n r = 100 . Year ELRT Wald Nonparametric90% 95% 90% 95% 90% 95%

Average lengths quantile levels all = 20%2013 0 .

465 0 .

563 0 .

440 0 .

524 0 .

513 0 . .

464 0 .

559 0 .

437 0 .

520 0 .

528 0 . .

459 0 .

553 0 .

432 0 .

515 0 .

519 0 . .

461 0 .

558 0 .

435 0 .

519 0 .

527 0 . .

459 0 .

557 0 .

434 0 .

518 0 .

539 0 . .

438 0 .

529 0 .

416 0 .

496 0 .

523 0 . .

458 0 .

553 0 .

433 0 .

515 0 .

525 0 . .

307 0 .

364 0 .

315 0 .

376 0 .

383 0 . .

306 0 .

366 0 .

316 0 .

376 0 .

379 0 . .

304 0 .

364 0 .

314 0 .

374 0 .

374 0 . .

305 0 .

364 0 .

315 0 .

375 0 .

382 0 . .

304 0 .

364 0 .

316 0 .

376 0 .

390 0 . .

300 0 .

357 0 .

311 0 .

371 0 .

373 0 . .

304 0 .

363 0 .

315 0 .

375 0 .

380 0 . Empirical coverage probabilities quantile levels all = 20%2013 88 .

0% 94 .

0% 88 .

7% 93 .

2% 87 .

7% 92 . .

1% 95 .

1% 88 .

7% 94 .

7% 87 .

9% 92 . .

8% 94 .

6% 88 .

6% 93 .

6% 89 .

5% 94 . .

7% 95 .

1% 88 .

6% 94 .

1% 87 .

7% 94 . .

0% 94 .

6% 87 .

8% 93 .

3% 86 .

6% 91 . .

4% 95 .

6% 87 .

5% 91 .

7% 89 .

0% 93 . .

7% 94 .

8% 88 .

3% 93 .

4% 88 .

1% 93 . .

8% 94 .

2% 89 .

3% 95 .

2% 88 .

5% 93 . .

2% 95 .

3% 90 .

4% 95 .

4% 89 .

4% 94 . .

7% 96 .

0% 92 .

3% 95 .

7% 92 .

4% 95 . .

0% 95 .

5% 90 .

9% 95 .

5% 90 .

9% 94 . .

9% 95 .

2% 90 .

1% 96 .

0% 91 .

7% 95 . .

6% 94 .

9% 89 .

8% 95 .

4% 90 .

0% 95 . .

9% 95 .

2% 90 .

5% 95 .

5% 90 .

5% 95 . Table 6

Empirical coverage probabilities and average areas for and quantiles for theyear 2018, based on real data of equal sample size.

Method 90% 95%Coverage probability Area Coverage probability Area n r = 100ELRT 87 .

8% 0 . .

5% 0 . .

9% 0 . .

8% 0 .

219 93 .

4% 0 . n r = 200ELRT 85 .

0% 0 . .

4% 0 . .

5% 0 . .

2% 0 .

111 94 .

4% 0 . APPENDIX A: PROOFS OF THE MAIN RESULTSThis Appendix provides the proofs of the technical results. In the followingproofs, without loss of generality, we proceed as if the sample proportions n k /n do not depend on n and equal their limits ρ k . Our results are applicableas long as none of the populations have comparatively very small samplesizes. Also, for the sake of convenience, with a generic function f ( y ) we use ∂f ( y ∗ ) ∂ y = ∂f ( y ) ∂ y (cid:12)(cid:12)(cid:12)(cid:12) y = y ∗ , ∂ f ( y ∗ ) ∂ y ∂ y (cid:62) = ∂ f ( y ) ∂ y ∂ y (cid:62) (cid:12)(cid:12)(cid:12)(cid:12) y = y ∗ . Finally, the DRM parameters θ are arranged in the order( θ , θ , . . . , θ m , . . . , θ , θ , . . . , θ m , . . . , θ d , θ d , . . . , θ md ) , where θ is is the s th component of the vector-valued parameter θ i . This orderis needed for the expressions of the second derivative of D ( λ , θ ) in the proofof Lemma 3.1 and for the covariance matrix of the ﬁrst derivative in theproof of Lemma 3.2. A.1. Proof of Lemma 3.1.

This lemma asserts that the second deriva-tive matrix of D ( λ , θ ) has a ﬁnite and full-rank matrix as a limit. Proof.

We ﬁrst recognize that D ( λ , θ ) can be written as a sum of m + 1sets of i.i.d. random variables: D ( λ , θ ) = (cid:88) k,j D k ( x kj , λ , θ ) , (13) . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM with D k ( x, λ , θ ) = θ (cid:62) k q ( x ) − log ¯ h ( x, θ ) − log (cid:110) (cid:88) r ∈ I λ r ψ r ( x, θ ) (cid:111) . Therefore, we may write n − ∂ D ( , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) = m (cid:88) k =0 ρ k  n − k n k (cid:88) j =1 ∂ D k ( x kj , , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62)  . By the law of large numbers (Durrett, 2010), as n → ∞ , n −  ∂ D ( , θ ∗ ) ∂ λ ∂ λ (cid:62) ∂ D ( , θ ∗ ) ∂ λ ∂ θ (cid:62) ∂ D ( , θ ∗ ) ∂ θ ∂λ (cid:62) ∂ D ( , θ ∗ ) ∂ θ ∂ θ (cid:62)  → (cid:18) S λλ S λθ S θλ S θθ (cid:19) , for some block matrix S given by S = m (cid:88) k =0 ρ k E k (cid:20) ∂ D k ( X, , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) (cid:21) . Here we remark again that we assume that the sample proportions n k /n donot change with n and always equal their limits ρ k .Next, we show that S has full rank. We ﬁrst give the following expressions: ∂ D k ( x, , θ ∗ ) ∂ λ ∂ λ (cid:62) = ψ ( x, θ ∗ ) ψ (cid:62) ( x, θ ∗ ) ,∂ D k ( x, , θ ∗ ) ∂ θ ∂ θ (cid:62) = [ q ( x ) q (cid:62) ( x )] ⊗ (cid:104) h ( x, θ ∗ ) h (cid:62) ( x, θ ∗ ) − diag { h ( x, θ ∗ ) } (cid:105) ,∂ D k ( x, , θ ∗ ) ∂ λ ∂ θ (cid:62) = q (cid:62) ( x ) ⊗ (cid:104) ψ ( x, θ ∗ ) h (cid:62) ( x, θ ∗ ) − diag { ψ ( x, θ ∗ ) } (cid:0) e I · · · e I l (cid:1) (cid:62) (cid:105) , where ⊗ is the Kronecker product, e i is a vector of length m that has 1 inthe i th entry and 0 elsewhere (we deﬁne e = by convention), and I j isthe population index of the j th quantile of interest.Based on the above expressions, we ﬁrst note that S θθ = − m (cid:88) k =0 ρ k E k (cid:104) { q ( X ) ⊗ [ e k − h ( X, θ ∗ )] }{ q ( X ) ⊗ [ e k − h ( X, θ ∗ )] } (cid:62) (cid:105) , . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM which is clearly negative semideﬁnite. We now strengthen the conclusionto negative deﬁnite. By Condition (ii), E [ q ( X ) q (cid:62) ( X )] is positive deﬁnite.Since h r ( x, θ ∗ ) = exp (cid:0) θ (cid:62) r q ( x ) (cid:1) , we have that E k (cid:104) { e k − h ( X, θ ∗ ) }{ e k − h ( X, θ ∗ ) } (cid:62) (cid:105) is positive deﬁnite. Simple algebra leads to the negative deﬁniteness of S θθ .For the same reason, S λλ is positive deﬁnite if ψ ( x, θ ∗ ) does not degenerate,which is assured because ϕ r ( x, θ , ξ ) = h r ( x, θ )[ ( x ≤ ξ r ) − τ r ] . From (cid:18) I − S λθ S − θθ I (cid:19) × (cid:18) S λλ S λθ S θλ S θθ (cid:19) = (cid:18) S λλ − S λθ S − θθ S θλ S θλ S θθ (cid:19) , we conclude that S has full rank if S λλ − S λθ S − θθ S (cid:62) λθ does. Because S λλ is positive deﬁnite and S − θθ is negative deﬁnite, S λλ − S λθ S − θθ S (cid:62) λθ must bepositive deﬁnite, and so it has full rank. This completes the proof that S has full rank. A.2. Proof of Lemma 3.2.

The ﬁrst conclusion of this lemma is thatthe ﬁrst derivative of D ( λ , θ ) in (7) has zero expectation when evaluated at( , θ ∗ ). Recall that D ( λ , θ ) = (cid:88) k,j θ (cid:62) k q ( x kj ) − (cid:88) k,j log ¯ h ( x kj , θ ) − (cid:88) k,j log (cid:8) (cid:88) r ∈ I λ r ψ r ( x kj , θ ) (cid:9) . For any r ∈ I , the partial derivative of D ( λ , θ ) with respect to λ r is givenby ∂ D ( λ , θ ) ∂λ r = − (cid:88) k,j ψ r ( x kj , θ )1 + (cid:80) i ∈ I λ i ψ i ( x kj , θ ) . At λ ∗ = and θ = θ ∗ , this reduces to ∂ D ( , θ ∗ ) ∂λ r = − (cid:88) k,j ψ r ( x kj , θ ∗ ) . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Hence, we have E (cid:20) ∂ D ( , θ ∗ ) ∂λ r (cid:21) = − (cid:88) k,j (cid:90) ψ r ( x, θ ∗ )d G k ( x )= − (cid:90) ψ r ( x, θ ∗ ) (cid:8) m (cid:88) k =0 n k h k ( x, θ ∗ ) (cid:9) d G ( x )= − n (cid:90) ϕ r ( x, θ ∗ , ξ ∗ )d G ( x ) = 0 . (14)For each i = 1 , , . . . , m, s = 1 , , ..., d , and at λ = and θ = θ ∗ , we have ∂ D ( , θ ∗ ) ∂θ is = n i (cid:88) j =1 q s ( x ij ) − (cid:88) k,j ρ i q s ( x kj ) h i ( x kj , θ ∗ ) / ¯ h ( x kj , θ ∗ ) . For the ﬁrst term, it can be seen that E (cid:104) n i (cid:88) j =1 q s ( x ij ) (cid:105) = n i (cid:90) q s ( x ) h i ( x, θ ∗ )d G ( x ) . At the same time, for the second term, we have E (cid:104) (cid:88) k,j ρ i q s ( x kj ) h i ( x kj , θ ∗ ) / ¯ h ( x kj , θ ∗ ) (cid:105) = n i (cid:90) q s ( x ) h i ( x, θ ∗ )d G ( x ) . Therefore, we ﬁnd that E (cid:104) ∂ D ( , θ ∗ ) ∂θ is (cid:105) = 0 . (15)Combining (14) and (15), we conclude that E (cid:104) ∂ D ( , θ ∗ ) ∂ ( λ , θ ) (cid:105) = . The second conclusion of this lemma is the asymptotic normality of theﬁrst derivative. Despite its complex expression, we can see that ∂ D ( λ , θ ) /∂ ( λ , θ )is a sum of m + 1 sets of i.i.d. random variables of sizes n r = nρ r with meanzero and ﬁnite second moment in the matrix sense. Recall (13) from theproof of Lemma 3.1 that D ( λ , θ ) = (cid:88) k,j D k ( x kj , λ , θ ) , . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM where D k ( x, λ , θ ) = θ (cid:62) k q ( x ) − log ¯ h ( x, θ ) − log (cid:110) (cid:88) r ∈ I λ r ψ r ( x, θ ) (cid:111) . We may write ∂ D ( λ , θ ) ∂ ( λ , θ ) = m (cid:88) k =0  n k (cid:88) j =1 (cid:20) D k ( x kj , λ , θ ) ∂ ( λ , θ ) − E k (cid:18) D k ( X, λ , θ ) ∂ ( λ , θ ) (cid:19)(cid:21) . For each k = 0 , , . . . , m , as n k → ∞ , T k := n − / k n k (cid:88) j =1 (cid:20) ∂ D k ( x kj , , θ ∗ ) ∂ ( λ , θ ) − E k (cid:18) D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:19)(cid:21) has a limiting distribution of normal with mean zero and ﬁnite second mo-ment in the matrix sense, by the multivariate central limit theorem for tri-angular arrays (Durrett, 2010). Because T , T , . . . , T m are independent ofeach other, the targeted quantity n − / ∂ D ( , θ ∗ ) ∂ ( λ , θ ) = m (cid:88) k =0 ρ / k T k is asymptotically normal with mean zero.We now give the expression V for the covariance matrix in the limitingdistribution. Let V k be the asymptotic covariance matrix of T k , then we have V = m (cid:88) k =0 ρ k V k . The expression for V k is given by V k = E k (cid:34)(cid:18) ∂ D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:19) (cid:18) ∂ D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:19) (cid:62) (cid:35) − E k (cid:20) D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:21) E k (cid:20) D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:21) (cid:62) . After some algebra, we ﬁnd that ∂ D k ( x, , θ ∗ ) ∂ λ = − ψ ( x, θ ∗ ) ,∂ D k ( x, , θ ∗ ) ∂ θ = q ( x ) ⊗ [ e k − h ( x, θ ∗ )] , . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM where e k is a unit vector with the k th element being 1 ( e = by conven-tion). We have m (cid:88) k =0 ρ k E k (cid:34)(cid:18) ∂ D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:19) (cid:18) ∂ D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:19) (cid:62) (cid:35) = (cid:18) S λλ − S θθ (cid:19) . Let W = (cid:18) ρ − m (cid:62) m + diag { ρ − , . . . , ρ − m }

00 0 (cid:19) , where m is an m -dimensional vector of ones; we then also have m (cid:88) k =0 ρ k E k (cid:20) D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:21) E k (cid:20) D k ( X, , θ ∗ ) ∂ ( λ , θ ) (cid:21) (cid:62) = S (cid:18) W (cid:19) S. Finally, we get V = (cid:18) S λλ − S θθ (cid:19) − S (cid:18) W (cid:19) S. This completes the proof that n − / ∂ D ( , θ ∗ ) /∂ ( λ , θ ) is asymptotically nor-mal. A.3. Proof of Lemma 3.3.

Proof.

Given θ , let λ ( θ ) be the solution to (cid:88) k,j ψ ( x kj , θ )1 + λ (cid:62) ψ ( x kj , θ ) = . We ﬁrst prove that uniformly for any θ in the n − / -neighbourhood of θ ∗ , λ ( θ ) is O ( n − / ). For notational convenience, in this section we omit θ in λ ( θ ) if this does not cause any confusion.Following the typical proof in Owen (2001), the claim is true if uniformlyfor θ such that (cid:107) θ − θ ∗ (cid:107) ≤ n − / we have(i) (cid:80) k,j ψ ( x kj , θ ) = O ( n / );(ii) n − (cid:80) k,j ψ ( x kj , θ ) ψ (cid:62) ( x kj , θ ) has a positive deﬁnite limit.We omit other details but prove the above results.We ﬁrst quote a result of Owen (2001) with a minor modiﬁcation: almostsurely, Y ( n ) = max { Y , . . . , Y n } = o ( n / ) , (16) . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM for i.i.d. Y , . . . , Y n with ﬁnite third moment. In addition, (cid:80) Y i = O ( √ n log log n )by the classical law of the iterated logarithm when E ( Y ) = 0.Also by the law of the iterated logarithm, we have (cid:88) k,j ψ ( x kj , θ ∗ ) = O ( (cid:112) n log log n ) = O ( n / ) . (17)For θ in a small neighbourhood of θ ∗ , there is a generic nonrandom constant C such that (cid:88) k,j (cid:107) ∂ ψ ( x kj , θ ) /∂ θ (cid:107) ≤ C (cid:88) k,j (cid:107) q ( x kj ) (cid:107) = O ( n ) , (18)with the order in the last step derived from the ﬁnite moment assumptionon q ( X ). Applying (17) and (18), with ¯ θ being a value between θ and θ ∗ ,we get (cid:88) k,j ψ ( x kj , θ ) = (cid:88) k,j ψ ( x kj , θ ∗ ) + (cid:88) k,j ∂ ψ ( x kj , ¯ θ ) ∂ θ ( θ − θ ∗ )= (cid:88) k,j ∂ ψ ( x kj , ¯ θ ) ∂ θ ( θ − θ ∗ ) + O ( n / ) = O ( n / ) . This proves (i).Recall that we focus on θ in an n − / -neighbourhood of θ ∗ . We have n k (cid:88) j =1 { ψ ( x kj , θ ) ψ (cid:62) ( x kj , θ ) − ψ ( x kj , θ ∗ ) ψ (cid:62) ( x kj , θ ∗ ) } = n k (cid:88) j =1 (cid:104) ψ ( x kj , θ ∗ ) { ψ ( x kj , θ ) − ψ ( x kj , θ ∗ ) } (cid:62) + { ψ ( x kj , θ ) − ψ ( x kj , θ ∗ ) } ψ (cid:62) ( x kj , θ ) (cid:105) ≤ { max k,j sup θ ψ ( x kj , θ ) } n k (cid:88) j =1 (cid:107) ψ ( x kj , θ ) − ψ ( x kj , θ ∗ ) (cid:107) = o ( n / ) × O ( n / ) = o ( n ) , where the ﬁrst-order assessment above is by (16). Therefore, we have n − (cid:88) k,j { ψ ( x kj , θ ) ψ (cid:62) ( x kj , θ ) } = n − (cid:88) k,j { ψ ( x kj , θ ∗ ) ψ (cid:62) ( x kj , θ ∗ ) } + o (1) → S λλ , . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM which is clearly positive deﬁnite. This proves (ii).As we have remarked, the validity of (i) and (ii) implies that uniformlyfor θ − θ ∗ = O ( n − / ), λ ( θ ) = O ( n − / ) . (19)Following the same line of the proof, we also have a stronger order for λ ( θ )when θ = θ ∗ : λ ( θ ∗ ) = o ( n − / ) . (20)The next stage of the proof is dedicated to showing that ˆ θ − θ ∗ = O ( n − / ). We consider a function of θ : L ( θ ) = D ( λ ( θ ) , θ ) . It can easily be seen that ˆ θ is a maximizer of L ( θ ). Since L ( θ ) is a smoothfunction, there must be a maximizer of L ( θ ) in the compact set { θ : (cid:107) θ − θ ∗ (cid:107) ≤ n − / } . We prove that this maximizer is attained in the interior of thecompact set by showing that L ( θ ) < L ( θ ∗ ) uniformly for θ on the boundaryof the compact set. For any unit vector a and θ = θ ∗ + n − / a , expanding L ( θ ) at θ ∗ yields (see Folland (2002)) L ( θ ) = L ( θ ∗ ) + n − / ∂L ( θ ∗ ) ∂ θ a + n − / a (cid:62) ∂ L ( θ ∗ ) ∂ θ ∂ θ (cid:62) a + ε n , (21)where ε n is the Lagrange remainder term in obvious notation: ε n = 16 n − (cid:88) | α | =3 ∂ α L ( θ ) a α , for some θ between θ ∗ and θ . By the uniform boundedness of the third-orderderivatives of L ( θ ), we have ε n = O (1) uniformly over a .For the ﬁrst term in the expansion, we note that λ ( θ ∗ ) = o ( n − / ) asgiven in (20), and this implies ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ θ = ∂ D ( , θ ∗ ) ∂ θ + O ( n )( λ ( θ ∗ ) − ) = o ( n / ) , with the order of ∂ D ( , θ ∗ ) /∂ θ implied by Lemma 3.2. Therefore, ∂L ( θ ∗ ) ∂ θ = ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ λ ∂ λ ( θ ∗ ) ∂ θ + ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ θ = + ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ θ = o ( n / ) . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM For the second term in the expansion, we proceed as follows. With λ ( θ ∗ ) = o ( n − / ) as given in (20) and Lemma 3.1, we ﬁrst note that ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) = ∂ D ( , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) + o ( n / ) = n [ S + o (1)] . Taking derivatives with respect to θ on both sides of the identity ∂ D ( λ ( θ ) , θ ) ∂ λ = , and then setting θ = θ ∗ , we further have ∂ λ ( θ ∗ ) ∂ θ = − (cid:20) ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ λ ∂ λ (cid:62) (cid:21) − (cid:20) ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ λ ∂ θ (cid:62) (cid:21) . Hence, ∂ L ( θ ∗ ) ∂ θ ∂ θ (cid:62) = ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ θ ∂ λ (cid:62) ∂ λ ( θ ∗ ) ∂ θ + ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ θ ∂ θ (cid:62) = − (cid:20) ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ λ ∂ θ (cid:62) (cid:21) (cid:62) (cid:20) ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ λ ∂ λ (cid:62) (cid:21) − (cid:20) ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ λ ∂ θ (cid:62) (cid:21) + ∂ D ( λ ( θ ∗ ) , θ ∗ ) ∂ θ ∂ θ (cid:62) = n [ − S (cid:62) λθ S − λλ S λθ + S θθ + o (1)] . Therefore, the expansion of L ( θ ) in (21) becomes L ( θ ) − L ( θ ∗ ) = n / a (cid:62) {− S (cid:62) λθ S − λλ S λθ + S θθ } a + o ( n / ) . The matrix in the quadratic form is negative deﬁnite, following the line ofan argument in the proof of Lemma 3.1. Hence, as n → ∞ , with probability1, L ( θ ∗ + n − / a ) < L ( θ ∗ ) , uniformly over all unit vector a . This provesˆ θ − θ ∗ = O ( n − / ) , and together with (19) further implies thatˆ λ = O ( n − / ) . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM We are now ready to prove the asymptotic normality of (ˆ λ , ˆ θ ). Expanding ∂ D (ˆ λ , ˆ θ ) /∂ ( λ , θ ) at ( , θ ∗ ), we get = ∂ D (ˆ λ , ˆ θ ) ∂ ( λ , θ ) = ∂ D ( , θ ∗ ) ∂ ( λ , θ ) + ∂ D ( , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) (cid:18) ˆ λ − ˆ θ − θ ∗ (cid:19) + O (1) . By Lemmas 3.1 and 3.2, we get √ n (cid:18) ˆ λ − ˆ θ − θ ∗ (cid:19) = − S − (cid:20) n − / ∂ D ( , θ ∗ ) ∂ ( λ , θ ) (cid:21) + o (1) d → N ( , S − V S − ) , (22)as n → ∞ . A.4. Proof of Theorem 3.4.

Proof.

We notice that, as shown in Cai et al. (2017),sup θ ,G { (cid:96) n ( θ , G ) } = sup θ D ( , θ ) − n log n. From (8) we also have ˜ (cid:96) n ( ξ ∗ ) = D (ˆ λ , ˆ θ ) − n log n. These relations lead to R n = 2 (cid:20) sup θ D ( , θ ) − D (ˆ λ , ˆ θ ) (cid:21) = 2 (cid:20) sup θ D ( , θ ) − D ( , θ ∗ ) (cid:21) − (cid:104) D (ˆ λ , ˆ θ ) − D ( , θ ∗ ) (cid:105) . (23)Cai et al. (2017) show in the proof of their Theorem 1 thatsup θ D ( , θ ) − D ( , θ ∗ ) = − (cid:20) n − / ∂ D ( , θ ∗ ) ∂ θ (cid:21) (cid:62) S − θθ (cid:20) n − / ∂ D ( , θ ∗ ) ∂ θ (cid:21) + o p (1) , for the same S θθ given in the proof of Lemma 3.1.For the second term in (23), utilizing the expansion of ˆ λ and ˆ θ − θ ∗ givenin (22), we have D (ˆ λ , ˆ θ ) − D ( , θ ∗ )= ∂ D ( , θ ∗ ) ∂ ( λ , θ ) (cid:18) ˆ λ − ˆ θ − θ ∗ (cid:19) + 12 (cid:18) ˆ λ − ˆ θ − θ ∗ (cid:19) (cid:62) ∂ D ( , θ ∗ ) ∂ ( λ , θ ) ∂ ( λ , θ ) (cid:62) (cid:18) ˆ λ − ˆ θ − θ ∗ (cid:19) + o p (1)= − (cid:20) n − / ∂ D ( , θ ∗ ) ∂ ( λ , θ ) (cid:21) (cid:62) S − (cid:20) n − / ∂ D ( , θ ∗ ) ∂ ( λ , θ ) (cid:21) + o p (1) . . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Let ν = n − / (cid:20) ∂ D ( , θ ∗ ) ∂ λ (cid:21) , ν = n − / (cid:20) ∂ D ( , θ ∗ ) ∂ θ (cid:21) , Λ = S λλ − S λθ S − θθ S (cid:62) λθ , D = (cid:0) I , − S λθ S − θθ (cid:1) , with D and the identity matrix I with proper sizes. We then get R n = − ν (cid:62) S − θθ ν + ( ν (cid:62) , ν (cid:62) ) S − (cid:18) ν ν (cid:19) + o p (1)= (cid:8) ν − S λθ S − θθ ν (cid:9) (cid:62) Λ − (cid:8) ν − S λθ S − θθ ν (cid:9) + o p (1)= (cid:18) ν ν (cid:19) (cid:62) ( D (cid:62) Λ − D ) (cid:18) ν ν (cid:19) + o p (1) , where the middle step can be obtained via some typical matrix algebra orTheorem 8.5.11 in Harville (1997).As given in the proof of Lemma 3.2, the asymptotic variance of ( ν , ν )is V . We also have DV D (cid:62) = D (cid:20)(cid:18) S λλ − S θθ (cid:19) − S (cid:18) W (cid:19) S (cid:21) D (cid:62) = D (cid:18) S λλ − S θθ (cid:19) D (cid:62) − = S λλ − S λθ S − θθ S (cid:62) λθ = Λ . Hence, V ( D (cid:62) Λ − D ) V ( D (cid:62) Λ − D ) V = V ( D (cid:62) Λ − D ) V. By the result on quadratic forms of the multivariate normal (section 3.5,Serﬂing (1980)), the limiting distribution of R n is chi-square with the degreesof freedom being the trace of ( D (cid:62) Λ − D ) V , which is l as claimed in thistheorem. This completes the proof. . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM APPENDIX B: DEFINABILITY OF THE PROFILE LOG-ELDiscussions of the properties of the ELRT statistic are not meaningfulif the proﬁle log-EL ˜ (cid:96) n ( ξ ) is not well deﬁned. In fact, in some situations,the constrained maximization has no solution (Grend´ar and Judge, 2009).Such an “empty-set” problem can be an issue, but there are methods in theliterature to overcome this obstacle (Chen et al., 2008; Liu and Chen, 2010;Tsao and Wu, 2014). In this Appendix, we show that our ˜ (cid:96) n ( ξ ) does notsuﬀer from the “empty-set” problem under two additional mild conditions.The ﬁrst condition restricts our attention to quantile values { ξ r : r ∈ I } inthe range min j x rj < ξ r < max j x rj . The second requires one of the components of q ( x ) to be monotone in x , inaddition to a component being 1. All of our examples satisfy these conditions.To deﬁne the proﬁle log-EL ˜ (cid:96) n ( ξ ), we must have some p kj > θ r such that (cid:88) k,j p kj exp (cid:16) θ (cid:62) r q ( x kj ) (cid:17) = 1 , r = 0 , , . . . , m, (cid:88) k,j p kj exp (cid:16) θ (cid:62) r q ( x kj ) (cid:17) [ ( x kj ≤ ξ r ) − τ r ] = 0 , r ∈ I. We work on the most general case where I contains all populations, andwithout loss of generality let d = 2. The above expressions are equivalent to(including r = 0 and allowing θ (cid:54) = 0) (cid:88) k,j p kj exp (cid:16) θ (cid:62) r q ( x kj ) (cid:17) [ ( x kj ≤ ξ r )] = τ r , (cid:88) k,j p kj exp (cid:16) θ (cid:62) r q ( x kj ) (cid:17) [ ( x kj > ξ r )] = 1 − τ r . Let θ (cid:62) r = ( θ r , θ r ) , and q (cid:62) ( x ) = ( q ( x ) , q ( x )) where q ( x ) ≡ q ( x )is monotone in x . We can rewrite the equations as (cid:88) k,j p kj exp (cid:110) θ (cid:48) r + θ r [ q ( x kj ) − q ( ξ r )] (cid:111) [ ( x kj ≤ ξ r )] = τ r , (cid:88) k,j p kj exp (cid:110) θ (cid:48) r + θ r [ q ( x kj ) − q ( ξ r )] (cid:111) [ ( x kj > ξ r )] = 1 − τ r , with θ (cid:48) r = θ r + θ r q ( ξ r ). For notational simplicity, we retain the notation θ r instead of θ (cid:48) r in what follows. . ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM Let p ∗ kj be any set of non-negative values such that (cid:80) k,j p ∗ kj = 1. Deﬁne A r ( θ r ) = (cid:88) k,j p ∗ kj exp { θ r [ q ( x kj ) − q ( ξ r )] } [ ( x kj ≤ ξ r )] B r ( θ r ) = (cid:88) k,j p ∗ kj exp { θ r [ q ( x kj ) − q ( ξ r )] } [ ( x kj > ξ r )] . Since q ( x ) is a monotone increasing function in x , A r ( θ r ) is decreasingin θ r and B r ( θ r ) is increasing in θ r . Thus, we havelim θ r →−∞ A r ( θ r ) = ∞ , lim θ r →∞ A r ( θ r ) = 0;lim θ r →−∞ B r ( θ r ) = 0 , lim θ r →∞ B r ( θ r ) = ∞ . These imply that the ratio A r ( θ r ) /B r ( θ r ) is decreasing in θ r and thatlim θ r →−∞ A r ( θ r ) /B r ( θ r ) = ∞ , lim θ r →∞ A r ( θ r ) /B r ( θ r ) = 0 . By the intermediate value theorem, there must exist a value θ ∗ r such that A r ( θ ∗ r ) /B r ( θ ∗ r ) = τ r / (1 − τ r ) . Let θ ∗ r = − log { A r ( θ ∗ r ) + B r ( θ ∗ r ) } . We note that p ∗ kj and θ ∗ r = ( θ ∗ r , θ ∗ r ) (cid:62) form a solution to the system. Hence, a solution to the system always exists.We may shift the solution to set θ = if required. Validity in the generalcase of d > θ r to the value 0.REFERENCES J. Anderson. Multivariate logistic compounds.

Biometrika , 66(1):17–26, 1979.Y. G. Berger and C. J. Skinner. Variance estimation for a low income proportion.

Journalof the Royal Statistical Society: Series C (Applied Statistics) , 52(4):457–468, 2003.S. Boyd and L. Vandenberghe.

Convex Optimization . Cambridge University Press, 2004.S. Cai. drmdel: Dual Empirical Likelihood Inference under Density Ratio Models in thePresence of Multiple Samples , 2015. URL https://CRAN.R-project.org/package=drmdel . R package version 1.3.1.S. Cai, J. Chen, and J. V. Zidek. Hypothesis testing in the presence of multiple samplesunder density ratio models.

Statistica Sinica , 27:761–783, 2017.A. Castell´o and R. Dom´enech. Human capital inequality and economic growth: Some newevidence.

The Economic Journal , 112(478):C187–C200, 2002.J. Chen and Y. Liu. Quantile and quantile-function estimations under density ratio model.

The Annals of Statistics , 41(3):1669–1692, 2013.J. Chen and Y. Liu. Small area quantile estimation.

International Statistical Review , 87:S219–S238, 2019.. ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM J. Chen, A. M. Variyath, and B. Abraham. Adjusted empirical likelihood and its proper-ties.

Journal of Computational and Graphical Statistics , 17(2):426–443, 2008.J. Chen, P. Li, Y. Liu, and J. V. Zidek. Monitoring test under nonparametric randomeﬀects model. arXiv preprint arXiv:1610.05809 , 2016.S. X. Chen and P. Hall. Smoothed empirical likelihood conﬁdence intervals for quantiles.

The Annals of Statistics , 21(3):1166–1181, 1993.M. Corak. The Canadian geography of intergenerational income mobility.

The EconomicJournal , 2019. URL https://doi.org/10.1093/ej/uez019 .V. De Oliveira and B. Kedem. Bayesian analysis of a density ratio model.

CanadianJournal of Statistics , 45(3):274–289, 2017.R. Durrett.

Probability: Theory and Examples . Cambridge University Press, 2010.K. Fokianos, B. Kedem, J. Qin, and D. A. Short. A semiparametric approach to theone-way layout.

Technometrics , 43(1):56–65, 2001.G. Folland.

Advanced Calculus . Featured Titles for Advanced Calculus Series. Pren-tice Hall, 2002. ISBN 9780130652652. URL https://books.google.ca/books?id=iatzQgAACAAJ .K. C. Gon¸calves, H. S. Migon, and L. S. Bastos. Dynamic quantile linear models: ABayesian approach.

Bayesian Analysis , 15(2):335–262, 2020.M. Grend´ar and G. Judge. Empty set problem of maximum empirical likelihood methods.

Electronic Journal of Statistics , 3:1542–1555, 2009.D. A. Harville.

Matrix Algebra from a Statistician’s Perspective , volume 1. Springer, 1997.B. Hasselman. nleqslv: Solve Systems of Nonlinear Equations , 2018. URL https://CRAN.R-project.org/package=nleqslv . R package version 3.3.2.D. L. Humphries, J. R. Behrman, B. T. Crookston, K. A. Dearden, W. Schott, and M. E.Penny. Households across all income quintiles, especially the poorest, increased animalsource food expenditures substantially during recent Peruvian economic growth.

PloSone , 9(11):e110961, 2014.R. Koenker, V. Chernozhukov, X. He, and L. Peng.

Handbook of Quantile Regression .CRC Press, 2017.Y. Liu and J. Chen. Adjusted empirical likelihood with high-order precision.

The Annalsof Statistics , 38(3):1341–1362, 2010.C. Muller. The measurement of poverty with geographical and intertemporal price dis-persion: Evidence from Rwanda.

Review of Income and Wealth , 54(1):27–49, 2008.A. B. Owen. Empirical likelihood ratio conﬁdence intervals for a single functional.

Biometrika , 75(2):237–249, 1988.A. B. Owen.

Empirical Likelihood . Chapman & Hall/CRC, New York, 2001.J. Qin. Empirical likelihood in biased sample problems.

The Annals of Statistics , 21(3):1182–1196, 1993.J. Qin. Inferences for case-control and semiparametric two-sample density ratio models.

Biometrika , 85(3):619–630, 1998.J. Qin.

Biased Sampling, Over-identiﬁed Parameter Problems and Beyond . Springer, 2017.J. Qin and J. Lawless. Empirical likelihood and general estimating equations.

The Annalsof Statistics , 22:300–325, 1994.J. Qin and B. Zhang. A goodness-of-ﬁt test for logistic regression models based on case-control data.

Biometrika , 84(3):609–618, 1997.R Core Team.

R: A Language and Environment for Statistical Computing . R Foundationfor Statistical Computing, Vienna, Austria, 2018. URL .R. J. Serﬂing.

Approximation Theorems of Mathematical Statistics . Wiley, New York,1980.. ZHANG AND J. CHEN/ELRT ON QUANTILES UNDER A DRM B. W. Silverman.

Density Estimation for Statistics and Data Analysis , volume 26. CRCPress, 1986.M. Sugiyama, T. Suzuki, and T. Kanamori.

Density Ratio Estimation in Machine Learn-ing . Cambridge University Press, 2012.M. Tsao and F. Wu. Extended empirical likelihood for estimating equations.

Biometrika ,101(3):703–710, 2014.Y. Vardi. Nonparametric estimation in the presence of length bias.

The Annals of Statis-tics , 10(2):616–620, 1982.Y. Vardi. Empirical distributions in selection bias models.

The Annals of Statistics , 13(1):178–203, 1985.S. Verrill, D. E. Kretschmann, and J. W. Evans. Simulations of strength property monitor-ing tests. Unpublished manuscript. Forest Products Laboratory, Madison, Wisconsin.Available at , 2015.T. A. Wunder. Income distribution and consumption driven growth: How consumptionbehaviors of the top two income quintiles help to explain the economy.

Journal ofEconomic Issues , 46(1):173–192, 2012.Y. Yang and X. He. Bayesian empirical likelihood for quantile regression.

The Annals ofStatistics , 40(2):1102–1131, 2012.W. Zhuang, B. Hu, and J. Chen. Semiparametric inference for the dominance index underthe density ratio model.

Biometrika , 106(1):229–241, 2019.