Ray Chambers
University of Wollongong
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ray Chambers.
Journal of the American Statistical Association | 1986
Ray Chambers
Abstract Outliers in sample data are a perennial problem for applied survey statisticians. Moreover, it is a problem for which traditional sample survey theory offers no real solution, beyond the sensible advice that such sample elements should not be weighted to their fullest extent in estimation. Sample outliers can be identified as of two basic types. Here we are concerned with the first type, which may conveniently be termed representative outliers. These are sample elements with values that have been correctly recorded and that cannot be assumed to be unique. That is, there is no good reason to assume there are no more similar outliers in the nonsampled part of the target population. The remaining sample outliers, which by default are termed nonrepresentative, are sample elements whose data values are incorrect or unique in some sense. Methods for dealing with these nonrepresentative outliers lie basically within the scope of survey editing and imputation theory and are, therefore, not considered in ...
Archive | 2012
Ray Chambers; Robert Graham Clark
PART I: BASICS OF MODEL-BASED SURVEY INFERENCE 1. Introduction 2. The Model-Based Approach 3. Homogeneous Populations 4. Stratified Populations 5. Populations with Regression Structure 6. Clustered Populations 7. The General Linear Population Model PART II: ROBUST MODEL-BASED INFERENCE 8. Robust Prediction under Model Misspecification 9. Robust Estimation of the Prediction Variance 10. Outlier Robust Prediction PART III: APPLICATIONS OF MODEL-BASED SURVEY INFERENCE 11. Inference for Nonlinear Population Parameters 12. Survey Inference via Sub-Sampling 13. Estimation for Multipurpose Surveys 14. Inference for Domains 15. Prediction for Small Areas 16. Model-Based Inference for Distributions and Quantiles 17. Using Transformations in Sample Survey Inference Exercises
Journal of Business & Economic Statistics | 1997
Philip Kokic; Ray Chambers; Jens Breckling; Stewe Beare
Suppose there are data available on the value of business output, as measured by a single variable y and the values of the corresponding inputs x, with the relationship between y and x determined by an appropriately chosen production function. It is shown how M-quantile regression methods can be used to construct a performance measure that allows a meaningful comparison of the production performance of the businesses. The method is illustrated using survey data collected from farm businesses in the pastoral zone of New South Wales and southern Queensland between 1978 and 1987.
International Statistical Review | 1994
J. U. Breckling; Ray Chambers; A. H. Dorfman; S. M. Tam; A. H. Welsh
Summary In this paper we present a general theory for maximum likelihood inference based on sample survey data. Our purpose is to identify and emphasise the recurring basic concepts that arise in the application of likelihood methods, including the estimation of precision, to survey data. We discuss the problems generated by the effects of sample design, selection and response processes. We also discuss the problem of failures of the model assumptions and the role of sample inclusion probabilities in achieving robustness. We present two illustrative examples, one of which illustrates the use of non-Gaussian models.
Statistical Methods and Applications | 2008
Nikos Tzavidis; Nicola Salvati; Monica Pratesi; Ray Chambers
Over the last decade there has been growing demand for estimates of population characteristics at small area level. Unfortunately, cost constraints in the design of sample surveys lead to small sample sizes within these areas and as a result direct estimation, using only the survey data, is inappropriate since it yields estimates with unacceptable levels of precision. Small area models are designed to tackle the small sample size problem. The most popular class of models for small area estimation is random effects models that include random area effects to account for between area variations. However, such models also depend on strong distributional assumptions, require a formal specification of the random part of the model and do not easily allow for outlier robust inference. An alternative approach to small area estimation that is based on the use of M-quantile models was recently proposed by Chambers and Tzavidis (Biometrika 93(2):255–268, 2006) and Tzavidis and Chambers (Robust prediction of small area means and distributions. Working paper, 2007). Unlike traditional random effects models, M-quantile models do not depend on strong distributional assumption and automatically provide outlier robust inference. In this paper we illustrate for the first time how M-quantile models can be practically employed for deriving small area estimates of poverty and inequality. The methodology we propose improves the traditional poverty mapping methods in the following ways: (a) it enables the estimation of the distribution function of the study variable within the small area of interest both under an M-quantile and a random effects model, (b) it provides analytical, instead of empirical, estimation of the mean squared error of the M-quantile small area mean estimates and (c) it employs a robust to outliers estimation method. The methodology is applied to data from the 2002 Living Standards Measurement Survey (LSMS) in Albania for estimating (a) district level estimates of the incidence of poverty in Albania, (b) district level inequality measures and (c) the distribution function of household per-capita consumption expenditure in each district. Small area estimates of poverty and inequality show that the poorest Albanian districts are in the mountainous regions (north and north east) with the wealthiest districts, which are also linked with high levels of inequality, in the coastal (south west) and southern part of country. We discuss the practical advantages of our methodology and note the consistency of our results with results from previous studies. We further demonstrate the usefulness of the M-quantile estimation framework through design-based simulations based on two realistic survey data sets containing small area information and show that the M-quantile approach may be preferable when the aim is to estimate the small area distribution function.
Ecological Inference : New Methodological Strategies, | 2004
David G Steel; Eric J. Beh; Ray Chambers
Ecological analysis involves using aggregate data for a set of groups to make inferences concerning individual level relationships. Typically the data available for analysis consists of the means or totals of variables of interest for geographical areas, although the groups can be organisations such as schools or hospitals. Attention has focused on developing methods of estimating the parameters characterising the individual level relationships across the whole population, but also in some cases the relationships for each of the groups. Applying standard methods used to analyse individual level data, such as linear or logistic regression or contingency table analysis, to aggregate data will usually produce biased estimates of individual level relationships. Thus much of the effort in ecological analysis has concentrated on developing methods of analysing aggregate data that can produce unbiased, or less biased, parameter estimates. There has been less work done on inference procedures, such as constructing confidence intervals and hypothesis testing. Fundamental to these inferential issues is the question of how much information is contained in aggregate data and what evidence such data can provide concerning important assumptions and hypotheses.
Journal of The Royal Statistical Society Series A-statistics in Society | 2001
Ray Chambers; David G Steel
This paper considers inference about the individual level relationship between two dichotomous variables based on aggregated data. It is known that such analyses suffer from ‘ecological bias’, caused by the lack of homogeneity of this relationship across the groups over which the aggregation occurs. Two new methods for overcoming this bias, one based on local smoothing and the other a simple semiparametric approach, are developed and evaluated. The local smoothing approach performs best when it is used with a covariate which accounts for some of the variation in the relationships across groups. The semiparametric approach performed well in our evaluation even without such auxiliary information
Computational Statistics & Data Analysis | 2010
Nicola Salvati; Hukum Chandra; M. Giovanna Ranalli; Ray Chambers
Nonparametric regression is widely used as a method of characterizing a non-linear relationship between a variable of interest and a set of covariates. Practical application of nonparametric regression methods in the field of small area estimation is fairly recent, and has so far focussed on the use of empirical best linear unbiased prediction under a model that combines a penalized spline (p-spline) fit and random area effects. The concept of model-based direct estimation is used to develop an alternative nonparametric approach to estimation of a small area mean. The suggested estimator is a weighted average of the sample values from the area, with weights derived from a linear regression model with random area effects extended to incorporate a smooth, nonparametrically specified trend. Estimation of the mean squared error of the proposed small area estimator is also discussed. Monte Carlo simulations based on both simulated and real datasets show that the proposed model-based direct estimator and its associated mean squared error estimator perform well. They are worth considering in small area estimation applications where the underlying population regression relationships are non-linear or have a complicated functional form.
Computational Statistics & Data Analysis | 2012
Ray Chambers
Most probability-based methods used to link records from two distinct data sets corresponding to the same target population do not lead to perfect linkage, i.e. there are linkage errors in the merged data. Further, the linkage is often incomplete, in the sense that many records in the two data sets remain unmatched at the completion of the linkage process. This paper introduces methods that correct for the biases due to linkage errors and incomplete linkage when carrying out regression analysis using linked data. In particular, it focuses on the case where one of the linked data sets is a sample from the target population and the other is a register, i.e. it covers the entire target population.
Journal of Computational and Graphical Statistics | 2013
Ray Chambers; Hukum Chandra
Random effects models for hierarchically dependent data, for example, clustered data, are widely used. A popular bootstrap method for such data is the parametric bootstrap based on the same random effects model as that used in inference. However, it is hard to justify this type of bootstrap when this model is known to be an approximation. In this article, we describe a random effect block bootstrap approach for clustered data that is simple to implement, free of both the distribution and the dependence assumptions of the parametric bootstrap, and is consistent when the mixed model assumptions are valid. Results based on Monte Carlo simulation show that the proposed method seems robust to failure of the dependence assumptions of the assumed mixed model. An application to a realistic environmental dataset indicates that the method produces sensible results. Supplementary materials for the article, including the data used for the application, are available online.