Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where James O. Chipperfield is active.

Publication


Featured researches published by James O. Chipperfield.


Journal of Official Statistics | 2015

Using the Bootstrap to Account for Linkage Errors when Analysing Probabilistically Linked Categorical Data

James O. Chipperfield; Ray Chambers

Abstract Record linkage is the act of bringing together records that are believed to belong to the same unit (e.g., person or business) from two or more files. Record linkage is not an error-free process and can lead to linking a pair of records that do not belong to the same unit. This occurs because linking fields on the files, which ideally would uniquely identify each unit, are often imperfect. There has been an explosion of record linkage applications, particularly involving government agencies and in the field of health, yet there has been little work on making correct inference using such linked files. Naively treating a linked file as if it were linked without errors can lead to biased inferences. This article develops a method of making inferences for cross tabulated variables when record linkage is not an error-free process. In particular, it develops a parametric bootstrap approach to estimation which can accommodate the sophisticated probabilistic record linkage techniques that are widely used in practice (e.g., 1-1 linkage). The article demonstrates the effectiveness of this method in a simulation and in a real application.


Journal of Official Statistics | 2014

Disclosure-Protected Inference with Linked Microdata Using a Remote Analysis Server

James O. Chipperfield

Abstract Large amounts of microdata are collected by data custodians in the form of censuses and administrative records. Often, data custodians will collect different information on the same individual. Many important questions can be answered by linking microdata collected by different data custodians. For this reason, there is very strong demand from analysts, within government, business, and universities, for linked microdata. However, many data custodians are legally obliged to ensure the risk of disclosing information about a person or organisation is acceptably low. Different authors have considered the problem of how to facilitate reliable statistical inference from analysis of linked microdata while ensuring that the risk of disclosure is acceptably low. This article considers the problem from the perspective of an Integrating Authority that, by definition, is trusted to link the microdata and to facilitate analysts’ access to the linked microdata via a remote server, which allows analysts to fit models and view the statistical output without being able to observe the underlying linked microdata. One disclosure risk that must be managed by an Integrating Authority is that one data custodian may use the microdata it supplied to the Integrating Authority and statistical output released from the remote server to disclose information about a person or organisation that was supplied by the other data custodian. This article considers analysis of only binary variables. The utility and disclosure risk of the proposed method are investigated both in a simulation and using a real example. This article shows that some popular protections against disclosure (dropping records, rounding regression coefficients or imposing restrictions on model selection) can be ineffective in the above setting.


Archive | 2014

Raising the Capability of Producers and Users of Official Statistics

Sharleen Forbes; John Harraway; James O. Chipperfield; Siu-Ming Tam

In both Australia and New Zealand, the National Statistics Offices have developed strong partnerships with academics to raise statistical capability. Both offices recognise the importance of good methodology to underpin official statistics. However, the main target group for Statistics New Zealand (SNZ) has been external users of official statistics, but that for the Australian Bureau of Statistics (ABS) has been its own statistical methodologists (producers) and advancing research. This chapter outlines sets of initiatives from both agencies. SNZ has actively focussed on raising the statistical capability of key groups of users, including schools, small businesses, government, the media and Maori. It has established a network of academics in official statistics whose members are involved in the design, implementation, delivery and assessment of courses for qualification as well as presenting short (1- or 2-day) courses. The ABS places strong emphasis on the recruitment, training and grooming of young methodologists to become leaders in their chosen field of research, and their focus is on collaboration with the university sector and academics to help with this and to foster ABS research. Other initiatives undertaken in both organisations are also briefly mentioned, including the Census AtSchool project.


Journal of Multivariate Analysis | 2012

Multivariate random effect models with complete and incomplete data

James O. Chipperfield; David G Steel

This paper considers the problem of estimating fixed effects, random effects and variance components for the multi-variate random effects model with complete and incomplete data. It also considers making inferences about fixed and random effects, a problem which requires careful consideration of the choice of degrees of freedom to use in confidence intervals. This paper uses the EM algorithm to maximise the hierarchical likelihood (HL). The HL estimates are often the same as the REML and Bayesian-justified estimates in Shah et al. (1997) [10]. A key benefit of the h-likelihood approach is its simplicity-it does not require integrating over the random effects or use of priors for its justification. Another benefit is that all inference can be made within a single framework. Extensive simulations show: that the h-likelihood approach is significantly more accurate than the well-known ANOVA approach; the h-likelihood approach often recovers a lot of the information lost through missing data; the h-likelihood approach has good coverage properties for fixed and random effects that are estimated using small samples.


Journal of Applied Statistics | 2018

Split Questionnaire Designs: collecting only the data that you need through MCAR and MAR designs

James O. Chipperfield; Margo Barr; David G Steel

ABSTRACT We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be thought of as incorporating missing data into survey design. This paper examines the situation where data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that it can be easy to measure the relative contribution of a respondent to the accuracy of estimated model parameters before collecting all the respondents model covariates. We show empirically and theoretically that we could achieve a significant reduction in respondent burden with a negligible impact on the accuracy of estimates by not collecting model covariates from respondents who we identify as contributing little to the accuracy of estimates. We discuss the general implications for SQDs.


privacy in statistical databases | 2016

A New Algorithm for Protecting Aggregate Business Microdata via a Remote System

Yue Ma; Yan-Xia Lin; James O. Chipperfield; John Newman; Victoria Leaver

Releasing business microdata is a challenging problem for many statistical agencies. Businesses with distinct continuous characteristics such as extremely high income could easily be identified while these businesses are normally included in surveys representing the population. In order to provide data users with useful statistics while maintaining confidentiality, some statistical agencies have developed online based tools to allow users to specify and request tables created from microdata. These tools only release perturbed cell values generated from automatic output perturbation algorithms in order to protect each underlying observation against various attacks, such as differencing attacks. An example of the perturbation algorithms has been proposed by Thompson et al. (2013). The algorithm focuses largely on reducing disclosure risks without addressing much on data utility. As a result, the algorithm has limitations, including a limited scope of applicable cells and uncontrolled utility loss. In this paper we introduce a new algorithm for generating perturbed cell values. As a comparison, The new algorithm allows more control over utility loss, while it could also achieve better utility-disclosure tradeoffs in many cases, and is conjectured to be applicable to a wider scope of cells.


Journal of Official Statistics | 2009

Design and estimation for split questionnaire surveys

James O. Chipperfield; David G Steel


International Statistical Review | 2013

A Summary of Attack Methods and Confidentiality Protection Measures for Fully Automated Remote Analysis Systems

Christine M. O'Keefe; James O. Chipperfield


Journal of Statistical Planning and Inference | 2011

Efficiency of split questionnaire surveys

James O. Chipperfield; David G Steel


Journal of The Royal Statistical Society Series A-statistics in Society | 2010

Embedded experiments in repeated and overlapping surveys

James O. Chipperfield; Philip Bell

Collaboration


Dive into the James O. Chipperfield's collaboration.

Top Co-Authors

Avatar

David G Steel

University of Wollongong

View shared research outputs
Top Co-Authors

Avatar

Bindi Kindermann

Australian Bureau of Statistics

View shared research outputs
Top Co-Authors

Avatar

Christine M. O'Keefe

Commonwealth Scientific and Industrial Research Organisation

View shared research outputs
Top Co-Authors

Avatar

Margo Barr

University of Wollongong

View shared research outputs
Top Co-Authors

Avatar

Noel Hansen

Australian Bureau of Statistics

View shared research outputs
Top Co-Authors

Avatar

Peter Rossiter

Australian Bureau of Statistics

View shared research outputs
Top Co-Authors

Avatar

Victoria Leaver

Australian Bureau of Statistics

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sharleen Forbes

Victoria University of Wellington

View shared research outputs
Top Co-Authors

Avatar

Jeffrey Wright

Australian Bureau of Statistics

View shared research outputs
Researchain Logo
Decentralizing Knowledge