Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where B.F.M. Bakker is active.

Publication


Featured researches published by B.F.M. Bakker.


Statistical journal of the IAOS | 2014

The System of social statistical datasets of Statistics Netherlands: An integral approach to the production of register-based social statistics

B.F.M. Bakker; J. van Rooijen; L. van Toor

More and more countries are using register data to replace traditional Censuses. Moreover, official statistics as well as research are increasingly based on register data or combinations of survey and register data. Register-based statistics offer wonderful new opportunities. At the same time, they require a new approach to how data are processed and managed. In this article, we present the System of social statistical datasets (SSD), a system of interlinked and standardized registers and surveys. All production processes within Statistics Netherlands that pertain to social or spatial statistics converge in the SSD, which thus constitutes a shared output-oriented system. The SSD contains a wealth of information on persons, households, jobs, benefits, pensions, education, hospitalizations, crime reports, dwellings, vehicles and more. In the Netherlands it is the most important source for official social statistics and, because the data are available on request by means of remote access, also very popular in the social sciences. This article describes the contents of the SSD as well as the underlying process and organization, and demonstrates its possibilities.


The Annals of Applied Statistics | 2012

People born in the Middle East but residing in the Netherlands: invariant population size estimates and the role of active and passive covariates

P.G.M. Van der Heijden; J. Whittaker; M.J.L.F. Cruyff; B.F.M. Bakker; R. van der Vliet

Including covariates in loglinear models of population registers improves population size estimates for two reasons. First, it is possible to take heterogeneity of inclusion probabilities over the levels of a covariate into account; and second, it allows subdivision of the estimated population by the levels of the covariates, giving insight into characteristics of individuals that are not included in any of the registers. The issue of whether or not marginalizing the full table of registers by covariates over one or more covariates leaves the estimated population size estimate invariant is intimately related to collapsibility of contingency tables [Biometrika 70 (1983) 567–578]. We show that, with information from two registers, population size invariance is equivalent to the simultaneous collapsibility of each margin consisting of one register and the covariates. We give a short path characterization of the loglinear model which describes when marginalizing over a covariate leads to different population size estimates. Covariates that are collapsible are called passive, to distinguish them from covariates that are not collapsible and are termed active. We make the case that it can be useful to include passive covariates within the estimation model, because they allow a finer description of the population in terms of these covariates. As an example we discuss the estimation of the population size of people born in the Middle East but residing in the Netherlands


Journal of Official Statistics | 2015

Sensitivity of Population Size Estimation for Violating Parametric Assumptions in Log-linear Models

S.C. Gerritse; Peter G. M. van der Heijden; B.F.M. Bakker

Abstract An important quality aspect of censuses is the degree of coverage of the population. When administrative registers are available undercoverage can be estimated via capture-recapture methodology. The standard approach uses the log-linear model that relies on the assumption that being in the first register is independent of being in the second register. In models using covariates, this assumption of independence is relaxed into independence conditional on covariates. In this article we describe, in a general setting, how sensitivity analyses can be carried out to assess the robustness of the population size estimate. We make use of log-linear Poisson regression using an offset, to simulate departure from the model. This approach can be extended to the case where we have covariates observed in both registers, and to a model with covariates observed in only one register. The robustness of the population size estimate is a function of implied coverage: as implied coverage is low the robustness is low. We conclude that it is important for researchers to investigate and report the estimated robustness of their population size estimate for quality reasons. Extensions are made to log-linear modeling in case of more than two registers and the multiplier method


Statistical journal of the IAOS | 2015

Different methods to complete datasets used for capture-recapture estimation: Estimating the number of usual residents in the Netherlands

S.C. Gerritse; B.F.M. Bakker; P.G.M. Van der Heijden

We are interested in an estimate of the usual residents in the Netherlands. Capture-recapture estimation with three registers enables us to estimate the size of the total population, of which the usual residents are a part. However, usual residence cannot be used as a covariate because it is not available in one of the registers. We approach this as a missing data problem. There are different methods available to handle missing data. In this manuscript we use Expectation Maximization (EM) algorithm and Predictive Mean Matching (PMM). The EM algorithm is often used in categorical data analysis, but PMM has the advantage of flexibility in the choice for a specific part of the observed data used for the imputation of the missing data. Four scenarios have been identified where the missing data are completed via either the EM algorithm or PMM imputation, resulting in different population size estimates for usual residence. It was found that the different scenarios lead to different population size estimates. Even small changes in the completed data lead to different population size estimates. In this study PMM imputation performs best according flexibility and it is theoretically better motivated.


Journal of Official Statistics | 2018

An Overview of Population Size Estimation where Linking Registers Results in Incomplete Covariates, with an Application to Mode of Transport of Serious Road Casualties

Peter G. M. van der Heijden; Paul Smith; M.J.L.F. Cruyff; B.F.M. Bakker

Abstract We consider the linkage of two or more registers in the situation where the registers do not cover the whole target population, and relevant categorical auxiliary variables (unique to one of the registers; although different variables could be present on each register) are available in addition to the usual matching variable(s). The linked registers therefore do not contain full information on either the observations (often individuals) or the variables. By treating this as a missing data problem it is possible to construct a linked data set, adjusted to estimate the part of the population missed by both registers, and containing completed covariate information for all the registers. This is achieved using an Expectation-Maximization (EM)-algorithm. We elucidate the properties of this approach where the model is appropriate and in situations corresponding with real applications in official statistics, and also where the model conditions are violated. The approach is applied to data on road accidents in the Netherlands, where the cause of the accident is denoted by the police and by the hospital. Here the cause of the accident denoted by the police is considered as missing information for the statistical units only registered by the hospital, and the other way around. The method needs to be widely applied to give a better impression of the range of problems where it can be beneficial.


Statistical journal of the IAOS | 2017

Reconciliation of inconsistent data sources by correction for measurement error : The feasibility of parameter re-use

Paulina Pankowska; B.F.M. Bakker; Daniel L. Oberski; D. Pavlopoulos

National Statistical Institutes (NSIs) often obtain information about a single variable from separate data sources. Administrative registers and surveys, in particular, often provide overlapping information on a range of phenomena of interest to official statistics. However, even though the two sources overlap, they both contain measurement error that prevents identical units from yielding identical values. Reconciling such separate data sources and providing accurate statistics, which is an important challenge for NSIs, is typically achieved through macro-integration. In this study we investigate the feasibility of an alternative method based on the application of previously obtained results from a recently introduced extension of the Hidden Markov Model (HMM) to newer data. The method allows a reconciliation of separate error-prone data sources without having to repeat the full HMM analysis, provided the estimated measurement error processes are stable over time. As we find that these processes are indeed stable over time, the proposed method can be used effectively for macro-integration, to reconciliate both first-order statistics-e.g. the size of temporary employment in the Netherlands-and second-order statistics-e.g. the amount of mobility from temporary to permanent employment.


Statistical journal of the IAOS | 2014

The impact of nonresponse on survey quality

Jelke Bethlehem; B.F.M. Bakker

Almost every survey suffers from nonresponse. Nonresponse rates are particularly high for voluntary surveys. The problem of nonresponse is that it affects the representativity of the survey results, and therefore causes estimates to be biased. Theoretically, it is possible to correct these estimates, but this requires sufficient auxiliary information. Unfortunately, such information is not always available. This papers discusses a number of issues and developments.


Cahiers | 2005

Verdacht van criminaliteit : Allochtonen en autochtonen nader bekeken

M. Blom; J. Oudhof; R.V. Bijl; B.F.M. Bakker


Statistica Neerlandica | 2012

Estimating the validity of administrative variables

B.F.M. Bakker


Archive | 2005

Verdacht van criminaliteit

M. Blom; J. Oudhof; R.V. Bijl; B.F.M. Bakker

Collaboration


Dive into the B.F.M. Bakker's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

J.H. Smit

VU University Amsterdam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aslan Zorlu

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar

Barbara M. Bakker

University Medical Center Groningen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge