Satkartar K. Kinney
Research Triangle Park
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Satkartar K. Kinney.
Epidemiology | 2011
Satkartar K. Kinney
In a recent editorial, the editors of EPIDEMIOLOGY invited authors to make their analytic and simulation codes, questionnaires, and data used in analyses available to other researchers. Much has been written about the need for data sharing and reproducible research, and many journals and funding agencies have explicit data-sharing policies. When data are confidential, however, investigators typically cannot release them as collected, because doing so could reveal data subjects’ identities or values of sensitive attributes, thereby violating ethical and potentially legal obligations to protect confidentiality. At first glance, safely sharing confidential data seems a straightforward task: simply strip unique identifiers such as names, addresses, and identification numbers before releasing the data. However, these actions alone may not suffice when other identifying variables, such as geographic or demographic data, remain in the file. These quasiidentifiers can be used to match units in the released data to other databases. For example, Sweeney showed that 97% of the records in publicly available voter registration lists for Cambridge, MA, could be uniquely identified using birth date and 9-digit zip code. By matching the information in these lists, she was able to identify Massachusetts Governor William Weld in an anonymized medical database. As the amount of information readily available to the public continues to expand (eg, via the Internet and private companies), investigators releasing large-scale epidemiologic data run the risk of similar breaches. In this commentary, we present a primer on techniques for sharing confidential data. We classify techniques into 2 broad classes: restricted access, in which access is provided only to trusted users, and restricted data, in which the original data are somehow altered before sharing.
Statistical journal of the IAOS | 2014
Satkartar K. Kinney; Javier Miranda
In most countries, national statistical agencies do not release establishment-level business microdata, because doing so represents too large a risk to establishments’ confidentiality. Agencies potentially can manage these risks by releasing synthetic microdata, i.e., individual establishment records simulated from statistical models designed to mimic the joint distribution of the underlying observed data. Previously, we used this approach to generate a public-use version – now available for public use – of the U.S. Census Bureau’s Longitudinal Business Database (LBD), a longitudinal census of establishments dating back to 1976. While the synthetic LBD has proven to be a useful product, we now seek to improve and expand it by using new synthesis models and adding features. This article describes our efforts to create the second generation of the SynLBD, including synthesis procedures that we believe could be replicated in other contexts.
Journal of statistical theory and practice | 2009
Satkartar K. Kinney
Multiple imputation is a common approach for handling missing data. It allows users to make valid inferences using standard complete-data methods with simple combining rules. A variation is to partition the missing data into two portions and conduct the imputation in two stages. We review two-stage multiple imputation and existing inferential methods and derive an alternative reference F-distribution for large sample hypothesis testing for high-dimensional estimands. We also derive formulas for estimating rates of missing information.
Archive | 2008
Satkartar K. Kinney; David B. Dunson
Random effects models are widely used in analyzing dependent data, which are collected routinely in a broad variety of application areas. For example, longitudinal studies collect repeated observations for each study subject, while multi-center studies collect data for patients nested within study centers. In such settings, it is natural to suppose that dependence arises due to the impact of important unmeasured predictors that may interact with measured predictors. This viewpoint naturally leads to random effects models in which the regression coefficients vary across the different subjects. In this chapter, we use the term “subject” broadly to refer to the independent experimental units. For example, in longitudinal studies, the subjects are the individuals under study, while in multi-center studies the subjects correspond to the study centers.
Biometrics | 2007
Satkartar K. Kinney; David B. Dunson
Survey Methodology | 2006
Trivellore E. Raghunathan; Satkartar K. Kinney
International Statistical Review | 2011
Satkartar K. Kinney; Arnold P. Reznek; Javier Miranda; Ron S. Jarmin; John M. Abowd
International Statistical Review | 2011
Lawrence H. Cox; Alan F. Karr; Satkartar K. Kinney
International Statistical Review | 2011
Satkartar K. Kinney; Arnold P. Reznek; Javier Miranda; Ron S. Jarmin; John M. Abowd
Journal of Official Statistics | 2012
Satkartar K. Kinney