Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rina Foygel Barber is active.

Publication


Featured researches published by Rina Foygel Barber.


Annals of Statistics | 2015

Controlling the false discovery rate via knockoffs

Rina Foygel Barber; Emmanuel J. Candès

In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we need to know that the false discovery rate (FDR) - the expected fraction of false discoveries among all discoveries - is not too high, in order to assure the scientist that most of the discoveries are indeed true and replicable. This paper introduces the knockoff filter, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables. This method achieves exact FDR control in finite sample settings no matter the design or covariates, the number of variables in the model, or the amplitudes of the unknown regression coefficients, and does not require any knowledge of the noise level. As the name suggests, the method operates by manufacturing knockoff variables that are cheap - their construction does not require any new data - and are designed to mimic the correlation structure found within the existing variables, in a way that allows for accurate FDR control, beyond what is possible with permutation-based methods. The method of knockoffs is very general and flexible, and can work with a broad class of test statistics. We test the method in combination with statistics from the Lasso for sparse regression, and obtain empirical results showing that the resulting method has far more power than existing selection rules when the proportion of null variables is high.


Electronic Journal of Statistics | 2015

High-dimensional Ising model selection with Bayesian information criteria

Rina Foygel Barber; Mathias Drton

We consider the use of Bayesian information criteria for selection of the graph underlying an Ising model. In an Ising model, the full conditional distributions of each variable form logistic regression models, and variable selection techniques for regression allow one to identify the neighborhood of each node and, thus, the entire graph. We prove high-dimensional consistency results for this pseudo-likelihood approach to graph selection when using Bayesian information criteria for the variable selection problems in the logistic regressions. The results pertain to scenarios of sparsity and following related prior work the information criteria we consider incorporate an explicit prior that encourages sparsity.


Geology | 2014

Long-term accumulation of carbonate shells reflects a 100-fold drop in loss rate

Adam Tomašových; Susan M. Kidwell; Rina Foygel Barber; Darrell S. Kaufman

Shells in modern seabeds can be thousands of years old, far older than would be extrapolated from the rapid rates of shell loss detected in short-term experiments. An extensive shelldating program on the Southern California (USA) shelf permits rigorous modeling of the dynamics of shell loss in the mixed layer, discriminating the key rates of carbonate disintegration and sequestration for the first time. We find that bivalve shells experience an initially high disintegration rate l 1 (~ decadal half-lives) but shift abruptly, within the first ~500 yr postmortem, to a 100-fold lower disintegration rate l 2 (~ millennial half-lives) at sequestration rate t (burial and/or diagenetic stabilization). This drop permits accrual of a long tail of very old shells even when sequestration is very slow, and allows only a minority (<1%) of all shells to survive the first phase. These high rates of disintegration and low rates of sequestration are consistent with independent measures of high carbonate loss and slow sedimentation on this shelf. Our two-phase model thus reveals significant spatial and temporal partitioning of car bonate loss rates within the mixed layer, and shows how shell age-frequency distributions can yield rigorous and realistic estimates of carbonate recycling on geological time scales.


Physics in Medicine and Biology | 2016

An algorithm for constrained one-step inversion of spectral CT data

Rina Foygel Barber; Emil Y. Sidky; Taly Gilat Schmidt; Xiaochuan Pan

We develop a primal-dual algorithm that allows for one-step inversion of spectral CT transmission photon counts data to a basis map decomposition. The algorithm allows for image constraints to be enforced on the basis maps during the inversion. The derivation of the algorithm makes use of a local upper bounding quadratic approximation to generate descent steps for non-convex spectral CT data discrepancy terms, combined with a new convex-concave optimization algorithm. Convergence of the algorithm is demonstrated on simulated spectral CT data. Simulations with noise and anthropomorphic phantoms show examples of how to employ the constrained one-step algorithm for spectral CT data.


Paleobiology | 2016

Inferring skeletal production from time-averaged assemblages: skeletal loss pulls the timing of production pulses towards the modern period

Adam Tomašových; Susan M. Kidwell; Rina Foygel Barber

Abstract. Age-frequency distributions of dead skeletal material on the landscape or seabed—information on the time that has elapsed since the death of individuals—provide decadal- to millennial-scale perspectives both on the history of production and on the processes that lead to skeletal disintegration and burial. So far, however, models quantifying the dynamics of skeletal loss have assumed that skeletal production is constant during time-averaged accumulation. Here, to improve inferences in conservation paleobiology and historical ecology, we evaluate the joint effects of temporally variable production and skeletal loss on postmortem age-frequency distributions (AFDs) to determine how to detect fluctuations in production over the recent past from AFDs. We show that, relative to the true timing of past production pulses, the modes of AFDs will be shifted to younger age cohorts, causing the true age of past pulses to be underestimated. This shift in the apparent timing of a past pulse in production will be stronger where loss rates are high and/or the rate of decline in production is slow; also, a single pulse coupled with a declining loss rate can, under some circumstances, generate a bimodal distribution. We apply these models to death assemblages of the bivalve Nuculana taphria from the Southern California continental shelf, finding that: (1) an onshore-offshore gradient in time averaging is dominated by a gradient in the timing of production, reflecting the tracking of shallow-water habitats under a sea-level rise, rather than by a gradient in disintegration and sequestration rates, which remain constant with water depth; and (2) loss-corrected model-based estimates of the timing of past production are in good agreement with likely past changes in local production based on an independent sea-level curve.


Journal of the American Statistical Association | 2017

Accumulation Tests for FDR Control in Ordered Hypothesis Testing

Ang Li; Rina Foygel Barber

ABSTRACT Multiple testing problems arising in modern scientific applications can involve simultaneously testing thousands or even millions of hypotheses, with relatively few true signals. In this article, we consider the multiple testing problem where prior information is available (for instance, from an earlier study under different experimental conditions), that can allow us to test the hypotheses as a ranked list to increase the number of discoveries. Given an ordered list of n hypotheses, the aim is to select a data-dependent cutoff k and declare the first k hypotheses to be statistically significant while bounding the false discovery rate (FDR). Generalizing several existing methods, we develop a family of “accumulation tests” to choose a cutoff k that adapts to the amount of signal at the top of the ranked list. We introduce a new method in this family, the HingeExp method, which offers higher power to detect true signals compared to existing techniques. Our theoretical results prove that these methods control a modified FDR on finite samples, and characterize the power of the methods in the family. We apply the tests to simulated data, including a high-dimensional model selection problem for linear regression. We also compare accumulation tests to existing methods for multiple testing on a real data problem of identifying differential gene expression over a dosage gradient. Supplementary materials for this article are available online.


Journal of The Royal Statistical Society Series B-statistical Methodology | 2017

EigenPrism: inference for high dimensional signal‐to‐noise ratios

Lucas Janson; Rina Foygel Barber; Emmanuel J. Candès

Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional (p > n) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the [Formula: see text]-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. The procedure, called EigenPrism, is computationally fast and makes no assumptions on coefficient sparsity or knowledge of the noise level. We investigate the width of the EigenPrism confidence intervals, including a comparison with a Bayesian setting in which our interval is just 5% wider than the Bayes credible interval. We are then able to unify the three aforementioned problems by showing that the EigenPrism procedure with only minor modifications is able to make important contributions to all three. We also investigate the robustness of coverage and find that the method applies in practice and in finite samples much more widely than just the case of multivariate Gaussian covariates. Finally, we apply EigenPrism to a genetic dataset to estimate the genetic signal-to-noise ratio for a number of continuous phenotypes.


Annals of Statistics | 2018

ROCKET: Robust confidence intervals via Kendall’s tau for transelliptical graphical models

Rina Foygel Barber; Mladen Kolar

Undirected graphical models are used extensively in the biological and social sciences to encode a pattern of conditional independences between variables, where the absence of an edge between two nodes


IEEE Transactions on Medical Imaging | 2017

A Spectral CT Method to Directly Estimate Basis Material Maps From Experimental Photon-Counting Data

Taly Gilat Schmidt; Rina Foygel Barber; Emil Y. Sidky

a


Electronic Journal of Statistics | 2017

The function-on-scalar LASSO with applications to longitudinal GWAS

Rina Foygel Barber; Matthew Reimherr; Thomas Schill

and

Collaboration


Dive into the Rina Foygel Barber's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aaditya Ramdas

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ang Li

University of Chicago

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge