Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sarah C. Emerson is active.

Publication


Featured researches published by Sarah C. Emerson.


Journal of The American Society of Nephrology | 2012

Imperfect Gold Standards for Kidney Injury Biomarker Evaluation

Sushrut S. Waikar; Rebecca A. Betensky; Sarah C. Emerson; Joseph V. Bonventre

Clinicians have used serum creatinine in diagnostic testing for acute kidney injury for decades, despite its imperfect sensitivity and specificity. Novel tubular injury biomarkers may revolutionize the diagnosis of acute kidney injury; however, even if a novel tubular injury biomarker is 100% sensitive and 100% specific, it may appear inaccurate when using serum creatinine as the gold standard. Acute kidney injury, as defined by serum creatinine, may not reflect tubular injury, and the absence of changes in serum creatinine does not assure the absence of tubular injury. In general, the apparent diagnostic performance of a biomarker depends not only on its ability to detect injury, but also on disease prevalence and the sensitivity and specificity of the imperfect gold standard. Assuming that, at a certain cutoff value, serum creatinine is 80% sensitive and 90% specific and disease prevalence is 10%, a new perfect biomarker with a true 100% sensitivity may seem to have only 47% sensitivity compared with serum creatinine as the gold standard. Minimizing misclassification by using more strict criteria to diagnose acute kidney injury will reduce the error when evaluating the performance of a biomarker under investigation. Apparent diagnostic errors using a new biomarker may be a reflection of errors in the imperfect gold standard itself, rather than poor performance of the biomarker. The results of this study suggest that small changes in serum creatinine alone should not be used to define acute kidney injury in biomarker or interventional studies.


Electronic Journal of Statistics | 2009

Calibration of the empirical likelihood method for a vector mean

Sarah C. Emerson; Art B. Owen

The empirical likelihood method is a versatile approach for testing hypotheses and constructing confidence regions in a non-parametric setting. For testing the value of a vector mean, the empirical likelihood method offers the benefit of making no distributional assumptions beyond some mild moment conditions. However, in small samples or high dimensions the method is very poorly calibrated, producing tests that generally have a much higher type I error than the nominal level, and it suffers from a limiting convex hull constraint. Methods to address the performance of the empirical likelihood in the vector mean setting have been proposed in a number of papers, including a contribution by Chen et al. (2008) that suggests supplementing the observed dataset with an artificial data point. We examine the consequences of this approach and describe a limitation of their method that we have discovered in settings when the sample size is relatively small compared with the dimension. We propose a new modification to the extra data approach that involves adding two points and changing the location of the extra points. We explore the benefits that this modification offers, and show that it results in better calibration, particularly in difficult cases. This new approach also results in a smallsample connection between the modified empirical likelihood method and Hotelling’s T-square test. We show that varying the location of the added data points creates a continuum of tests that range from the unmodified empirical likelihood statistic to Hotelling’s T-square statistic.


Statistics in Medicine | 2011

Comments on ‘Adaptive increase in sample size when interim results are promising: A practical guide with examples’

Scott S. Emerson; Gregory P. Levin; Sarah C. Emerson

In their paper [1], Drs. Mehta and Pocock illustrate the use of a particular approach to revising the maximal sample size of a randomized clinical trial (RCT) by using an interim estimate of the treatment effect. Slightly extending the results of Gao, Ware, and Mehta [2], the authors define conditions on an adaptive rule such that one can know that the naive statistical hypothesis test that ignores the adaptation is conservative. They then use this knowledge to define an adaptive rule for a clinical trial. In our review of this paper, however, we do not find that such an adaptive rule confers any advantage by the usual criteria for clinical trial design. Rather, we find that the designs proposed in this paper are markedly inferior to alternative designs that the authors do not (but should) consider. By way of full disclosure, the first author of this commentary provided to the authors a signed referee’s report on an earlier version of this manuscript, and that report contained the substance (and most of the detail) of this review. In the comments to the editor accompanying that review, the first author described the dilemma that arose during that review. In essence, the methods described in the manuscript do not seem to us worthy of emulation. But on the other hand, the purpose of case studies in the statistical literature is to present an academic exposition of lessons that can be learned. From years of recreational spelunking, we have noted parallels between research and cave exploration. In both processes, explorers spend their time in the dark exploring the maze of potential leads, most often without a clear idea of where they will end up. Because the overwhelming majority of such leads are dead ends, the most useful companions to have along with you are the ones who will willingly explore the dead ends. However, they rapidly become the least useful companions if they have a tendency to explore the dead ends and then come back and tell you the leads went somewhere. Furthermore, the most important skill that any explorers can have is the ability to recognize when they are back at the beginning, lest they believe that the promising lead took them someplace new and become hopelessly lost. According to these criteria, then, the fact that we would not adopt some approach does not necessarily detract from the importance of a paper to the statistical literature. Instead, a paper’s value relates to the extent to which it contributes to our understanding of the methods, which can often be greatly enhanced by identifying dead ends and/or leads that take us back to the beginning. We note that there are several levels to what could be called the “recommended approach” in this paper. At the topmost level, it can be viewed merely as advocating the use of adaptive designs to assess the likelihood of future futility and efficacy of a clinical trial. But in illustrating that use, the authors seem also to advocate for adaptive methods resulting in sampling distributions that are less “heavy tailed” than analogous fixed sample designs (so that they can safely use naive analytic approaches), and they seem to fall prey to some of the difficulties in interpreting conditional power. We note that


Statistics in Medicine | 2011

Exploring the benefits of adaptive sequential designs in time-to-event endpoint settings.

Sarah C. Emerson; Kyle Rudser; Scott S. Emerson

Sequential analysis is frequently employed to address ethical and financial issues in clinical trials. Sequential analysis may be performed using standard group sequential designs, or, more recently, with adaptive designs that use estimates of treatment effect to modify the maximal statistical information to be collected. In the general setting in which statistical information and clinical trial costs are functions of the number of subjects used, it has yet to be established whether there is any major efficiency advantage to adaptive designs over traditional group sequential designs. In survival analysis, however, statistical information (and hence efficiency) is most closely related to the observed number of events, while trial costs still depend on the number of patients accrued. As the number of subjects may dominate the cost of a trial, an adaptive design that specifies a reduced maximal possible sample size when an extreme treatment effect has been observed may allow early termination of accrual and therefore a more cost-efficient trial. We investigate and compare the tradeoffs between efficiency (as measured by average number of observed events required), power, and cost (a function of the number of subjects accrued and length of observation) for standard group sequential methods and an adaptive design that allows for early termination of accrual. We find that when certain trial design parameters are constrained, an adaptive approach to terminating subject accrual may improve upon the cost efficiency of a group sequential clinical trial investigating time-to-event endpoints. However, when the spectrum of group sequential designs considered is broadened, the advantage of the adaptive designs is less clear.


Infection and Immunity | 2014

The Type II Secretion Pathway in Vibrio cholerae Is Characterized by Growth Phase-Dependent Expression of Exoprotein Genes and Is Positively Regulated by σE

Ryszard A. Zielke; Ryan S. Simmons; Bo R. Park; Mariko Nonogaki; Sarah C. Emerson; Aleksandra E. Sikora

ABSTRACT Vibrio cholerae, an etiological agent of cholera, circulates between aquatic reservoirs and the human gastrointestinal tract. The type II secretion (T2S) system plays a pivotal role in both stages of the lifestyle by exporting multiple proteins, including cholera toxin. Here, we studied the kinetics of expression of genes encoding the T2S system and its cargo proteins. We have found that under laboratory growth conditions, the T2S complex was continuously expressed throughout V. cholerae growth, whereas there was growth phase-dependent transcriptional activity of genes encoding different cargo proteins. Moreover, exposure of V. cholerae to different environmental cues encountered by the bacterium in its life cycle induced transcriptional expression of T2S. Subsequent screening of a V. cholerae genomic library suggested that σE stress response, phosphate metabolism, and the second messenger 3′,5′-cyclic diguanylic acid (c-di-GMP) are involved in regulating transcriptional expression of T2S. Focusing on σE, we discovered that the upstream region of the T2S operon possesses both the consensus σE and σ70 signatures, and deletion of the σE binding sequence prevented transcriptional activation of T2S by RpoE. Ectopic overexpression of σE stimulated transcription of T2S in wild-type and isogenic ΔrpoE strains of V. cholerae, providing additional support for the idea that the T2S complex belongs to the σE regulon. Together, our results suggest that the T2S pathway is characterized by the growth phase-dependent expression of genes encoding cargo proteins and requires a multifactorial regulatory network to ensure appropriate kinetics of the secretory traffic and the fitness of V. cholerae in different ecological niches.


PLOS ONE | 2012

Length Bias Correction in Gene Ontology Enrichment Analysis Using Logistic Regression

Gu Mi; Yanming Di; Sarah C. Emerson; Jason S. Cumbie; Jeff H. Chang

When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called “length bias”, will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.


Environmental Entomology | 2012

The Statistical Analysis of Insect Phenology

Paul A. Murtaugh; Sarah C. Emerson; Peter B. McEvoy; Kimberley M. Higgs

ABSTRACT We introduce two simple methods for the statistical comparison of the temporal pattern of life-cycle events between two populations. The methods are based on a translation of stage-frequency data into individual ‘times in stage’. For example, if the stage-k individuals in a set of samples consist of three individuals counted at time t1 and two counted at time t2, the observed times in stage k would be (t1, t1; t1; t2, t2). Times in stage then can be compared between two populations by performing stage-specific t-tests or by testing for equality of regression lines of time versus stage between the two populations. Simulations show that our methods perform at close to the nominal level, have good power against a range of alternatives, and have much better operating characteristics than a widely-used phenology model from the literature.


Statistical Applications in Genetics and Molecular Biology | 2013

Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data

Yanming Di; Sarah C. Emerson; Daniel W. Schafer; Jeffrey A. Kimbrel; Jeff H. Chang

Abstract RNA sequencing (RNA-Seq) is the current method of choice for characterizing transcriptomes and quantifying gene expression changes. This next generation sequencing-based method provides unprecedented depth and resolution. The negative binomial (NB) probability distribution has been shown to be a useful model for frequencies of mapped RNA-Seq reads and consequently provides a basis for statistical analysis of gene expression. Negative binomial exact tests are available for two-group comparisons but do not extend to negative binomial regression analysis, which is important for examining gene expression as a function of explanatory variables and for adjusted group comparisons accounting for other factors. We address the adequacy of available large-sample tests for the small sample sizes typically available from RNA-Seq studies and consider a higher-order asymptotic (HOA) adjustment to likelihood ratio tests. We demonstrate that 1) the HOA-adjusted likelihood ratio test is practically indistinguishable from the exact test in situations where the exact test is available, 2) the type I error of the HOA test matches the nominal specification in regression settings we examined via simulation, and 3) the power of the likelihood ratio test does not appear to be affected by the HOA adjustment. This work helps clarify the accuracy of the unadjusted likelihood ratio test and the degree of improvement available with the HOA adjustment. Furthermore, the HOA test may be preferable even when the exact test is available because it does not require ad hoc library size adjustments.


Methods in Ecology and Evolution | 2015

Penalized likelihood methods improve parameter estimates in occupancy models

Rebecca A. Hutchinson; Jonathon J. Valente; Sarah C. Emerson; Matthew G. Betts; Thomas G. Dietterich

Summary Occupancy models are employed in species distribution modelling to account for imperfect detection during field surveys. While this approach is popular in the literature, problems can occur when estimating the model parameters. In particular, the maximum likelihood estimates can exhibit bias and large variance for data sets with small sample sizes, which can result in estimated occupancy probabilities near 0 and 1 (‘boundary estimates’). In this paper, we explore strategies for estimating parameters based on maximizing a penalized likelihood. Penalized likelihood methods augment the usual likelihood with a penalty function that encodes information about what parameter values are undesirable. We introduce penalties for occupancy models that have analogues in ridge regression and Bayesian approaches, and we compare them to a penalty developed for occupancy models in prior work. We examine the bias, variance and mean squared error of parameter estimates obtained from each method on synthetic data. Across all of the synthetic data sets, the penalized estimation methods had lower mean squared error than the maximum likelihood estimates. We also provide an example of the application of these methods to point counts of avian species. Penalized likelihood methods show similar improvements when tested using empirical bird point count data. We discuss considerations for choosing among these methods when modelling occupancy. We conclude that penalized methods may be of practical utility for fitting occupancy models with small sample sizes, and we are releasing R code that implements these methods.


Biometrics | 2014

An evaluation of inferential procedures for adaptive clinical trial designs with pre‐specified rules for modifying the sample size

Gregory P. Levin; Sarah C. Emerson; Scott S. Emerson

Many papers have introduced adaptive clinical trial methods that allow modifications to the sample size based on interim estimates of treatment effect. There has been extensive commentary on type I error control and efficiency considerations, but little research on estimation after an adaptive hypothesis test. We evaluate the reliability and precision of different inferential procedures in the presence of an adaptive design with pre-specified rules for modifying the sampling plan. We extend group sequential orderings of the outcome space based on the stage at stopping, likelihood ratio statistic, and sample mean to the adaptive setting in order to compute median-unbiased point estimates, exact confidence intervals, and P-values uniformly distributed under the null hypothesis. The likelihood ratio ordering is found to average shorter confidence intervals and produce higher probabilities of P-values below important thresholds than alternative approaches. The bias adjusted mean demonstrates the lowest mean squared error among candidate point estimates. A conditional error-based approach in the literature has the benefit of being the only method that accommodates unplanned adaptations. We compare the performance of this and other methods in order to quantify the cost of failing to plan ahead in settings where adaptations could realistically be pre-specified at the design stage. We find the cost to be meaningful for all designs and treatment effects considered, and to be substantial for designs frequently proposed in the literature.

Collaboration


Dive into the Sarah C. Emerson's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yanming Di

Oregon State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Joseph V. Bonventre

Brigham and Women's Hospital

View shared research outputs
Top Co-Authors

Avatar

Kyle Rudser

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Luis Leon-Novelo

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge