Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David R. Bickel is active.

Publication


Featured researches published by David R. Bickel.


Plant Molecular Biology | 2008

Genome-wide allele-specific expression analysis using Massively Parallel Signature Sequencing (MPSS™) Reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue

Mei Guo; Sean Yang; Mary Rupe; Bin Hu; David R. Bickel; Lane Arthur; Oscar S. Smith

Allelic differences in expression are important genetic factors contributing to quantitative trait variation in various organisms. However, the extent of genome-wide allele-specific expression by different modes of gene regulation has not been well characterized in plants. In this study we developed a new methodology for allele-specific expression analysis by applying Massively Parallel Signature Sequencing (MPSS™), an open ended and sequencing based mRNA profiling technology. This methodology enabled a genome-wide evaluation of cis- and trans-effects on allelic expression in six meristem stages of the maize hybrid. Summarization of data from nearly 400 pairs of MPSS allelic signature tags showed that 60% of the genes in the hybrid meristems exhibited differential allelic expression. Because both alleles are subjected to the same trans-acting factors in the hybrid, the data suggest the abundance of cis-regulatory differences in the genome. Comparing the same allele expressed in the hybrid versus its inbred parents showed that 40% of the genes were differentially expressed, suggesting different trans-acting effects present in different genotypes. Such trans-acting effects may result in gene expression in the hybrid different from allelic additive expression. With this approach we quantified gene expression in the hybrid relative to its inbred parents at the allele-specific level. As compared to measuring total transcript levels, this study provides a new level of understanding of different modes of gene regulation in the hybrid and the molecular basis of heterosis.


Computational Statistics & Data Analysis | 2002

Robust estimators of the mode and skewness of continuous data

David R. Bickel

Measures of location based on the shortest half sample, including the shorth and the location of the least median of squares, are more robust than the median to outliers, but less robust to contamination near the location. Although such measures can estimate the mode, the proposed estimator of the mode, based on densest half ranges, has a much lower bias while having similar robustness. Like the median, this mode estimator has the highest breakdown point possible: the estimator has meaning if less than half the sample consists of outliers. The mode is more robust than the median in that the mode estimates are unaffected by outliers, whereas the median is influenced by each outlier. Robustness in this sense is quantified by the rejection point, the largest absolute value that is not rejected, which is low for the mode but infinite for the median. Even though the median is changed less by contamination near the location than is the mode, outliers generally pose more of a problem to estimation than contamination near the location, so the mode is more robust for data that may have a large number of outliers. A robust estimator of skewness is based on this mode estimator.


Bioinformatics | 2003

Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically

David R. Bickel

MOTIVATION The success of each method of cluster analysis depends on how well its underlying model describes the patterns of expression. Outlier-resistant and distribution-insensitive clustering of genes are robust against violations of model assumptions. RESULTS A measure of dissimilarity that combines advantages of the Euclidean distance and the correlation coefficient is introduced. The measure can be made robust using a rank order correlation coefficient. A robust graphical method of summarizing the results of cluster analysis and a biological method of determining the number of clusters are also presented. These methods are applied to a public data set, showing that rank-based methods perform better than log-based methods. AVAILABILITY Software is available from http://www.davidbickel.com.


Computational Statistics & Data Analysis | 2006

On a fast, robust estimator of the mode: Comparisons to other robust estimators with applications

David R. Bickel; Rudolf Frühwirth

Advances in computing power enable more widespread use of the mode, which is a natural measure of central tendency since it is not influenced by the tails in the distribution. The properties of the half-sample mode, which is a simple and fast estimator of the mode of a continuous distribution, are studied. The half-sample mode is less sensitive to outliers than most other estimators of location, including many other low-bias estimators of the mode. Its breakdown point is one half, equal to that of the median. However, because of its finite rejection point, the half-sample mode is much less sensitive to outliers that are all either greater or less than the other values of the sample. This is confirmed by applying the mode estimator and the median to samples drawn from normal, lognormal, and Pareto distributions contaminated by outliers. It is also shown that the half-sample mode, in combination with a robust scale estimator, is a highly robust starting point for iterative robust location estimators such as Hubers M-estimator. The half-sample mode can easily be generalized to modal intervals containing more or less than half of the sample. An application of such an estimator to the finding of collision points in high-energy proton-proton interactions is presented.


Journal of Statistical Computation and Simulation | 2003

Robust and efficient estimation of the mode of continuous data: the mode as a viable measure of central tendency

David R. Bickel

Although a natural measure of the central tendency of a sample of continuous data is its mode the mean and median are the most popular measures of location due to their simplicity and ease of estimation. The median is often used instead of the mean for asymmetric data because it is closer to the mode and is insensitive to extreme values in the sample. However, the mode itself can be reliably estimated by first transforming the data into approximately normal data by raising the values to a real power, and then estimating the mean and standard deviation of the transformed data. With this method, two estimators of the mode of the original data are proposed: a simple estimator based on estimating the mean by the sample mean and the standard deviation by the sample standard deviation, and a more robust estimator based on estimating the mean by the median and the standard deviation by the standardized median absolute deviation. Both of these mode estimators were tested using simulated data drawn from normal (symmetric), lognormal (asymmetric), and Pareto (very asymmetric) distributions. The latter two distributions were chosen to test the generality of the method since they are not power transforms of the normal distribution. Each of the proposed estimators of the mode has a much lower variance than the mean and median for the two asymmetric distributions. When outliers were added to the simulations, the more robust of the two proposed mode estimators had a lower bias and variance than the median for the asymmetric distributions, especially when the level of contamination approached the 50% breakdown point. It is concluded that the mode is often a more reliable measure of location than the mean or median for asymmetric data. The proposed estimators also performed well relative to previous estimators of the mode. While different estimators are better under different conditions, the proposed robust estimator is reliable for a wide variety of distributions and contamination levels.


PLOS ONE | 2010

Long-chain fatty acid combustion rate is associated with unique metabolite profiles in skeletal muscle mitochondria.

Erin L. Seifert; Oliver Fiehn; Véronic Bézaire; David R. Bickel; Gert Wohlgemuth; Sean H. Adams; Mary-Ellen Harper

Background/Aim Incomplete or limited long-chain fatty acid (LCFA) combustion in skeletal muscle has been associated with insulin resistance. Signals that are responsive to shifts in LCFA β-oxidation rate or degree of intramitochondrial catabolism are hypothesized to regulate second messenger systems downstream of the insulin receptor. Recent evidence supports a causal link between mitochondrial LCFA combustion in skeletal muscle and insulin resistance. We have used unbiased metabolite profiling of mouse muscle mitochondria with the aim of identifying candidate metabolites within or effluxed from mitochondria and that are shifted with LCFA combustion rate. Methodology/Principal Findings Large-scale unbiased metabolomics analysis was performed using GC/TOF-MS on buffer and mitochondrial matrix fractions obtained prior to and after 20 min of palmitate catabolism (n = 7 mice/condition). Three palmitate concentrations (2, 9 and 19 µM; corresponding to low, intermediate and high oxidation rates) and 9 µM palmitate plus tricarboxylic acid (TCA) cycle and electron transport chain inhibitors were each tested and compared to zero palmitate control incubations. Paired comparisons of the 0 and 20 min samples were made by Students t-test. False discovery rate were estimated and Type I error rates assigned. Major metabolite groups were organic acids, amines and amino acids, free fatty acids and sugar phosphates. Palmitate oxidation was associated with unique profiles of metabolites, a subset of which correlated to palmitate oxidation rate. In particular, palmitate oxidation rate was associated with distinct changes in the levels of TCA cycle intermediates within and effluxed from mitochondria. Conclusions/Significance This proof-of-principle study establishes that large-scale metabolomics methods can be applied to organelle-level models to discover metabolite patterns reflective of LCFA combustion, which may lead to identification of molecules linking muscle fat metabolism and insulin signaling. Our results suggest that future studies should focus on the fate of effluxed TCA cycle intermediates and on mechanisms ensuring their replenishment during LCFA metabolism in skeletal muscle.


The FASEB Journal | 2013

Muscle uncoupling protein 3 overexpression mimics endurance training and reduces circulating biomarkers of incomplete β-oxidation

Céline Aguer; Oliver Fiehn; Erin L. Seifert; Véronic Bézaire; John K. Meissen; Amanda Daniels; Kyle Scott; Jean Marc Renaud; Marta Padilla; David R. Bickel; Michael Dysart; Sean H. Adams; Mary-Ellen Harper

Exercise substantially improves metabolic health, making the elicited mechanisms important targets for novel therapeutic strategies. Uncoupling protein 3 (UCP3) is a mitochondrial inner membrane protein highly selectively expressed in skeletal muscle. Here we report that moderate UCP3 overexpression (roughly 3‐fold) in muscles of UCP3 transgenic (UCP3Tg) mice acts as an exercise mimetic in many ways. UCP3 overexpression increased spontaneous activity (~40%) and energy expenditure (~5–10%) and decreased oxidative stress (~ 15–20%), similar to exercise training in wild‐type (WT) mice. The increase in complete fatty acid oxidation (FAO; ~30% for WT and ~70% for UCP3 Tg) and energy expenditure (~8% for WT and 15% for UCP3 Tg) in response to endurance training was higher in UCP3 Tg than in WT mice, showing an additive effect of UCP3 and endurance training on these two parameters. Moreover, increases in circulating short‐chain acylcarnitines in response to acute exercise in untrained WT mice were absent with training or in UCP3 Tg mice. UCP3 overexpression had the same effect as training in decreasing long‐chain acylcarnitines. Outcomes coincided with a reduction in muscle carnitine acetyltransferase activity that catalyzes the formation of acylcarnitines. Overall, results are consistent with the conclusions that circulating acylcarnitines could be used as a marker of incomplete muscle FAO and that UCP3 is a potential target for the treatment of prevalent metabolic diseases in which muscle FAO is affected.—Aguer, C., Fiehn, O., Seifert, E. L., Bézaire, V., Meissen, J. K., Daniels, A., Scott, K., Renaud, J.‐M., Padilla, M., Bickel, D. R., Dysart, M., Adams, S. H., Harper, M.‐E. Muscle uncoupling protein 3 overexpression mimics endurance training and reduces circulating biomarkers of incomplete β‐oxidation. FASEB J. 27, 4213–4225 (2013). www.fasebj.org


Bioinformatics | 2005

Probabilities of spurious connections in gene networks: application to expression time series

David R. Bickel

MOTIVATION The reconstruction of gene networks from gene-expression microarrays is gaining popularity as methods improve and as more data become available. The reliability of such networks could be judged by the probability that a connection between genes is spurious, resulting from chance fluctuations rather than from a true biological relationship. RESULTS Unlike the false discovery rate and positive false discovery rate, the decisive false discovery rate (dFDR) is exactly equal to a conditional probability without assuming independence or the randomness of hypothesis truth values. This property is useful not only in the common application to the detection of differential gene expression, but also in determining the probability of a spurious connection in a reconstructed gene network. Estimators of the dFDR can estimate each of three probabilities: (1) The probability that two genes that appear to be associated with each other lack such association. (2) The probability that a time ordering observed for two associated genes is misleading. (3) The probability that a time ordering observed for two genes is misleading, either because they are not associated or because they are associated without a lag in time. The first probability applies to both static and dynamic gene networks, and the other two only apply to dynamic gene networks.


Statistical Applications in Genetics and Molecular Biology | 2012

Estimators of the local false discovery rate designed for small numbers of tests

Marta Padilla; David R. Bickel

Abstract Histogram-based empirical Bayes methods developed for analyzing data for large numbers of genes, SNPs, or other biological features tend to have large biases when applied to data with a smaller number of features such as genes with expression measured conventionally, proteins, and metabolites. To analyze such small-scale and medium-scale data in an empirical Bayes framework, we introduce corrections of maximum likelihood estimators (MLEs) of the local false discovery rate (LFDR). In this context, the MLE estimates the LFDR, which is a posterior probability of null hypothesis truth, by estimating the prior distribution. The corrections lie in excluding each feature when estimating one or more parameters on which the prior depends. In addition, we propose the expected LFDR (ELFDR) in order to propagate the uncertainty involved in estimating the prior. We also introduce an optimally weighted combination of the best of the corrected MLEs with a previous estimator that, being based on a binomial distribution, does not require a parametric model of the data distribution across features. An application of the new estimators and previous estimators to protein abundance data illustrates the extent to which different estimators lead to different conclusions about which proteins are affected by cancer.A simulation study was conducted to approximate the bias of the new estimators relative to previous LFDR estimators. Data were simulated for two different numbers of features (N), two different noncentrality parameter values or detectability levels (dalt), and several proportions of unaffected features (p0). One of these previous estimators is a histogram-based estimator (HBE) designed for a large number of features. The simulations show that some of the corrected MLEs and the ELFDR that corrects the HBE reduce the negative bias relative to the MLE and the HBE, respectively.For every method, we defined the worst-case performance as the maximum of the absolute value of the bias over the two different dalt and over various p0. The best worst-case methods represent the safest methods to be used under given conditions. This analysis indicates that the binomial-based method has the lowest worst-case absolute bias for high p0 and for N = 3, 12. However, the corrected MLE that is based on the minimum description length (MDL) principle is the best worst-case method when the value of p0 is more uncertain since it has one of the lowest worst-case biases over all possible values of p0 and for N = 3, 12. Therefore, the safest estimator considered is the binomial-based method when a high proportion of unaffected features can be assumed and the MDL-based method otherwise.A second simulation study was conducted with additional values of N. We found that HBE requires N to be at least 6-12 features to perform as well as the estimators proposed here, with the precise minimum N depending on p0 and dalt.


Canadian Journal of Statistics-revue Canadienne De Statistique | 2011

A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons

David R. Bickel

The normalized maximum likelihood (NML) is a recent penalized likelihood that has properties that justify defining the amount of discrimination information (DI) in the data supporting an alternative hypothesis over a null hypothesis as the logarithm of an NML ratio, namely, the alternative hypothesis NML divided by the null hypothesis NML. The resulting DI, like the Bayes factor but unlike the p-value, measures the strength of evidence for an alternative hypothesis over a null hypothesis such that the probability of misleading evidence vanishes asymptotically under weak regularity conditions and such that evidence can support a simple null hypothesis. Unlike the 1 ar X iv :1 01 0. 06 94 v3 [ m at h. ST ] 2 N ov 2 01 0 Bayes factor, the DI does not require a prior distribution and is minimax optimal in a sense that does not involve averaging over outcomes that did not occur. Replacing a (possibly pseudo-) likelihood function with its weighted counterpart extends the scope of the DI to models for which the unweighted NML is undefined. The likelihood weights leverage side information, either in data associated with comparisons other than the comparison at hand or in the parameter value of a simple null hypothesis. Two case studies, one involving multiple populations and the other involving multiple biological features, indicate that the DI is robust to the type of side information used when that information is assigned the weight of a single observation. Such robustness suggests that very little adjustment for multiple comparisons is warranted if the sample size is at least moderate.

Collaboration


Dive into the David R. Bickel's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bruce J. West

Johns Hopkins University School of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Erin L. Seifert

Thomas Jefferson University

View shared research outputs
Top Co-Authors

Avatar

Oliver Fiehn

University of California

View shared research outputs
Top Co-Authors

Avatar

Sean H. Adams

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge