Yoav Benjamini
Tel Aviv University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yoav Benjamini.
Bioinformatics | 2003
Anat Reiner; Daniel Yekutieli; Yoav Benjamini
MOTIVATION DNA microarrays have recently been used for the purpose of monitoring expression levels of thousands of genes simultaneously and identifying those genes that are differentially expressed. The probability that a false identification (type I error) is committed can increase sharply when the number of tested genes gets large. Correlation between the test statistics attributed to gene co-regulation and dependency in the measurement errors of the gene expression levels further complicates the problem. In this paper we address this very large multiplicity problem by adopting the false discovery rate (FDR) controlling approach. In order to address the dependency problem, we present three resampling-based FDR controlling procedures, that account for the test statistics distribution, and compare their performance to that of the naïve application of the linear step-up procedure in Benjamini and Hochberg (1995). The procedures are studied using simulated microarray data, and their performance is examined relative to their ease of implementation. RESULTS Comparative simulation analysis shows that all four FDR controlling procedures control the FDR at the desired level, and retain substantially more power then the family-wise error rate controlling procedures. In terms of power, using resampling of the marginal distribution of each test statistics substantially improves the performance over the naïve one. The highest power is achieved, at the expense of a more sophisticated algorithm, by the resampling-based procedures that resample the joint distribution of the test statistics and estimate the level of FDR control. AVAILABILITY An R program that adjusts p-values using FDR controlling procedures is freely available over the Internet at www.math.tau.ac.il/~ybenja.
Journal of Educational and Behavioral Statistics | 2000
Yoav Benjamini; Yosef Hochberg
A new approach to problems of multiple significance testing was presented in Benjamini and Hochberg (1995), which calls for controlling the expected ratio of the number of erroneous rejections to the number of rejections–the False Discovery Rate (FDR). The procedure given there was shown to control the FDR for independent test statistics. When some of the hypotheses are in fact false, that procedure is too conservative. We present here an adaptive procedure, where the number of true null hypotheses is estimated first as in Hochberg and Benjamini (1990), and this estimate is used in the procedure of Benjamini and Hochberg (1995). The result is still a simple stepwise procedure, to which we also give a graphical companion. The new procedure is used in several examples drawn from educational and behavioral studies, addressing problems in multi-center studies, subset analysis and meta-analysis. The examples vary in the number of hypotheses tested, and the implication of the new procedure on the conclusions. In a large simulation study of independent test statistics the adaptive procedure is shown to control the FDR and have substantially better power than the previously suggested FDR controlling method, which by itself is more powerful than the traditional family wise error-rate controlling methods. In cases where most of the tested hypotheses are far from being true there is hardly any penalty due to the simultaneous testing of many hypotheses.
Journal of Statistical Planning and Inference | 1999
Daniel Yekutieli; Yoav Benjamini
A new false discovery rate controlling procedure is proposed for multiple hypotheses testing. The procedure makes use of resampling-based p-value adjustment, and is designed to cope with correlated test statistics. Some properties of the proposed procedure are investigated theoretically, and further properties are investigated using a simulation study. According to the results of the simulation study, the new procedure offers false discovery rate control and greater power. The motivation for developing this resampling-based procedure was an actual problem in meteorology, in which almost 2000 hypotheses are tested simultaneously using highly correlated test statistics. When applied to this problem the increase in power was evident. The same procedure can be used in many other large problems of multiple testing, for example multiple endpoints. The procedure is also extended to serve as a general diagnostic tool in model selection.
Geophysical Research Letters | 2002
Pinhas Alpert; T. Ben-Gai; Anat Baharad; Yoav Benjamini; Daniel Yekutieli; M. Colacino; L. Diodato; C. Ramis; V. Homar; R. Romero; S. Michaelides; A. Manes
] Earlier reports indicated some specific isolated regionsexhibiting a paradoxical increase of extreme rainfall in spite ofdecrease in the totals. Here, we conduct a coherent study of thefull-scale of daily rainfall categories over a relatively largesubtropical region- the Mediterranean- in order to assess whetherthis paradoxical behavior is real and its extent. We show that thetorrential rainfall in Italy exceeding 128 mm/d has increasedpercentage-wise by a factor of 4 during 1951–1995 with strongpeaks in El-Nino years. In Spain, extreme categories at both tails ofthe distribution (light: 0-4 mm/d and heavy/torrential: 64 mm/d andup) increased significantly. No significant trends were found inIsrael and Cyprus. The consequent redistribution of the dailyrainfall categories -torrential/heavy against the moderate/lightintensities - is of utmost interest particularly in the semi-aridsub-tropical regions for purposes of water management, soilerosion and flash floods impacts. I
Annals of Statistics | 2006
Felix Abramovich; Yoav Benjamini; David L. Donoho; Iain M. Johnstone
We attempt to recover an n-dimensional vector observed in white noise, where n is large and the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing power-law decay bounds on the ordered entries; and controlling the lp norm for p small. We obtain a procedure which is asymptotically minimax for l r loss, simultaneously throughout a range of such sparsity classes. The optimal procedure is a data-adaptive thresholding scheme, driven by control of the False Discovery Rate (FDR). FDR control is a relatively recent innovation in simultaneous testing, ensuring that at most a certain fraction of the rejected null hypotheses will correspond to false rejections. In our treatment, the FDR control parameter qn also plays a determining role in asymptotic minimaxity. If q = lim qn ∈ [0,1/2] and also qn > γ/log(n) we get sharp asymptotic minimaxity, simultaneously, over a wide range of sparse parameter spaces and loss functions. On the other hand, q = lim qn ∈ (1/2,1], forces the risk to exceed the minimax risk by a factor growing with q. To our knowledge, this relation between ideas in simultaneous inference and asymptotic decision theory is new. Our work provides a new perspective on a class of model selection rules which has been introduced recently by several authors. These new rules impose complexity penalization of the form 2 � log( potential model size / actual model size ). We exhibit a close connection with FDR-controlling procedures under stringent control of the false discovery rate.
Journal of the American Statistical Association | 2005
Yoav Benjamini; Daniel Yekutieli
Often in applied research, confidence intervals (CIs) are constructed or reported only for parameters selected after viewing the data. We show that such selected intervals fail to provide the assumed coverage probability. By generalizing the false discovery rate (FDR) approach from multiple testing to selected multiple CIs, we suggest the false coverage-statement rate (FCR) as a measure of interval coverage following selection. A general procedure is then introduced, offering FCR control at level q under any selection rule. The procedure constructs a marginal CI for each selected parameter, but instead of the confidence level 1 − q being used marginally, q is divided by the number of parameters considered and multiplied by the number selected. If we further use the FDR controlling testing procedure of Benjamini and Hochberg for selecting the parameters, the newly suggested procedure offers CIs that are dual to the testing procedure and are shown to be optimal in the independent case. Under the positive regression dependency condition of Benjamini and Yekutieli, the FCR is controlled for one-sided tests and CIs, as well as for a modification for two-sided testing. Results for general dependency are also given. Finally, using the equivalence of the CIs to testing, we prove that the procedure of Benjamini and Hochberg offers directional FDR control as conjectured.
Journal of Statistical Planning and Inference | 1999
Yoav Benjamini; Wei Liu
For the problems of multiple hypotheses testing, Benjamini and Hochberg (1995, J. Roy. Statist. Soc. Ser. B 57, 289–300), proposed the control of the expected ratio of the number of erroneous rejections to the number of total rejections, the false discovery rate (FDR). The step-up procedure given in that paper controls the FDR when the test statistics are independent. In this paper, a new step-down procedure is presented, and it also controls the FDR when the test statistics are independent. The step-down procedure neither dominates nor is dominated by the step-up procedure. In a large simulation study of the power of the two procedures, the step-down procedure turns out to be more powerful when the number of tested hypotheses is small and many of the hypotheses are far from being true. An example is given to illustrate the step-down procedure.
Scandinavian Journal of Statistics | 1997
Yoav Benjamini; Yosef Hochberg
In this paper we offer a multiplicity of approaches and procedures for multiple testing problems with weights. Some rationale for incorporating weights in multiple hypotheses testing are discussed. Various type-I error-rates and different possible formulations are considered, for both the intersection hypothesis testing and the multiple hypotheses testing problems. An optimal per family weighted error-rate controlling procedure a la Spjotvoll (1972) is obtained. This model serves as a vehicle for demonstrating the different implications of the approaches to weighting. Alternative approach es to that of Holm (1979) for family-wise error-rate control with weights are discussed, one involving an alternative procedure for family-wise error-rate control, and the other involving the control of a weighted family-wise error-rate. Extensions and modifications of the procedures based on Simes (1986) are given. These include a test of the overall intersec tion hypothesis with general weights, and weighted sequentially rejective procedures for testing the individual hypotheses. The false discovery rate controlling approach and procedure of Benjamini & Hochberg (1995) are extended to allow for different weights.
Computational Statistics & Data Analysis | 1996
Felix Abramovich; Yoav Benjamini
Wavelet techniques have become an attractive and efficient tool in function estimation. Given noisy data, its discrete wavelet transform is an estimator of the wavelet coefficients. It has been shown by Donoho and Johnstone (Biometrika 81 (1994) 425–455) that thresholding the estimated coefficients and then reconstructing an estimated function reduces the expected risk close to the possible minimum. They offered a global threshold λ ∼ δ2lognfor j > j0, while the coefficients of the first coarse j0 levels are always included. We demonstrate that the choice of j0 may strongly affect the corresponding estimators. Then, we use the connection between thresholding and hypotheses testing to construct a thresholding procedure based on the false discovery rate (FDR) approach to multiple testing of Benjamini and Hochberg (J. Roy. Statist. Soc. Ser. B 57 (1995) 289–300). The suggested procedure controls the expected proportion of incorrectly included coefficients among those chosen for the wavelet reconstruction. The resulting procedure is inherently adaptive, and responds to the complexity of the estimated function and to the noise level. Finally, comparing the proposed FDR based procedure with the fixed global threshold by evaluating the relative mean-square-error across the various test-functions and noise levels, we find the FDR-estimator to enjoy robustness of MSE-efficiency.
Behavioural Brain Research | 1993
Ilan Golani; Yoav Benjamini; David Eilam
In the absence of an obvious reference place, rat locomotor behavior in a novel environment appears haphazard. In previous work, one or two places termed home bases, were shown to stand out from all the other places in the environment in terms of the behaviors performed in them and in terms of their behavioral stability. We use home base location as a reference place for rat movement in locale space, by defining an excursion as a trip starting at a home base and ending at the next stop at a home base. We then establish the uniform distribution as an appropriate model for the number of stops per excursion. This way we show that there is an intrinsic upper bound on the number of times a rat stops during an excursion. As a rat leaves the home base, home base attraction increases with every additional stop performed by it, first slowly and then fast. This cumulative process of attraction may be concluded after each stop, as long as the number of stops does not exceed an intrinsic upper bound; once the upper bound is reached, the rat concludes that excursion and returns to base. The sessions upper bound does not increase with the size of the explored area.