Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ali S. Hadi is active.

Publication


Featured researches published by Ali S. Hadi.


Computational Statistics & Data Analysis | 2000

BACON: blocked adaptive computationally efficient outlier nominators

Nedret Billor; Ali S. Hadi; Paul F. Velleman

Abstract Although it is customary to assume that data are homogeneous, in fact, they often contain outliers or subgroups. Methods for identifying multiple outliers and subgroups must deal with the challenge of establishing a metric that is not itself contaminated by inhomogeneities by which to measure how extraordinary a data point is. For samples of a sufficient size to support sophisticated methods, the computation cost often makes outlier detection unattractive. All multiple outlier detection methods have suffered in the past from a computational cost that escalated rapidly with the sample size. We propose a new general approach, based on the methods of Hadi (1992a,1994) and Hadi and Simonoff (1993) that can be computed quickly — often requiring less than five evaluations of the model being fit to the data, regardless of the sample size. Two cases of this approach are presented in this paper (algorithms for the detection of outliers in multivariate and regression data). The algorithms, however, can be applied more broadly than to these two cases. We show that the proposed methods match the performance of more computationally expensive methods on standard test problems and demonstrate their superior performance on large simulated challenges.


Journal of the American Statistical Association | 1993

Procedures for the Identification of Multiple Outliers in Linear Models

Ali S. Hadi; Jeffrey S. Simonoff

Abstract We consider the problem of identifying and testing multiple outliers in linear models. The available outlier identification methods often do not succeed in detecting multiple outliers because they are affected by the observations they are supposed to identify. We introduce two test procedures for the detection of multiple outliers that appear to be less sensitive to this problem. Both procedures attempt to separate the data into a set of “clean” data points and a set of points that contain the potential outliers. The potential outliers are then tested to see how extreme they are relative to the clean subset, using an appropriately scaled version of the prediction error. The procedures are illustrated and compared to various existing methods, using several data sets known to contain multiple outliers. Also, the performances of both procedures are investigated by a Monte Carlo study. The data sets and the Monte Carlo indicate that both procedures are effective in the detection of multiple outliers ...


Journal of the American Statistical Association | 1997

Fitting the Generalized Pareto Distribution to Data

Enrique Castillo; Ali S. Hadi

Abstract The generalized Pareto distribution (GPD) was introduced by Pickands to model exceedances over a threshold. It has since been used by many authors to model data in several fields. The GPD has a scale parameter ([sgrave] > 0) and a shape parameter (−∞ 1, the maximum likelihood estimates do not exist, and when k is between 1/2 and 1, they may have problems. Furthermore, for k ≤ −1/2, second and higher moments do not exist, and hence both the method-of-moments (MOM) and the probability-weighted moments (PWM) estimates do not exist. Another and perhaps more serious problem with the MOM and PWM methods is that they can produce nonsensical estimates (i.e., estimates inconsistent with the observed data). In this article we propose a method for estimating the parameters and quantiles of the GPD. The estimators are well defined for all parameter values. They are also easy to compute. Some asymptotic results are provide...


systems man and cybernetics | 1997

Sensitivity analysis in discrete Bayesian networks

Enrique Castillo; José Manuel Gutiérrez; Ali S. Hadi

This paper presents an efficient computational method for performing sensitivity analysis in discrete Bayesian networks. The method exploits the structure of conditional probabilities of a target node given the evidence. First, the set of parameters which is relevant to the calculation of the conditional probabilities of the target node is identified. Next, this set is reduced by removing those combinations of the parameters which either contradict the available evidence or are incompatible. Finally, using the canonical components associated with the resulting subset of parameters, the desired conditional probabilities are obtained. In this way, an important saving in the calculations is achieved. The proposed method can also be used to compute exact upper and lower bounds for the conditional probabilities, hence a sensitivity analysis can be easily performed. Examples are used to illustrate the proposed methodology.


Computational Statistics & Data Analysis | 1992

A new measure of overall potential influence in linear regression

Ali S. Hadi

Abstract In this article we introduce a new influence measure and a graphical display for the detection of influential observations in linear regression. We demonstrate the need for such a new influence measure by giving examples where existing influence measures, numerous as they may be, may fail to detect influential observations. The existing influence measures are procedures each of which is designed to detect the influence of observations on a specific regression result. The proposed diagnostic is a measure of overall potential influence. Fitting a regression model in practice can have many purposes and the user may prefer to begin with an overall measure. This diagnostic possesses several desirable properties that many of the frequently used diagnostics do not generally possess: (a) invariance to location and scale in the response variable. (b) invariance to nonsingular transformations of the explanatory variables, (c) it is an additive function of measures of leverage and of residual error, and (d) it is monotonically increasing in the leverage values and in the squared residuals. The new influence measure is also supplemented by a graphical display which shows the source of influence. This graph is suitable for both single- and multiple-case influence. Analytical and empirical comparisons between the proposed measure and the existing ones indicate that our measure is superior to existing influence measures.


The American Statistician | 1998

Some Cautionary Notes on the Use of Principal Components Regression

Ali S. Hadi; Robert F. Ling

Abstract Many textbooks on regression analysis include the methodology of principal components regression (PCR) as a way of treating multicollinearity problems. Although we have not encountered any strong justification of the methodology, we have encountered, through carrying out the methodology in well-known data sets with severe multicollinearity, serious actual and potential pitfalls in the methodology. We address these pitfalls as cautionary notes, numerical examples that use well-known data sets. We also illustrate by theory and example that it is possible for the PCR to fail miserably in the sense that when the response variable is regressed on all of the p principal components (PCs), the first (p − 1) PCs contribute nothing toward the reduction of the residual sum of squares, yet the last PC alone (the one that is always discarded according to PCR methodology) contributes everything. We then give conditions under which the PCR totally fails in the above sense.


Computational Statistics & Data Analysis | 1997

Maximum trimmed likelihood estimators: a unified approach, examples, and algorithms

Ali S. Hadi; Alberto Luceño

Abstract The likelihood principle is one of the most important concepts in Statistics. Among other things, it is used to obtain point estimators for the parameters of probability distributions of random variables by maximizing the likelihood function. The resulting maximum likelihood estimators usually have desirable properties such as consistency and efficiency. However, these estimators are often not robust as it is known that there is usually a trade-off between robustness and efficiency; the more robust an estimator is the less efficient it may be when the data come from a Gaussian distribution. In this paper we investigate how the estimators change when the likelihood function is replaced by a trimmed version of it. The idea here is to trim the likelihood function rather than directly trim the data. Because the likelihood is scalar-valued, it is always possible to order and trim univariate as well as multivariate observations according to their contributions to the likelihood function. The degree of trimming depends on some parameters to be specified by the analyst. We show how this trimmed likelihood principle produces many of the existing estimators (e.g., maximum likelihood, least squares, least trimmed squares, least median of squares, and minimum volume ellipsoid estimators) as special cases. Since the resulting estimators may be very robust, they can be used for example for outliers detection. In some cases the estimators can be obtained in closed form. In other cases they may require numerical solutions. In cases where the estimators cannot be obtained in closed forms, we provide several algorithms for computing the estimates. The method and the algorithms are illustrated by several examples of both discrete and continuous distributions.


Technometrics | 2004

A General Method for Local Sensitivity Analysis With Application to Regression Models and Other Optimization Problems

Enrique Castillo; Ali S. Hadi; Antonio J. Conejo; Alfonso Fernández-Canteli

This article introduces a method for sensitivity analysis of general applicability. The method is based on the well-known duality property of mathematical programming, which states that the partial derivatives of the primal objective function with respect to the constraints on the right side parameters are the negative of the optimal values of the dual problem variables. For the parameters or data, for which sensitivities are sought, to appear on the right side, they are converted into artificial variables and set to their actual values, thus obtaining the desired constraints. The method is applicable to linear and nonlinear models, to normal and nonnormal models, and to least squares and other methods of estimation. In addition to its general applicability, the method is also computationally inexpensive, because the necessary information becomes available without extra calculations. The theoretical basis for the method is given and illustrated by its application to least squares, least absolute value, and minimax regression problems and to the estimation of a Weibull distribution from censored data.


Technometrics | 2001

Some Applications of Functional Networks in Statistics and Engineering

Enrique Castillo; José Manuel Gutiérrez; Ali S. Hadi; Beatriz Lacruz

Functional networks are a general framework useful for solving a wide range of problems in probability, statistics, and engineering applications. In this article, we demonstrate that functional networks can be used for many general purposes including (a) solving nonlinear regression problems without the rather strong assumption of a known functional form, (b) modeling chaotic time series data, (c) finding conjugate families of distribution functions needed for the applications of Bayesian statistical techniques, (d) analyzing the problem of stability with respect to maxima operations, which are useful in the theory and applications of extreme values, and (e) modeling the reproductivity and associativity laws that have many applications in applied probability. We also give two specific engineering applications—analyzing the Ikeda map with parameters leading to chaotic behavior and modeling beam stress subject to a given load. The main purpose of this article is to introduce functional networks and to show their power and usefulness in engineering and statistical applications. We describe the steps involved in working with functional networks including structural learning (specification and simplification of the initial topology), parametric learning, and model-selection procedures. The concepts and methodologies are illustrated using several examples of applications.


International Journal of Fatigue | 1999

On fitting a fatigue model to data

Enrique Castillo; Alfonso Fernández-Canteli; Ali S. Hadi

Abstract Fatigue lifetimes are dependent on several physical constraints. Therefore, a realistic model for analyzing fatigue lifetime data should take into account these constraints. These physical considerations lead to a functional solution in the form of two five-parameter models for the analysis of fatigue lifetime data. The parameters have clear physical interpretations. However, the standard estimation methods, such as the maximum likelihood, do not produce satisfactory results because: (a) the range of the distribution depends on the parameters, (b) the parameters appear non-linearly in the likelihood gradient equations and hence their solution requires multidimensional searches which may lead to convergence problems, and (c) the maximum likelihood estimates may not exist because the likelihood can be made infinite for some values of the parameters. Castillo and Hadi [5] consider only one of the two models and use the elemental percentile method to estimate the parameters and quantiles. This paper considers the other model. The parameters and quantiles are estimated by the elemental percentile method and are easy to compute. A simulation study shows that the estimators perform well under different values of the parameters. The method is also illustrated by fitting the model to an example of real-life fatigue lifetime data.

Collaboration


Dive into the Ali S. Hadi's collaboration.

Top Co-Authors

Avatar

Enrique Castillo

American University in Cairo

View shared research outputs
Top Co-Authors

Avatar

Samprit Chatterjee

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

José Manuel Gutiérrez

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

Enrique Castillo

American University in Cairo

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge