Nedret Billor
Auburn University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nedret Billor.
Computational Statistics & Data Analysis | 2000
Nedret Billor; Ali S. Hadi; Paul F. Velleman
Abstract Although it is customary to assume that data are homogeneous, in fact, they often contain outliers or subgroups. Methods for identifying multiple outliers and subgroups must deal with the challenge of establishing a metric that is not itself contaminated by inhomogeneities by which to measure how extraordinary a data point is. For samples of a sufficient size to support sophisticated methods, the computation cost often makes outlier detection unattractive. All multiple outlier detection methods have suffered in the past from a computational cost that escalated rapidly with the sample size. We propose a new general approach, based on the methods of Hadi (1992a,1994) and Hadi and Simonoff (1993) that can be computed quickly — often requiring less than five evaluations of the model being fit to the data, regardless of the sample size. Two cases of this approach are presented in this paper (algorithms for the detection of outliers in multivariate and regression data). The algorithms, however, can be applied more broadly than to these two cases. We show that the proposed methods match the performance of more computationally expensive methods on standard test problems and demonstrate their superior performance on large simulated challenges.
American Journal of Mathematical and Management Sciences | 2006
Nedret Billor; Samprit Chatterjee; Ali S. Hadi
Abstract Chatterjee and Mächler (1997) propose an iteratively weighted least squares procedure as a robust fit for linear models. The weights are a function of leverage and residuals. The standard measure of leverage (the diagonal element of the projection matrix) as is well known, can be distorted by the presence of collection of points which individually have small leverage values but collectively forms a high leverage group (“masking points”). The Chatterjee-Mächler procedure is not very effective when there is extensive masking. We present a procedure which works in the presence or absence of masking. In the proposed new procedure, instead of using the diagonal elements of the projection matrix as a measure of leverage, we use a robust distance proposed by Hadi (1992a, 1994). This measure eliminates the distorting effect of masking by constructing a measure of location and dispersion for the observed points which is free from the effects of multivariate outliers and clustering in the X-space. The method is complemented by a simple diagnostic plot which displays clearly the nature of all the data points, distinguishing among outliers, leverage points, and well-fitted points. The proposed procedure is illustrated by data sets which are known to have severe masking.
Journal of Applied Statistics | 1999
Nedret Billor
In this study, the method of local influence, which was introduced by Cook as a general tool for assessing the influence of local departures from the underlying assumptions, is applied to ridge regression, by defining the maximum pseudo-likelihood ridge estimator obtained using the augmentation approach, because this method is suitable for likelihood-based models. In addition, an alternative local influence approach suggested by Billor and Loynes is applied to ridge regression. A comparison of these approaches and an example are given.
Communications in Statistics - Simulation and Computation | 2008
Nedret Billor; Gülsen Kiral
The problem of outliers in statistical data has attracted many researchers for a long time. Consequently, numerous outlier detection methods have been proposed in the statistical literature. However, no consensus has emerged as to which method is uniformly better than the others or which one is recommended for use in practical situations. In this article, we perform an extensive comparative Monte Carlo simulation study to assess the performance of the multiple outlier detection methods that are either recently proposed or frequently cited in the outlier detection literature. Our simulation experiments include a wide variety of realistic and challenging regression scenarios. We give recommendations on which method is superior to others under what conditions.
Sensors | 2016
Gifty E. Acquah; Brian K. Via; Nedret Billor; Oladiran Fasina; Lori G. Eckhardt
As new markets, technologies and economies evolve in the low carbon bioeconomy, forest logging residue, a largely untapped renewable resource will play a vital role. The feedstock can however be variable depending on plant species and plant part component. This heterogeneity can influence the physical, chemical and thermochemical properties of the material, and thus the final yield and quality of products. Although it is challenging to control compositional variability of a batch of feedstock, it is feasible to monitor this heterogeneity and make the necessary changes in process parameters. Such a system will be a first step towards optimization, quality assurance and cost-effectiveness of processes in the emerging biofuel/chemical industry. The objective of this study was therefore to qualitatively classify forest logging residue made up of different plant parts using both near infrared spectroscopy (NIRS) and Fourier transform infrared spectroscopy (FTIRS) together with linear discriminant analysis (LDA). Forest logging residue harvested from several Pinus taeda (loblolly pine) plantations in Alabama, USA, were classified into three plant part components: clean wood, wood and bark and slash (i.e., limbs and foliage). Five-fold cross-validated linear discriminant functions had classification accuracies of over 96% for both NIRS and FTIRS based models. An extra factor/principal component (PC) was however needed to achieve this in FTIRS modeling. Analysis of factor loadings of both NIR and FTIR spectra showed that, the statistically different amount of cellulose in the three plant part components of logging residue contributed to their initial separation. This study demonstrated that NIR or FTIR spectroscopy coupled with PCA and LDA has the potential to be used as a high throughput tool in classifying the plant part makeup of a batch of forest logging residue feedstock. Thus, NIR/FTIR could be employed as a tool to rapidly probe/monitor the variability of forest biomass so that the appropriate online adjustments to parameters can be made in time to ensure process optimization and product quality.
Technometrics | 2014
Richard C. Jr. Bell; L. Allison Jones-Farmer; Nedret Billor
In quality control, a proper Phase I analysis is essential to the success of Phase II monitoring. A literature review reveals no distribution-free Phase I multivariate techniques in existence. This research develops a Phase I location control chart for multivariate elliptical processes. The resulting in-control reference sample can then be used to estimate the parameters for Phase II monitoring. Using Monte Carlo simulation, the proposed method is compared with the Hotellings T2 Phase I chart. Although Hotellings T2 chart is preferred when the data are multivariate normal, the proposed method is shown to perform significantly better under nonnormality. This article has supplementary material online.
Journal of Applied Statistics | 2007
C. Caroni; Nedret Billor
Abstract Many methods have been developed for detecting multiple outliers in a single multivariate sample, but very few for the case where there may be groups in the data. We propose a method of simultaneously determining groups (as in cluster analysis) and detecting outliers, which are points that are distant from every group. Our method is an adaptation of the BACON algorithm proposed by Billor, Hadi and Velleman for the robust detection of multiple outliers in a single group of multivariate data. There are two versions of our method, depending on whether or not the groups can be assumed to have equal covariance matrices. The effectiveness of the method is illustrated by its application to two real data sets and further shown by a simulation study for different sample sizes and dimensions for 2 and 3 groups, with and without planted outliers in the data. When the number of groups is not known in advance, the algorithm could be used as a robust method of cluster analysis, by running it for various numbers of groups and choosing the best solution.
Computational Statistics & Data Analysis | 2013
Seokho Lee; Hyejin Shin; Nedret Billor
We propose a robust method for estimating principal functions based on MM estimation. Specifically, we formulate functional principal component analysis into alternating penalized M-regression with a bounded loss function. The resulting principal functions are given as M-type smoothing spline estimators. Using the properties of a natural cubic spline, we develop a fast computation algorithm even for long and dense functional data. The proposed method is efficient in that the maximal information from whole observed curve is retained since it partly downweighs abnormally observed individual measurements in a single curve rather than removing or downweighing a whole curve. We demonstrate the performance of the proposed method on simulated and real data and compare it with the conventional functional principal component analysis and other robust functional principal component analysis techniques.
Journal of Classification | 2008
Nedret Billor; Asheber Abebe; Asuman Türkmen; Sai V. Nudurupati
Suppose y, a d-dimensional (d ≥ 1) vector, is drawn from a mixture of k (k ≥ 2) populations, given by ∏1, ∏2,…,∏k. We wish to identify the population that is the most likely source of the point y. To solve this classification problem many classification rules have been proposed in the literature. In this study, a new nonparametric classifier based on the transvariation probabilities of data depth is proposed. We compare the performance of the newly proposed nonparametric classifier with classical and maximum depth classifiers using some benchmark and simulated data sets.
Communications in Statistics - Simulation and Computation | 2016
Melody Denhere; Nedret Billor
In this article, we discuss the estimation of the parameter function for a functional logistic regression model in the presence of outliers. We consider ways that allow for the parameter estimator to be resistant to outliers, in addition to minimizing multicollinearity and reducing the high dimensionality, which is inherent with functional data. To achieve this, the functional covariates and functional parameter of the model are approximated in a finite-dimensional space generated by an appropriate basis. This approach reduces the functional model to a standard multiple logistic model with highly collinear covariates and potential high-dimensionality issues. The proposed estimator tackles these issues and also minimizes the effect of functional outliers. Results from a simulation study and a real world example are also presented to illustrate the performance of the proposed estimator.