Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tim Verdonck is active.

Publication


Featured researches published by Tim Verdonck.


Computational Statistics & Data Analysis | 2008

Principal component regression for data containing outliers and missing elements

Sven Serneels; Tim Verdonck

Two approaches are presented to perform principal component analysis (PCA) on data which contain both outlying cases and missing elements. At first an eigendecomposition of a covariance matrix which can deal with such data is proposed, but this approach is not fit for data where the number of variables exceeds the number of cases. Alternatively, an expectation robust (ER) algorithm is proposed so as to adapt the existing methodology for robust PCA to data containing missing elements. According to an extensive simulation study, the ER approach performs well for all data sizes concerned. Using simulations and an example, it is shown that by virtue of the ER algorithm, the properties of the existing methods for robust PCA carry through to data with missing elements.


Computational Statistics & Data Analysis | 2009

Robust PCA for skewed data and its outlier map

Mia Hubert; Peter J. Rousseeuw; Tim Verdonck

The outlier sensitivity of classical principal component analysis (PCA) has spurred the development of robust techniques. Existing robust PCA methods like ROBPCA work best if the non-outlying data have an approximately symmetric distribution. When the original variables are skewed, too many points tend to be flagged as outlying. A robust PCA method is developed which is also suitable for skewed data. To flag the outliers a new outlier map is defined. Its performance is illustrated on real data from economics, engineering, and finance, and confirmed by a simulation study.


Journal of Computational and Graphical Statistics | 2012

A Deterministic Algorithm for Robust Location and Scatter

Mia Hubert; Peter J. Rousseeuw; Tim Verdonck

Most algorithms for highly robust estimators of multivariate location and scatter start by drawing a large number of random subsets. For instance, the FASTMCD algorithm of Rousseeuw and Van Driessen starts in this way, and then takes so-called concentration steps to obtain a more accurate approximation to the MCD. The FASTMCD algorithm is affine equivariant but not permutation invariant. In this article, we present a deterministic algorithm, denoted as DetMCD, which does not use random subsets and is even faster. It computes a small number of deterministic initial estimators, followed by concentration steps. DetMCD is permutation invariant and very close to affine equivariant. We compare it to FASTMCD and to the OGK estimator of Maronna and Zamar. We also illustrate it on real and simulated datasets, with applications involving principal component analysis, classification, and time series analysis. Supplemental material (Matlab code of the DetMCD algorithm and the datasets) is available online.


Advanced Data Analysis and Classification | 2010

Robust kernel principal component analysis and classification

Michiel Debruyne; Tim Verdonck

Kernel principal component analysis (KPCA) extends linear PCA from a real vector space to any high dimensional kernel feature space. The sensitivity of linear PCA to outliers is well-known and various robust alternatives have been proposed in the literature. For KPCA such robust versions received considerably less attention. In this article we present kernel versions of three robust PCA algorithms: spherical PCA, projection pursuit and ROBPCA. These robust KPCA algorithms are analyzed in a classification context applying discriminant analysis on the KPCA scores. The performances of the different robust KPCA algorithms are studied in a simulation study comparing misclassification percentages, both on clean and contaminated data. An outlier map is constructed to visualize outliers in such classification problems. A real life example from protein classification illustrates the usefulness of robust KPCA and its corresponding outlier map.


Journal of Chemometrics | 2012

Robust PARAFAC for incomplete data

Mia Hubert; Johan Van Kerckhoven; Tim Verdonck

Different methods exist to explore multiway data. In this article, we focus on the widely used PARAFAC (parallel factor analysis) model, which expresses multiway data in a more compact way without ignoring the underlying complex structure. An alternating least squares procedure is typically used to fit the PARAFAC model. It is, however, well known that least squares techniques are very sensitive to outliers, and hence, the PARAFAC model as a whole is a nonrobust method. Therefore a robust alternative, which can deal with fully observed data possibly contaminated by outlying samples, has already been proposed in literature. In this paper, we present an approach to perform PARAFAC on data that contain both outlying cases and missing elements. A simulation study shows the good performance of our methodology. In particular, we can apply our method on a dataset in which scattering is detected and replaced with missing values. This is illustrated on a real data example. Copyright


The North American Actuarial Journal | 2009

A robustification of the chain-ladder method

Tim Verdonck; M. Van Wouwe; Jan Dhaene

Abstract In a non-life insurance business an insurer often needs to build up a reserve to able to meet his or her future obligations arising from incurred but not reported completely claims. To forecast these claims reserves, a simple but generally accepted algorithm is the classical chain-ladder method. Recent research essentially focused on the underlying model for the claims reserves to come to appropriate bounds for the estimates of future claims reserves. Our research concentrates on scenarios with outlying data. On closer examination it is demonstrated that the forecasts for future claims reserves are very dependent on outlying observations. The paper focuses on two approaches to robustify the chain-ladder method: the first method detects and adjusts the outlying values, whereas the second method is based on a robust generalized linear model technique. In this way insurers will be able to find a reserve that is similar to the reserve they would have found if the data contained no outliers. Because the robust method flags the outliers, it is possible to examine these observations for further examination. For obtaining the corresponding standard errors the bootstrapping technique is applied. The robust chain-ladder method is applied to several run-off triangles with and without outliers, showing its excellent performance.


Computational Statistics & Data Analysis | 2015

The DetS and DetMM estimators for multivariate location and scatter

Mia Hubert; Peter J. Rousseeuw; Tim Verdonck

New deterministic robust estimators of multivariate location and scatter are presented. They combine ideas from the deterministic DetMCD estimator with steps from the subsampling-based FastS and FastMM algorithms. The new DetS and DetMM estimators perform similarly to FastS and FastMM on low-dimensional data, whereas in high dimensions they are more robust. Their computation time is much lower than FastS and FastMM, which allows to compute the estimators for a range of breakdown values. Moreover, they are permutation invariant and very close to affine equivariant.


Reliability Engineering & System Safety | 2014

Precision of power-law NHPP estimates for multiple systems with known failure rate scaling

Jozef Van Dyck; Tim Verdonck

Abstract The power-law non-homogeneous Poisson process, also called the Crow-AMSAA model, is often used to model the failure rate of repairable systems. In standard applications it is assumed that the recurrence rate is the same for all systems that are observed. The estimation of the model parameters on the basis of past failure data is typically performed using maximum likelihood. If the operational period over which failures are observed differs for each system, the Fisher information matrix is numerically inverted to quantify the precision of the parameter estimates. In this paper, the extended case is considered where the recurrence rate between the different systems may vary with known scaling factors and it is shown that the standard error of the parameter estimates can be quantified using analytical formulae. The scaling factors allow to apply the model to a wider range of problems. The analytical solution for the standard error simplifies the application and allows to better understand how the precision of the model varies with the extent of available data. The good performance and the practical use of the method is illustrated in an example.


Technometrics | 2016

Sparse PCA for High-Dimensional Data With Outliers

Mia Hubert; Tom Reynkens; Eric Schmitt; Tim Verdonck

A new sparse PCA algorithm is presented, which is robust against outliers. The approach is based on the ROBPCA algorithm that generates robust but nonsparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. In comparison with a projection pursuit-based algorithm, ROSPCA demonstrates superior robustness properties and comparable sparsity estimation capability, as well as significantly faster computation time.


Statistical Analysis and Data Mining | 2017

Fast robust SUR with economical and actuarial applications

Mia Hubert; Tim Verdonck; Özlem Yorulmaz

The seemingly unrelated regression (SUR) model is a generalization of a linear regression model consisting of more than one equation, where the error terms of these equations are contemporaneously correlated. The standard Feasible Generalized Linear Squares (FGLS) estimator is efficient as it takes into account the covariance structure of the errors, but it is also very sensitive to outliers. The robust SUR estimator of Bilodeau and Duchesne (Canadian Journal of Statistics, 28:277–288, 2000) can accommodate outliers, but it is hard to compute. First we propose a fast algorithm, FastSUR, for its computation and show its good performance in a simulation study. We then provide diagnostics for outlier detection and illustrate them on a real data set from economics. Next we apply our FastSUR algorithm in the framework of stochastic loss reserving for general insurance. We focus on the General Multivariate Chain Ladder (GMCL) model that employs SUR to estimate its parameters. Consequently, this multivariate stochastic reserving method takes into account the contemporaneous correlations among run-off triangles and allows structural connections between these triangles. We plug in our FastSUR algorithm into the GMCL model to obtain a robust version.

Collaboration


Dive into the Tim Verdonck's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter J. Rousseeuw

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kris Boudt

Vrije Universiteit Brussel

View shared research outputs
Top Co-Authors

Avatar

Pieter Segaert

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Stefan Van Aelst

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Dries Cornilly

Vrije Universiteit Brussel

View shared research outputs
Top Co-Authors

Avatar

Kris Peremans

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge