Paul H. C. Eilers
Leiden University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paul H. C. Eilers.
Statistical Modelling | 2004
Iain D. Currie; María Durbán; Paul H. C. Eilers
The prediction of future mortality rates is a problem of fundamental importance for the insurance and pensions industry. We show how the method of P-splines can be extended to the smoothing and forecasting of two-dimensional mortality tables. We use a penalized generalized linear model with Poisson errors and show how to construct regression and penalty matrices appropriate for two-dimensional modelling. An important feature of our method is that forecasting is a natural consequence of the smoothing process. We illustrate our methods with two data sets provided by the Continuous Mortality Investigation Bureau, a central body for the collection and processing of UK insurance and pensions data.
British Journal of Cancer | 2006
Marcel Lombaerts; T. van Wezel; Katja Philippo; Jan Willem F Dierssen; Rhyenne Zimmerman; Jan Oosting; R. van Eijk; Paul H. C. Eilers; B van de Water; C. J. Cornelisse; A-M Cleton-Jansen
Using genome-wide expression profiling of a panel of 27 human mammary cell lines with different mechanisms of E-cadherin inactivation, we evaluated the relationship between E-cadherin status and gene expression levels. Expression profiles of cell lines with E-cadherin (CDH1) promoter methylation were significantly different from those with CDH1 expression or, surprisingly, those with CDH1 truncating mutations. Furthermore, we found no significant differentially expressed genes between cell lines with wild-type and mutated CDH1. The expression profile complied with the fibroblastic morphology of the cell lines with promoter methylation, suggestive of epithelial–mesenchymal transition (EMT). All other lines, also the cases with CDH1 mutations, had epithelial features. Three non-tumorigenic mammary cell lines derived from normal breast epithelium also showed CDH1 promoter methylation, a fibroblastic phenotype and expression profile. We suggest that CDH1 promoter methylation, but not mutational inactivation, is part of an entire programme, resulting in EMT and increased invasiveness in breast cancer. The molecular events that are part of this programme can be inferred from the differentially expressed genes and include genes from the TGFβ pathway, transcription factors involved in CDH1 regulation (i.e. ZFHX1B, SNAI2, but not SNAI1, TWIST), annexins, AP1/2 transcription factors and members of the actin and intermediate filament cytoskeleton organisation.
Computational Statistics & Data Analysis | 2006
Paul H. C. Eilers; Iain D. Currie; María Durbán
A framework of penalized generalized linear models and tensor products of B-splines with roughness penalties allows effective smoothing of data in multidimensional arrays. A straightforward application of the penalized Fisher scoring algorithm quickly runs into storage and computational difficulties. A novel algorithm takes advantage of the special structure of both the data as an array and the model matrix as a tensor product; the algorithm is fast, uses only a moderate amount of memory and works for any number of dimensions. Examples are given of how the method is used to smooth life tables and image data.
Bioinformatics | 2005
Paul H. C. Eilers; Renee X. de Menezes
MOTIVATION Plots of array Comparative Genomic Hybridization (CGH) data often show special patterns: stretches of constant level (copy number) with sharp jumps between them. There can also be much noise. Classic smoothing algorithms do not work well, because they introduce too much rounding. To remedy this, we introduce a fast and effective smoothing algorithm based on penalized quantile regression. It can compute arbitrary quantile curves, but we concentrate on the median to show the trend and the lower and upper quartile curves showing the spread of the data. Two-fold cross-validation is used for optimizing the weight of the penalties. RESULTS Simulated data and a published dataset are used to show the capabilities of the method to detect the segments of changed copy numbers in array CGH data.
Bioinformatics | 2004
Paul H. C. Eilers; Jelle J. Goeman
MOTIVATION Scatterplots of microarray data generally contain a very large number of dots, making it difficult to get a good impression of their distribution in dense areas. RESULTS We present a fast and simple algorithm for two-dimensional histogram smoothing, to visually enhance scatterplots. AVAILABILITY Functions for Matlab and R are available from the corresponding author.
Chemometrics and Intelligent Laboratory Systems | 2003
Paul H. C. Eilers; Brian D. Marx
Abstract The Penalized Signal Regression (PSR) approach to multivariate calibration (MVC) assumes a smooth vector of coefficients for weighting a spectrum to predict the unknown concentration of a chemical component. B-splines and roughness penalties, based on differences, are used to estimate the coefficients. In this paper, we extend PSR to incorporate a covariate like temperature. A smooth surface on the wavelength–temperature domain is estimated, using tensor products of B-splines and penalties along the two dimensions. A slice of this surface gives the vector of weights at an arbitrary temperature. We present the theory and apply multi-dimensional PSR to a published data set, showing good performance. We also introduce and apply a simplification based on a varying-coefficient model (VCM).
Technometrics | 2005
Brian D. Marx; Paul H. C. Eilers
We propose a general approach to regression on digitized multidimensional signals that can pose severe challenges to standard statistical methods. The main contribution of this work is to build a two-dimensional coefficient surface that allows for interaction across the indexing plane of the regressor array. We aim to use the estimated coefficient surface for reliable (scalar) prediction. We assume that the coefficients are smooth along both indices. We present a rather straightforward and rich extension of penalized signal regression using penalized B-spline tensor products, where appropriate difference penalties are placed on the rows and columns of the tensor product coefficients. Our methods are grounded in standard penalized regression, and thus cross-validation, effective dimension, and other diagnostics are accessible. Further, the model is easily transplanted into the generalized linear model framework. An illustrative example motivates our proposed methodology, and performance comparisons are made to other popular methods.
Journal of Computational and Graphical Statistics | 2002
Paul H. C. Eilers; Brian D. Marx
This article proposes a practical modeling approach that can accommodate a rich variety of predictors, united in a generalized linear model (GLM) setting. In addition to the usual ANOVA-type or covariatelinear (L) predictors, we consider modeling any combination of smooth additive (G) components, varying coefficient (V) components, and (discrete representations of) signal (S) components. We assume that G is, and the coefficients of V and S are, inherently smooth—projecting each of these onto B-spline bases using a modest number of equally spaced knots. Enough knots are used to ensure more flexibility than needed; further smoothness is achieved through a difference penalty on adjacent B-spline coefficients (P-splines). This linear re-expression allows all of the parameters associated with these components to be estimated simultaneously in one large GLM through penalized likelihood. Thus, we have the advantage of avoiding both the backfitting algorithm and complex knot selection schemes. We regulate the flexibility of each component through a separate penalty parameter that is optimally chosen based on cross-validation or an information criterion.
Microarrays : optical technologies and informatics. Conference | 2001
Paul H. C. Eilers; Judith M. Boer; Gert-Jan B. van Ommen; Hans C. van Houwelingen
Classification of microarray data needs a firm statistical basis. In principle, logistic regression can provide it, modeling the probability of membership of a class with (transforms of) linear combinations of explanatory variables. However, classical logistic regression does not work for microarrays, because generally there will be far more variables than observations. One problem is multicollinearity: estimating equations become singular and have no unique and stable solution. A second problem is over-fitting: a model may fit well into a data set, but perform badly when used to classify new data. We propose penalized likelihood as a solution to both problems. The values of the regression coefficients are constrained in a similar way as in ridge regression. All variables play an equal role, there is no ad-hoc selection of most relevant or most expressed genes. The dimension of the resulting systems of equations is equal to the number of variables, and generally will be too large for most computers, but it can dramatically be reduced with the singular value decomposition of some matrices. The penalty is optimized with AIC (Akaikes Information Criterion), which essentially is a measure of prediction performance. We find that penalized logistic regression performs well on a public data set (the MIT ALL/AML data).
European Journal of Obstetrics & Gynecology and Reproductive Biology | 2000
Frans Klumper; Inge L. van Kamp; F.P.H.A. Vandenbussche; Robertjan H. Meerman; Dick Oepkes; Sicco Scherjon; Paul H. C. Eilers; Humphrey H.H. Kanhai
OBJECTIVE To compare the outcome after intrauterine transfusion (IUT) between fetuses treated before and those treated after 32 weeks gestation. SETTING National referral center for intrauterine treatment of red-cell alloimmunization in The Netherlands. STUDY DESIGN Retrospective evaluation of an 11 year period, during which 209 fetuses were treated for alloimmune hemolytic disease with 609 red-cell IUTs. We compared fetal and neonatal outcome in three groups: fetuses only treated before 32 weeks gestation (group A, n=46), those treated both before and after 32 weeks (group B, n=117), and those where IUT was started at or after 32 weeks (group C, n=46). RESULTS Survival rate was 48% in group A, 100% in group B, and 91% in group C. Moreover, fetuses in group A were hydropic significantly more often. Short-term perinatal loss rate after IUT was 3.4% in the 409 procedures performed before 32 weeks and 1.0% in the 200 procedures performed after 32 weeks gestation. CONCLUSION Perinatal losses were much more common in fetuses only treated before 32 weeks gestation. Two procedure-related perinatal losses in 200 IUT after 32 weeks remain a matter of concern because of the good prospects of alternative extrauterine treatment.