Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nicole Krämer is active.

Publication


Featured researches published by Nicole Krämer.


Physical Review Letters | 2008

Robustly estimating the flow direction of information in complex physical systems

Guido Nolte; Andreas Ziehe; Vadim V. Nikulin; Alois Schlögl; Nicole Krämer; Tom Brismar; Klaus-Robert Müller

We propose a new measure (phase-slope index) to estimate the direction of information flux in multivariate time series. This measure (a) is insensitive to mixtures of independent sources, (b) gives meaningful results even if the phase spectrum is not linear, and (c) properly weights contributions from different frequencies. These properties are shown in extended simulations and contrasted to Granger causality which yields highly significant false detections for mixtures of independent sources. An application to electroencephalography data (eyes-closed condition) reveals a clear front-to-back information flow.


Neural Networks | 2009

2009 Special Issue: Time Domain Parameters as a feature for EEG-based Brain-Computer Interfaces

Carmen Vidaurre; Nicole Krämer; Benjamin Blankertz; Alois Schlögl

Several feature types have been used with EEG-based Brain-Computer Interfaces. Among the most popular are logarithmic band power estimates with more or less subject-specific optimization of the frequency bands. In this paper we introduce a feature called Time Domain Parameter that is defined by the generalization of the Hjorth parameters. Time Domain Parameters are studied under two different conditions. The first setting is defined when no data from a subject is available. In this condition our results show that Time Domain Parameters outperform all band power features tested with all spatial filters applied. The second setting is the transition from calibration (no feedback) to feedback, in which the frequency content of the signals can change for some subjects. We compare Time Domain Parameters with logarithmic band power in subject-specific bands and show that these features are advantageous in this situation as well.


knowledge discovery and data mining | 2008

Partial least squares regression for graph mining

Hiroto Saigo; Nicole Krämer; Koji Tsuda

Attributed graphs are increasingly more common in many application domains such as chemistry, biology and text processing. A central issue in graph mining is how to collect informative subgraph patterns for a given learning task. We propose an iterative mining method based on partial least squares regression (PLS). To apply PLS to graph data, a sparse version of PLS is developed first and then it is combined with a weighted pattern mining algorithm. The mining algorithm is iteratively called with different weight vectors, creating one latent component per one mining call. Our method, graph PLS, is efficient and easy to implement, because the weight vector is updated with elementary matrix calculations. In experiments, our graph PLS algorithm showed competitive prediction accuracies in many chemical datasets and its efficiency was significantly superior to graph boosting (gBoost) and the naive method based on frequent graph mining.


Journal of the American Statistical Association | 2011

The Degrees of Freedom of Partial Least Squares Regression

Nicole Krämer; Masashi Sugiyama

The derivation of statistical properties for partial least squares regression can be a challenging task. The reason is that the construction of latent components from the predictor variables also depends on the response variable. While this typically leads to good performance and interpretable models in practice, it makes the statistical analysis more involved. In this work, we study the intrinsic complexity of partial least squares regression. Our contribution is an unbiased estimate of its degrees of freedom. It is defined as the trace of the first derivative of the fitted values, seen as a function of the response. We establish two equivalent representations that rely on the close connection of partial least squares to matrix decompositions and Krylov subspace techniques. We show that the degrees of freedom depend on the collinearity of the predictor variables: The lower the collinearity, the higher the complexity. In particular, they are typically higher than the naive approach that defines the degrees of freedom as the number of components. Further, we illustrate how our degrees of freedom estimate can be used for the comparison of different regression methods. In the experimental section, we show that our degrees of freedom estimate in combination with information criteria is useful for model selection.


Chemometrics and Intelligent Laboratory Systems | 2008

Penalized Partial Least Squares with applications to B-spline transformations and functional data

Nicole Krämer; Anne-Laure Boulesteix; Gerhard Tutz

We propose a novel method to model nonlinear regression problems by adapting the principle of penalization to Partial Least Squares (PLS). Starting with a generalized additive model, we expand the additive component of each variable in terms of a generous amount of B-Splines basis functions. In order to prevent overfitting and to obtain smooth functions, we estimate the regression model by applying a penalized version of PLS. Although our motivation for penalized PLS stems from its use for B-Splines transformed data, the proposed approach is very general and can be applied to other penalty terms or to other dimension reduction techniques. It turns out that penalized PLS can be computed virtually as fast as PLS. We prove a close connection of penalized PLS to the solutions of preconditioned linear systems. In the case of high-dimensional data, the new method is shown to be an attractive competitor to other techniques for estimating generalized additive models. If the number of predictor variables is high compared to the number of examples, traditional techniques often suffer from overfitting. We illustrate that penalized PLS performs well in these situations.


Genetics | 2014

Usefulness of Multiparental Populations of Maize (Zea mays L.) for Genome-Based Prediction

Christina Lehermeier; Nicole Krämer; Eva Bauer; Cyril Bauland; Christian Camisan; Laura Campo; Pascal Flament; Albrecht E. Melchinger; Monica A. Menz; Nina Meyer; Laurence Moreau; Jesús Moreno-González; Milena Ouzunova; Hubert Pausch; Nicolas Ranc; Wolfgang Schipprack; Manfred Schönleben; Hildrun Walter; Alain Charcosset; Chris-Carolin Schön

The efficiency of marker-assisted prediction of phenotypes has been studied intensively for different types of plant breeding populations. However, one remaining question is how to incorporate and counterbalance information from biparental and multiparental populations into model training for genome-wide prediction. To address this question, we evaluated testcross performance of 1652 doubled-haploid maize (Zea mays L.) lines that were genotyped with 56,110 single nucleotide polymorphism markers and phenotyped for five agronomic traits in four to six European environments. The lines are arranged in two diverse half-sib panels representing two major European heterotic germplasm pools. The data set contains 10 related biparental dent families and 11 related biparental flint families generated from crosses of maize lines important for European maize breeding. With this new data set we analyzed genome-based best linear unbiased prediction in different validation schemes and compositions of estimation and test sets. Further, we theoretically and empirically investigated marker linkage phases across multiparental populations. In general, predictive abilities similar to or higher than those within biparental families could be achieved by combining several half-sib families in the estimation set. For the majority of families, 375 half-sib lines in the estimation set were sufficient to reach the same predictive performance of biomass yield as an estimation set of 50 full-sib lines. In contrast, prediction across heterotic pools was not possible for most cases. Our findings are important for experimental design in genome-based prediction as they provide guidelines for the genetic structure and required sample size of data sets used for model training.


Epidemiology | 2012

Addressing the identification problem in age-period-cohort analysis: a tutorial on the use of partial least squares and principal components analysis.

Yu-Kang Tu; Nicole Krämer; Wen-Chung Lee

In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.


PLOS ONE | 2014

Enhancing Genome-Enabled Prediction by Bagging Genomic BLUP

Daniel Gianola; Kent A. Weigel; Nicole Krämer; Alessandra Stella; Chris-Carolin Schön

We examined whether or not the predictive ability of genomic best linear unbiased prediction (GBLUP) could be improved via a resampling method used in machine learning: bootstrap aggregating sampling (“bagging”). In theory, bagging can be useful when the predictor has large variance or when the number of markers is much larger than sample size, preventing effective regularization. After presenting a brief review of GBLUP, bagging was adapted to the context of GBLUP, both at the level of the genetic signal and of marker effects. The performance of bagging was evaluated with four simulated case studies including known or unknown quantitative trait loci, and an application was made to real data on grain yield in wheat planted in four environments. A metric aimed to quantify candidate-specific cross-validation uncertainty was proposed and assessed; as expected, model derived theoretical reliabilities bore no relationship with cross-validation accuracy. It was found that bagging can ameliorate predictive performance of GBLUP and make it more robust against over-fitting. Seemingly, 25–50 bootstrap samples was enough to attain reasonable predictions as well as stable measures of individual predictive mean squared errors.


international conference on machine learning | 2007

Kernelizing PLS, degrees of freedom, and efficient model selection

Nicole Krämer; Mikio L. Braun

Kernelizing partial least squares (PLS), an algorithm which has been particularly popular in chemometrics, leads to kernel PLS which has several interesting properties, including a sub-cubic runtime for learning, and an iterative construction of directions which are relevant for predicting the outputs. We show that the kernelization of PLS introduces interesting properties not found in ordinary PLS, giving novel insights into the workings of kernel PLS and the connections to kernel ridge regression and conjugate gradient descent methods. Furthermore, we show how to correctly define the degrees of freedom for kernel PLS and how to efficiently compute an unbiased estimate. Finally, we address the practical problem of model selection. We demonstrate how to use the degrees of freedom estimate to perform effective model selection, and discuss how to implement crossvalidation schemes efficiently.


european conference on machine learning | 2009

The Feature Importance Ranking Measure

Alexander Zien; Nicole Krämer; Sören Sonnenburg; Gunnar Rätsch

Most accurate predictions are typically obtained by learning machines with complex feature spaces (as e.g. induced by kernels). Unfortunately, such decision rules are hardly accessible to humans and cannot easily be used to gain insights about the application domain. Therefore, one often resorts to linear models in combination with variable selection, thereby sacrificing some predictive power for presumptive interpretability. Here, we introduce the Feature Importance Ranking Measure (FIRM), which by retrospective analysis of arbitrary learning machines allows to achieve both excellent predictive performance and superior interpretation. In contrast to standard raw feature weighting, FIRM takes the underlying correlation structure of the features into account. Thereby, it is able to discover the most relevant features, even if their appearance in the training data is entirely prevented by noise. The desirable properties of FIRM are investigated analytically and illustrated in simulations.

Collaboration


Dive into the Nicole Krämer's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Klaus-Robert Müller

Technical University of Berlin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andreas Ziehe

Technical University of Berlin

View shared research outputs
Top Co-Authors

Avatar

Konrad Rieck

Braunschweig University of Technology

View shared research outputs
Top Co-Authors

Avatar

Mikio L. Braun

Technical University of Berlin

View shared research outputs
Top Co-Authors

Avatar

Stefan Haufe

Technical University of Berlin

View shared research outputs
Top Co-Authors

Avatar

Alois Schlögl

Graz University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge