Svante Wold
Umeå University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Svante Wold.
Chemometrics and Intelligent Laboratory Systems | 2001
Svante Wold; Michael Sjöström; Lennart Eriksson
PLS-regression (PLSR) is the PLS approach in its simplest, and in chemistry and technology, most used form (two-block predictive PLS). PLSR is a method for relating two data matrices, X and Y, by a linear multivariate model, but goes beyond traditional regression in that it models also the structure of X and Y. PLSR derives its usefulness from its ability to analyze data with many, noisy, collinear, and even incomplete variables in both X and Y. PLSR has the desirable property that the precision of the model parameters improves with the increasing number of relevant variables and observations.This article reviews PLSR as it has developed to become a standard tool in chemometrics and used in chemistry and engineering. The underlying model and its assumptions are discussed, and commonly used diagnostics are reviewed together with the interpretation of resulting parameters.Two examples are used as illustrations: First, a Quantitative Structure-Activity Relationship (QSAR)/Quantitative Structure-Property Relationship (QSPR) data set of peptides is used to outline how to develop, interpret and refine a PLSR model. Second, a data set from the manufacturing of recycled paper is analyzed to illustrate time series modelling of process data by means of PLSR and time-lagged X-variables.
Chemometrics and Intelligent Laboratory Systems | 1987
Svante Wold; Kim H. Esbensen; Paul Geladi
Principal Component Analysis (PCA) is a multivariate exploratory analysis method, useful to separate systematic variation from noise. It allows to define a space of reduced dimensions that preserve ...
Technometrics | 1978
Svante Wold
By means of factor analysis (FA) or principal components analysis (PCA) a matrix Y with the elements y ik is approximated by the model Here the parameters α, β and θ express the systematic part of the data yik, “signal,” and the residuals ∊ ik express the “random” part, “noise.” When applying FA or PCA to a matrix of real data obtained, for example, by characterizing N chemical mixtures by M measured variables, one major problem is the estimation of the rank A of the matrix Y, i.e. the estimation of how much of the data y ik is “signal” and how much is “noise.” Cross validation can be used to approach this problem. The matrix Y is partitioned and the rank A is determined so as to maximize the predictive properties of model (I) when the parameters are estimated on one part of the matrix Y and the prediction tested on another part of the matrix Y.
Pattern Recognition | 1976
Svante Wold
Pattern recognition based on modelling each separate class by a separate principal components (PC) model is discussed. These PC models are shown to be able to approximate any continuous variation within a single class. Hence, methods based on PC models will, provided that the data are sufficient, recognize any pattern that exists in a given set of objects. In addition, fitting the objects in each class by a separate PC model will, in a simple way, provide information about such matters as the relevance of single variables, “outliers” among the objects and “distances” between different classes. Application to the classical Iris-data of Fisher is used as an illustration.
Chemometrics and Intelligent Laboratory Systems | 1998
Svante Wold; Henrik Antti; Fredrik Lindgren; Jerker Öhman
Abstract Near-infrared (NIR) spectra are often pre-processed in order to remove systematic noise such as base-line variation and multiplicative scatter effects. This is done by differentiating the spectra to first or second derivatives, by multiplicative signal correction (MSC), or by similar mathematical filtering methods. This pre-processing may, however, also remove information from the spectra regarding Y (the measured response variable in multivariate calibration applications). We here show how a variant of PLS can be used to achieve a signal correction that is as close to orthogonal as possible to a given Y-vector or Y-matrix. Thus, one ensures that the signal correction removes as little information as possible regarding Y. In the case when the number of X-variables (K) exceeds the number of observations (N), strict orthogonality is obtained. The approach is called orthogonal signal correction (OSC) and is here applied to four different data sets of multivariate calibration. The results are compared with those of traditional signal correction as well as with those of no pre-processing, and OSC is shown to give substantial improvements. Prediction sets of new data, not used in the model development, are used for the comparisons.
Archive | 1984
Svante Wold; Christer Albano; William Dunn; Ulf Edlund; Kim H. Esbensen; Paul Geladi; Sven Hellberg; Erik Johansson; W. Lindberg; Michael Sjöström
Any data table produced in a chemical investigation can be analysed by bilinear projection methods, i. e. principal components and factor analysis and their extensions. Representing the table rows (objects) as points in a p-dimensional space, these methods project the point swarm of the data set or parts of it down on a F-dimensional subspace (plane or hyperplane). Different questions put to the data table correspond to different projections.
Chemometrics and Intelligent Laboratory Systems | 2001
Svante Wold; Johan Trygg; Anders Berglund; Henrik Antti
The original chemometrics partial least squares (PLS) model with two blocks of variables (X and Y), linearly related to each other, has had several enhancements/extensions since the beginning of 19 ...
Chemometrics and Intelligent Laboratory Systems | 1989
Svante Wold; Nouna Kettaneh-Wold; Bert Skagerberg
The linear two block predictive PLS model (PPLS2) is often used to model the relation between two data matrices, X and Y. Applications include multivariate calibration, quantitative structure-activity relationships (QSAR), and process optimization. In each PPLS2 model dimension the matrices X and Y are decomposed as bilinear products plus residual matrices: X = tp′ + E Y = uq′ + F In addition, a linear model is assumed to relate the score vectors t and u (h denotes residuals): u = bt + h This allows Y to be modeled by t and q as: Y = tq′b + ƒ* In the present work the linear PPLS2 model is extended to the case when the inner model relating the block scores u and t is nonlinear (h is a vector of residuals): u = ƒ(t) + h An algorithm is outlined for the model where the inner relation is a quadratic polynomial: u = c0 + c1t + c2t2 + h This will be referred to as the QPLS2 model (standing for quadratic PLS with two blocks). Applications to cosmetics qualimetrics and a drug structure-activity relationship are used as illustrations.
Technometrics | 1974
Svante Wold
The use of spline functions in the analysis of empirical two-dimensional data (y i, x i) is described. The definition of spline functions as piecewise polynomials with continuity conditions give them unique properties as empirical function. They can represent any variation of y with x arbitrarily well over wide intervals of x. Furthermore, due to the local properties of the spline functions, they are excellent tools for differentiation and integration of empirical data. Hence, spline functions are excellent empirical functions which can be used with advantage instead of other empirical functions, such as poly-nomials or exponentials. Examples of application show spline analyses of response curves in pharmacokinetics and of the local behavior of almost first order kinetic data.
Journal of Chemometrics | 1996
Svante Wold; Nouna Kettaneh; Kjell Tjessem
In multivariate PLS (partial least square projection to latent structures) and PC (principal component) models with many variables, plots and lists of b loadings, coefficients, VIPs, etc. become messy and results are difficult to interpret. There is then a strong temptation to reduce the variables to a smaller, more manageable number. This reduction of variables, however, often removes information, makes the interpretation misleading and seriously increases the risk of spurious models.