Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where K. Magnus Åberg is active.

Publication


Featured researches published by K. Magnus Åberg.


Analytical and Bioanalytical Chemistry | 2009

The correspondence problem for metabonomics datasets.

K. Magnus Åberg; Erik Alm; Ralf J. O. Torgrip

In metabonomics it is difficult to tell which peak is which in datasets with many samples. This is known as the correspondence problem. Data from different samples are not synchronised, i.e., the peak from one metabolite does not appear in exactly the same place in all samples. For datasets with many samples, this problem is nontrivial, because each sample contains hundreds to thousands of peaks that shift and are identified ambiguously. Statistical analysis of the data assumes that peaks from one metabolite are found in one column of a data table. For every error in the data table, the statistical analysis loses power and the risk of missing a biomarker increases. It is therefore important to solve the correspondence problem by synchronising samples and there is no method that solves it once and for all. In this review, we analyse the correspondence problem, discuss current state-of-the-art methods for synchronising samples, and predict the properties of future methods.


Journal of Chromatography A | 2008

Feature detection and alignment of hyphenated chromatographic-mass spectrometric data Extraction of pure ion chromatograms using Kalman tracking

K. Magnus Åberg; Ralf J. O. Torgrip; Johan Kolmert; Johan Lindberg

In this paper we present a new method, called TracMass, for analyzing data obtained using hyphenated chromatography-mass spectrometry (XC/MS). The method uses a Kalman filter to extract pure, noise-free ion chromatograms by exploiting the latent second order structure in the XC/MS data. TracMass differs from current state-of-the-art methodologies, which extract chromatograms by binning along the m/z axis and further processes the data in various ways, e.g. by baseline correction, component detection algorithm, peak detection, and curve resolution to extract molecular features. The proposed method was validated by analyzing two plasma datasets: one derived from 99 quality control samples where TracMass extracted 8880 Pure Ion Chromatograms (PICs) present in > or =90 of the samples. The second dataset was spiked with two different internal standard mixtures to test differential expression analysis. Here TracMass found 20000 PICs present in 10 samples, all differentially expressed analytes, and also a previously unreported discriminating metabolite. Finding as many PICs as possible is in this context essential to ensure that even small differentiating features are found (if they exist). The resulting data representation from TracMass (PICs) can be used directly for statistical analysis, and the method is fast (approximately 5min/sample), with few adjustable parameters.


Journal of Chemical Physics | 2004

Determination of solvation free energies by adaptive expanded ensemble molecular dynamics

K. Magnus Åberg; Alexander P. Lyubartsev; Sven P. Jacobsson; Aatto Laaksonen

A new method of calculating absolute free energies is presented. It was developed as an extension to the expanded ensemble molecular dynamics scheme and uses probability density estimation to continuously optimize the expanded ensemble parameters. The new method is much faster as it removes the time-consuming and expertise-requiring step of determining balancing factors. Its efficiency and accuracy are demonstrated for the dissolution of three qualitatively very different chemical species in water: methane, ionic salts, and benzylamine. A recently suggested optimization scheme by Wang and Landau [Phys. Rev. Lett. 86, 2050 (2001)] was also implemented and found to be computationally less efficient than the proposed adaptive expanded ensemble method.


Journal of Environmental Monitoring | 2011

Organophosphate and phthalate esters in indoor air: a comparison between multi-storey buildings with high and low prevalence of sick building symptoms

Caroline Bergh; K. Magnus Åberg; Magnus Svartengren; Gunnel Emenius; Conny Östman

An extensive study has been conducted on the prevalence of organophosphorous flame retardants/plasticizers and phthalate ester plasticizers in indoor air. The targeted substances were measured in 45 multi-storey apartment buildings in Stockholm, Sweden. The apartment buildings were classified as high or low risk with regard to the reporting of sick building symptoms (SBS) within the project Healthy Sustainable Houses in Stockholm (3H). Air samples were taken from two to four apartments per building (in total 169 apartments) to facilitate comparison within and between buildings. Association with building characteristics has been examined as well as association with specific sources by combining chemical analysis and exploratory uni- and multivariate data analysis. The study contributes to the overall perspective of levels of organophosphate and phthalate ester in indoor air enabling comparison with other studies. The results indicated little or no difference in the concentrations of the target substances between the two risk classifications of the buildings. The differences between the apartments sampled within (intra) buildings were greater than the differences between (inter) buildings. The concentrations measured in air ranged up to 1200 ng m(-3) for organophosphate esters and up to 11 000 ng m(-3) for phthalate esters. Results in terms of sources were discerned e.g. PVC flooring is a major source of benzylbutyl phthalate in indoor air.


Analytical and Bioanalytical Chemistry | 2009

A solution to the 1D NMR alignment problem using an extended generalized fuzzy Hough transform and mode support

Erik Alm; Ralf J. O. Torgrip; K. Magnus Åberg; Johan Lindberg

AbstractThis paper approaches the problem of intersample peak correspondence in the context of later applying statistical data analysis techniques to 1D 1H-nuclear magnetic resonance (NMR) data. Any data analysis methodology will fail to produce meaningful results if the analyzed data table is not synchronized, i.e., each analyzed variable frequency (Hz) does not originate from the same chemical source throughout the entire dataset. This is typically the case when dealing with NMR data from biological samples. In this paper, we present a new state of the art for solving this problem using the generalized fuzzy Hough transform (GFHT). This paper describes significant improvements since the method was introduced for NMR datasets of plasma in Csenki et al. (Anal Bioanal Chem 389:875-885, 15) and is now capable of synchronizing peaks from more complex datasets such as urine as well as plasma data. We present a novel way of globally modeling peak shifts using principal component analysis, a new algorithm for calculating the transform and an effective peak detection algorithm. The algorithm is applied to two real metabonomic 1H-NMR datasets and the properties of the method are compared to bucketing. We implicitly prove that GFHT establishes the objectively true correspondence. Desirable features of the GFHT are: (1) intersample peak correspondence even if peaks change order on the frequency axis and (2) the method is symmetric with respect to the samples. FigureFrom chaos to order: heatmaps of a H-NMR spectral segment prior and post sorting on one peak position. Post sorting sample order reveals that peak positions exhibits distinctive patterns which are modeled by the GFHT to establish correspondence.


Analytical Chemistry | 2014

TracMass 2—A Modular Suite of Tools for Processing Chromatography-Full Scan Mass Spectrometry Data

Erik Tengstrand; Johan Lindberg; K. Magnus Åberg

In untargeted proteomics and metabolomics, raw data obtained with an LC/MS instrument are processed into a format that can be used for statistical analysis. Full scan MS data from chromatographic separation of biological samples are complex and analyte concentrations need to be extracted and aligned so that they can be compared across the samples. Several computer programs and methods have been developed for this purpose. There is still a need to improve the ease of use and feedback to the user because of the advanced multiparametric algorithms used. Here, we present and make publicly available, TracMass 2, a suite of computer programs that gives immediate graphical feedback to the data analyst on parameter settings and processing results, as well as producing state-of-the-art results. The main advantage of TracMass 2 is that the feedback and transparency of the processing steps generate confidence in the end result, which is a table of peak intensities. The data analyst can easily validate every step of the processing pipeline. Because the user receives feedback on how all parameter values affect the result before starting a lengthy computation, the users learning curve is enhanced and the total time used for data processing can be reduced. TracMass 2 has been released as open source and is included in the Supporting Information . We anticipate that TracMass 2 will set a new standard for how chemometrical algorithms are implemented in computer programs.


Analytical and Bioanalytical Chemistry | 2012

Automated annotation and quantification of metabolites in (1)H NMR data of biological origin

Erik Alm; Tove Slagbrand; K. Magnus Åberg; Erik Wahlström; Ingela Gustafsson; Johan Lindberg

In 1H NMR metabolomic datasets, there are often over a thousand peaks per spectrum, many of which change position drastically between samples. Automatic alignment, annotation, and quantification of all the metabolites of interest in such datasets have not been feasible. In this work we propose a fully automated annotation and quantification procedure which requires annotation of metabolites only in a single spectrum. The reference database built from that single spectrum can be used for any number of 1H NMR datasets with a similar matrix. The procedure is based on the generalized fuzzy Hough transform (GFHT) for alignment and on Principal-components analysis (PCA) for peak selection and quantification. We show that we can establish quantities of 21 metabolites in several 1H NMR datasets and that the procedure is extendable to include any number of metabolites that can be identified in a single spectrum. The procedure speeds up the quantification of previously known metabolites and also returns a table containing the intensities and locations of all the peaks that were found and aligned but not assigned to a known metabolite. This enables both biopattern analysis of known metabolites and data mining for new potential biomarkers among the unknowns.


Analytical and Bioanalytical Chemistry | 2010

Time-resolved biomarker discovery in 1H-NMR data using generalized fuzzy Hough transform alignment and parallel factor analysis

Erik Alm; Ralf J. O. Torgrip; K. Magnus Åberg; Johan Lindberg

This work addresses the subject of time-series analysis of comprehensive 1H-NMR data of biological origin. One of the problems with toxicological and efficacy studies is the confounding of correlation between the administered drug, its metabolites and the systemic changes in molecular dynamics, i.e., the flux of drug-related molecules correlates with the molecules of system regulation. This correlation poses a problem for biomarker mining since this confounding must be untangled in order to separate true biomarker molecules from dose-related molecules. One way of achieving this goal is to perform pharmacokinetic analysis. The difference in pharmacokinetic time profiles of different molecules can aid in the elucidation of the origin of the dynamics, this can even be achieved regardless of whether the identity of the molecule is known or not. This mode of analysis is the basis for metabonomic studies of toxicology and efficacy. One major problem concerning the analysis of 1H-NMR data generated from metabonomic studies is that of the peak positional variation and of peak overlap. These phenomena induce variance in the data, obscuring the true information content and are hence unwanted but hard to avoid. Here, we show that by using the generalized fuzzy Hough transform spectral alignment, variable selection, and parallel factor analysis, we can solve both the alignment and the confounding problem stated above. Using the outlined method, several different temporal concentration profiles can be resolved and the majority of the studied molecules and their respective fluxes can be attributed to these resolved kinetic profiles. The resolved time profiles hereby simplifies finding true biomarkers and bio-patterns for early detection of biological conditions as well as providing more detailed information about the studied biological system. The presented method represents a significant step forward in time-series analysis of biological 1H-NMR data as it provides almost full automation of the whole data analysis process and is able to analyze over 800 unique features per sample. The method is demonstrated using a 1H-NMR rat urine dataset from a toxicology study and is compared with a classical approach: COW alignment followed by bucketing.


Chemometrics and Intelligent Laboratory Systems | 2001

Pre-processing of three-way data by pulse-coupled neural networks—an imaging approach

K. Magnus Åberg; Sven P. Jacobsson

Abstract A new method for pre-processing three-dimensional data to model quantitative structure-retention relationships (QSRR) is presented. The pre-processing of three-dimensional images of molecules is done with a pulse-coupled neural network (PCNN). The PCNN is capable of transforming an image to a short time series representation of the molecule, which is more suitable for QSRR modelling with partial least squares than the original data. The method was developed and tested on a steroid data set of 24 compounds with reversed-phase high-performance liquid chromatographic retention data. The QSRR models are stable with respect to the parameters of the PCNN. Test set correlations ( q 2 ) of 0.95 and cross-validated r 2 of about 0.95 are readily obtained.


Journal of Chemometrics | 2018

Can we beat overfitting?—A closer look at Cloarec's PLS algorithm

Pedro F. M. Sousa; K. Magnus Åberg

Random noise has been addressed as a cause of overfitting in partial least squares regression. A previous study pinpointed that one of the sources of overfitting resides in the calculation of scores due to the accumulation of noise in the diagonal of the variance‐covariance matrix, and a modified partial least squares regression was proposed with the removal of this diagonal prior to the score calculation. Here, a further modification of the NIPALS algorithm is proposed, with the same ability to overcome overfitting due to noise, but algebraically more similar to the original NIPALS. The results indicate that it is possible to get more reliable auto‐prediction R2 with a cross‐validation performance close to that of the original NIPALS algorithm.

Collaboration


Dive into the K. Magnus Åberg's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Erik Alm

Stockholm University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge