Tomi Peltola
Aalto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tomi Peltola.
Journal of Proteome Research | 2012
Ville Petteri Mäkinen; Tuulia Tynkkynen; Pasi Soininen; Tomi Peltola; Antti J. Kangas; Carol Forsblom; Lena M. Thorn; Kimmo Kaski; Reino Laatikainen; Mika Ala-Korpela; Per-Henrik Groop
Type 1 diabetic patients with varying severity of kidney disease were investigated to create multimetabolite models of the disease process. Urinary albumin excretion rate was measured for 3358 patients with type 1 diabetes. Prospective records were available for 1051 patients, of whom 163 showed progression of albuminuria (8.3-year follow-up), and 162 were selected as stable controls. At baseline, serum lipids, lipoprotein subclasses, and low-molecular weight metabolites were quantified by NMR spectroscopy (325 samples). The data were analyzed by the self-organizing map. In cross-sectional analyses, patients with no complications had low serum lipids, less inflammation, and better glycemic control, whereas patients with advanced kidney disease had high serum cystatin-C and sphingomyelin. These phenotype extremes shared low unsaturated fatty acids (UFAs) and phospholipids. Prospectively, progressive albuminuria was associated with high UFAs, phospholipids, and IDL and LDL lipids. Progression at longer duration was associated with high HDL lipids, whereas earlier progression was associated with poor glycemic control, increased saturated fatty acids (SFAs), and inflammation. Diabetic kidney disease consists of diverse metabolic phenotypes: UFAs, phospholipids, IDL, and LDL may be important in the subclinical phase, high SFAs and low HDL suggest accelerated progression, and the sphingolipid pathway in advanced kidney injury deserves further research.
Annals of Medicine | 2012
Daniel Gordin; Johan Wadén; Carol Forsblom; Lena M. Thorn; Milla Rosengård-Bärlund; Outi Heikkilä; Markku Saraheimo; Nina Tolonen; Kustaa Hietala; Aino Soro-Paavonen; Laura Salovaara; Ville Petteri Mäkinen; Tomi Peltola; Luciano Bernardi; Per-Henrik Groop
Abstract Introduction/aims. While patients with type 1 diabetes (T1D) are known to suffer from early cardiovascular disease (CVD), we examined associations between arterial stiffness and diabetic complications in a large patient group with T1D. Methods. This study included 807 subjects (622 T1D and 185 healthy volunteers (age 40.6 ± 0.7 versus 41.6 ± 1.2 years; P = NS)). Arterial stiffness was measured by pulse wave analysis from each participant. Furthermore, information on diabetic retinopathy, nephropathy, and CVD was collected. The renal status was verified from at least two out of three urine collections. Results. Patients with T1D without signs of diabetic nephropathy had stiffer arteries measured as the augmentation index (AIx) than age-matched control subjects (17.3% ± 0.6% versus 10.0% ± 1.2%; P < 0.001). Moreover, AIx (OR 1.08; 95% CI 1.03–1.13; P = 0.002) was associated with diabetic laser-treated retinopathy in patients with normoalbuminuria in a multivariate logistic regression analysis. The same was true for AIx and diabetic nephropathy (1.04 (1.01–1.08); P = 0.004) as well as AIx and CVD (1.06 (1.00–1.12); P = 0.01) in patients with T1D. Conclusions. Arterial stiffness was associated with microvascular and macrovascular complications in patients with T1D.
PLOS ONE | 2012
Tomi Peltola; Pekka Marttinen; Antti Jula; Veikko Salomaa; Markus Perola; Aki Vehtari
Although complex diseases and traits are thought to have multifactorial genetic basis, the common methods in genome-wide association analyses test each variant for association independent of the others. This computational simplification may lead to reduced power to identify variants with small effect sizes and requires correcting for multiple hypothesis tests with complex relationships. However, advances in computational methods and increase in computational resources are enabling the computation of models that adhere more closely to the theory of multifactorial inheritance. Here, a Bayesian variable selection and model averaging approach is formulated for searching for additive and dominant genetic effects. The approach considers simultaneously all available variants for inclusion as predictors in a linear genotype-phenotype mapping and averages over the uncertainty in the variable selection. This leads to naturally interpretable summary quantities on the significances of the variants and their contribution to the genetic basis of the studied trait. We first characterize the behavior of the approach in simulations. The results indicate a gain in the causal variant identification performance when additive and dominant variation are simulated, with a negligible loss of power in purely additive case. An application to the analysis of high- and low-density lipoprotein cholesterol levels in a dataset of 3895 Finns is then presented, demonstrating the feasibility of the approach at the current scale of single-nucleotide polymorphism data. We describe a Markov chain Monte Carlo algorithm for the computation and give suggestions on the specification of prior parameters using commonly available prior information. An open-source software implementing the method is available at http://www.lce.hut.fi/research/mm/bmagwa/ and https://github.com/to-mi/.
PLOS ONE | 2012
Tomi Peltola; Pekka Marttinen; Aki Vehtari
High-dimensional datasets with large amounts of redundant information are nowadays available for hypothesis-free exploration of scientific questions. A particular case is genome-wide association analysis, where variations in the genome are searched for effects on disease or other traits. Bayesian variable selection has been demonstrated as a possible analysis approach, which can account for the multifactorial nature of the genetic effects in a linear regression model. Yet, the computation presents a challenge and application to large-scale data is not routine. Here, we study aspects of the computation using the Metropolis-Hastings algorithm for the variable selection: finite adaptation of the proposal distributions, multistep moves for changing the inclusion state of multiple variables in a single proposal and multistep move size adaptation. We also experiment with a delayed rejection step for the multistep moves. Results on simulated and real data show increase in the sampling efficiency. We also demonstrate that with application specific proposals, the approach can overcome a specific mixing problem in real data with 3822 individuals and 1,051,811 single nucleotide polymorphisms and uncover a variant pair with synergistic effect on the studied trait. Moreover, we illustrate multimodality in the real dataset related to a restrictive prior distribution on the genetic effect sizes and advocate a more flexible alternative.
Machine Learning | 2017
Pedram Daee; Tomi Peltola; Marta Soare; Samuel Kaski
Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.
intelligent user interfaces | 2017
Luana Micallef; Iiris Sundin; Pekka Marttinen; Muhammad Ammad-ud-din; Tomi Peltola; Marta Soare; Giulio Jacucci; Samuel Kaski
Providing accurate predictions is challenging for machine learning algorithms when the number of features is larger than the number of samples in the data. Prior knowledge can improve machine learning models by indicating relevant variables and parameter values. Yet, this prior knowledge is often tacit and only available from domain experts. We present a novel approach that uses interactive visualization to elicit the tacit prior knowledge and uses it to improve the accuracy of prediction models. The main component of our approach is a user model that models the domain experts knowledge of the relevance of different features for a prediction task. In particular, based on the experts earlier input, the user model guides the selection of the features on which to elicit users knowledge next. The results of a controlled user study show that the user model significantly improves prior knowledge elicitation and prediction accuracy, when predicting the relative citation counts of scientific documents in a specific domain.
Scientific Reports | 2018
Elsa Marques; Tomi Peltola; Samuel Kaski; Juha Klefström
In metazoans, epithelial architecture provides a context that dynamically modulates most if not all epithelial cell responses to intrinsic and extrinsic signals, including growth or survival signalling and transforming oncogene action. Three-dimensional (3D) epithelial culture systems provide tractable models to interrogate the function of human genetic determinants in establishment of context-dependency. We performed an arrayed genetic shRNA screen in mammary epithelial 3D cultures to identify new determinants of epithelial architecture, finding that the key phenotype impacting shRNAs altered not only the data population average but even more noticeably the population distribution. The broad distributions were attributable to sporadic gene silencing actions by shRNA in unselected populations. We employed Maximum Mean Discrepancy concept to capture similar population distribution patterns and demonstrate here the feasibility of the test in identifying an impact of shRNA in populations of 3D structures. Integration of the clustered morphometric data with protein-protein interactions data enabled hypothesis generation of novel biological pathways underlying similar 3D phenotype alterations. The results present a new strategy for 3D phenotype-driven pathway analysis, which is expected to accelerate discovery of context-dependent gene functions in epithelial biology and tumorigenesis.
Annals of Medicine | 2012
Sanna Kuusisto; Tomi Peltola; Maria Laitinen; Linda S. Kumpula; Ville Petteri Mäkinen; Tuire Salonurmi; Pirjo Hedberg; Matti Jauhiainen; Markku J. Savolainen; Minna L. Hannuksela; Mika Ala-Korpela
Abstract Context and objective. Lipoproteins are involved in the pathophysiology of several metabolic diseases. Here we focus on the interplay between lipoprotein metabolism and adiponectin with the extension of alcohol intake. Design and subjects. Eighty-three low-to-moderate and 80 heavy alcohol drinkers were studied. Plasma adiponectin, other biochemical and extensive lipoprotein data were measured. Self-organizing maps were applied to characterize lipoprotein phenotypes and their interrelationships with biochemical measures and alcohol consumption. Results. Alcohol consumption and plasma adiponectin had a strong positive association. Heavy alcohol consumption was associated with decreased low-density lipoprotein cholesterol (LDL-C). Nevertheless, two distinct lipoprotein phenotypes were identified, one with elevated high-density lipoprotein cholesterol (HDL-C) and decreased very-low-density lipoprotein triglycerides (VLDL-TG) together with low prevalence of metabolic syndrome, and the other vice versa. The HDL particles were enlarged in both phenotypes related to the heavy drinkers. The low-to-moderate alcohol drinkers were characterized with high LDL-C and C-enriched LDL particles. Conclusions. The analyses per se illustrated the multi-faceted and non-linear nature of lipoprotein metabolism. The heavy alcohol drinkers were characterized either by an anti-atherogenic lipoprotein phenotype (with also the highest adiponectin concentrations) or by a phenotype with pro-atherogenic and metabolic syndrome-like features. Clinically this underlines the need to distinguish the differing individual risk for lipid-related metabolic disturbances also in heavy alcohol drinkers.
intelligent user interfaces | 2018
Pedram Daee; Tomi Peltola; Aki Vehtari; Samuel Kaski
In human-in-the-loop machine learning, the user provides information beyond that in the training data. Many algorithms and user interfaces have been designed to optimize and facilitate this human--machine interaction; however, fewer studies have addressed the potential defects the designs can cause. Effective interaction often requires exposing the user to the training data or its statistics. The design of the system is then critical, as this can lead to double use of data and overfitting, if the user reinforces noisy patterns in the data. We propose a user modelling methodology, by assuming simple rational behaviour, to correct the problem. We show, in a user study with 48 participants, that the method improves predictive performance in a sparse linear regression sentiment analysis task, where graded user knowledge on feature relevance is elicited. We believe that the key idea of inferring user knowledge with probabilistic user models has general applicability in guarding against overfitting and improving interactive machine learning.
intelligent systems in molecular biology | 2018
Iiris Sundin; Tomi Peltola; Luana Micallef; Homayun Afrabandpey; Marta Soare; Muntasir Mamun Majumder; Pedram Daee; Chen He; Baris Serim; Aki S. Havulinna; Caroline Heckman; Giulio Jacucci; Pekka Marttinen; Samuel Kaski
Predicting the efficacy of a drug for a given individual, using highdimensional genomic measurements, is at the core of precision medicine. However, identifying features on which to base the predictions remains a challenge, especially when the sample size is small. Incorporating expert knowledge offers a promising alternative to improve a prediction model, but collecting such knowledge is laborious to the expert if the number of candidate features is very large. We introduce a probabilistic model that can incorporate expert feedback about the impact of genomic measurements on the sensitivity of a cancer cell for a given drug. We also present two methods to intelligently collect this feedback from the expert, using experimental design and multi-armed bandit models. In a multiple myeloma blood cancer data set (n=51), expert knowledge decreased the prediction error by 8%. Furthermore, the intelligent approaches can be used to reduce the workload of feedback collection to less than 30% on average compared to a naive approach.Motivation Precision medicine requires the ability to predict the efficacies of different treatments for a given individual using high‐dimensional genomic measurements. However, identifying predictive features remains a challenge when the sample size is small. Incorporating expert knowledge offers a promising approach to improve predictions, but collecting such knowledge is laborious if the number of candidate features is very large. Results We introduce a probabilistic framework to incorporate expert feedback about the impact of genomic measurements on the outcome of interest and present a novel approach to collect the feedback efficiently, based on Bayesian experimental design. The new approach outperformed other recent alternatives in two medical applications: prediction of metabolic traits and prediction of sensitivity of cancer cells to different drugs, both using genomic features as predictors. Furthermore, the intelligent approach to collect feedback reduced the workload of the expert to approximately 11%, compared to a baseline approach. Availability and implementation Source code implementing the introduced computational methods is freely available at https://github.com/AaltoPML/knowledge‐elicitation‐for‐precision‐medicine.