Henrik Boström | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Henrik Boström is active.

Explore More

Publication

Featured researches published by Henrik Boström.

international conference on computational linguistics | 2001

Automatic Keyword Extraction Using Domain Knowledge

Anette Hulth; Jussi Karlgren; Anna Jonsson; Henrik Boström; Lars Asker

Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchically organised domain specific thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords. In the presented experiment, the combination of the evidence from frequency analysis and the hierarchically organised thesaurus was done using inductive logic programming.

international conference on machine learning and applications | 2006

Reducing High-Dimensional Data by Principal Component Analysis vs. Random Projection for Nearest Neighbor Classification

Sampath Deegalla; Henrik Boström

The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principle component analysis (PCA) and random projection (RP), are investigated for this purpose and compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and five micro array data sets. The experiment results demonstrate that PCA outperforms RP for all data sets used in this study. However, the experiments also show that PCA is more sensitive to the choice of the number of reduced dimensions. After reaching a peak, the accuracy degrades with the number of dimensions for PCA, while the accuracy for RP increases with the number of dimensions. The experiments also show that the use of PCA and RP may even outperform using the non-reduced feature set (in 9 respectively 6 cases out of 10), hence not only resulting in more efficient, but also more effective, nearest neighbor classification

european conference on principles of data mining and knowledge discovery | 2000

Learning First Order Logic Time Series Classifiers: Rules and Boosting

Juan José Rodríguez; Carlos J. Alonso González; Henrik Boström

A method for learning multivariate time series classifiers by inductive logic programming is presented. Two types of background predicate that are suited for this task are introduced: interval based predicates, such as always, and distance based, such as the euclidean distance. Special purpose techniques are presented that allow these predicates to be handled efficiently when performing top-down induction. Furthermore, by employing boosting, the accuracy of the resulting classifiers can be improved significantly. Experiments on several different datasets show that the proposed method is highly competitive with previous approaches.

international conference on machine learning and applications | 2007

Estimating class probabilities in random forests

Henrik Boström

For both single probability estimation trees (PETs) and ensembles of such trees, commonly employed class probability estimates correct the observed relative class frequencies in each leaf to avoid anomalies caused by small sample sizes. The effect of such corrections in random forests of PETs is investigated, and the use of the relative class frequency is compared to using two corrected estimates, the Laplace estimate and the m-estimate. An experiment with 34 datasets from the UCI repository shows that estimating class probabilities using relative class frequency clearly outperforms both using the Laplace estimate and the m-estimate with respect to accuracy, area under the ROC curve (AUC) and Brier score. Hence, in contrast to what is commonly employed for PETs and ensembles of PETs, these results strongly suggest that a non-corrected probability estimate should be used in random forests of PETs. The experiment further shows that learning random forests of PETs using relative class frequency significantly outperforms learning random forests of classification trees (i.e., trees for which only an unweighted vote on the most probable class is counted) with respect to both accuracy and AUC, but that the latter is clearly ahead of the former with respect to Brier score.Extracting association rules from large datasets typically results in a huge amount of rules. An approach to tackle this problem is to filter the resulting rule set, which reduces the rules, at the cost of also eliminating potentially interesting ones. In exploring a new dataset in search of relevant associations, it may be more useful for miners to have an overview of the space of rules obtainable from the dataset, rather than getting an arbitrary set satisfying high values for given interest measures. We describe a rule extraction approach that favors rule diversity, allowing miners to gain an overview of the rule space while reducing semantic redundancy within the rule set. This approach adopts an itemset-driven rule generation coupled with a cluster-based filtering process. The set of rules so obtained provides a starting point for a user-driven exploration of it.

intelligent data engineering and automated learning | 2007

Classification of microarrays with kNN: comparison of dimensionality reduction methods

Sampath Deegalla; Henrik Boström

Dimensionality reduction can often improve the performance of the k-nearest neighbor classifier (kNN) for high-dimensional data sets, such as microarrays. The effect of the choice of dimensionality reduction method on the predictive performance of kNN for classifying microarray data is an open issue, and four common dimensionality reduction methods, Principal Component Analysis (PCA), Random Projection (RP), Partial Least Squares (PLS) and Information Gain(IG), are compared on eight microarray data sets. It is observed that all dimensionality reduction methods result in more accurate classifiers than what is obtained from using the raw attributes. Furthermore, it is observed that both PCA and PLS reach their best accuracies with fewer components than the other two methods, and that RP needs far more components than the others to outperform kNN on the non-reduced dataset. None of the dimensionality reduction methods can be concluded to generally outperform the others, although PLS is shown to be superior on all four binary classification tasks, but the main conclusion from the study is that the choice of dimensionality reduction method can be of major importance when classifying microarrays using kNN.

artificial intelligence in medicine in europe | 2013

Predicting Adverse Drug Events by Analyzing Electronic Patient Records

Isak Karlsson; Jing Zhao; Lars Asker; Henrik Boström

Diagnosis codes for adverse drug events (ADEs) are sometimes missing from electronic patient records (EPRs). This may not only affect patient safety in the worst case, but also the number of reported ADEs, resulting in incorrect risk estimates of prescribed drugs. Large databases of electronic patient records (EPRs) are potentially valuable sources of information to support the identification of ADEs. This study investigates the use of machine learning for predicting one specific ADE based on information extracted from EPRs, including age, gender, diagnoses and drugs. Several predictive models are developed and evaluated using different learning algorithms and feature sets. The highest observed AUC is 0.87, obtained by the random forest algorithm. The resulting model can be used for screening EPRs that are not, but possibly should be, assigned a diagnosis code for the ADE under consideration. Preliminary results from using the model are presented.

Journal of Logic Programming | 1999

Induction of logic programs by example-guided unfolding

Henrik Boström; Peter Idestam-Almquist

Abstract Resolution has been used as a specialisation operator in several approaches to top-down induction of logic programs. This operator allows the overly general hypothesis to be used as a declarative bias that restricts not only what predicate symbols can be used in produced hypotheses, but also how the predicates can be invoked. The two main strategies for top-down induction of logic programs, Covering and Divide-and-Conquer, are formalised using resolution as a specialisation operator, resulting in two strategies for performing example-guided unfolding. These strategies are compared both theoretically and experimentally. It is shown that the computational cost grows quadratically in the size of the example set for Covering, while it grows linearly for Divide-and-Conquer. This is also demonstrated by experiments, in which the amount of work performed by Covering is up to 30 times the amount of work performed by Divide-and-Conquer. The theoretical analysis shows that the hypothesis space is larger for Covering, and thus more compact hypotheses may be found by this technique than by Divide-and-Conquer. However, it is shown that for each non-recursive hypothesis that can be produced by Covering, there is an equivalent hypothesis (w.r.t. the background predicates) that can be produced by Divide-and-Conquer. A major draw-back of Divide-and-Conquer, in contrast to Covering, is that it is not applicable to learning recursive definitions.

bioinformatics and biomedicine | 2014

Detecting adverse drug events with multiple representations of clinical measurements

Jing Zhao; Aron Henriksson; Lars Asker; Henrik Boström

Adverse drug events (ADEs) are grossly under-reported in electronic health records (EHRs). This could be mitigated by methods that are able to detect ADEs in EHRs, thereby allowing for missing ADE-specific diagnosis codes to be identified and added. A crucial aspect of constructing such systems is to find proper representations of the data in order to allow the predictive modeling to be as accurate as possible. One category of EHR data that can be used as indicators of ADEs are clinical measurements. However, using clinical measurements as features is not unproblematic due to the high rate of missing values and they can be repeated a variable number of times in each patient health record. In this study, five basic representations of clinical measurements are proposed and evaluated to handle these two problems. An empirical investigation using random forest on 27 datasets from a real EHR database with different ADE targets is presented, demonstrating that the predictive performance, in terms of accuracy and area under ROC curve, is higher when representing clinical measurements crudely as whether they were taken or how many times they were taken by a patient. Furthermore, a sixth alternative, combining all five basic representations, significantly outperforms using any of the basic representation except for one. A subsequent analysis of variable importance is also conducted with this fused feature set, showing that when clinical measurements have a high missing rate, the number of times they were taken by one patient is ranked as more informative than looking at their actual values. The observation from random forest is also confirmed empirically using other commonly employed classifiers. This study demonstrates that the way in which clinical measurements from EHRs are presented has a high impact for ADE detection, and that using multiple representations outperforms using a basic representation.

international conference on machine learning and applications | 2008

Calibrating Random Forests

Henrik Boström

When using the output of classifiers to calculate the expected utility of different alternatives in decision situations, the correctness of predicted class probabilities may be of crucial importance. However, even very accurate classifiers may output class probabilities of rather poor quality. One way of overcoming this problem is by means of calibration, i.e., mapping the original class probabilities to more accurate ones. Previous studies have however indicated that random forests are difficult to calibrate by standard calibration methods. In this work, a novel calibration method is introduced, which is based on a recent finding that probabilities predicted by forests of classification trees have a lower squared error compared to those predicted by forests of probability estimation trees (PETs). The novel calibration method is compared to the two standard methods, Platt scaling and isotonic regression, on 34 datasets from the UCI repository. The experiment shows that random forests of PETs calibrated by the novel method significantly outperform uncalibrated random forests of both PETs and classification trees, as well as random forests calibrated with the two standard methods, with respect to the squared error of predicted class probabilities.

Molecular Diversity | 2006

Discrimination between modes of toxic action of phenols using rule based methods.

Ulf Norinder; Per Lidén; Henrik Boström

SummaryRule-based ensemble modelling has been used to develop a model with high accuracy and predictive capabilities for distinguishing between four different modes of toxic action for a set of 220 phenols. The model not only predicts the majority class (polar narcotics) well but also the other three classes (weak acid respiratory uncouplers, pro-electrophiles and soft electrophiles) of toxic action despite the severely skewed distribution among the four investigated classes. Furthermore, the investigation also highlights the merits of using ensemble (or consensus) modelling as an alternative to the more traditional development of a single model in order to promote robustness and accuracy with respect to the predictive capability for the derived model.

Explore More