Hamed Valizadegan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hamed Valizadegan is active.

Explore More

Publication

Featured researches published by Hamed Valizadegan.

ACM Transactions on Intelligent Systems and Technology | 2013

A temporal pattern mining approach for classifying electronic health record data

Iyad Batal; Hamed Valizadegan; Gregory F. Cooper; Milos Hauskrecht

We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the Minimal Predictive Temporal Patterns framework to generate a small set of predictive and nonspurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin-induced thrombocytopenia. The results demonstrate the benefit of our approach in efficiently learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems.

european conference on machine learning | 2008

Semi-supervised boosting for multi-class classification

Hamed Valizadegan; Rong Jin; Anil K. Jain

Most semi-supervised learning algorithms have been designed for binary classification, and are extended to multi-class classification by approaches such as one-against-the-rest. The main shortcoming of these approaches is that they are unable to exploit the fact that each example is only assigned to one class. Additional problems with extending semi-supervised binary classifiers to multi-class problems include imbalanced classification and different output scales of different binary classifiers. We propose a semi-supervised boosting framework, termed Multi-Class Semi-Supervised Boosting (MCSSB) , that directly solves the semi-supervised multi-class learning problem. Compared to the existing semi-supervised boosting methods, the proposed framework is advantageous in that it exploits both classification confidence and similarities among examples when deciding the pseudo-labels for unlabeled examples. Empirical study with a number of UCI datasets shows that the proposed MCSSB algorithm performs better than the state-of-the-art boosting algorithms for semi-supervised learning.

bioinformatics and biomedicine | 2011

A Pattern Mining Approach for Classifying Multivariate Temporal Data

Iyad Batal; Hamed Valizadegan; Gregory F. Cooper; Milos Hauskrecht

We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the minimal predictive temporal patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems.

Journal of Biomedical Informatics | 2013

Learning classification models from multiple experts

Hamed Valizadegan; Quang Nguyen; Milos Hauskrecht

Building classification models from clinical data using machine learning methods often relies on labeling of patient examples by human experts. Standard machine learning framework assumes the labels are assigned by a homogeneous process. However, in reality the labels may come from multiple experts and it may be difficult to obtain a set of class labels everybody agrees on; it is not uncommon that different experts have different subjective opinions on how a specific patient example should be classified. In this work we propose and study a new multi-expert learning framework that assumes the class labels are provided by multiple experts and that these experts may differ in their class label assessments. The framework explicitly models different sources of disagreements and lets us naturally combine labels from different human experts to obtain: (1) a consensus classification model representing the model the group of experts converge to, as well as, and (2) individual expert models. We test the proposed framework by building a model for the problem of detection of the Heparin Induced Thrombocytopenia (HIT) where examples are labeled by three experts. We show that our framework is superior to multiple baselines (including standard machine learning framework in which expert differences are ignored) and that our framework leads to both improved consensus and individual expert models.

Journal of the American Medical Informatics Association | 2014

Learning classification models with soft-label information

Quang Nguyen; Hamed Valizadegan; Milos Hauskrecht

OBJECTIVE Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. MATERIALS AND METHODS Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. RESULTS Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. CONCLUSIONS A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.

international conference on data mining | 2011

Conditional Anomaly Detection with Soft Harmonic Functions

Michal Valko; Branislav Kveton; Hamed Valizadegan; Gregory F. Cooper; Milos Hauskrecht

In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response or a class label. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method on several synthetic and UCI ML datasets in detecting unusual labels when compared to several baseline approaches. We also evaluate the performance of our method on a real-world electronic health record dataset where we seek to identify unusual patient-management decisions.

knowledge discovery and data mining | 2011

Learning to trade off between exploration and exploitation in multiclass bandit prediction

Hamed Valizadegan; Rong Jin; Shijun Wang

We study multi-class bandit prediction, an online learning problem where the learner only receives a partial feedback in each trial indicating whether the predicted class label is correct. The exploration vs. exploitation tradeoff strategy is a well-known technique for online learning with incomplete feedback (i.e., bandit setup). Banditron [8], a multi-class online learning algorithm for bandit setting, maximizes the run-time gain by balancing between exploration and exploitation with a fixed tradeoff parameter. The performance of Banditron can be quite sensitive to the choice of the tradeoff parameter and therefore effective algorithms to automatically tune this parameter is desirable. In this paper, we propose three learning strategies to automatically adjust the tradeoff parameter for Banditron. Our extensive empirical study with multiple real-world data sets verifies the efficacy of the proposed approach in learning the exploration vs. exploitation tradeoff parameter.

international conference on data mining | 2011

Learning Classification with Auxiliary Probabilistic Information

Quang Nguyen; Hamed Valizadegan; Milos Hauskrecht

Finding ways of incorporating auxiliary information or auxiliary data into the learning process has been the topic of active data mining and machine learning research in recent years. In this work we study and develop a new framework for classification learning problem in which, in addition to class labels, the learner is provided with an auxiliary (probabilistic) information that reflects how strong the expert feels about the class label. This approach can be extremely useful for many practical classification tasks that rely on subjective label assessment and where the cost of acquiring additional auxiliary information is negligible when compared to the cost of the example analysis and labelling. We develop classification algorithms capable of using the auxiliary information to make the learning process more efficient in terms of the sample complexity. We demonstrate the benefit of the approach on a number of synthetic and real world data sets by comparing it to the learning with class labels only.

computational intelligence and data mining | 2007

A Prototype-driven Framework for Change Detection in Data Stream Classification

Hamed Valizadegan; Pang Ning Tan

This paper presents a prototype-driven framework for classifying evolving data streams. Our framework uses cluster prototypes to summarize the data and to determine whether the current model is outdated. This strategy of rebuilding the model only when significant changes are detected helps to reduce the computational overhead and the amount of labeled examples needed. To improve its accuracy, we also propose a selective sampling strategy to acquire more labeled examples from regions where the models predictions are unreliable. Our experimental results demonstrate the effectiveness of the proposed framework, both in terms of reducing the amount of model updates and maintaining high accuracy

international conference on tools with artificial intelligence | 2007

A Probabilistic Substructure-Based Approach for Graph Classification

H. D. K. Moonesinghe; Hamed Valizadegan; Samah Jamal Fodeh; Pang Ning Tan

Graph classification is an important data mining task that has attracted considerable attention recently. This paper presents a probabilistic substructure-based approach for classifying graph-based data. More specifically, we use a frequent subgraph mining algorithm to extract substructure based descriptors and apply the maximum entropy principle to build a classification model from the frequent subgraphs. We perform extensive experiments to compare the performance of the proposed approach against existing feature vector methods using AdaBoost and support vector machine.In this work, we present comparative evaluation of the practical value of some recently proposed speech parameterizations on the speech recognition task. Specifically, in a common experimental setup we evaluate recent discrete wavelet-packet transform (DWPT)-based speech features against traditional techniques, such as the Mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive (PLP) cepstral coefficients that presently dominate the speech recognition field. The relative ranking of eleven sets of speech features is presented.

Explore More