Charmgil Hong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Charmgil Hong is active.

Explore More

Publication

Featured researches published by Charmgil Hong.

conference on information and knowledge management | 2014

A Mixtures-of-Trees Framework for Multi-Label Classification

Charmgil Hong; Iyad Batal; Milos Hauskrecht

We propose a new probabilistic approach for multi-label classification that aims to represent the class posterior distribution P(Y|X). Our approach uses a mixture of tree-structured Bayesian networks, which can leverage the computational advantages of conditional tree-structured models and the abilities of mixtures to compensate for tree-structured restrictions. We develop algorithms for learning the model from data and for performing multi-label predictions using the learned model. Experiments on multiple datasets demonstrate that our approach outperforms several state-of-the-art multi-label classification methods.

siam international conference on data mining | 2015

A Generalized Mixture Framework for Multi-label Classification

Charmgil Hong; Iyad Batal; Milos Hauskrecht

We develop a novel probabilistic ensemble framework for multi-label classification that is based on the mixtures-of-experts architecture. In this framework, we combine multi-label classification models in the classifier chains family that decompose the class posterior distribution P(Y1, …, Yd |X) using a product of posterior distributions over components of the output space. Our approach captures different input-output and output-output relations that tend to change across data. As a result, we can recover a rich set of dependency relations among inputs and outputs that a single multi-label classification model cannot capture due to its modeling simplifications. We develop and present algorithms for learning the mixtures-of-experts models from data and for performing multi-label predictions on unseen data instances. Experiments on multiple benchmark datasets demonstrate that our approach achieves highly competitive results and outperforms the existing state-of-the-art multi-label classification methods.

conference on information and knowledge management | 2013

An efficient probabilistic framework for multi-dimensional classification

Iyad Batal; Charmgil Hong; Milos Hauskrecht

The objective of multi-dimensional classification is to learn a function that accurately maps each data instance to a vector of class labels. Multi-dimensional classification appears in a wide range of applications including text categorization, gene functionality classification, semantic image labeling, etc. Usually, in such problems, the class variables are not independent, but rather exhibit conditional dependence relations among them. Hence, the key to the success of multi-dimensional classification is to effectively model such dependencies and use them to facilitate the learning. In this paper, we propose a new probabilistic approach that represents class conditional dependencies in an effective yet computationally efficient way. Our approach uses a special tree-structured Bayesian network model to represent the conditional joint distribution of the class variables given the feature variables. We develop and present efficient algorithms for learning the model from data and for performing exact probabilistic inferences on the model. Extensive experiments on multiple datasets demonstrate that our approach achieves highly competitive results when it is compared to existing state-of-the-art methods.

Journal of Biomedical Informatics | 2016

Outlier-based detection of unusual patient-management actions: An ICU study

Milos Hauskrecht; Iyad Batal; Charmgil Hong; Quang Nguyen; Gregory F. Cooper; Shyam Visweswaran; Gilles Clermont

Medical errors remain a significant problem in healthcare. This paper investigates a data-driven outlier-based monitoring and alerting framework that uses data in the Electronic Medical Records (EMRs) repositories of past patient cases to identify any unusual clinical actions in the EMR of a current patient. Our conjecture is that these unusual clinical actions correspond to medical errors often enough to justify their detection and alerting. Our approach works by using EMR repositories to learn statistical models that relate patient states to patient-management actions. We evaluated this approach on the EMR data for 24,658 intensive care unit (ICU) patient cases. A total of 16,500 cases were used to train statistical models for ordering medications and laboratory tests given the patient state summarizing the patients clinical history. The models were applied to a separate test set of 8158 ICU patient cases and used to generate alerts. A subset of 240 alerts generated by the models were evaluated and assessed by eighteen ICU clinicians. The overall true positive rates for the alerts (TPARs) ranged from 0.44 to 0.71. The TPAR for medication order alerts specifically ranged from 0.31 to 0.61 and for laboratory order alerts from 0.44 to 0.75. These results support outlier-based alerting as a promising new approach to data-driven clinical alerting that is generated automatically based on past EMR data.

siam international conference on data mining | 2014

An Optimization-based Framework to Learn Conditional Random Fields for Multi-label Classification.

Mahdi Pakdaman Naeini; Iyad Batal; Zitao Liu; Charmgil Hong; Milos Hauskrecht

This paper studies multi-label classification problem in which data instances are associated with multiple, possibly high-dimensional, label vectors. This problem is especially challenging when labels are dependent and one cannot decompose the problem into a set of independent classification problems. To address the problem and properly represent label dependencies we propose and study a pairwise conditional random Field (CRF) model. We develop a new approach for learning the structure and parameters of the CRF from data. The approach maximizes the pseudo likelihood of observed labels and relies on the fast proximal gradient descend for learning the structure and limited memory BFGS for learning the parameters of the model. Empirical results on several datasets show that our approach outperforms several multi-label classification baselines, including recently published state-of-the-art methods.

national conference on artificial intelligence | 2015