Kun Deng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kun Deng is active.

Explore More

Publication

Featured researches published by Kun Deng.

international conference on data mining | 2010

Active Learning from Multiple Noisy Labelers with Varied Costs

Yaling Zheng; Stephen D. Scott; Kun Deng

In active learning, where a learning algorithm has to purchase the labels of its training examples, it is often assumed that there is only one labeler available to label examples, and that this labeler is noise-free. In reality, it is possible that there are multiple labelers available (such as human labelers in the online annotation tool Amazon Mechanical Turk) and that each such labeler has a different cost and accuracy. We address the active learning problem with multiple labelers where each labeler has a different (known) cost and a different (unknown) accuracy. Our approach uses the idea of {\em adjusted cost}, which allows labelers with different costs and accuracies to be directly compared. This allows our algorithm to find low-cost combinations of labelers that result in high-accuracy labelings of instances. Our algorithm further reduces costs by pruning under performing labelers from the set under consideration, and by halting the process of estimating the accuracy of the labelers as early as it can. We found that our algorithm often outperforms, and is always competitive with, other algorithms in the literature.

Machine Learning | 2008

On reoptimizing multi-class classifiers

Chris Bourke; Kun Deng; Stephen D. Scott; Robert E. Schapire; N. V. Vinodchandran

AbstractSignificant changes in the instance distribution or associated cost function of a learning problem require one to reoptimize a previously-learned classifier to work under new conditions. We study the problem of reoptimizing a multi-class classifier based on its ROC hypersurface and a matrix describing the costs of each type of prediction error. For a binary classifier, it is straightforward to find an optimal operating point based on its ROC curve and the relative cost of true positive to false positive error. However, the corresponding multi-class problem (finding an optimal operating point based on a ROC hypersurface and cost matrix) is more challenging and until now, it was unknown whether an efficient algorithm existed that found an optimal solution. We answer this question by first proving that the decision version of this problem is

Proceedings of SPIE | 2011

Machine learning-based automatic detection of pulmonary trunk

Hong Wu; Kun Deng; Jianming Liang

\mathsf{NP}

Machine Learning | 2013

New algorithms for budgeted learning

Kun Deng; Yaling Zheng; Chris Bourke; Stephen D. Scott; Julie Masciale

-complete. As a complementary positive result, we give an algorithm that finds an optimal solution in polynomial time if the number of classes n is a constant. We also present several heuristics for this problem, including linear, nonlinear, and quadratic programming formulations, genetic algorithms, and a customized algorithm. Empirical results suggest that under both uniform and non-uniform cost models, simple greedy methods outperform more sophisticated methods.

ieee symposium on adaptive dynamic programming and reinforcement learning | 2011

Active learning for personalizing treatment

Kun Deng; Joelle Pineau; Susan A. Murphy

Pulmonary embolism is a common cardiovascular emergency with about 600,000 cases occurring annually and causing approximately 200,000 deaths in the US. CT pulmonary angiography (CTPA) has become the reference standard for PE diagnosis, but the interpretation of these large image datasets is made complex and time consuming by the intricate branching structure of the pulmonary vessels, a myriad of artifacts that may obscure or mimic PEs, and suboptimal bolus of contrast and inhomogeneities with the pulmonary arterial blood pool. To meet this challenge, several approaches for computer aided diagnosis of PE in CTPA have been proposed. However, none of these approaches is capable of detecting central PEs, distinguishing the pulmonary artery from the vein to effectively remove any false positives from the veins, and dynamically adapting to suboptimal contrast conditions associated the CTPA scans. To overcome these shortcomings, it requires highly efficient and accurate identification of the pulmonary trunk. For this very purpose, in this paper, we present a machine learning based approach for automatically detecting the pulmonary trunk. Our idea is to train a cascaded AdaBoost classifier with a large number of Haar features extracted from CTPA image samples, so that the pulmonary trunk can be automatically identified by sequentially scanning the CTPA images and classifying each encountered sub-image with the trained classifier. Our approach outperforms an existing anatomy-based approach, requiring no explicit representation of anatomical knowledge and achieving a nearly 100% accuracy tested on a large number of cases.

international conference on machine learning and applications | 2014

Budgeted Learning for Developing Personalized Treatment

Kun Deng; Russell Greiner; Susan A. Murphy

We explore the problem of budgeted machine learning, in which the learning algorithm has free access to the training examples’ class labels but has to pay for each attribute that is specified. This learning model is appropriate in many areas, including medical applications. We present new algorithms for choosing which attributes to purchase of which examples, based on algorithms for the multi-armed bandit problem. In addition, we also evaluate a group of algorithms based on the idea of incorporating second-order statistics into decision making. Most of our algorithms are competitive with the current state of art and performed better when the budget was highly limited (in particular, our new algorithm AbsoluteBR2). Finally, we present new heuristics for selecting an instance to purchase after the attribute is selected, instead of selecting an instance uniformly at random, which is typically done. While experimental results showed some performance improvements when using the new instance selectors, there was no consistent winner among these methods.

international conference on data mining | 2007

Bandit-Based Algorithms for Budgeted Learning

Kun Deng; Chris Bourke; Stephen D. Scott; Julie Sunderman; Yaling Zheng

The personalization of treatment via genetic biomarkers and other risk categories has drawn increasing interest among clinical researchers and scientists. A major challenge here is to construct individualized treatment rules (ITR), which recommend the best treatment for each of the different categories of individuals. In general, ITRs can be constructed using data from clinical trials, however these are generally very costly to run. In order to reduce the cost of learning an ITR, we explore active learning techniques designed to carefully decide whom to recruit, and which treatment to assign, throughout the online conduct of the clinical trial. As an initial investigation, we focus on simple ITRs that utilize a small number of subpopulation categories to personalize treatment. To minimize the maximal uncertainty regarding the treatment effects for each subpopulation, we propose the use of a minimax bandit model and provide an active learning policy for solving it. We evaluate our active learning policy using simulated data and data modeled after a clinical trial involving treatments for depressed individuals. We contrast this policy with other plausible active learning policies. The techniques presented in the paper may be generalized to tackle problems of efficient exploration in other domains.

Archive | 2012

Systems, Methods, and Media for Detecting an Anatomical Object in a Medical Device Image

Hong Wu; Kun Deng; Jianming Liang

There is increased interest in using patient-specific information to personalize treatment. Personalized treatment decision rules can be learned using data from standard clinical trials, but such trials are very costly to run. This paper explores the use of budgeted learning techniques to design more efficient clinical trials, by effectively determining which type of patients to recruit, at each time, throughout the duration of the trial. We propose a Bayesian bandit model and discuss the computational challenges and issues pertaining to this approach. We compare our budgeted learning algorithm, which approximately minimizes the Bayes risk, using both simulated data and data modeled after a clinical trial for treating depressed individuals, with other plausible algorithms. We show that our budgeted learning algorithm demonstrated excellent performance across a wide variety of situations.

uncertainty in artificial intelligence | 2011