Is this you? Create Your Porfile

Dianhong Wang

China University of Geosciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dianhong Wang is active.

Explore More

Publication

Featured researches published by Dianhong Wang.

fuzzy systems and knowledge discovery | 2007

Survey of Improving K-Nearest-Neighbor for Classification

Liangxiao Jiang; Zhihua Cai; Dianhong Wang; Siwei Jiang

KNN (k-nearest-neighbor) has been widely used as an effective classification model. In this paper, we summarize three main shortcomings confronting KNN and single out three main methods for overcoming its three shortcomings. Keeping to these methods, we try our best to survey some improved algorithms and experimentally tested their effectiveness. Besides, we discuss some directions for future study on KNN.

Knowledge Based Systems | 2012

Improving Tree augmented Naive Bayes for class probability estimation

Liangxiao Jiang; Zhihua Cai; Dianhong Wang; Harry Zhang

Numerous algorithms have been proposed to improve Naive Bayes (NB) by weakening its conditional attribute independence assumption, among which Tree Augmented Naive Bayes (TAN) has demonstrated remarkable classification performance in terms of classification accuracy or error rate, while maintaining efficiency and simplicity. In many real-world applications, however, classification accuracy or error rate is not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimation is often required to make optimal decisions. In this paper, we investigate the class probability estimation performance of TAN in terms of conditional log likelihood (CLL) and present a new algorithm to improve its class probability estimation performance by the spanning TAN classifiers. We call our improved algorithm Averaged Tree Augmented Naive Bayes (ATAN). The experimental results on a large number of UCI datasets published on the main web site of Weka platform show that ATAN significantly outperforms TAN and all the other algorithms used to compare in terms of CLL.

advanced data mining and applications | 2007

Survey of Improving Naive Bayes for Classification

Liangxiao Jiang; Dianhong Wang; Zhihua Cai; Xuesong Yan

The attribute conditional independence assumption of naive Bayes essentially ignores attribute dependencies and is often violated. On the other hand, although a Bayesian network can represent arbitrary attribute dependencies, learning an optimal Bayesian network classifier from data is intractable. Thus, learning improved naive Bayes has attracted much attention from researchers and presented many effective and efficient improved algorithms. In this paper, we review some of these improved algorithms and single out four main improved approaches: 1) Feature selection; 2) Structure extension; 3) Local learning; 4) Data expansion. We experimentally tested these approaches using the whole 36 UCI data sets selected by Weka, and compared them to naive Bayes. The experimental results show that all these approaches are effective. In the end, we discuss some main directions for future research on Bayesian network classifiers.

Journal of Experimental and Theoretical Artificial Intelligence | 2012

Weighted average of one-dependence estimators†

Liangxiao Jiang; Harry Zhang; Zhihua Cai; Dianhong Wang

Naive Bayes (NB) is a probability-based classification model which is based on the attribute independence assumption. However, in many real-world data mining applications, its attribute independence assumption is often violated. Responding to this fact, researchers have made a substantial amount of effort to improve the classification accuracy of NB by weakening its attribute independence assumption. For a recent example, averaged one-dependence estimators (AODE) is proposed, which weakens its attribute independence assumption by averaging all models from a restricted class of one-dependence classifiers. However, all one-dependence classifiers in AODE have same weights and are treated equally. According to our observation, different one-dependence classifiers should have different weights. Therefore, in this article, we proposed an improved model called weighted average of one-dependence estimators (WAODE) by assigning different weights to these one-dependence classifiers. In our WAODE, four different weighting approaches are designed and thus four different versions are created. For simplicity, we respectively denote them by WAODE-MI, WAODE-ACC, WAODE-CLL and WAODE-AUC. The experimental results on a large number of UCI datasets published on the main website of Weka platform show that our WAODE significantly outperform AODE. †This article is an extended version of PRICAI 2006 conference paper (Jiang and Zhang 2006).

Journal of Experimental and Theoretical Artificial Intelligence | 2013

Naive Bayes text classifiers: a locally weighted learning approach

Liangxiao Jiang; Zhihua Cai; Harry Zhang; Dianhong Wang

Due to being fast, easy to implement and relatively effective, some state-of-the-art naive Bayes text classifiers with the strong assumption of conditional independence among attributes, such as multinomial naive Bayes, complement naive Bayes and the one-versus-all-but-one model, have received a great deal of attention from researchers in the domain of text classification. In this article, we revisit these naive Bayes text classifiers and empirically compare their classification performance on a large number of widely used text classification benchmark datasets. Then, we propose a locally weighted learning approach to these naive Bayes text classifiers. We call our new approach locally weighted naive Bayes text classifiers (LWNBTC). LWNBTC weakens the attribute conditional independence assumption made by these naive Bayes text classifiers by applying the locally weighted learning approach. The experimental results show that our locally weighted versions significantly outperform these state-of-the-art naive Bayes text classifiers in terms of classification accuracy.

Expert Systems With Applications | 2012

Not so greedy: Randomly Selected Naive Bayes

Liangxiao Jiang; Zhihua Cai; Harry Zhang; Dianhong Wang

Many approaches are proposed to improve Naive Bayes, among which the attribute selection approach has demonstrated remarkable performance. Algorithms for attribute selection fall into two broad categories: filters and wrappers. Filters use the general data characteristics to evaluate the selected attribute subset before the learning algorithm is run, while wrappers use the learning algorithm itself as a black box to evaluate the selected attribute subset. In this paper, we work on the attribute selection approach of wrapper and propose an improved Naive Bayes algorithm by carrying a random search through the whole space of attributes. We simply called it Randomly Selected Naive Bayes (RSNB). In order to meet the need of classification, ranking, and class probability estimation, we discriminatively design three different versions: RSNB-ACC, RSNB-AUC, and RSNB-CLL. The experimental results based on a large number of UCI datasets validate their effectiveness in terms of classification accuracy (ACC), area under the ROC curve (AUC), and conditional log likelihood (CLL), respectively.

International Journal of Machine Learning and Cybernetics | 2014

Bayesian Citation-KNN with distance weighting

Liangxiao Jiang; Zhihua Cai; Dianhong Wang; Harry Zhang

Multi-instance (MI) learning is receiving growing attention in the machine learning research field, in which learning examples are represented by a bag of instances instead of a single instance. K-nearest-neighbor (KNN) is a simple and effective classification model in the traditional supervised learning. As its two variants, Bayesian-KNN (BKNN) and Citation-KNN (CKNN) are proposed and are widely used for solving multi-instance classification problems. However, CKNN still applies the simplest majority vote approach among the references and citers to classify unseen bags. In this paper, we propose an improved algorithm called Bayesian Citation-KNN (BCKNN). For each unseen bag, BCKNN firstly finds its

International Journal on Artificial Intelligence Tools | 2012