Victor S. Sheng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Victor S. Sheng is active.

Explore More

Publication

Featured researches published by Victor S. Sheng.

knowledge discovery and data mining | 2008

Get another label? improving data quality and data mining using multiple, noisy labelers

Victor S. Sheng; Foster Provost; Panagiotis G. Ipeirotis

This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus especially on the improvement of training labels for supervised induction. With the outsourcing of small tasks becoming easier, for example via Rent-A-Coder or Amazons Mechanical Turk, it often is possible to obtain less-than-expert labeling at low cost. With low-cost labeling, preparing the unlabeled part of the data can become considerably more expensive than labeling. We present repeated-labeling strategies of increasing complexity, and show several main results. (i) Repeated-labeling can improve label quality and model quality, but not always. (ii) When labels are noisy, repeated labeling can be preferable to single labeling even in the traditional setting where labels are not particularly cheap. (iii) As soon as the cost of processing the unlabeled data is not free, even the simple strategy of labeling everything multiple times can give considerable advantage. (iv) Repeatedly labeling a carefully chosen set of points is generally preferable, and we present a robust technique that combines different notions of uncertainty to select data points for which quality should be improved. The bottom line: the results show clearly that when labeling is not perfect, selective acquisition of multiple labels is a strategy that data miners should have in their repertoire; for certain label-quality/cost regimes, the benefit is substantial.

IEEE Transactions on Neural Networks | 2015

Incremental Support Vector Learning for Ordinal Regression

Bin Gu; Victor S. Sheng; Keng Yeow Tay; Walter Romano; Shuo Li

Support vector ordinal regression (SVOR) is a popular method to tackle ordinal regression problems. However, until now there were no effective algorithms proposed to address incremental SVOR learning due to the complicated formulations of SVOR. Recently, an interesting accurate on-line algorithm was proposed for training ν-support vector classification (ν-SVC), which can handle a quadratic formulation with a pair of equality constraints. In this paper, we first present a modified SVOR formulation based on a sum-of-margins strategy. The formulation has multiple constraints, and each constraint includes a mixture of an equality and an inequality. Then, we extend the accurate on-line ν-SVC algorithm to the modified formulation, and propose an effective incremental SVOR algorithm. The algorithm can handle a quadratic formulation with multiple constraints, where each constraint is constituted of an equality and an inequality. More importantly, it tackles the conflicts between the equality and inequality constraints. We also provide the finite convergence analysis for the algorithm. Numerical experiments on the several benchmark and real-world data sets show that the incremental algorithm can converge to the optimal solution in a finite number of steps, and is faster than the existing batch and incremental SVOR algorithms. Meanwhile, the modified formulation has better accuracy than the existing incremental SVOR algorithm, and is as accurate as the sum-of-margins based formulation of Shashua and Levin.

IEEE Transactions on Neural Networks | 2017

Structural Minimax Probability Machine

Bin Gu; Xingming Sun; Victor S. Sheng

Minimax probability machine (MPM) is an interesting discriminative classifier based on generative prior knowledge. It can directly estimate the probabilistic accuracy bound by minimizing the maximum probability of misclassification. The structural information of data is an effective way to represent prior knowledge, and has been found to be vital for designing classifiers in real-world problems. However, MPM only considers the prior probability distribution of each class with a given mean and covariance matrix, which does not efficiently exploit the structural information of data. In this paper, we use two finite mixture models to capture the structural information of the data from binary classification. For each subdistribution in a finite mixture model, only its mean and covariance matrix are assumed to be known. Based on the finite mixture models, we propose a structural MPM (SMPM). SMPM can be solved effectively by a sequence of the second-order cone programming problems. Moreover, we extend a linear model of SMPM to a nonlinear model by exploiting kernelization techniques. We also show that the SMPM can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi–min margin machine under certain special conditions. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of SMPM.

Measurement Science Review | 2013

A Comparative Study of SIFT and its Variants

Jian Wu; Zhiming Cui; Victor S. Sheng; Pengpeng Zhao; Dongliang Su; Shengrong Gong

SIFT is an image local feature description algorithm based on scale-space. Due to its strong matching ability, SIFT has many applications in different fields, such as image retrieval, image stitching, and machine vision. After SIFT was proposed, researchers have never stopped tuning it. The improved algorithms that have drawn a lot of attention are PCA-SIFT, GSIFT, CSIFT, SURF and ASIFT. In this paper, we first systematically analyze SIFT and its variants. Then, we evaluate their performance in different situations: scale change, rotation change, blur change, illumination change, and affine change. The experimental results show that each has its own advantages. SIFT and CSIFT perform the best under scale and rotation change. CSIFT improves SIFT under blur change and affine change, but not illumination change. GSIFT performs the best under blur change and illumination change. ASIFT performs the best under affine change. PCA-SIFT is always the second in different situations. SURF performs the worst in different situations, but runs the fastest.

international conference on machine learning | 2006

Feature value acquisition in testing: a sequential batch test algorithm

Victor S. Sheng; Charles X. Ling

In medical diagnosis, doctors often have to order sets of medical tests in sequence in order to make an accurate diagnosis of patient diseases. While doing so they have to make a trade-off between the cost of the tests and possible misdiagnosis. In this paper, we use cost-sensitive learning to model this process. We assume that test examples (new patients) may contain missing values, and their actual values can be acquired at cost (similar to doing medical tests) in order to reduce misclassification errors (misdiagnosis). We propose a novel Sequential Batch Test algorithm that can acquire sets of attribute values in sequence, similar to sets of medical tests ordered by doctors in sequence. The goal of our algorithm is to minimize the total cost (i.e., the trade-off) of acquiring attribute values and misclassifications. We demonstrate the effectiveness of our algorithm, and show that it outperforms previous methods significantly. Our algorithm can be readily applied in real-world diagnosis tasks. A case study on the heart disease is given in the paper.

IEEE Transactions on Neural Networks | 2013

Feasibility and Finite Convergence Analysis for Accurate On-Line

Bin Gu; Victor S. Sheng

The ν-support vector machine ( ν-SVM) for classification has the advantage of using a parameter ν on controlling the number of support vectors and margin errors. Recently, an interesting accurate on-line algorithm accurate on-line ν-SVM algorithm (AONSVM) is proposed for training ν-SVM. AONSVM can be viewed as a special case of parametric quadratic programming techniques. It is demonstrated that AONSVM avoids the infeasible updating path as far as possible, and successfully converges to the optimal solution based on experimental analysis. However, because of the differences between AONSVM and classical parametric quadratic programming techniques, there is no theoretical justification for these conclusions. In this paper, we prove the feasibility and finite convergence of AONSVM under two assumptions. The main results of feasibility analysis include: 1) the inverses of the two key matrices in AONSVM always exist; 2) the rules for updating the two key inverse matrices are reliable; 3) the variable ζ can control the adjustment of the sum of all the weights efficiently; and 4) a sample cannot migrate back and forth in successive adjustment steps among the set of margin support vectors, the set of error support vectors, and the set of the remaining vectors. Moreover, the analyses of AONSVM also provide the proofs of the feasibility and finite convergence for accurate on-line C-SVM learning directly.

knowledge discovery and data mining | 2006

\nu

Charles X. Ling; Victor S. Sheng; Tilmann Bruckhaus; Nazim H. Madhavji

While most software defects (i.e., bugs) are corrected and tested as part of the lengthy software development cycle, enterprise software vendors often have to release software products before all reported defects are corrected, due to deadlines and limited resources. A small number of these defects will be escalated by customers and they must be resolved immediately by the software vendors at a very high cost. In this paper, we develop an Escalation Prediction (EP) system that mines historic defect report data and predict the escalation risk of the defects for maximum net profit. More specifically, we first describe a simple and general framework to convert the maximum net profit problem to cost-sensitive learning. We then apply and compare several well-known cost-sensitive learning approaches for EP. Our experiments suggest that the cost-sensitive decision tree is the best method for producing the highest positive net profit and comprehensible results. The EP system has been deployed successfully in the product group of an enterprise software vendor.

european conference on machine learning | 2007

-Support Vector Machine

Victor S. Sheng; Charles X. Ling

In this paper, we propose a new and general preprocessor algorithm, called CSRoulette, which converts any cost-insensitive classification algorithms into cost-sensitive ones. CSRouletteis based on cost proportional roulette sampling technique (called CPRSin short). CSRouletteis closely related to Costing, another cost-sensitive meta-learning algorithm, which is based on rejection sampling. Unlike rejection sampling which produces smaller samples, CPRScan generate different size samples. To further improve its performance, we apply ensemble (bagging) on CPRS; the resulting algorithm is called CSRoulette. Our experiments show that CSRouletteoutperforms Costing and other meta-learning methods in most datasets tested. In addition, we investigate the effect of various sample sizes and conclude that reduced sample sizes (as in rejection sampling) cannot be compensated by increasing the number of bagging iterations.

IEEE Transactions on Knowledge and Data Engineering | 2015

Maximum profit mining and its application in software development

Jing Zhang; Xindong Wu; Victor S. Sheng

It can be easy to collect multiple noisy labels for the same object via Internet-based crowdsourcing systems. Labelers may have bias when labeling, due to lacking expertise, dedication, and personal preference. These cause Imbalanced Multiple Noisy Labeling. In most cases, we have no information about the labeling qualities of labelers and the underlying class distributions. It is important to design agnostic solutions to utilize these noisy labels for supervised learning. We first investigate how imbalanced multiple noisy labeling affects the class distributions of training sets and the performance of classification. Then, an agnostic algorithm Positive LAbel frequency Threshold (PLAT) is proposed to deal with the imbalanced labeling issue. Simulations on eight UCI data sets with different underlying class distributions show that PLAT not only effectively deals with the imbalanced multiple noisy labeling problems that off-the-shelf agnostic methods cannot cope with, but also performs nearly the same as majority voting under the circumstances without imbalance. We also apply PLAT to eight real-world data sets with imbalanced labels collected from Amazon Mechanical Turk, and the experimental results show that PLAT is efficient and better than other ground truth inference algorithms.

International Journal of Pattern Recognition and Artificial Intelligence | 2016

Roulette Sampling for Cost-Sensitive Learning

Xuesong Yan; Qinghua Wu; Victor S. Sheng

Multi-label classification is to assign an instance to multiple classes. Naive Bayes (NB) is one of the most popular algorithms for pattern recognition and classification. It has a high performance in single label classification. It is naturally extended for multi-label classification under the assumption of label independence. As we know, NB is based on a simple but unrealistic assumption that attributes are conditionally independent given the class. Therefore, a double weighted NB (DWNB) is proposed to demonstrate the influences of predicting different labels based on different attributes. Our DWNB utilizes the niching cultural algorithm (NLA) to determine the weight configuration automatically. Our experimental results show that our proposed DWNB outperforms NB and its extensions significantly in multi-label classification.

Explore More