Xiaoyuan Su | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiaoyuan Su is active.

Explore More

Publication

Featured researches published by Xiaoyuan Su.

Advances in Artificial Intelligence | 2009

A survey of collaborative filtering techniques

Xiaoyuan Su; Taghi M. Khoshgoftaar

As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, modelbased, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

international conference on tools with artificial intelligence | 2006

Collaborative Filtering for Multi-class Data Using Belief Nets Algorithms

Xiaoyuan Su; Taghi M. Khoshgoftaar

As one of the most successful recommender systems, collaborative filtering (CF) algorithms can deal with high sparsity and high requirement of scalability amongst other challenges. Bayesian belief nets (BNs), one of the most frequently used classifiers, can be used for CF tasks. Previous works of applying BNs to CF tasks were mainly focused on binary-class data, and used simple or basic Bayesian classifiers (Miyahara and Pazzani, 2002; Breese et al., 1998). In this work, we apply advanced BNs models to CF tasks instead of simple ones, and work on real-world multi-class CF data instead of synthetic binary-class data. Empirical results show that with their ability to deal with incomplete data, extended logistic regression on naive Bayes and tree augmented naive Bayes (NB-ELR and TAN-ELR) models (Greiner et al., 2005) consistently perform better than the state-of-the-art Pearson correlation-based CF algorithm. In addition, the ELR-optimized BNs CF models are robust in terms of the ability to make predictions, while the robustness of the Pearson correlation-based CF algorithm degrades as the sparseness of the data increases

acm symposium on applied computing | 2008

Imputation-boosted collaborative filtering using machine learning classifiers

Xiaoyuan Su; Taghi M. Khoshgoftaar; Xingquan Zhu; Russell Greiner

As data sparsity remains a significant challenge for collaborative filtering (CF, we conjecture that predicted ratings based on imputed data may be more accurate than those based on the originally very sparse rating data. In this paper, we propose a framework of imputation-boosted collaborative filtering (IBCF), which first uses an imputation technique, or perhaps machine learned classifier, to fill-in the sparse user-item rating matrix, then runs a traditional Pearson correlation-based CF algorithm on this matrix to predict a novel rating. Empirical results show that IBCF using machine learning classifiers can improve predictive accuracy of CF tasks. In particular, IBCF using a classifier capable of dealing well with missing data, such as naïve Bayes, can outperform the content-boosted CF (a representative hybrid CF algorithm) and IBCF using PMM (predictive mean matching, a state-of-the-art imputation technique), without using external content information.

web intelligence | 2007

Hybrid Collaborative Filtering Algorithms Using a Mixture of Experts

Xiaoyuan Su; Russell Greiner; Taghi M. Khoshgoftaar; Xingquan Zhu

Collaborative filtering (CF) is one of the most successful approaches for recommendation. In this paper, we propose two hybrid CF algorithms, sequential mixture CF and joint mixture CF, each combining advice from multiple experts for effective recommendation. These proposed hybrid CF models work particularly well in the common situation when data are very sparse. By combining multiple experts to form a mixture CF, our systems are able to cope with sparse data to obtain satisfactory performance. Empirical studies show that our algorithms outperform their peers, such as memory-based, pure model-based, pure content-based CF algorithms, and the content- boosted CF (a representative hybrid CF algorithm), especially when the underlying data are very sparse.

web intelligence | 2008

Imputed Neighborhood Based Collaborative Filtering

Xiaoyuan Su; Taghi M. Khoshgoftaar; Russell Greiner

Collaborative filtering (CF) is one of the most effective types of recommender systems. As data sparsity remains a significant challenge for CF, we consider basing predictions on imputed data, and find this often improves performance on very sparse rating data. In this paper, we propose two imputed neighborhood based collaborative filtering (INCF) algorithms: imputed nearest neighborhood CF (INN-CF) and imputed densest neighborhood CF (IDN-CF), each of which first imputes the user rating data using an imputation technique, before using a traditional Pearson correlation-based CF algorithm on the resulting imputed data of the most similar neighbors or the densest neighbors to make CF predictions for a specific user. We compared an extension of Bayesian multiple imputation (eBMI) and the mean imputation (MEI) in these INCF algorithms, with the commonly-used neighborhood based CF, Pearson correlation-based CF, as well as a densest neighborhood based CF. Our empirical results show that IDN-CF using eBMI significantly outperforms its rivals and takes less time to make its best predictions.

International Journal on Artificial Intelligence Tools | 2008

COLLABORATIVE FILTERING FOR MULTI-CLASS DATA USING BAYESIAN NETWORKS

Xiaoyuan Su; Taghi M. Khoshgoftaar

As one of the most successful recommender systems, collaborative filtering (CF) algorithms are required to deal with high sparsity and high requirement of scalability amongst other challenges. Bayesian networks (BNs), one of the most frequently used classifiers, can be used for CF tasks. Previous works on applying BNs to CF tasks were mainly focused on binary-class data, and used simple or basic Bayesian classifiers.1,2 In this work, we apply advanced BNs models to CF tasks instead of simple ones, and work on real-world multi-class CF data instead of synthetic binary-class data. Empirical results show that with their ability to deal with incomplete data, the extended logistic regression on tree augmented naive Bayes (TAN-ELR)3 CF model consistently performs better than the traditional Pearson correlation-based CF algorithm for the rating data that have few items or high missing rates. In addition, the ELR-optimized BNs CF models are robust in terms of the ability to make predictions, while the robustness of the Pearson correlation-based CF algorithm degrades as the sparseness of the data increases.

International Journal of Information and Decision Sciences | 2009

Making an accurate classifier ensemble by voting on classifications from imputed learning sets

Xiaoyuan Su; Taghi M. Khoshgoftaar; Russell Greiner

Ensemble methods often produce effective classifiers by learning a set of base classifiers from a diverse collection of the training sets. In this paper, we present a system, voting on classifications from imputed learning sets (VCI), that produces those diverse training sets by randomly removing a small percentage of attribute values from the original training set, and then using an imputation technique to replace those values. VCI then runs a learning algorithm on each of these imputed training sets to produce a set of base classifiers. Later, the final prediction on a novel instance is the plurality classification produced by these classifiers. We investigate various imputation techniques here, including the state-of-the-art Bayesian multiple imputation (BMI) and expectation maximisation (EM). Our empirical results show that VCI predictors, especially those using BMI and EM as imputers, significantly improve the classification accuracy over conventional classifiers, especially on datasets that are originally incomplete; moreover VCI significantly outperforms bagging predictors and imputation-helped machine learners.

international symposium on visual computing | 2007

Rule-based multiple object tracking for traffic surveillance using collaborative background extraction

Xiaoyuan Su; Taghi M. Khoshgoftaar; Xingquan Zhu; Andres Folleco

In order to address the challenges of occlusions and background variations, we propose a novel and effective rule-based multiple object tracking system for traffic surveillance using a collaborative background extraction algorithm. The collaborative background extraction algorithm collaboratively extracts a background from multiple independent extractions to remove spurious background pixels. The rule-based strategies are applied for thresholding, outlier removal, object consolidation, separating neighboring objects, and shadow removal. Empirical results show that our multiple object tracking system is highly accurate for traffic surveillance under occlusion conditions.

international symposium on visual computing | 2007

A progressive edge-based stereo correspondence method

Xiaoyuan Su; Taghi M. Khoshgoftaar

Local stereo correspondence is usually not satisfactory because neither big window nor small window based methods can accurately match densely-textured and textureless regions at the same time. In this paper, we present a progressive edge-based stereo matching algorithm, in which big window and small window based matches are progressively integrated based on the edges of disparity map of a big window based matching. In addition, an arbitrarily-shaped window based matching is used for the regions where big windows and small windows can not find matches, and a novel optimization method, progressive outlier remover, is used to effectively remove outliers and noise. Empirical results show that our method is comparable to some state-of-the-art stereo correspondence algorithms.

international conference on pattern recognition | 2008

VoB predictors: Voting on bagging classifications

Xiaoyuan Su; Taghi M. Khoshgoftarr; Xingquan Zhu

Bagging predictors relies on bootstrap sampling to maintain a set of diverse base classifiers constituting the classifier ensemble, where the diversity among base classifiers is ensured through a random sampling (with replacement) process on the original data. In this paper, we propose a random missing value corruption based bootstrap sampling process, where the objective is to enhance the diversity of the learning sets through random missing value injection, such that base classifiers can form an accurate classifier ensemble. Our VoB (voting on bagging classifications) predictors first generate multiple incomplete datasets from a base complete dataset by randomly injecting missing values with a small missing ratio, then apply a bagging predictor trained on each of the incomplete dataset to give classifications. The final prediction of a class is the result of voting on the classifications. Our empirical results show that VoB predictors significantly improve the classification performance on complete data, and perform better than bagging predictors.

Explore More