Is this you? Create Your Porfile

Evan Wei Xiang

Hong Kong University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Evan Wei Xiang is active.

Explore More

Publication

Featured researches published by Evan Wei Xiang.

international joint conference on artificial intelligence | 2011

Transfer learning to predict missing ratings via heterogeneous user feedbacks

Weike Pan; Nathan Nan Liu; Evan Wei Xiang; Qiang Yang

Data sparsity due to missing ratings is a major challenge for collaborative filtering (CF) techniques in recommender systems. This is especially true for CF domains where the ratings are expressed numerically. We observe that, while we may lack the information in numerical ratings, we may have more data in the form of binary ratings. This is especially true when users can easily express themselves with their likes and dislikes for certain items. In this paper, we explore how to use the binary preference data expressed in the form of like/dislike to help reduce the impact of data sparsity of more expressive numerical ratings. We do this by transferring the rating knowledge from some auxiliary data source in binary form (that is, likes or dislikes), to a target numerical rating matrix. Our solution is to model both numerical ratings and like/dislike in a principled way, using a novel framework of Transfer by Collective Factorization (TCF). In particular, we construct the shared latent space collectively and learn the data-dependent effect separately. A major advantage of the TCF approach over previous collective matrix factorization (or bifactorization) methods is that we are able to capture the data-dependent effect when sharing the data-independent knowledge, so as to increase the over-all quality of knowledge transfer. Experimental results demonstrate the effectiveness of TCF at various sparsity levels as compared to several state-of-the-art methods.

IEEE Intelligent Systems | 2012

SMS Spam Detection Using Noncontent Features

Qian Xu; Evan Wei Xiang; Qiang Yang; Jiachun Du; Jieping Zhong

Short Message Service text messages are indispensable, but they face a serious problem from spamming. This service-side solution uses graph data mining to distinguish spammers from nonspammers and detect spam without checking a messages contents.

conference on information and knowledge management | 2010

Unifying explicit and implicit feedback for collaborative filtering

Nathan Nan Liu; Evan Wei Xiang; Min Zhao; Qiang Yang

Most collaborative filtering algorithms are based on certain statistical models of user interests built from either explicit feedback (eg: ratings, votes) or implicit feedback (eg: clicks, purchases). Explicit feedbacks are more precise but more difficult to collect from users while implicit feedbacks are much easier to collect though less accurate in reflecting user preferences. In the existing literature, separate models have been developed for either of these two forms of user feedbacks due to their heterogeneous representation. However in most real world recommended systems both explicit and implicit user feedback are abundant and could potentially complement each other. It is desirable to be able to unify these two heterogeneous forms of user feedback in order to generate more accurate recommendations. In this work, we developed matrix factorization models that can be trained from explicit and implicit feedback simultaneously. Experimental results of multiple datasets showed that our algorithm could effectively combine these two forms of heterogeneous user feedback to improve recommendation quality.

international semantic web conference | 2012

A machine learning approach for instance matching based on similarity metrics

Shu Rong; Xing Niu; Evan Wei Xiang; Haofen Wang; Qiang Yang; Yong Yu

The Linking Open Data (LOD) project is an ongoing effort to construct a global data space, i.e. the Web of Data. One important part of this project is to establish owl:sameAs links among structured data sources. Such links indicate equivalent instances that refer to the same real-world object. The problem of discovering owl:sameAs links between pairwise data sources is called instance matching. Most of the existing approaches addressing this problem rely on the quality of prior schema matching, which is not always good enough in the LOD scenario. In this paper, we propose a schema-independent instance-pair similarity metric based on several general descriptive features. We transform the instance matching problem to the binary classification problem and solve it by machine learning algorithms. Furthermore, we employ some transfer learning methods to utilize the existing owl:sameAs links in LOD to reduce the demand for labeled data. We carry out experiments on some datasets of OAEI2010. The results show that our method performs well on real-world LOD data and outperforms the participants of OAEI2010.

IEEE Transactions on Knowledge and Data Engineering | 2010

Bridging Domains Using World Wide Knowledge for Transfer Learning

Evan Wei Xiang; Bin Cao; Derek Hao Hu; Qiang Yang

A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a ¿bridge¿ that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly.

international joint conference on artificial intelligence | 2011

Source-selection-free transfer learning

Evan Wei Xiang; Sinno Jialin Pan; Weike Pan; Jian Su; Qiang Yang

Transfer learning addresses the problems that labeled training data are insufficient to produce a high-performance model. Typically, given a target learning task, most transfer learning approaches require to select one or more auxiliary tasks as sources by the designers. However, how to select the right source data to enable effective knowledge transfer automatically is still an unsolved problem, which limits the applicability of transfer learning. In this paper, we take one step ahead and propose a novel transfer learning framework, known as source-selection-free transfer learning (SSFTL), to free users from the need to select source domains. Instead of asking the users for source and target data pairs, as traditional transfer learning does, SSFTL turns to some online information sources such as World Wide Web or the Wikipedia for help. The source data for transfer learning can be hidden somewhere within this large online information source, but the users do not know where they are. Based on the online information sources, we train a large number of classifiers. Then, given a target task, a bridge is built for labels of the potential source candidates and the target domain data in SSFTL via some large online social media with tag cloud as a label translator. An added advantage of SSFTL is that, unlike many previous transfer learning approaches, which are difficult to scale up to the Web scale, SSFTL is highly scalable and can offset much of the training work to offline stage. We demonstrate the effectiveness and efficiency of SSFTL through extensive experiments on several real-world datasets in text classification.

conference on recommender systems | 2012

Constrained collective matrix factorization

Yu-Jia Huang; Evan Wei Xiang; Rong Pan

Transfer learning for collaborative filtering (TLCF) aims to solve the sparsity problem by transferring rating knowledge across multiple domains. Taking domain difference into ac- count, one of the issues in cross-domain collaborative filtering is to selectively transfer knowledge from source/auxiliary domains. In particular, this paper addresses the problem of inconstant users (users with changeable preferences across different domains) when transferring knowledge about users from another auxiliary domain. We first formulate the problem of inconstant users caused by domain difference and then propose a new model that performs constrained collective matrix factorization (CCMF). Our experiments on simulated and real data show that CCMF has superior performance than other methods.

Proteomics | 2011

Transferring network topological knowledge for predicting protein-protein interactions.

Qian Xu; Evan Wei Xiang; Qiang Yang

Protein–protein interactions (PPIs) play an important role in cellular processes within a cell. An important task is to determine the existence of interactions among proteins. Unfortunately, the existing biological experimental techniques are expensive, time‐consuming and labor‐intensive. The network structures of many such networks are sparse, incomplete and noisy. Thus, state‐of‐the‐art methods for link prediction in these networks often cannot give satisfactory prediction results, especially when some networks are extremely sparse. Noticing that we typically have more than one PPI network available, we naturally wonder whether it is possible to ‘transfer’ the linkage knowledge from some existing, relatively dense networks to a sparse network, to improve the prediction performance. Noticing that a network structure can be modeled using a matrix model, we introduce the well‐known collective matrix factorization technique to ‘transfer’ usable linkage knowledge from relatively dense interaction network to a sparse target network. Our approach is to establish a correspondence between a source network and a target network via network‐wide similarities. We test this method on two real PPI networks, Helicobacter pylori (as a target network) and human (as a source network). Our experimental results show that our method can achieve higher performance as compared with some baseline methods.

international conference on data mining | 2009

Knowledge Transfer among Heterogeneous Information Networks

Evan Wei Xiang; Nathan Nan Liu; Sinno Jialin Pan; Qiang Yang

Online recommendation systems are becoming more and more popular with the development of web. However, a critical problem of such system is that new users and items are always added to the system with time. How to overcome the data sparseness for such new incoming entities become an important issue. In this paper, we try to reduce the data sparseness in the link prediction problem via involving heterogeneous information network as auxiliary information sources. We developed two models based on the Collective Matrix Factorization (CMF) framework. We also provided a detailed empirical study on how effectively different information networks could help with two real world link prediction tasks. We will report some preliminary results of our current work and also point our several potential research issues.

national conference on artificial intelligence | 2010