Suhang Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Suhang Wang is active.

Explore More

Publication

Featured researches published by Suhang Wang.

ACM Computing Surveys | 2017

Feature Selection: A Data Perspective

Jundong Li; Kewei Cheng; Suhang Wang; Fred Morstatter; Robert P. Trevino; Jiliang Tang; Huan Liu

Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data-mining and machine-learning problems. The objectives of feature selection include building simpler and more comprehensible models, improving data-mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity-based, information-theoretical-based, sparse-learning-based, and statistical-based methods. To facilitate and promote the research in this community, we also present an open source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.

Neurocomputing | 2015

Discriminative graph regularized extreme learning machine and its application to face recognition

Yong Peng; Suhang Wang; Xianzhong Long; Bao-Liang Lu

Extreme Learning Machine (ELM) has been proposed as a new algorithm for training single hidden layer feed forward neural networks. The main merit of ELM lies in the fact that the input weights as well as hidden layer bias are randomly generated and thus the output weights can be obtained analytically, which can overcome the drawbacks incurred by gradient-based training algorithms such as local optima, improper learning rate and low learning speed. Based on the consistency property of data, which enforces similar samples to share similar properties, we propose a discriminative graph regularized Extreme Learning Machine (GELM) for further enhancing its classification performance in this paper. In the proposed GELM model, the label information of training samples are used to construct an adjacent graph and correspondingly the graph regularization term is formulated to constrain the output weights to learn similar outputs for samples from the same class. The proposed GELM model also has a closed form solution as the standard ELM and thus the output weights can be obtained efficiently. Experiments on several widely used face databases show that our proposed GELM can achieve much performance gain over standard ELM and regularized ELM. Moreover, GELM also performs well when compared with the state-of-the-art classification methods for face recognition.

Sigkdd Explorations | 2017

User Identity Linkage across Online Social Networks: A Review

Kai Shu; Suhang Wang; Jiliang Tang; Reza Zafarani; Huan Liu

The increasing popularity and diversity of social media sites has encouraged more and more people to participate on multiple online social networks to enjoy their services. Each user may create a user identity, which can includes profile, content, or network information, to represent his or her unique public figure in every social network. Thus, a fundamental question arises -- can we link user identities across online social networks? User identity linkage across online social networks is an emerging task in social media and has attracted increasing attention in recent years. Advancements in user identity linkage could potentially impact various domains such as recommendation and link prediction. Due to the unique characteristics of social network data, this problem faces tremendous challenges. To tackle these challenges, recent approaches generally consist of (1) extracting features and (2) constructing predictive models from a variety of perspectives. In this paper, we review key achievements of user identity linkage across online social networks including stateof- the-art algorithms, evaluation metrics, and representative datasets. We also discuss related research areas, open problems, and future research directions for user identity linkage across online social networks.

Neural Networks | 2015

Enhanced low-rank representation via sparse manifold adaption for semi-supervised learning

Yong Peng; Bao-Liang Lu; Suhang Wang

Constructing an informative and discriminative graph plays an important role in various pattern recognition tasks such as clustering and classification. Among the existing graph-based learning models, low-rank representation (LRR) is a very competitive one, which has been extensively employed in spectral clustering and semi-supervised learning (SSL). In SSL, the graph is composed of both labeled and unlabeled samples, where the edge weights are calculated based on the LRR coefficients. However, most of existing LRR related approaches fail to consider the geometrical structure of data, which has been shown beneficial for discriminative tasks. In this paper, we propose an enhanced LRR via sparse manifold adaption, termed manifold low-rank representation (MLRR), to learn low-rank data representation. MLRR can explicitly take the data local manifold structure into consideration, which can be identified by the geometric sparsity idea; specifically, the local tangent space of each data point was sought by solving a sparse representation objective. Therefore, the graph to depict the relationship of data points can be built once the manifold information is obtained. We incorporate a regularizer into LRR to make the learned coefficients preserve the geometric constraints revealed in the data space. As a result, MLRR combines both the global information emphasized by low-rank property and the local information emphasized by the identified manifold structure. Extensive experimental results on semi-supervised classification tasks demonstrate that MLRR is an excellent method in comparison with several state-of-the-art graph construction approaches.

conference on information and knowledge management | 2016

Linked Document Embedding for Classification

Suhang Wang; Jiliang Tang; Charu C. Aggarwal; Huan Liu

Word and document embedding algorithms such as Skip-gram and Paragraph Vector have been proven to help various text analysis tasks such as document classification, document clustering and information retrieval. The vast majority of these algorithms are designed to work with independent and identically distributed documents. However, in many real-world applications, documents are inherently linked. For example, web documents such as blogs and online news often have hyperlinks to other web documents, and scientific articles usually cite other articles. Linked documents present new challenges to traditional document embedding algorithms. In addition, most existing document embedding algorithms are unsupervised and their learned representations may not be optimal for classification when labeling information is available. In this paper, we study the problem of linked document embedding for classification and propose a linked document embedding framework LDE, which combines link and label information with content information to learn document representations for classification. Experimental results on real-world datasets demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of link and label information in the proposed framework LDE.

siam international conference on data mining | 2016

Exploiting emotional information for trust/distrust prediction

Ghazaleh Beigi; Jiliang Tang; Suhang Wang; Huan Liu

Trust and distrust networks are usually extremely sparse and the vast majority of the existing algorithms for trust/distrust prediction suffer from the data sparsity problem. In this paper, following the research from psychology and sociology, we envision that users’ emotions such as happiness and anger are strong indicators of trust/distrust relations. Meanwhile the popularity of social media encourages the increasing number of users to freely express their emotions; hence emotional information is pervasively available and usually denser than the trust and distrust relations. Therefore incorporating emotional information could have the potentials to alleviate the data sparsity in the problem of trust/distrust prediction. In this study, we investigate how to exploit emotional information for trust/distrust prediction. In particular, we provide a principled way to capture emotional information mathematically and propose a novel trust/distrust prediction framework ETD. Experimental results on the real-world social media dataset demonstrate the effectiveness of the proposed framework and the importance of emotional information in trust/distrust prediction.

Cognitive Computation | 2017

Learning Word Representations for Sentiment Analysis

Yang Li; Quan Pan; Tao Yang; Suhang Wang; Jiliang Tang; Erik Cambria

Word embedding has been proven to be a useful model for various natural language processing tasks. Traditional word embedding methods merely take into account word distributions independently from any specific tasks. Hence, the resulting representations could be sub-optimal for a given task. In the context of sentiment analysis, there are various types of prior knowledge available, e.g., sentiment labels of documents from available datasets or polarity values of words from sentiment lexicons. We incorporate such prior sentiment information at both word level and document level in order to investigate the influence each word has on the sentiment label of both target word and context words. By evaluating the performance of sentiment analysis in each category, we find the best way of incorporating prior sentiment information. Experimental results on real-world datasets demonstrate that the word representations learnt by DLJT2 can significantly improve the sentiment analysis performance. We prove that incorporating prior sentiment knowledge into the embedding process has the potential to learn better representations for sentiment analysis.

conference on information and knowledge management | 2015

Toward Dual Roles of Users in Recommender Systems

Suhang Wang; Jiliang Tang; Huan Liu

Users usually play dual roles in real-world recommender systems. One is as a reviewer who writes reviews for items with rating scores, and the other is as a rater who rates the helpfulness scores of reviews. Traditional recommender systems mainly consider the reviewer role while not taking into account the rater role. However, the rater role allows users to express their opinions toward reviews about items; hence it may indirectly indicate their opinions about items, which could be complementary to the reviewer role. Since most real-world recommender systems provide convenient mechanisms for the rater role, recent studies show that typically there are much more helpfulness ratings from the rater role than item ratings from the reviewer role. Therefore, incorporating the rater role of users may have the potentials to mitigate the data sparsity and cold-start problems in traditional recommender systems. In this paper, we investigate how to exploit dual roles of users in recommender systems. In particular, we provide a principled way to exploit the rater role mathematically and propose a novel recommender system DualRec, which captures both the reviewer role and the rater role of users simultaneously for recommendation. Experimental results on two real world datasets demonstrate the effectiveness of the proposed framework, and further experiments are conducted to understand the importance of the rater role of users in recommendation.

conference on information and knowledge management | 2017

Attributed Signed Network Embedding

Suhang Wang; Charu C. Aggarwal; Jiliang Tang; Huan Liu

The major task of network embedding is to learn low-dimensional vector representations of social-network nodes. It facilitates many analytical tasks such as link prediction and node clustering and thus has attracted increasing attention. The majority of existing embedding algorithms are designed for unsigned social networks. However, many social media networks have both positive and negative links, for which unsigned algorithms have little utility. Recent findings in signed network analysis suggest that negative links have distinct properties and added value over positive links. This brings about both challenges and opportunities for signed network embedding. In addition, user attributes, which encode properties and interests of users, provide complementary information to network structures and have the potential to improve signed network embedding. Therefore, in this paper, we study the novel problem of signed social network embedding with attributes. We propose a novel framework SNEA, which exploits the network structure and user attributes simultaneously for network representation learning. Experimental results on link prediction and node clustering with real-world datasets demonstrate the effectiveness of SNEA.

web search and data mining | 2018

CrossFire: Cross Media Joint Friend and Item Recommendations

Kai Shu; Suhang Wang; Jiliang Tang; Yilin Wang; Huan Liu

Friend and item recommendation on a social media site is an important task, which not only brings conveniences to users but also benefits platform providers. However, recommendation for newly launched social media sites is challenging because they often lack user historical data and encounter data sparsity and cold-start problem. Thus, it is important to exploit auxiliary information to help improve recommendation performances on these sites. Existing approaches try to utilize the knowledge transferred from other mature sites, which often require overlapped users or similar items to ensure an effective knowledge transfer. However, these assumptions may not hold in practice because 1) Overlapped user set is often unavailable and costly to identify due to the heterogeneous user profile, content and network data, and 2) Different schemes to show item attributes across sites cause the attribute values inconsistent, incomplete, and noisy. Thus, how to transfer knowledge when no direct bridge is given between two social media sites remains a challenge. In addition, another auxiliary information we can exploit is the mutual benefit between social relationships and rating preferences within the platform. User-user relationships are widely used as side information to improve item recommendation, whereas how to exploit user-item interactions for friend recommendation is rather limited. To tackle these challenges, we propose aCross media jointF riend andI temRe commendation framework (CrossFire ), which can capture both 1) cross-platform knowledge transfer, and 2) within-platform correlations among user-user relations and user-item interactions. Empirical results on real-world datasets demonstrate the effectiveness of the proposed framework.

Explore More