Shuang-Hong Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shuang-Hong Yang is active.

Explore More

Publication

Featured researches published by Shuang-Hong Yang.

international world wide web conferences | 2011

Like like alike: joint friendship and interest propagation in social networks

Shuang-Hong Yang; Bo Long; Alexander J. Smola; Narayanan Sadagopan; Zhaohui Zheng; Hongyuan Zha

Targeting interest to match a user with services (e.g. news, products, games, advertisements) and predicting friendship to build connections among users are two fundamental tasks for social network systems. In this paper, we show that the information contained in interest networks (i.e. user-service interactions) and friendship networks (i.e. user-user connections) is highly correlated and mutually helpful. We propose a framework that exploits homophily to establish an integrated network linking a user to interested services and connecting different users with common interests, upon which both friendship and interests could be efficiently propagated. The proposed friendship-interest propagation (FIP) framework devises a factor-based random walk model to explain friendship connections, and simultaneously it uses a coupled latent factor model to uncover interest interactions. We discuss the flexibility of the framework in the choices of loss objectives and regularization penalties and benchmark different variants on the Yahoo! Pulse social networking system. Experiments demonstrate that by coupling friendship with interest, FIP achieves much higher performance on both interest targeting and friendship prediction than systems using only one source of information.

international acm sigir conference on research and development in information retrieval | 2011

Functional matrix factorizations for cold-start recommendation

Ke Zhou; Shuang-Hong Yang; Hongyuan Zha

A key challenge in recommender system research is how to effectively profile new users, a problem generally known as cold-start recommendation. Recently the idea of progressively querying user responses through an initial interview process has been proposed as a useful new user preference elicitation strategy. In this paper, we present functional matrix factorization (fMF), a novel cold-start recommendation method that solves the problem of initial interview construction within the context of learning user and item profiles. Specifically, fMF constructs a decision tree for the initial interview with each node being an interview question, enabling the recommender to query a user adaptively according to her prior responses. More importantly, we associate latent profiles for each node of the tree --- in effect restricting the latent profiles to be a function of possible answers to the interview questions --- which allows the profiles to be gradually refined through the interview process based on user responses. We develop an iterative optimization algorithm that alternates between decision tree construction and latent profiles extraction as well as a regularization scheme that takes into account of the tree structure. Experimental results on three benchmark recommendation data sets demonstrate that the proposed fMF algorithm significantly outperforms existing methods for cold-start recommendation.

Mining Text Data | 2012

Dimensionality Reduction and Topic Modeling: From Latent Semantic Indexing to Latent Dirichlet Allocation and Beyond

Steven P. Crain; Ke Zhou; Shuang-Hong Yang; Hongyuan Zha

The bag-of-words representation commonly used in text analysis can be analyzed very efficiently and retains a great deal of useful information, but it is also troublesome because the same thought can be expressed using many different terms or one term can have very different meanings. Dimension reduction can collapse together terms that have the same semantics, to identify and disambiguate terms with multiple meanings and to provide a lower-dimensional representation of documents that reflects concepts instead of raw terms. In this chapter, we survey two influential forms of dimension reduction. Latent semantic indexing uses spectral decomposition to identify a lower-dimensional representation that maintains semantic properties of the documents. Topic modeling, including probabilistic latent semantic indexing and latent Dirichlet allocation, is a form of dimension reduction that uses a probabilistic model to find the co-occurrence patterns of terms that correspond to semantic topics in a collection of documents. We describe the basic technologies in detail and expose the underlying mechanism. We also discuss recent advances that have made it possible to apply these techniques to very large and evolving text collections and to incorporate network structure or other contextual information.

european conference on machine learning | 2009

Variational Graph Embedding for Globally and Locally Consistent Feature Extraction

Shuang-Hong Yang; Hongyuan Zha; S. Kevin Zhou; Bao-Gang Hu

Existing feature extraction methods explore either global statistical or local geometric information underlying the data. In this paper, we propose a general framework to learn features that account for both types of information based on variational optimization of nonparametric learning criteria. Using mutual information and Bayes error rate as example criteria, we show that high-quality features can be learned from a variational graph embedding procedure, which is solved through an iterative EM-style algorithm where the E-Step learns a variational affinity graph and the M-Step in turn embeds this graph by spectral analysis. The resulting feature learner has several appealing properties such as maximum discrimination , maximum-relevance- minimum-redundancy and locality-preserving . Experiments on benchmark face recognition data sets confirm the effectiveness of our proposed algorithms.

conference on information and knowledge management | 2010

Language pyramid and multi-scale text analysis

Shuang-Hong Yang; Hongyuan Zha

The classical Bag-of-Word (BOW) model represents a document as a histogram of word occurrence, losing the spatial information that is invaluable for many text analysis tasks. In this paper, we present the Language Pyramid (LaP) model, which casts a document as a probabilistic distribution over the joint semantic-spatial space and motivates a multi-scale 2D local smoothing framework for nonparametric text coding. LaP efficiently encodes both semantic and spatial contents of a document into a pyramid of matrices that are smoothed both semantically and spatially at a sequence of resolutions, providing a convenient multi-scale imagic view for natural language understanding. The LaP representation can be used in text analysis in a variety of ways, among which we investigate two instantiations in the current paper: (1) multi-scale text kernels for document categorization, and (2) multi-scale language models for ad hoc text retrieval. Experimental results illustrate that: for classification, LaP outperforms BOW by (up to) 4% on moderate-length texts (RCV1 text benchmark) and 15% on short texts (Yahoo! queries); and for retrieval, LaP gains 12% MAP improvement over uni-gram language models on the OHSUMED data set.

international acm sigir conference on research and development in information retrieval | 2011