Jiangchao Yao
Shanghai Jiao Tong University
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Jiangchao Yao.
IEEE Transactions on Multimedia | 2018
Jiangchao Yao; Yanfeng Wang; Ya Zhang; Jun Sun; Jun Zhou
Social tags, serving as a textual source of simple but useful semantic metadata to reflect the user preference or describe the web objects, has been widely used in many applications. However, social tags have several unique characteristics, i.e., sparseness and data coupling (i.e., non-IIDness), which makes existing text analysis methods such as LDA not directly applicable. In this paper, we propose a new generative algorithm for social tag analysis named joint latent Dirichlet allocation, which models the generation of tags based on both the users and the objects, and thus accounts for the coupling relationships among social tags. The model introduces two latent factors that jointly influence tag generation: the users latent interest factor and the objects latent topic factor, formulated as user-topic distribution matrix and object-topic distribution matrix, respectively. A Gibbs sampling approach is adopted to simultaneously infer the above two matrices as well as a topic-word distribution matrix. Experimental results on four social tagging datasets have shown that our model is able to capture more reasonable topics and achieves better performance than five state-of-the-art topic models in terms of the widely used point-wise mutual information metric. In addition, we analyze the learnt topics showing that our model recovers more themes from social tags while LDA may lead the topic vanishing problems, and demonstrate its advantages in the social recommendation by evaluating the retrieval results with mean reciprocal rank metric. Finally, we explore the joint procedure of our model in depth to show the non-IID characteristic of social tagging process.
international conference on multimedia and expo | 2015
Jiangchao Yao; Ya Zhang; Zhe Xu; Jun Sun; Jun Zhou; Xiao Gu
Topic models have been widely used for analyzing text corpora and achieved great success in applications including content organization and information retrieval. However, different from traditional text data, social tags in the web containers are usually of small amounts, unordered, and non-iid, i.e., it is highly dependent on contextual information such as users and objects. Considering the specific characteristics of social tags, we here introduce a new model named Joint Latent Dirichlet Allocation (JLDA) to capture the relationships among users, objects, and tags. The model assumes that the latent topics of users and those of objects jointly influence the generation of tags. The latent distributions is then inferred with Gibbs sampling. Experiments on two social tag data sets have demonstrated that the model achieves a lower predictive error and generates more reasonable topics. We also present an interesting application of this model to object recommendation.
conference on multimedia modeling | 2017
Jiangchao Yao; Ya Zhang; Ivor W. Tsang; Jun Sun
The last decades have witnessed the boom of social networks. As a result, discovering user interests from social media has gained increasing attention. While the accumulation of social media presents us great opportunities for a better understanding of the users, the challenge lies in how to build a uniform model for the heterogeneous contents. In this article, we propose a hybrid mixture model for user interests discovery which exploits both the textual and visual content associated with social images. By modeling the features of each content source independently at the latent variable level and unifies them as latent interests, the proposed model allows the semantic interpretation of user interests in both the visual and textual perspectives. Qualitative and quantitative experiments on a Flickr dataset with 2.54 million images have demonstrated its promise for user interest analysis compared with existing methods.
conference on multimedia modeling | 2017
Huangjie Zheng; Jiangchao Yao; Ya Zhang
Images play important roles in providing comprehensive understanding of our physical world. When thinking of a tourist city, one can immediately imagine pictures of its famous attractions. With the boom of social images, we attempt to explore the possibility of describing geographical characteristics of different regions. We here propose a Geographical Latent Attribute Model (GLAM) to mine regional characteristics from social images, which is expected to provide a comprehensive view of the regions. The model assumes that a geographical region consists of different “attributes” (e.g., infrastructures, attractions, events and activities) and “attributes” are interpreted by different image “clusters”. Both “attributes” and image “clusters” are modeled as latent variables. The experimental analysis on a collection of 2.5M Flickr photos regarding Chinese provinces and cities has shown that the proposed model is promising in describing regional characteristics. Moreover, we demonstrate the usefulness of the proposed model for place recommendation.
international conference on machine learning and applications | 2015
Xiaoyu Chen; Jiangchao Yao; Yanfeng Wang; Ya Zhang
Collective Latent Dirichlet Allocation (C-LDA) is proposed as an extension of LDA to simultaneously model multiple corpora from different domains in order to overcome bias of individual corpus. However, with large volume of document collections from various sources, it becomes challenging to achieve fast convergence for C-LDA. The high time complexity of C-LDA limits its application to real-world tasks. Luckily, online learning has shown promise for speeding up the convergence of LDA. In this paper, we propose to explore online learning for collective LDA (OVCLDA). We first develop an efficient variational inference algorithm for collective LDA and then extend it to the online learning framework. We perform experiments with various real-world corpora. Experimental results have shown that OVCLDA can learn comparable topics with C-LDA and better than Online LDA, and achieves comparable computational efficiency with Online LDA and is much more efficient than C-LDA.
neural information processing systems | 2018
Bo Han; Jiangchao Yao; Gang Niu; Mingyuan Zhou; Ivor W. Tsang; Ya Zhang; Masashi Sugiyama
arXiv: Machine Learning | 2018
Huangjie Zheng; Jiangchao Yao; Ya Zhang; Ivor W. Tsang
arXiv: Machine Learning | 2018
Huangjie Zheng; Jiangchao Yao; Ya Zhang; Ivor W. Tsang
arXiv: Learning | 2018
Bo Han; Gang Niu; Jiangchao Yao; Xingrui Yu; Miao Xu; Ivor W. Tsang; Masashi Sugiyama
arXiv: Learning | 2018
Jiangchao Yao; Ivor W. Tsang; Ya Zhang
