Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jialie Shen is active.

Publication


Featured researches published by Jialie Shen.


IEEE Transactions on Image Processing | 2013

Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search

Yue Gao; Meng Wang; Zheng-Jun Zha; Jialie Shen; Xuelong Li; Xindong Wu

Due to the popularity of social media websites, extensive research efforts have been dedicated to tag-based social image search. Both visual information and tags have been investigated in the research field. However, most existing methods use tags and visual characteristics either separately or sequentially in order to estimate the relevance of images. In this paper, we propose an approach that simultaneously utilizes both visual and textual information to estimate the relevance of user tagged images. The relevance estimation is determined with a hypergraph learning approach. In this method, a social image hypergraph is constructed, where vertices represent images and hyperedges represent visual or textual terms. Learning is achieved with use of a set of pseudo-positive images, where the weights of hyperedges are updated throughout the learning process. In this way, the impact of different tags and visual words can be automatically modulated. Comparative results of the experiments conducted on a dataset including 370+images are presented, which demonstrate the effectiveness of the proposed approach.


IEEE Transactions on Multimedia | 2014

Representative Discovery of Structure Cues for Weakly-Supervised Image Segmentation

Luming Zhang; Yue Gao; Yingjie Xia; Ke Lu; Jialie Shen; Rongrong Ji

Weakly-supervised image segmentation is a challenging problem with multidisciplinary applications in multimedia content analysis and beyond. It aims to segment an image by leveraging its image-level semantics (i.e., tags). This paper presents a weakly-supervised image segmentation algorithm that learns the distribution of spatially structural superpixel sets from image-level labels. More specifically, we first extract graphlets from a given image, which are small-sized graphs consisting of superpixels and encapsulating their spatial structure. Then, an efficient manifold embedding algorithm is proposed to transfer labels from training images into graphlets. It is further observed that there are numerous redundant graphlets that are not discriminative to semantic categories, which are abandoned by a graphlet selection scheme as they make no contribution to the subsequent segmentation. Thereafter, we use a Gaussian mixture model (GMM) to learn the distribution of the selected post-embedding graphlets (i.e., vectors output from the graphlet embedding). Finally, we propose an image segmentation algorithm, termed representative graphlet cut, which leverages the learned GMM prior to measure the structure homogeneity of a test image. Experimental results show that the proposed approach outperforms state-of-the-art weakly-supervised image segmentation methods, on five popular segmentation data sets. Besides, our approach performs competitively to the fully-supervised segmentation models.


computer vision and pattern recognition | 2010

Weakly-supervised hashing in kernel space

Yadong Mu; Jialie Shen; Shuicheng Yan

The explosive growth of the vision data motivates the recent studies on efficient data indexing methods such as locality-sensitive hashing (LSH). Most existing approaches perform hashing in an unsupervised way. In this paper we move one step forward and propose a supervised hashing method, i.e., the LAbel-regularized Max-margin Partition (LAMP) algorithm. The proposed method generates hash functions in weakly-supervised setting, where a small portion of sample pairs are manually labeled to be “similar” or “dissimilar”. We formulate the task as a Constrained Convex-Concave Procedure (CCCP), which can be relaxed into a series of convex sub-problems solvable with efficient Quadratic-Program (QP). The proposed hashing method possesses other characteristics including: 1) most existing LSH approaches rely on linear feature representation. Unfortunately, kernel tricks are often more natural to gauge the similarity between visual objects in vision research, which corresponds to probably infinite-dimensional Hilbert spaces. The proposed LAMP has a natural support for kernel-based feature representation. 2) traditional hashing methods assume uniform data distributions. Typically, the collision probability of two samples in hash buckets is only determined by pairwise similarity, unrelated to contextual data distribution. In contrast, we provide such a collision bound which is beyond pairwise data interaction based on Markov random fields theory. Extensive empirical evaluations are conducted on five widely-used benchmarks. It takes only several seconds to generate a new hashing function, and the adopted random supporting-vector scheme enables the LAMP algorithm scalable to large-scale problems. Experimental results well validate the superiorities of the LAMP algorithm over the state-of-the-art kernel-based hashing methods.


IEEE Transactions on Circuits and Systems for Video Technology | 2008

Bayesian Tensor Approach for 3-D Face Modeling

Dacheng Tao; Mingli Song; Xuelong Li; Jialie Shen; Jimeng Sun; Xindong Wu; Christos Faloutsos; Stephen J. Maybank

Effectively modeling a collection of three-dimensional (3-D) faces is an important task in various applications, especially facial expression-driven ones, e.g., expression generation, retargeting, and synthesis. These 3-D faces naturally form a set of second-order tensors-one modality for identity and the other for expression. The number of these second-order tensors is three times of that of the vertices for 3-D face modeling. As for algorithms, Bayesian data modeling, which is a natural data analysis tool, has been widely applied with great success; however, it works only for vector data. Therefore, there is a gap between tensor-based representation and vector-based data analysis tools. Aiming at bridging this gap and generalizing conventional statistical tools over tensors, this paper proposes a decoupled probabilistic algorithm, which is named Bayesian tensor analysis (BTA). Theoretically, BTA can automatically and suitably determine dimensionality for different modalities of tensor data. With BTA, a collection of 3-D faces can be well modeled. Empirical studies on expression retargeting also justify the advantages of BTA.


IEEE Transactions on Knowledge and Data Engineering | 2015

Bridging the Vocabulary Gap between Health Seekers and Healthcare Knowledge

Liqiang Nie; Yi-Liang Zhao; Mohammad Akbari; Jialie Shen; Tat-Seng Chua

The vocabulary gap between health seekers and providers has hindered the cross-system operability and the inter-user reusability. To bridge this gap, this paper presents a novel scheme to code the medical records by jointly utilizing local mining and global learning approaches, which are tightly linked and mutually reinforced. Local mining attempts to code the individual medical record by independently extracting the medical concepts from the medical record itself and then mapping them to authenticated terminologies. A corpus-aware terminology vocabulary is naturally constructed as a byproduct, which is used as the terminology space for global learning. Local mining approach, however, may suffer from information loss and lower precision, which are caused by the absence of key medical concepts and the presence of irrelevant medical concepts. Global learning, on the other hand, works towards enhancing the local medical coding via collaboratively discovering missing key terminologies and keeping off the irrelevant terminologies by analyzing the social neighbors. Comprehensive experiments well validate the proposed scheme and each of its component. Practically, this unsupervised scheme holds potential to large-scale data.


IEEE Transactions on Circuits and Systems for Video Technology | 2008

Modality Mixture Projections for Semantic Video Event Detection

Jialie Shen; Dacheng Tao; Xuelong Li

Event detection is one of the most fundamental components for various kinds of domain applications of video information system. In recent years, it has gained a considerable interest of practitioners and academics from different areas. While detecting video event has been the subject of extensive research efforts recently, much less existing approach has considered multimodal information and related efficiency issues. In this paper, we use a subspace selection technique to achieve fast and accurate video event detection using a subspace selection technique. The approach is capable of discriminating different classes and preserving the intramodal geometry of samples within an identical class. With the method, feature vectors presenting different kind of multi data can be easily projected from different identities and modalities onto a unified subspace, on which recognition process can be performed. Furthermore, the training stage is carried out once and we have a unified transformation matrix to project different modalities. Unlike existing multimodal detection systems, the new system works well when some modalities are not available. Experimental results based on soccer video and TRECVID news video collections demonstrate the effectiveness, efficiency and robustness of the proposed MMP for individual recognition tasks in comparison to the existing approaches.


acm multimedia | 2011

Tag-based social image search with visual-text joint hypergraph learning

Yue Gao; Meng Wang; Huanboo Luan; Jialie Shen; Shuicheng Yan; Dacheng Tao

Tag-based social image search has attracted great interest and how to order the search results based on relevance level is a research problem. Visual content of images and tags have both been investigated. However, existing methods usually employ tags and visual content separately or sequentially to learn the image relevance. This paper proposes a tag-based image search with visual-text joint hypergraph learning. We simultaneously investigate the bag-of-words and bag-of-visual-words representations of images and accomplish the relevance estimation with a hypergraph learning approach. Each textual or visual word generates a hyperedge in the constructed hypergraph. We conduct experiments with a real-world data set and experimental results demonstrate the effectiveness of our approach.


IEEE Transactions on Systems, Man, and Cybernetics | 2015

Content-Based Visual Landmark Search via Multimodal Hypergraph Learning

Lei Zhu; Jialie Shen; Hai Jin; Ran Zheng; Liang Xie

While content-based landmark image search has recently received a lot of attention and became a very active domain, it still remains a challenging problem. Among the various reasons, high diverse visual content is the most significant one. It is common that for the same landmark, images with a wide range of visual appearances can be found from different sources and different landmarks may share very similar sets of images. As a consequence, it is very hard to accurately estimate the similarities between the landmarks purely based on single type of visual feature. Moreover, the relationships between landmark images can be very complex and how to develop an effective modeling scheme to characterize the associations still remains an open question. Motivated by these concerns, we propose multimodal hypergraph (MMHG) to characterize the complex associations between landmark images. In MMHG, images are modeled as independent vertices and hyperedges contain several vertices corresponding to particular views. Multiple hypergraphs are firstly constructed independently based on different visual modalities to describe the hidden high-order relations from different aspects. Then, they are integrated together to involve discriminative information from heterogeneous sources. We also propose a novel content-based visual landmark search system based on MMHG to facilitate effective search. Distinguished from the existing approaches, we design a unified computational module to support query-specific combination weight learning. An extensive experiment study on a large-scale test collection demonstrates the effectiveness of our scheme over state-of-the-art approaches.


ACM Transactions on Information Systems | 2014

Learning to Recommend Descriptive Tags for Questions in Social Forums

Liqiang Nie; Yi-Liang Zhao; Xiangyu Wang; Jialie Shen; Tat-Seng Chua

Around 40% of the questions in the emerging social-oriented question answering forums have at most one manually labeled tag, which is caused by incomprehensive question understanding or informal tagging behaviors. The incompleteness of question tags severely hinders all the tag-based manipulations, such as feeds for topic-followers, ontological knowledge organization, and other basic statistics. This article presents a novel scheme that is able to comprehensively learn descriptive tags for each question. Extensive evaluations on a representative real-world dataset demonstrate that our scheme yields significant gains for question annotation, and more importantly, the whole process of our approach is unsupervised and can be extended to handle large-scale data.


Pattern Recognition | 2009

Stochastic modeling western paintings for effective classification

Jialie Shen

As one of the most important cultural heritages, classical western paintings have always played a special role in human live and been applied for many different purposes. While image classification is the subject of a plethora of related publications, relatively little attention has been paid to automatic categorization of western classical paintings which could be a key technique of modern digital library, museums and art galleries. This paper studies automatic classification on large western painting image collection. We propose a novel framework to support automatic classification on large western painting image collections. With this framework, multiple visual features can be integrated effectively to improve the accuracy of identification process significantly. We also evaluate our method and its competitors based on a large image collection. A careful study on the empirical results indicates the approach enjoys great superiority over the state-of-the-art approaches in different aspects.

Collaboration


Dive into the Jialie Shen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhiyong Cheng

Singapore Management University

View shared research outputs
Top Co-Authors

Avatar

John Shepherd

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Shuicheng Yan

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Meng Wang

Hefei University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xuelong Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Kian-Lee Tan

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Hwee Hwa Pang

Singapore Management University

View shared research outputs
Researchain Logo
Decentralizing Knowledge