Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huanbo Luan is active.

Publication


Featured researches published by Huanbo Luan.


international acm sigir conference on research and development in information retrieval | 2016

Discrete Collaborative Filtering

Hanwang Zhang; Fumin Shen; Wei Liu; Xiangnan He; Huanbo Luan; Tat-Seng Chua

We address the efficiency problem of Collaborative Filtering (CF) by hashing users and items as latent vectors in the form of binary codes, so that user-item affinity can be efficiently calculated in a Hamming space. However, existing hashing methods for CF employ binary code learning procedures that most suffer from the challenging discrete constraints. Hence, those methods generally adopt a two-stage learning scheme composed of relaxed optimization via discarding the discrete constraints, followed by binary quantization. We argue that such a scheme will result in a large quantization loss, which especially compromises the performance of large-scale CF that resorts to longer binary codes. In this paper, we propose a principled CF hashing framework called Discrete Collaborative Filtering (DCF), which directly tackles the challenging discrete optimization that should have been treated adequately in hashing. The formulation of DCF has two advantages: 1) the Hamming similarity induced loss that preserves the intrinsic user-item similarity, and 2) the balanced and uncorrelated code constraints that yield compact yet informative binary codes. We devise a computationally efficient algorithm with a rigorous convergence proof of DCF. Through extensive experiments on several real-world benchmarks, we show that DCF consistently outperforms state-of-the-art CF hashing techniques, e.g, though using only 8 bits, DCF is even significantly better than other methods using 128 bits.


IEEE Transactions on Circuits and Systems for Video Technology | 2013

Detecting Group Activities With Multi-Camera Context

Zheng-Jun Zha; Hanwang Zhang; Meng Wang; Huanbo Luan; Tat-Seng Chua

Human group activities detection in multi-camera CCTV surveillance videos is a pressing demand on smart surveillance. Previous works on this topic are mainly based on camera topology inference that is hard to apply to real-world unconstrained surveillance videos. In this paper, we propose a new approach for multi-camera group activities detection. Our approach simultaneously exploits intra-camera and inter-camera contexts without topology inference. Specifically, a discriminative graphical model with hidden variables is developed. The intra-camera and inter-camera contexts are characterized by the structure of hidden variables. By automatically optimizing the structure, the contexts are effectively explored. Furthermore, we propose a new spatiotemporal feature, named vigilant area (VA), to characterize the quantity and appearance of the motion in an area. This feature is effective for group activity representation and is easy to extract from a dynamic and crowded scene. We evaluate the proposed VA feature and discriminative graphical model extensively on two real-world multi-camera surveillance video data sets, including a public corpus consisting of 2.5 h of videos and a 468-h video collection, which, to the best of our knowledge, is the largest video collection ever used in human activity detection. The experimental results demonstrate the effectiveness of our approach.


acm multimedia | 2014

Start from Scratch: Towards Automatically Identifying, Modeling, and Naming Visual Attributes

Hanwang Zhang; Yang Yang; Huanbo Luan; Shuicheng Yang; Tat-Seng Chua

Higher-level semantics such as visual attributes are crucial for fundamental multimedia applications. We present a novel attribute discovery approach that can automatically identify, model and name attributes from an arbitrary set of image and text pairs that can be easily gathered on the Web. Different from conventional attribute discovery methods, our approach does not rely on any pre-defined vocabularies and human labeling. Therefore, we are able to build a large visual knowledge base without any human efforts. The discovery is based on a novel deep architecture, named Independent Component Multimodal Autoencoder (ICMAE), that can continually learn shared higher-level representations across the visual and textual modalities. With the help of the resultant representations encoding strong visual and semantic evidences, we propose to (a) identify attributes and their corresponding high-quality training images, (b) iteratively model them with maximum compactness and comprehensiveness, and (c) name the attribute models with human understandable words. To date, the proposed system has discovered 1,898 attributes over 1.3 million pairs of image and text. Extensive experiments on various real-world multimedia datasets demonstrate the quality and effectiveness of the discovered attributes, facilitating multimedia applications such as image annotation and retrieval as compared to the state-of-the-art approaches.


Information Sciences | 2011

VisionGo: Towards video retrieval with joint exploration of human and computer

Huanbo Luan; Yan-Tao Zheng; Meng Wang; Tat-Seng Chua

This paper introduces an effective interactive video retrieval system named VisionGo. It jointly explores human and computer to accomplish video retrieval with high effectiveness and efficiency. It assists the interactive video retrieval process in different aspects: (1) it maximizes the interaction efficiency between human and computer by providing a user interface that supports highly effective user annotation and an intuitive visualization of retrieval results; (2) it employs a multiple feedback technique that assists users in choosing proper method to enhance relevance feedback performance; and (3) it facilitates users to assess the retrieval results of motion-related queries by using motion-icons instead of static keyframes. Experimental results based on over 160h of news video shows demonstrate the effectiveness of the VisionGo system.


ACM Transactions on Information Systems | 2014

Social-Sensed Image Search

Peng Cui; Shaowei Liu; Wenwu Zhu; Huanbo Luan; Tat-Seng Chua; Shiqiang Yang

Although Web search techniques have greatly facilitate users’ information seeking, there are still quite a lot of search sessions that cannot provide satisfactory results, which are more serious in Web image search scenarios. How to understand user intent from observed data is a fundamental issue and of paramount significance in improving image search performance. Previous research efforts mostly focus on discovering user intent either from clickthrough behavior in user search logs (e.g., Google), or from social data to facilitate vertical image search in a few limited social media platforms (e.g., Flickr). This article aims to combine the virtues of these two information sources to complement each other, that is, sensing and understanding users’ interests from social media platforms and transferring this knowledge to rerank the image search results in general image search engines. Toward this goal, we first propose a novel social-sensed image search framework, where both social media and search engine are jointly considered. To effectively and efficiently leverage these two kinds of platforms, we propose an example-based user interest representation and modeling method, where we construct a hybrid graph from social media and propose a hybrid random-walk algorithm to derive the user-image interest graph. Moreover, we propose a social-sensed image reranking method to integrate the user-image interest graph from social media and search results from general image search engines to rerank the images by fusing their social relevance and visual relevance. We conducted extensive experiments on real-world data from Flickr and Google image search, and the results demonstrated that the proposed methods can significantly improve the social relevance of image search results while maintaining visual relevance well.


acm multimedia | 2007

Segregated feedback with performance-based adaptive sampling for interactive news video retrieval

Huanbo Luan; Shi-Yong Neo; Hai-Kiat Goh; Yongdong Zhang; Shouxun Lin; Tat-Seng Chua

Existing video research incorporates the use of relevance feedback based on user-dependent interpretations to improve the retrieval results. In this paper, we segregate the process of relevance feedback into 2 distinct facets: (a) recall-directed feedback; and (b) precision-directed feedback. The recall-directed facet employs general features such as text and high level features (HLFs) to maximize efficiency and recall during feedback, making it very suitable for large corpuses. The precision-directed facet on the other hand uses many other multimodal features in an active learning environment for improved accuracy. Combined with a performance-based adaptive sampling strategy, this process continuously re-ranks a subset of instances as the user annotates. Experiments done using TRECVID 2006 dataset show that our approach is efficient and effective.


acm multimedia | 2014

One of a Kind: User Profiling by Social Curation

Xue Geng; Hanwang Zhang; Zheng Song; Yang Yang; Huanbo Luan; Tat-Seng Chua

Social Curation Service (SCS) is a new type of emerging social media platform, where users can select, organize and keep track of multimedia contents they like. In this paper, we take advantage of this great opportunity and target at the very starting point in social media: user profiling, which supports fundamental applications such as personalized search and recommendation. As compared to other profiling methods in conventional Social Network Services (SNS), our work benefits from the two distinguishable characteristics of SCS: a) organized multimedia user-generated contents, and b) content-centric social network. Based on these two characteristics, we are able to deploy the state-of-the-art multimedia analysis techniques to establish content-based user profiles by extracting user preferences and their social relations. First, we automatically construct a content-based user preference ontology and learn the ontological models to generate comprehensive user profiles. In particular, we propose a new deep learning strategy called multi-task convolutional neural network (mtCNN) to learn profile models and profile-related visual features simultaneously. Second, we propose to model the multi-level social relations offered by SCS to refine the user profiles in a low-rank recovery framework. To the best of our knowledge, our work is the first that explores how social curation can help in content-based social media technologies, taking user profiling as an example. Extensive experiments on 1,293 users and 1.5 million images collected from Pinterest in fashion domain demonstrate that recommendation methods based on the proposed user profiles are considerably more effective than other state-of-the-art recommendation strategies.


conference on multimedia modeling | 2013

Social visual image ranking for web image search

Shaowei Liu; Peng Cui; Huanbo Luan; Wenwu Zhu; Shiqiang Yang; Qi Tian

Many research have been focusing on how to match the textual query with visual images and their surrounding texts or tags for Web image search. The returned results are often unsatisfactory due to their deviation from user intentions. In this paper, we propose a novel image ranking approach to web image search, in which we use social data from social media platform jointly with visual data to improve the relevance between returned images and user intentions (i.e., social relevance). Specifically, we propose a community-specific Social-Visual Ranking(SVR) algorithm to rerank the Web images by taking social relevance into account. Through extensive experiments, we demonstrated the importance of both visual factors and social factors, and the effectiveness and superiority of the social-visual ranking algorithm for Web image search.


acm multimedia | 2012

View-based 3D object retrieval by bipartite graph matching

Yue Wen; Yue Gao; Huanbo Luan; Qiong Liu; Jialie Shen; Rongrong Ji

Bipartite graph matching has been investigated in multiple view matching for 3D object retrieval. However, existing methods employ one-to-one vertex matching scheme while more than two views may share close semantic meanings in practice. In this work, we propose a bipartite graph matching method to measure the distance between two objects based on multiple views. In the proposed method, representative views are first selected by using view clustering for each object, and the corresponding weights are given based on the cluster results. A bipartite graph is constructed by using the two groups of representative views from two compared objects. To calculate the similarity between two objects, the bipartite graph is first partitioned to several subsets, and the views in the same sub-set are with high possibility to be with similar semantic meanings. The distances between two objects within individual subsets are then assembled through the graph to obtain the final similarity. Experimental results and comparison with the state-of-the-art methods demonstrate the effectiveness of the proposed algorithm.


IEEE Transactions on Multimedia | 2015

Enhancing Video Event Recognition Using Automatically Constructed Semantic-Visual Knowledge Base

Xishan Zhang; Yang Yang; Yongdong Zhang; Huanbo Luan; Jintao Li; Hanwang Zhang; Tat-Seng Chua

The task of recognizing events from video has attracted a lot of attention in recent years. However, due to the complex nature of user-defined events, the use of purely audio- visual content analysis without domain knowledge has been found to be grossly inadequate. In this paper, we propose to construct a semantic-visual knowledge base to encode the rich event-centric concepts and their relationships from the well- established lexical databases, including FrameNet, as well as the concept-specific visual knowledge from ImageNet. Based on this semantic-visual knowledge bases, we design an effective system for video event recognition. Specifically, in order to narrow the semantic gap between the high-level complex events and low-level visual representations, we utilize the event-centric semantic concepts encoded in the knowledge base as the intermediate-level event representation, which offers both human-perceivable and machine-interpretable semantic clues for event recognition. In addition, in order to leverage the abundant ImageNet images, we propose a robust transfer learning model to learn the noise- resistant concept classifiers for videos. Extensive experiments on various real-world video datasets demonstrate the superiority of our proposed system as compared to the state-of-the-art approaches.

Collaboration


Dive into the Huanbo Luan's collaboration.

Top Co-Authors

Avatar

Tat-Seng Chua

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yongdong Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Shi-Yong Neo

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Yan-Tao Zheng

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Meng Wang

Hefei University of Technology

View shared research outputs
Top Co-Authors

Avatar

Shouxun Lin

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yang Yang

University of Electronic Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jintao Li

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge