Weizhi Nie | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Weizhi Nie is active.

Explore More

Publication

Featured researches published by Weizhi Nie.

IEEE Transactions on Image Processing | 2016

Multi-Modal Clique-Graph Matching for View-Based 3D Model Retrieval

Anan Liu; Weizhi Nie; Yue Gao; Yuting Su

Multi-view matching is an important but a challenging task in view-based 3D model retrieval. To address this challenge, we propose an original multi-modal clique graph (MCG) matching method in this paper. We systematically present a method for MCG generation that is composed of cliques, which consist of neighbor nodes in multi-modal feature space and hyper-edges that link pairwise cliques. Moreover, we propose an image set-based clique/edgewise similarity measure to address the issue of the set-to-set distance measure, which is the core problem in MCG matching. The proposed MCG provides the following benefits: 1) preserves the local and global attributes of a graph with the designed structure; 2) eliminates redundant and noisy information by strengthening inliers while suppressing outliers; and 3) avoids the difficulty of defining high-order attributes and solving hyper-graph matching. We validate the MCG-based 3D model retrieval using three popular single-modal data sets and one novel multi-modal data set. Extensive experiments show the superiority of the proposed method through comparisons. Moreover, we contribute a novel real-world 3D object data set, the multi-view RGB-D object data set. To the best of our knowledge, it is the largest real-world 3D object data set containing multi-modal and multi-view information.

computer vision and pattern recognition | 2015

Clique-graph matching by preserving global & local structure

Weizhi Nie; Anan Liu; Zan Gao; Yuting Su

This paper originally proposes the clique-graph and further presents a clique-graph matching method by preserving global and local structures. Especially, we formulate the objective function of clique-graph matching with respective to two latent variables, the clique information in the original graph and the pairwise clique correspondence constrained by the one-to-one matching. Since the objective function is not jointly convex to both latent variables, we decompose it into two consecutive steps for optimization: 1) clique-to-clique similarity measure by preserving local unary and pairwise correspondences; 2) graph-to-graph similarity measure by preserving global clique-to-clique correspondence. Extensive experiments on the synthetic data and real images show that the proposed method can outperform representative methods especially when both noise and outliers exist.

Signal Processing | 2015

Coupled hidden conditional random fields for RGB-D human action recognition

Anan Liu; Weizhi Nie; Yuting Su; Li Ma; Tong Hao; Zhaoxuan Yang

This paper proposes a human action recognition method via coupled hidden conditional random fields model by fusing both RGB and depth sequential information. The coupled hidden conditional random fields model extends the standard hidden-state conditional random fields model only with one chain-structure sequential observation to multiple chain-structure sequential observations, which are synchronized sequence data captured in multiple modalities. For model formulation, we propose the specific graph structure for the interaction among multiple modalities and design the corresponding potential functions. Then we propose the model learning and inference methods to discover the latent correlation between RGB and depth data as well as model temporal context within individual modality. The extensive experiments show that the proposed model can boost the performance of human action recognition by taking advance of complementary characteristics from both RGB and depth modalities. HighlightsWe propose cHCRF to learn sequence-specific and sequence-shared temporal structure.We contribute a novel RGB-D human action dataset containing 1200 samples.Experiments on 3 popular datasets show the superiority of the proposed method.

IEEE Transactions on Multimedia | 2015

Semantic-Based Location Recommendation With Multimodal Venue Semantics

Xiangyu Wang; Yi-Liang Zhao; Liqiang Nie; Yue Gao; Weizhi Nie; Zheng-Jun Zha; Tat-Seng Chua

In recent years, we have witnessed a flourishing of location -based social networks. A well-formed representation of location knowledge is desired to cater to the need of location sensing, browsing, navigation and querying. In this paper, we aim to study the semantics of point-of-interest (POI) by exploiting the abundant heterogeneous user generated content (UGC) from different social networks. Our idea is to explore the text descriptions, photos, user check-in patterns, and venue context for location semantic similarity measurement. We argue that the venue semantics play an important role in user check-in behavior. Based on this argument, a unified POI recommendation algorithm is proposed by incorporating venue semantics as a regularizer. In addition to deriving user preference based on user-venue check-in information, we place special emphasis on location semantic similarity. Finally, we conduct a comprehensive performance evaluation of location semantic similarity and location recommendation over a real world dataset collected from Foursquare and Instagram. Experimental results show that the UGC information can well characterize the venue semantics, which help to improve the recommendation performance.

Information Sciences | 2015

Graph-based characteristic view set extraction and matching for 3D model retrieval

Anan Liu; Zhongyang Wang; Weizhi Nie; Yuting Su

In recent times, multi-view representation of the 3D model has led to extensive research in view-based methods for 3D model retrieval. However, most approaches focus on feature extraction from 2D images while ignoring the spatial information of the 3D model. In order to improve the effectiveness of view-based methods on 3D model retrieval, this paper proposes a novel method for characteristic view extraction and similarity measurement. First, the graph clustering method is used for view grouping and the random-walk algorithm is applied to adaptively update the weight of each view. The spatial information of the 3D object is utilized to construct a view-graph model, thus enabling each characteristic view to represent the discriminative visual feature in terms of specific spatial context. Next, by considering the view set as a graph model, the similarity measurement of two models can be converted into a graph matching problem. This problem is solved by mathematically formulating it as a Rayleigh quotient maximization with affinity constraints for similarity measurement. Extensive comparison experiments were conducted on the popular ETH, NTU, PSB, and MV-RED 3D model datasets. The results demonstrate the superiority of the proposed method.

Neurocomputing | 2014

Single/cross-camera multiple-person tracking by graph matching

Weizhi Nie; Anan Liu; Yuting Su; Huanbo Luan; Zhaoxuan Yang; Liujuan Cao; Rongrong Ji

Single and cross-camera multiple person tracking in unconstrained condition is an extremely challenging task in computer vision. Facing the main difficulties caused by the existence of occlusion in single-camera scenario and the occurrence of transition in cross-camera scenario, we propose a unified framework formulated in graph matching with affinity constraints for both single and cross-camera tracking tasks. To our knowledge, our work is the first to unify two kinds of tracking problems with the same framework by graph matching. The proposed method consists of two steps, tracklet generation and tracklet association. First, we implement the modified part-based human detector and the Tracking-Modeling-Detection (TMD) method for tracklet generation. Then we propose to associate tracklets by graph matching which is mathematically formulated into the Rayleigh Quotients Maximization. The comparison experiments show that the proposed method can produce the competing results with the state-of-the-art methods.

Journal of Visual Communication and Image Representation | 2016

3D object retrieval based on sparse coding in weak supervision

Weizhi Nie; Anan Liu; Yuting Su

The proposed method does not require the explicit virtual model information.We utilized FDDL to learn dictionary to reconstruct query sample for retrieval.Our approach has high generalization ability and can be used on other applications. With the rapid development of computer vision and digital capture equipment, we can easily record the 3D information of objects. In the recent years, more and more 3D data are generated, which makes it desirable to develop effective 3D retrieval algorithms. In this paper, we apply the sparse coding method in a weakly supervision manner to address 3D model retrieval. First, each 3D object, which is represented by a set of 2D images, is used to learn dictionary. Then, sparse coding is used to compute the reconstruction residual for each query object. Finally, the residual between the query model and the candidate model is used for 3D model retrieval. In the experiment, ETH, NTU and ALOL dataset are used to evaluate the performance of the proposed method. The results demonstrate the superiority of the proposed method.

Neurocomputing | 2015

3D Model Retrieval with Weighted Locality-constrained Group Sparse Coding

Xiangyu Wang; Weizhi Nie

Abstract In recent years, we have witnessed a flourishing of 3D object modelling. Efficient and effective 3D model retrieval algorithms are high desired and attracted intensive research attentions. In this work, we propose a view-based 3D model retrieval algorithm based on weighted locality-constrained group sparse coding. Representative views are first selected by clustering and the corresponding weights are provided by considering the relationship among these views. By grouping the views from 3D models, a locality-constrained group sparse coding method is employed to find the reconstruction residual for each query view. The distance between query model and candidate model is taken as the weighted sum of residual. The query model is matched to the model which can best reconstruct the query model. Experimental comparisons have been conducted on the ETH 3D model dataset, and the results have demonstrated the effectiveness of the proposed method.

Image and Vision Computing | 2016

Cross-view action recognition by cross-domain learning

Weizhi Nie; Anan Liu; Wenhui Li; Yuting Su

This paper proposes a novel cross-view human action recognition method by discovering and sharing common knowledge among different video sets captured in multiple viewpoints. We treat a specific view as target domain and the others as source domains and consequently formulate the cross-view action recognition into the cross-domain learning framework. First, the classic bag-of-visual word framework is implemented for visual feature extraction in individual viewpoints. Then, we add two transformation matrices in order to transform original action feature from different views into one common feature space, and also combine the original feature and the transformation feature to proposed the new feature mapping function for target and auxiliary domains respectively. Finally, we proposed a new method to learn the two transformation matrices in model training step based on the standard SVM solver and generate the final classifier for each human action. Extensive experiments are implemented on IXMAS, and TJU. The experimental results demonstrate that the proposed method can consistently outperform the state-of-the-arts. We proposed a novel cross-domain learning method to handle action recognition.We applied the block-wise weighted kernel function to leverage cross-view information.We evaluated the proposed method on two popular cross-view human action datasets.

advanced video and signal based surveillance | 2012

Multiple Person Tracking by Spatiotemporal Tracklet Association

Weizhi Nie; Anan Liu; Yuting Su

In the field of video surveillance, multiple object tracking is a challenging problem in the real application. In this paper, we propose a multiple object tracking method by spatiotemporal tracklet association. Firstly, reliable tracklets, the fragments of the entire trajectory of individual object movement, are generated by frame-wise association between object localization results in the neighbor frames. To avoid the negative influence of occlusion on reliable tracklet generation, part-based similarity computation is performed. Secondly, the produced tracklets are associated considering both spatial and temporal constrains to output the entire trajectory for individual person. Especially, we formulate the task of spatiotemporal multiple tracklet matching into a Maximum A Posterior (MAP) problem in the form of Markov Chain with spatiotemporal context constraints. The experiment on PETS 2012 dataset demonstrates the superiority of the proposed method.

Explore More