Chenxia Wu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chenxia Wu is active.

Explore More

Publication

Featured researches published by Chenxia Wu.

IEEE Transactions on Knowledge and Data Engineering | 2013

Semi-Supervised Nonlinear Hashing Using Bootstrap Sequential Projection Learning

Chenxia Wu; Jianke Zhu; Deng Cai; Chun Chen; Jiajun Bu

In this paper, we study the effective semi-supervised hashing method under the framework of regularized learning-based hashing. A nonlinear hash function is introduced to capture the underlying relationship among data points. Thus, the dimensionality of the matrix for computation is not only independent from the dimensionality of the original data space but also much smaller than the one using linear hash function. To effectively deal with the error accumulated during converting the real-value embeddings into the binary code after relaxation, we propose a semi-supervised nonlinear hashing algorithm using bootstrap sequential projection learning which effectively corrects the errors by taking into account of all the previous learned bits holistically without incurring the extra computational overhead. Experimental results on the six benchmark data sets demonstrate that the presented method outperforms the state-of-the-art hashing algorithms at a large margin.

robotics science and systems | 2014

Hierarchical Semantic Labeling for Task-Relevant RGB-D Perception

Chenxia Wu; Ian Lenz; Ashutosh Saxena

Semantic labeling of RGB-D scenes is very important in enabling robots to perform mobile manipulation tasks, but different tasks may require entirely different sets of labels. For example, when navigating to an object, we may need only a single label denoting its class, but to manipulate it, we might need to identify individual parts. In this work, we present an algorithm that produces hierarchical labelings of a scene, following is-part-of and is-type-of relationships. Our model is based on a Conditional Random Field that relates pixel-wise and pair-wise observations to labels. We encode hierarchical labeling constraints into the model while keeping inference tractable. Our model thus predicts different specificities in labeling based on its confidence—if it is not sure whether an object is Pepsi or Sprite, it will predict soda rather than making an arbitrary choice. In extensive experiments, both offline on standard datasets as well as in online robotic experiments, we show that our model outperforms other stateof-the-art methods in labeling performance as well as in success rate for robotic tasks.

computer vision and pattern recognition | 2015

Watch-n-patch: Unsupervised understanding of actions and relations

Chenxia Wu; Jiemi Zhang; Silvio Savarese; Ashutosh Saxena

We focus on modeling human activities comprising multiple actions in a completely unsupervised setting. Our model learns the high-level action co-occurrence and temporal relations between the actions in the activity video. We consider the video as a sequence of short-term action clips, called action-words, and an activity is about a set of action-topics indicating which actions are present in the video. Then we propose a new probabilistic model relating the action-words and the action-topics. It allows us to model long-range action relations that commonly exist in the complex activity, which is challenging to capture in the previous works. We apply our model to unsupervised action segmentation and recognition, and also to a novel application that detects forgotten actions, which we call action patching. For evaluation, we also contribute a new challenging RGB-D activity video dataset recorded by the new Kinect v2, which contains several human daily activities as compositions of multiple actions interacted with different objects. The extensive experiments show the effectiveness of our model.

acm multimedia | 2012

Unsupervised face-name association via commute distance

Jiajun Bu; Bin Xu; Chenxia Wu; Chun Chen; Jianke Zhu; Deng Cai; Xiaofei He

Recently, the task of unsupervised face-name association has received a considerable interests in multimedia and information retrieval communities. It is quite different with the generic facial image annotation problem because of its unsupervised and ambiguous assignment properties. Specifically, the task of face-name association should obey the following three constraints: (1) a face can only be assigned to a name appearing in its associated caption or to null; (2) a name can be assigned to at most one face; and (3) a face can be assigned to at most one name. Many conventional methods have been proposed to tackle this task while suffering from some common problems, eg, many of them are computational expensive and hard to make the null assignment decision. In this paper, we design a novel framework named face-name association via commute distance (FACD), which judges face-name and face-null assignments under a unified framework via commute distance (CD) algorithm. Then, to further speed up the on-line processing, we propose a novel anchor-based commute distance (ACD) algorithm whose main idea is using the anchor point representation structure to accelerate the eigen-decomposition of the adjacency matrix of a graph. Systematic experiment results on a large scale and real world image-caption database with a total of 194,046 detected faces and 244,725 names show that our proposed approach outperforms many state-of-the-art methods in performance. Our framework is appropriate for a large scale and real-time system.

computer vision and pattern recognition | 2012

A content-based video copy detection method with randomly projected binary features

Chenxia Wu; Jianke Zhu; Jiemi Zhang

Video copy detection has been actively studied in a wide range of multimedia applications. This paper presents a novel content-based video copy detection method using the randomly projected binary features. A very efficient sparse random projection method is employed to encode the image features while retaining their discrimination capability. By taking advantage of the extremely fast similarity computation of binary features using Hamming distance, we present a keyframe-based copy retrieval method that exhaustively searches the copy candidates from the large video database without indexing. Moreover, an effective scoring and localization algorithm is proposed to further refine the retrieved copies and accurately locate the video segments. The experimental evaluation has been performed to show the efficacy of the proposed randomly projected binary features. The promising results in the TRECVID2011 [14] content-based copy detection task demonstrated the effectiveness of our proposed approach.

european conference on computer vision | 2012

A convolutional treelets binary feature approach to fast keypoint recognition

Chenxia Wu; Jianke Zhu; Jiemi Zhang; Chun Chen; Deng Cai

Fast keypoint recognition is essential to many vision tasks. In contrast to the classification-based approaches [1,2], we directly formulate the keypoint recognition as an image patch retrieval problem, which enjoys the merit of finding the matched keypoint and its pose simultaneously. A novel convolutional treelets approach is proposed to effectively extract the binary features from the patches. A corresponding sub-signature-based locality sensitive hashing scheme is employed for the fast approximate nearest neighbor search in patch retrieval. Experiments on both synthetic data and real-world images have shown that our method performs better than state-of-the-art descriptor-based and classification-based approaches.

international conference on big data | 2013

Sparse Poisson coding for high dimensional document clustering

Chenxia Wu; Haiqin Yang; Jianke Zhu; Jiemi Zhang; Irwin King; Michael R. Lyu

Document clustering plays an important role in large scale textual data analysis, which generally faces with great challenge of the high dimensional textual data. One remedy is to learn the high-level sparse representation by the sparse coding techniques. In contrast to traditional Gaussian noise-based sparse coding methods, in this paper, we employ a Poisson distribution model to represent the word-count frequency feature of a text for sparse coding. Moreover, a novel sparse-constrained Poisson regression algorithm is proposed to solve the induced optimization problem. Different from previous Poisson regression with the family of ℓ1-regularization to enhance the sparse solution, we introduce a sparsity ratio measure which make use of both ℓ1-norm and ℓ2-norm on the learned weight. An important advantage of the sparsity ratio is that it bounded in the range of 0 and 1. This makes it easy to set for practical applications. To further make the algorithm trackable for the high dimensional textual data, a projected gradient descent algorithm is proposed to solve the regression problem. Extensive experiments have been conducted to show that our proposed approach can achieve effective representation for document clustering compared with state-of-the-art regression methods.

international conference on robotics and automation | 2016

Watch-Bot: Unsupervised learning for reminding humans of forgotten actions

Chenxia Wu; Jiemi Zhang; Bart Selman; Silvio Savarese; Ashutosh Saxena

We present a robotic system that watches a human using a Kinect v2 RGB-D sensor, detects what he forgot to do while performing an activity, and if necessary reminds the person using a laser pointer to point out the related object. Our simple setup can be easily deployed on any assistive robot. Our approach is based on a learning algorithm trained in a purely unsupervised setting, which does not require any human annotations. This makes our approach scalable and applicable to variant scenarios. Our model learns the action/object co-occurrence and action temporal relations in the activity, and uses the learned rich relationships to infer the forgotten action and the related object. We show that our approach not only improves the unsupervised action segmentation and action cluster assignment performance, but also effectively detects the forgotten actions on a challenging human activity RGB-D video dataset. In robotic experiments, we show that our robot is able to remind people of forgotten actions successfully.

IEEE Transactions on Systems, Man, and Cybernetics | 2015

Treelets Binary Feature Retrieval for Fast Keypoint Recognition

Jianke Zhu; Chenxia Wu; Chun Chen; Deng Cai

Fast keypoint recognition is essential to many vision tasks. In contrast to the classification-based approaches, we directly formulate the keypoint recognition as an image patch retrieval problem, which enjoys the merit of finding the matched keypoint and its pose simultaneously. To effectively extract the binary features from each patch surrounding the keypoint, we make use of treelets transform that can group the highly correlated data together and reduce the noise through the local analysis. Treelets is a multiresolution analysis tool, which provides an orthogonal basis to reflect the geometry of the noise-free data. To facilitate the real-world applications, we have proposed two novel approaches. One is the convolutional treelets that capture the image patch information locally and globally while reducing the computational cost. The other is the higher-order treelets that reflect the relationship between the rows and columns within image patch. An efficient sub-signature-based locality sensitive hashing scheme is employed for fast approximate nearest neighbor search in patch retrieval. Experimental evaluations on both synthetic data and the real-world Oxford dataset have shown that our proposed treelets binary feature retrieval methods outperform the state-of-the-art feature descriptors and classification-based approaches.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2018

Watch-n-Patch: Unsupervised Learning of Actions and Relations

Chenxia Wu; Jiemi Zhang; Ozan Sener; Bart Selman; Silvio Savarese; Ashutosh Saxena

There is a large variation in the activities that humans perform in their everyday lives. We consider modeling these composite human activities which comprises multiple basic level actions in a completely unsupervised setting. Our model learns high-level co-occurrence and temporal relations between the actions. We consider the video as a sequence of short-term action clips, which contains human-words and object-words. An activity is about a set of action-topics and object-topics indicating which actions are present and which objects are interacting with. We then propose a new probabilistic model relating the words and the topics. It allows us to model long-range action relations that commonly exist in the composite activities, which is challenging in previous works. We apply our model to the unsupervised action segmentation and clustering, and to a novel application that detects forgotten actions, which we call action patching. For evaluation, we contribute a new challenging RGB-D activity video dataset recorded by the new Kinect v2, which contains several human daily activities as compositions of multiple actions interacting with different objects. Moreover, we develop a robotic system that watches and reminds people using our action patching algorithm. Our robotic setup can be easily deployed on any assistive robots.

Explore More