Xue Mei
Toyota
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xue Mei.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011
Xue Mei; Haibin Ling
In this paper, we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, noise, and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target in a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an ℓ1-regularized least-squares problem. Then, the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework. Two strategies are used to further improve the tracking performance. First, target templates are dynamically updated to capture appearance changes. Second, nonnegativity constraints are enforced to filter out clutter which negatively resembles tracking targets. We test the proposed approach on numerous sequences involving different types of challenges, including occlusion and variations in illumination, scale, and pose. The proposed approach demonstrates excellent performance in comparison with previously proposed trackers. We also extend the method for simultaneous tracking and recognition by introducing a static template set which stores target images from different classes. The recognition result at each frame is propagated to produce the final result for the whole video. The approach is validated on a vehicle tracking and classification task using outdoor infrared video sequences.
international conference on computer vision | 2009
Xue Mei; Haibin Ling
In this paper we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, corruption and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an ℓ1-regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Two additional components further improve the robustness of our approach: 1) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 2) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on five challenging sequences involving heavy occlusions, drastic illumination changes, and large pose variations. The proposed approach shows excellent performance in comparison with three previously proposed trackers.
computer vision and pattern recognition | 2015
Zhibin Hong; Zhe Chen; Chaohui Wang; Xue Mei; Danil V. Prokhorov; Dacheng Tao
Variations in the appearance of a tracked object, such as changes in geometry/photometry, camera viewpoint, illumination, or partial occlusion, pose a major challenge to object tracking. Here, we adopt cognitive psychology principles to design a flexible representation that can adapt to changes in object appearance during tracking. Inspired by the well-known Atkinson-Shiffrin Memory Model, we propose MUlti-Store Tracker (MUSTer), a dual-component approach consisting of short- and long-term memory stores to process target appearance memories. A powerful and efficient Integrated Correlation Filter (ICF) is employed in the short-term store for short-term tracking. The integrated long-term component, which is based on keypoint matching-tracking and RANSAC estimation, can interact with the long-term memory and provide additional information for output control. MUSTer was extensively evaluated on the CVPR2013 Online Object Tracking Benchmark (OOTB) and ALOV++ datasets. The experimental results demonstrated the superior performance of MUSTer in comparison with other state-of-art trackers.
international conference on computer vision | 2009
Xue Mei; Haibin Ling
In this paper we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, corruption and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an ℓ1-regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Two additional components further improve the robustness of our approach: 1) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 2) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on five challenging sequences involving heavy occlusions, drastic illumination changes, and large pose variations. The proposed approach shows excellent performance in comparison with three previously proposed trackers.
computer vision and pattern recognition | 2011
Xue Mei; Haibin Ling; Yi Wu; Erik Blasch; Li Bai
Recently, sparse representation has been applied to visual tracking to find the target with the minimum reconstruction error from the target template subspace. Though effective, these L1 trackers require high computational costs due to numerous calculations for ℓ1 minimization. In addition, the inherent occlusion insensitivity of the ℓ1 minimization has not been fully utilized. In this paper, we propose an efficient L1 tracker with minimum error bound and occlusion detection which we call Bounded Particle Resampling (BPR)-L1 tracker. First, the minimum error bound is quickly calculated from a linear least squares equation, and serves as a guide for particle resampling in a particle filter framework. Without loss of precision during resampling, most insignificant samples are removed before solving the computationally expensive ℓ1 minimization function. The BPR technique enables us to speed up the L1 tracker without sacrificing accuracy. Second, we perform occlusion detection by investigating the trivial coefficients in the ℓ1 minimization. These coefficients, by design, contain rich information about image corruptions including occlusion. Detected occlusions enhance the template updates to effectively reduce the drifting problem. The proposed method shows good performance as compared with several state-of-the-art trackers on challenging benchmark sequences.
IEEE Transactions on Image Processing | 2013
Xue Mei; Haibin Ling; Yi Wu; Erik Blasch; Li Bai
Recently, sparse representation has been applied to visual tracking to find the target with the minimum reconstruction error from a target template subspace. Though effective, these L1 trackers require high computational costs due to numerous calculations for l1 minimization. In addition, the inherent occlusion insensitivity of the l1 minimization has not been fully characterized. In this paper, we propose an efficient L1 tracker, named bounded particle resampling (BPR)-L1 tracker, with a minimum error bound and occlusion detection. First, the minimum error bound is calculated from a linear least squares equation and serves as a guide for particle resampling in a particle filter (PF) framework. Most of the insignificant samples are removed before solving the computationally expensive l1 minimization in a two-step testing. The first step, named τ testing, compares the sample observation likelihood to an ordered set of thresholds to remove insignificant samples without loss of resampling precision. The second step, named max testing, identifies the largest sample probability relative to the target to further remove insignificant samples without altering the tracking result of the current frame. Though sacrificing minimal precision during resampling, max testing achieves significant speed up on top of τ testing. The BPR-L1 technique can also be beneficial to other trackers that have minimum error bounds in a PF framework, especially for trackers based on sparse representations. After the error-bound calculation, BPR-L1 performs occlusion detection by investigating the trivial coefficients in the l1 minimization. These coefficients, by design, contain rich information about image corruptions, including occlusion. Detected occlusions are then used to enhance the template updating. For evaluation, we conduct experiments on three video applications: biometrics (head movement, hand holding object, singers on stage), pedestrians (urban travel, hallway monitoring), and cars in traffic (wide area motion imagery, ground-mounted perspectives). The proposed BPR-L1 method demonstrates an excellent performance as compared with nine state-of-the-art trackers on eleven challenging benchmark sequences.
international conference on computer vision | 2013
Zhibin Hong; Xue Mei; Danil V. Prokhorov; Dacheng Tao
Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that the proposed formulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several state-of-the-art trackers.
international conference on computer vision | 2011
Yi Wu; Haibin Ling; Jingyi Yu; Feng Li; Xue Mei; Erkang Cheng
Visual tracking plays an important role in many computer vision tasks. A common assumption in previous methods is that the video frames are blur free. In reality, motion blurs are pervasive in the real videos. In this paper we present a novel BLUr-driven Tracker (BLUT) framework for tracking motion-blurred targets. BLUT actively uses the information from blurs without performing debluring. Specifically, we integrate the tracking problem with the motion-from-blur problem under a unified sparse approximation framework. We further use the motion information inferred by blurs to guide the sampling process in the particle filter based tracking. To evaluate our method, we have collected a large number of video sequences with significatcant motion blurs and compared BLUT with state-of-the-art trackers. Experimental results show that, while many previous methods are sensitive to motion blurs, BLUT can robustly and reliably track severely blurred targets.
IEEE Transactions on Neural Networks | 2017
Jun Li; Xue Mei; Danil V. Prokhorov; Dacheng Tao
Hierarchical neural networks have been shown to be effective in learning representative image features and recognizing object classes. However, most existing networks combine the low/middle level cues for classification without accounting for any spatial structures. For applications such as understanding a scene, how the visual cues are spatially distributed in an image becomes essential for successful analysis. This paper extends the framework of deep neural networks by accounting for the structural cues in the visual signals. In particular, two kinds of neural networks have been proposed. First, we develop a multitask deep convolutional network, which simultaneously detects the presence of the target and the geometric attributes (location and orientation) of the target with respect to the region of interest. Second, a recurrent neuron layer is adopted for structured visual detection. The recurrent neurons can deal with the spatial distribution of visible cues belonging to an object whose shape or structure is difficult to explicitly define. Both the networks are demonstrated by the practical task of detecting lane boundaries in traffic scenes. The multitask convolutional neural network provides auxiliary geometric information to help the subsequent modeling of the given lane structures. The recurrent neural network automatically detects lane boundaries, including those areas containing no marks, without any explicit prior knowledge or secondary modeling.
international conference on computer vision | 2009
Xue Mei; Haibin Ling; David W. Jacobs
Scenes with cast shadows can produce complex sets of images. These images cannot be well approximated by low-dimensional linear subspaces. However, in this paper we show that the set of images produced by a Lambertian scene with cast shadows can be efficiently represented by a sparse set of images generated by directional light sources. We first model an image with cast shadows as composed of a diffusive part (without cast shadows) and a residual part that captures cast shadows. Then, we express the problem in an ℓ1-regularized least squares formulation, with nonnegativity constraints. This sparse representation enjoys an effective and fast solution, thanks to recent advances in compressive sensing. In experiments on both synthetic and real data, our approach performs favorably in comparison to several previously proposed methods.