Is this you? Create Your Porfile

Xin Sun

Harbin Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xin Sun is active.

Explore More

Publication

Featured researches published by Xin Sun.

Pattern Recognition | 2013

Sparse coding based visual tracking: Review and experimental comparison

Shengping Zhang; Hongxun Yao; Xin Sun; Xiusheng Lu

Recently, sparse coding has been successfully applied in visual tracking. The goal of this paper is to review the state-of-the-art tracking methods based on sparse coding. We first analyze the benefits of using sparse coding in visual tracking and then categorize these methods into appearance modeling based on sparse coding (AMSC) and target searching based on sparse representation (TSSR) as well as their combination. For each categorization, we introduce the basic framework and subsequent improvements with emphasis on their advantages and disadvantages. Finally, we conduct extensive experiments to compare the representative methods on a total of 20 test sequences. The experimental results indicate that: (1) AMSC methods significantly outperform TSSR methods. (2) For AMSC methods, both discriminative dictionary and spatial order reserved pooling operators are important for achieving high tracking accuracy. (3) For TSSR methods, the widely used identity pixel basis will degrade the performance when the target or candidate images are not aligned well or severe occlusion occurs. (4) For TSSR methods, @?1 norm minimization is not necessary. In contrast, @?2 norm minimization can obtain comparable performance but with lower computational cost. The open questions and future research topics are also discussed.

Neurocomputing | 2013

Robust visual tracking based on online learning sparse representation

Shengping Zhang; Hongxun Yao; Huiyu Zhou; Xin Sun; Shaohui Liu

Handling appearance variations is a very challenging problem for visual tracking. Existing methods usually solve this problem by relying on an effective appearance model with two features: (1) being capable of discriminating the tracked target from its background, (2) being robust to the targets appearance variations during tracking. Instead of integrating the two requirements into the appearance model, in this paper, we propose a tracking method that deals with these problems separately based on sparse representation in a particle filter framework. Each target candidate defined by a particle is linearly represented by the target and background templates with an additive representation error. Discriminating the target from its background is achieved by activating the target templates or the background templates in the linear system in a competitive manner. The targets appearance variations are directly modeled as the representation error. An online algorithm is used to learn the basis functions that sparsely span the representation error. The linear system is solved via @?1 minimization. The candidate with the smallest reconstruction error using the target templates is selected as the tracking result. We test the proposed approach using four sequences with heavy occlusions, large pose variations, drastic illumination changes and low foreground-background contrast. The proposed approach shows excellent performance in comparison with two latest state-of-the-art trackers.

Information Sciences | 2014

Action recognition based on overcomplete independent components analysis

Shengping Zhang; Hongxun Yao; Xin Sun; Kuanquan Wang; Jun Zhang; Xiusheng Lu; Yanhao Zhang

Existing works on action recognition rely on two separate stages: (1) designing hand-designed features or learning features from video data; (2) classifying features using a classifier such as SVM or AdaBoost. Motivated by two observations: (1) independent component analysis (ICA) is capable of encoding intrinsic features underlying video data; and (2) videos of different actions can be easily distinguished by their intrinsic features, we propose a simple but effective action recognition framework based on the recently proposed overcomplete ICA model. After a set of overcomplete ICA basis functions are learned from the densely sampled 3D patches from training videos for each action, a test video is classified as the class whose basis functions can reconstruct the sampled 3D patches from the test video with the smallest reconstruction error. The experimental results on five benchmark datasets demonstrate that the proposed approach outperforms several state-of-the-art works.

ACM Transactions on Intelligent Systems and Technology | 2012

Robust Visual Tracking Using an Effective Appearance Model Based on Sparse Coding

Shengping Zhang; Hongxun Yao; Xin Sun; Shaohui Liu

Intelligent video surveillance is currently one of the most active research topics in computer vision, especially when facing the explosion of video data captured by a large number of surveillance cameras. As a key step of an intelligent surveillance system, robust visual tracking is very challenging for computer vision. However, it is a basic functionality of the human visual system (HVS). Psychophysical findings have shown that the receptive fields of simple cells in the visual cortex can be characterized as being spatially localized, oriented, and bandpass, and it forms a sparse, distributed representation of natural images. In this article, motivated by these findings, we propose an effective appearance model based on sparse coding and apply it in visual tracking. Specifically, we consider the responses of general basis functions extracted by independent component analysis on a large set of natural image patches as features and model the appearance of the tracked target as the probability distribution of these features. In order to make the tracker more robust to partial occlusion, camouflage environments, pose changes, and illumination changes, we further select features that are related to the target based on an entropy-gain criterion and ignore those that are not. The target is finally represented by the probability distribution of those related features. The target search is performed by minimizing the Matusita distance between the distributions of the target model and a candidate using Newton-style iterations. The experimental results validate that the proposed method is more robust and effective than three state-of-the-art methods.

computer vision and pattern recognition | 2011

A novel supervised level set method for non-rigid object tracking

Xin Sun; Hongxun Yao; Shengping Zhang

We present a novel approach to non-rigid object tracking based on a supervised level set model (SLSM). In contrast with conventional level set models, which emphasize the intensity consistency only and consider no priors, the curve evolution of the proposed SLSM is object-oriented and supervised by the specific knowledge of the target we want to track. Therefore, the SLSM can ensure a more accurate convergence to the target in tracking applications. In particular, we firstly construct the appearance model for the target in an on-line boosting manner due to its strong discriminative power between objects and background. Then the probability of the contour is modeled by considering both the region and edge cues in a Bayesian manner, leading the curve converge to the candidate region with maximum likelihood of being the target. Finally, accurate target region qualifies the samples fed the boosting procedure as well as the target model prepared for the next time step. Positive decrease rate is used to adjust the learning pace over time, enabling tracking to continue under partial and total occlusion. Experimental results on a number of challenging sequences validate the effectiveness of the technique.

IEEE Transactions on Image Processing | 2015

Non-Rigid Object Contour Tracking via a Novel Supervised Level Set Model

Xin Sun; Hongxun Yao; Shengping Zhang; Dong Li

We present a novel approach to non-rigid objects contour tracking in this paper based on a supervised level set model (SLSM). In contrast to most existing trackers that use bounding box to specify the tracked target, the proposed method extracts the accurate contours of the target as tracking output, which achieves better description of the non-rigid objects while reduces background pollution to the target model. Moreover, conventional level set models only emphasize the regional intensity consistency and consider no priors. Differently, the curve evolution of the proposed SLSM is object-oriented and supervised by the specific knowledge of the targets we want to track. Therefore, the SLSM can ensure a more accurate convergence to the exact targets in tracking applications. In particular, we firstly construct the appearance model for the target in an online boosting manner due to its strong discriminative power between the object and the background. Then, the learnt target model is incorporated to model the probabilities of the level set contour by a Bayesian manner, leading the curve converge to the candidate region with maximum likelihood of being the target. Finally, the accurate target region qualifies the samples fed to the boosting procedure as well as the target model prepared for the next time step. We firstly describe the proposed mechanism of two-phase SLSM for single target tracking, then give its generalized multi-phase version for dealing with multi-target tracking cases. Positive decrease rate is used to adjust the learning pace over time, enabling tracking to continue under partial and total occlusion. Experimental results on a number of challenging sequences validate the effectiveness of the proposed method.

international conference on image processing | 2011

Robust visual tracking via context objects computing

Zhongqian Sun; Hongxun Yao; Shengping Zhang; Xin Sun

Occlusions are challenging issue for robust visual tracking. In this paper, motivated by the fact that a tracked object is usually embedded into context that provides useful information for estimating the target, we propose a novel tracking algorithm named Tracking with Context Prediction (TCP). The context here includes the neighboring objects and specific parts of target. The proposed method simultaneously track the target and context objects using the existing tracking methods. The positions of the context objects are used to predict the position of the target. Thus, the target can be stably tracked even when it is partially or fully occluded. By computing the probability of each prediction being target, our algorithm allows the drifting of context objects during tracking and do not require predictions from all context objects are correct. Experiments on challenging sequences show significant improvements especially in the case of occlusions and appearance changes.

visual communications and image processing | 2010

Robust object tracking based on sparse representation

Shengping Zhang; Hongxun Yao; Xin Sun; Shaohui Liu

In this paper, we propose a novel and robust object tracking algorithm based on sparse representation. Object tracking is formulated as a object recognition problem rather than a traditional search problem. All target candidates are considered as training samples and the target template is represented as a linear combination of all training samples. The combination coefficients are obtained by solving for the minimum l1-norm solution. The final tracking result is the target candidate associated with the non-zero coefficient. Experimental results on two challenging test sequences show that the proposed method is more effective than the widely used mean shift tracker.

international conference on internet multimedia computing and service | 2011

Robust object tracking via inertial potential based mean shift

Xin Sun; Hongxun Yao; Shengping Zhang

We present a novel mean shift approach in this paper for robust object tracking based on an inertial potential model. Conventional mean shift based trackers exploit only appearance information of observation to determine the target location which usually cannot effectively distinguish the foreground from background in complex scenes. In contrast, by constructing the inertial potential model, the proposed algorithm makes good use of motion information of previous frames adaptively to track the target. Then the probability of all candidates is modeled by considering both the photometric and motion cues in a Bayesian manner, leading the mean shift vector finally converge to the location with maximum likelihood of being the target. Experimental results on several challenging video sequences have verified that the proposed method is compared very robust and effective with the traditional mean shift based trackers in many complicated scenes.

Proceedings of the 4th International SenseCam & Pervasive Imaging Conference on | 2013

Eating activity detection from images acquired by a wearable camera

Xin Sun; Hongxun Yao; Wenyan Jia; Mingui Sun

We present an eating activity detection method via automatic detecting dining plates from images acquired chronically by a wearable camera. Convex edge segments and their combinations within each input image are modeled with respect to probabilities of belonging to candidate ellipses. Then, a dining plate is determined according to a confidence score. Finally, the presence/absence of an eating event in an image sequence is determined by analyzing successive frames. Our experimental results verified the effectiveness of this method.

Explore More