Xinyi Cui | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xinyi Cui is active.

Explore More

Publication

Featured researches published by Xinyi Cui.

computer vision and pattern recognition | 2011

Abnormal detection using interaction energy potentials

Xinyi Cui; Qingshan Liu; Mingchen Gao; Dimitris N. Metaxas

A new method is proposed to detect abnormal behaviors in human group activities. This approach effectively models group activities based on social behavior analysis. Different from previous work that uses independent local features, our method explores the relationships between the current behavior state of a subject and its actions. An interaction energy potential function is proposed to represent the current behavior state of a subject, and velocity is used as its actions. Our method does not depend on human detection or segmentation, so it is robust to detection errors. Instead, tracked spatio-temporal interest points are able to provide a good estimation of modeling group interaction. SVM is used to find abnormal events. We evaluate our algorithm in two datasets: UMN and BEHAVE. Experimental results show its promising performance against the state-of-art methods.

european conference on computer vision | 2012

Background subtraction using low rank and group sparsity constraints

Xinyi Cui; Junzhou Huang; Shaoting Zhang; Dimitris N. Metaxas

Background subtraction has been widely investigated in recent years. Most previous work has focused on stationary cameras. Recently, moving cameras have also been studied since videos from mobile devices have increased significantly. In this paper, we propose a unified and robust framework to effectively handle diverse types of videos, e.g., videos from stationary or moving cameras. Our model is inspired by two observations: 1) background motion caused by orthographic cameras lies in a low rank subspace, and 2) pixels belonging to one trajectory tend to group together. Based on these two observations, we introduce a new model using both low rank and group sparsity constraints. It is able to robustly decompose a motion trajectory matrix into foreground and background ones. After obtaining foreground and background trajectories, the information gathered on them is used to build a statistical model to further label frames at the pixel level. Extensive experiments demonstrate very competitive performance on both synthetic data and real videos.

acm multimedia | 2009

Temporal spectral residual: fast motion saliency detection

Xinyi Cui; Qingshan Liu; Dimitris N. Metaxas

Saliency detection has attracted much attention in recent years. It aims at locating semantic regions in images for further image understanding. In this paper, we address the issue of motion saliency detection for video content analysis. Inspired by the idea of Spectral Residual for image saliency detection, we propose a new method Temporal Spectral Residual on video slices along X-T and Y-T planes, which can automatically separate foreground motion objects from backgrounds, also with the help of threshold selection and voting schemes. Different from conventional background modeling methods with complex mathematical model, the proposed method is only based on Fourier spectrum analysis, so it is simple and fast. The power of our proposed method is demonstrated in the experiments of four typical videos with different dynamic background.

computer vision and pattern recognition | 2008

Facial expression recognition using encoded dynamic features

Peng Yang; Qingshan Liu; Xinyi Cui; Dimitris N. Metaxas

In this paper, we propose a novel framework for video-based facial expression recognition, which can handle the data with various time resolution including a single frame. We first use the haar-like features to represent facial appearance, due to their simplicity and effectiveness. Then we perform K-Means clustering on the facial appearance features to explore the intrinsic temporal patterns of each expression. Based on the temporal pattern models, we further map the facial appearance variations into dynamic binary patterns. Finally, boosting learning is performed to construct the expression classifiers. Compared to previous work, the dynamic binary patterns encode the intrinsic dynamics of expression, and our method makes no assumption on the time resolution of the data. Extensive experiments carried on the Cohn-Kanade database show the promising performance of the proposed method.

Computer Vision and Image Understanding | 2013

3D anatomical shape atlas construction using mesh quality preserved deformable models

Shaoting Zhang; Yiqiang Zhan; Xinyi Cui; Mingchen Gao; Junzhou Huang; Dimitris N. Metaxas

3D anatomical shape atlas construction has been extensively studied in medical image analysis research, owing to its importance in model-based image segmentation, longitudinal studies and populational statistical analysis, etc. Among multiple steps of 3D shape atlas construction, establishing anatomical correspondences across subjects, i.e., surface registration, is probably the most critical but challenging one. Adaptive focus deformable model (AFDM) [1] was proposed to tackle this problem by exploiting cross-scale geometry characteristics of 3D anatomy surfaces. Although the effectiveness of AFDM has been proved in various studies, its performance is highly dependent on the quality of 3D surface meshes, which often degrades along with the iterations of deformable surface registration (the process of correspondence matching). In this paper, we propose a new framework for 3D anatomical shape atlas construction. Our method aims to robustly establish correspondences across different subjects and simultaneously generate high-quality surface meshes without removing shape details. Mathematically, a new energy term is embedded into the original energy function of AFDM to preserve surface mesh qualities during deformable surface matching. More specifically, we employ the Laplacian representation to encode shape details and smoothness constraints. An expectation-maximization style algorithm is designed to optimize multiple energy terms alternatively until convergence. We demonstrate the performance of our method via a set of diverse applications, including a population of sparse cardiac MRI slices with 2D labels, 3D high resolution CT cardiac images and rodent brain MRIs with multiple structures. The constructed shape atlases exhibit good mesh qualities and preserve fine shape details. The constructed shape atlases can further benefit other research topics such as segmentation and statistical analysis.

Neurocomputing | 2012

Temporal Spectral Residual for fast salient motion detection

Xinyi Cui; Qingshan Liu; Shaoting Zhang; Fei Yang; Dimitris N. Metaxas

Motion saliency detection aims at finding the semantic regions in a video sequence. It is an important pre-processing step in many vision applications. In this paper, we propose a new algorithm, Temporal Spectral Residual, for fast motion saliency detection. Different from conventional motion saliency detection algorithms that use complex mathematical models, our goal is to find a good tradeoff between the computational efficiency and accuracy. The basic observation for salient motions is that on the cross section along the temporal axis of a video sequence, the regions of moving objects contain distinct signals while the background area contains redundant information. Thus our focus in this paper is to extract the salient information on the cross section, by utilizing the off-the-shelf method Spectral Residual, which is a 2D image saliency detection method. Majority voting strategy is also introduced to generate reliable results. Since the proposed method only involves Fourier spectrum analysis, it is computationally efficient. We validate our algorithm on two applications: background subtraction in outdoor video sequences under dynamic background and left ventricle endocardium segmentation in MR sequences. Compared with some state-of-art algorithms, our algorithm achieves both good accuracy and fast computation, which satisfies the need as a pre-processing method.

international conference on image processing | 2012

Robust face tracking with a consumer depth camera

Fei Yang; Junzhou Huang; Xiang Yu; Xinyi Cui; Dimitris N. Metaxas

We address the problem of tracking human faces under various poses and lighting conditions. Reliable face tracking is a challenging task. The shapes of the faces may change dramatically with various identities, poses and expressions. Moreover, poor lighting conditions may cause a low contrast image or cast shadows on faces, which will significantly degrade the performance of the tracking system. In this paper, we develop a framework to track face shapes by using both color and depth information. Since the faces in various poses lie on a nonlinear manifold, we build piecewise linear face models, each model covering a range of poses. The low-resolution depth image is captured by using Microsoft Kinect, and is used to predict head pose and generate extra constraints at the face boundary. Our experiments show that, by exploiting the depth information, the performance of the tracking system is significantly improved.

Archive | 2013

Towards accurate group activity analysis in videos: robust saliency detection and effective feature modeling

Dimitris N. Metaxas; Xinyi Cui

Human activity analysis is an important area of computer vision research today. The goal of human activity analysis is to automatically analyze ongoing activities from an unknown video. The ability to analyze complex human activities from videos has many important applications, such as smart camera system, video surveillance, etc. However, it is still far from an off-the-shelf system. There are many challenging problems and it is still an active research area. This dissertation focuses on addressing two problems: various camera motions and effective modeling of group behaviors. We propose a unified and robust framework to detect salient motions from diverse types of videos. Given a video sequence that is recorded from either a stationary or moving camera, our algorithm is able to detect the salient motion regions. The model is inspired by two observations: 1) background motion caused by orthographic cameras lies in a low rank subspace, and 2) pixels belonging to one trajectory tend to group together. Based on these two observations, we introduce a new model using both low rank and group sparsity constraints. It is able to robustly decompose a motion trajectory matrix into foreground and background ones. Extensive experiments demonstrate very competitive performance on both synthetic data and real videos. After salient motion detection, a new method is proposed to model group behaviors in video sequences. This approach effectively models group activities based on social behavior analysis. Different from previous work that uses independent local features, our method explores the relationships between the current behavior state of a subject and its actions. An interaction energy potential function is proposed to represent the current behavior state of a subject, and velocity is used as its actions. Our method does not depend on human detection, so it is robust to detection errors. Instead, tracked salient points are able to provide a good estimation of modeling group interaction. We evaluate our algorithm in two datasets: UMN and BEHAVE. Experimental results show its promising performance against the state-of-art methods.

international symposium on biomedical imaging | 2012

Left endocardium segmentation using spatio-temporal Metamorphs

Xinyi Cui; Shaoting Zhang; Junzhou Huang; Xiaolei Huang; Dimitris N. Metaxas; Leon Axel

The Metamorphs model is a robust segmentation method which integrates both shape and appearance in a unified space. The standard Metamorphs model does not encode temporal information. Thus it is not effective in segmenting time series data, such as a cardiac cycle from MRI. Furthermore, it needs manual interaction to initialize the model, which is time consuming for temporal data. In this paper, we proposed a model to seamlessly couple both spatial and temporal information together in the Metamorphs method. It is also able to automatically initialize the model instead of manual initialization. We model energy terms as probability maps, then different energy terms can be easily fused by multiplying them together. Temporal Spectral Residual (TSR) is employed to rapidly generate a probability map in temporal data. Compared to traditional Metamorphs, the computational overhead of our model is very light due to the efficiency of the TSR method and the ease of coupling different energy functions by using probability maps. We validate this algorithm in a task of segmenting the left ventricle endocardium from 2D MR sequences, and our method shows performance superior to the traditional Metamorphs.

Archive | 2014