Shuixian Chen
Wuhan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shuixian Chen.
Multimedia Tools and Applications | 2010
Shuixian Chen; Naixue Xiong; Jong Hyuk Park; Min Chen; Ruimin Hu
We use Modified Discrete Cosine Transform (MDCT) to analyze and synthesize spatial parameters. MDCT in itself lacks phase information and energy conservation, which are needed by spatial parameters representation. Completing MDCT with Modified Discrete Sine Transform (MDST) into “MDCT-j*MDST” overcomes this and enables the representation in a form similar to that of DFT. And due to overlap-add in time domain, a MDST spectrum can be built perfectly from MDCT spectra of neighboring frames through matrix-vector multiplication. The matrix is heavily diagonal and keeping only a small number of its sub-diagonals is sufficient for approximation. When using MDCT based core coder in spatial audio coding, like Advanced Audio Coding (AAC), we need no separate transforming for spatial processing, cutting down significantly the computational complexity. Subjective listening tests also show that MDCT domain spatial processing has no quality impairment.
international conference on multimedia and expo | 2009
Shuixian Chen; Ruimin Hu; Shuhua Zhang
Although widely used otherwise, MDCT is excluded in the current scheme for spatial cues representation, due to its lacking of phase information and energy conservation. But combining MDCT with MDST overcomes the difficulties. Moreover, MDST spectra can be built perfectly from neighboring MDCT spectra. The MDCT-MDST conversion, in matrix form, is approximating to a banded sparse matrix. When applied to spatial audio coding using MDCT based core coders, this method avoids separate transforming for cues representation and saves significant computation. Listening tests also show that it has same audio quality as other complex transform based methods.
Neurocomputing | 2016
Shiming Ge; Rui Yang; Yuqing He; Kaixuan Xie; Hongsong Zhu; Shuixian Chen
Accurate eye localization plays a key role in many face analysis related applications. In this paper, we propose a novel statistic-based eye localization framework with a group of trained filter arrays called multi-channel correlation filter bank (MCCFB). Each filter array in the bank suits to a different face condition, thus combining these filter arrays can locate eyes more precisely in the conditions of variable poses, appearances and illuminations when comparing to single filter based or filter array based methods. To demonstrate the performance of our framework, we compare MCCFB with other statistic-based eye localization methods, experimental results show superiority of our method in detection ratio, localization accuracy and robustness.
international conference on image processing | 2015
Hui Wen; Shiming Ge; Shuixian Chen; Hongtao Wang; Limin Sun
Detecting abnormal events plays an essential role in video content analysis and has received increasing attention in surveillance system. One of the major problems in abnormal event detection is the imbalanced classification issue due to the rare abnormal samples. Another problem is the difficulty of detecting anomalies within a reasonable amount of computation time. To address these problems, we propose an adaptive cascade dictionary learning framework for detecting the anomalies. The framework considers anomaly detection as an one-class classification problem with a cascade of dictionaries. Each stage of the cascade constructs an adaptive dictionary to detect the anomalies with costless least square optimization solution. The experiments on benchmark datasets demonstrate that the proposed method has a better performance while comparing with several state-of-the-art methods.
international conference on multimedia and expo | 2014
Shiming Ge; Rui Yang; Hui Wen; Shuixian Chen; Limin Sun
Eye localization is a key step in many face analysis related applications. In this paper, we present a novel eye localization method based on a group of trained filters called correlation filter bank (CFB). We formulate the eye localization problem as an optimization problem with a well-defined cost function based on CFB. The CFB is trained with an EM-like adaptive clustering approach. The trained filter bank includes several discriminative filter templates, each of them suits to a different face condition from the others, thus can provide accurate eye localization ability for variable poses, appearances and illuminations. Simulation comparisons with cascade classifier-based method [1], traditional single correlation filter based methods [2][3] and pictorial structure model based method [4] demonstrates the superiority of the proposed method both in detection ratio and localization accuracy.
Journal of Computer Science and Technology | 2006
Hao-Jun Ai; Shuixian Chen; Ruimin Hu
This paper describes a general audio coding algorithm which has been recently standardized by AVS, China. The algorithm is based on a perceptual coding technique. The codec delivers near CD-quality audio at 128kb/s. This paper describes the coder structure in detail and discusses the reasons for specific design methods. A summary of the subjective test results are presented for the prototype codec. Comparison Mean Opinion Score (CMOS) test indicates that the quality of the AVS audio coder is comparable with MPEG Layer-3 audio coder. A real-time decoder was used for the characterization test, which is based on a 16-bit fixed-point DSP. The performance of the DSP solution was demonstrated, including computational complexity and storage characteristics.
Eurasip Journal on Wireless Communications and Networking | 2010
Shuixian Chen; Ruimin Hu; Naixue Xiong
Usually multimedia data have to be compressed before transmitting, and higher compression rate, or equivalently lower bitrate, relieves the load of communication channels but impacts negatively the quality. We investigate the bitrate lower bound for perceptually lossless compression of a major type of multimedia—multichannel audio signals. This bound equals to the perceptible information rate of the signals. Traditionally, Perceptual Entropy (PE), based primarily on monaural hearing measures the perceptual information rate of individual channels. But PE cannot measure the spatial information captured by binaural hearing, thus is not suitable for estimating Spatial Audio Coding (SAC) bitrate bound. To measure this spatial information, we build a Binaural Cue Physiological Perception Model (BCPPM) on the ground of binaural hearing, which represents spatial information in the physical and physiological layers. This model enables computing Spatial Perceptual Entropy (SPE), the lower bitrate bound for SAC. For real-world stereo audio signals of various types, our experiments indicate that SPE reliably estimates their spatial information rate. Therefore, SPE plus PE gives lower bitrate bounds for communicating multichannel audio signals with transparent quality.
multimedia and ubiquitous engineering | 2009
Shuixian Chen; Ruimin Hu; Jong Hyuk Park; Naixue Xiong; Sang-Soo Yeo
We present a method of analyzing and synthesizing spatial parameters using only Modified Discrete Domain Transform (MDCT). Combing it with Modified Discrete Sine Transform (MDST) enables spatial parameters representation. And instead of direct transforming, MDST spectra can be perfectly built from neighboring MDCT spectra by a conversion matrix, which is highly diagonal and can be approximated by a small number of its sub-diagonals. With MDCT based core coders, like Advanced Audio Coding (AAC), we need no separate transforming for spatial coding, cutting down significant computational complexity.
conference on industrial electronics and applications | 2014
Hui Wen; Shiming Ge; Rui Yang; Shuixian Chen; Limin Sun
This paper present a discriminative sparse point matching method (DSPM) for tracking generic objects in vision applications. Different from the conventional tracking methods that involves the construction of high-level or self-learning features, DSPM particularly focuses on a optical flow based point matching optimization method for overcoming the variation of object deformation in motion. The algorithm contains two key issues: a stable point matching method based on the global smoothing constraint with optical flow correspondence and a discriminative sparse point selection strategy for distinguishing the object from its surrounding background. Due to the efficient sparse point matching method, the algorithm is able to track objects that undergo fast motion and considerable shape or appearance variations. The proposed tracking method has been thoroughly evaluated on challenging benchmark video sequences and performs a excellent experimental result.
chinese conference on pattern recognition | 2014
Rui Yang; Shiming Ge; Kaixuan Xie; Shuixian Chen
Accurate eye localization plays a key role in many face analysis related applications. In this paper, we propose a novel eye localization framework with a group of trained filter arrays called multi-channel correlation filter bank (MCCFB). Each filter array in the bank suits to a different face condition, thus combining these filter array can locate eyes more precisely for variable poses, appearances and illuminations when comparing to single filter/filter array. To demonstrate the performance of our strategy, MCCFB is compared to other eye localization methods, experimental results show superiority of our method in detection ratio, localization accuracy and robustness.