Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiang Sean Zhou is active.

Publication


Featured researches published by Xiang Sean Zhou.


Multimedia Systems | 2003

Relevance Feedback in Image Retrieval: A Comprehensive Review

Xiang Sean Zhou; Thomas S. Huang

Abstract. We analyze the nature of the relevance feedback problem in a continuous representation space in the context of content-based image retrieval. Emphasis is put on exploring the uniqueness of the problem and comparing the assumptions, implementations, and merits of various solutions in the literature. An attempt is made to compile a list of critical issues to consider when designing a relevance feedback algorithm. With a comprehensive review as the main portion, this paper also offers some novel solutions and perspectives throughout the discussion.


international conference on image processing | 2001

One-class SVM for learning in image retrieval

Yunqiang Chen; Xiang Sean Zhou; Thomas S. Huang

Relevance feedback schemes using linear/quadratic estimators have been applied in content-based image retrieval to improve retrieval performance significantly. One major difficulty in relevance feedback is to estimate the support of target images in high dimensional feature space with a relatively small number of training samples. We develop a novel scheme based on one-class SVM, which fits a tight hyper-sphere in the nonlinearly transformed feature space to include most of the target images based on positive examples. The use of a kernel provides us an elegant way to deal with nonlinearity in the distribution of the target images, while the regularization term in SVM provides good generalization ability. To validate the efficacy of the proposed approach, we test it on both synthesized data and real-world images. Promising results are achieved in both cases.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

Total variation models for variable lighting face recognition

Terrence Chen; Wotao Yin; Xiang Sean Zhou; Dorin Comaniciu; Thomas S. Huang

In this paper, we present the logarithmic total variation (LTV) model for face recognition under varying illumination, including natural lighting conditions, where we rarely know the strength, direction, or number of light sources. The proposed LTV model has the ability to factorize a single face image and obtain the illumination invariant facial structure, which is then used for face recognition. Our model is inspired by the SQI model but has better edge-preserving ability and simpler parameter selection. The merit of this model is that neither does it require any lighting assumption nor does it need any training. The LTV model reaches very high recognition rates in the tests using both Yale and CMU PIE face databases as well as a face database containing 765 subjects under outdoor lighting conditions


acm multimedia | 2007

Feature selection using principal feature analysis

Yijuan Lu; Ira Cohen; Xiang Sean Zhou; Qi Tian

Dimensionality reduction of a feature set is a common preprocessing step used for pattern recognition and classification applications. Principal Component Analysis (PCA) is one of the popular methods used, and can be shown to be optimal using different optimality criteria. However, it has the disadvantage that measurements from all the original features are used in the projection to the lower dimensional space. This paper proposes a novel method for dimensionality reduction of a feature set by choosing a subset of the original features that contains most of the essential information, using the same criteria as PCA. We call this method Principal Feature Analysis (PFA). The proposed method is successfully applied for choosing the principal features in face tracking and content-based image retrieval (CBIR) problems. Automated annotation of digital pictures has been a highly challenging problem for computer scientists since the invention of computers. The capability of annotating pictures by computers can lead to breakthroughs in a wide range of applications including Web image search, online picture-sharing communities, and scientific experiments. In our work, by advancing statistical modeling and optimization techniques, we can train computers about hundreds of semantic concepts using example pictures from each concept. The ALIPR (Automatic Linguistic Indexing of Pictures - Real Time) system of fully automatic and high speed annotation for online pictures has been constructed. Thousands of pictures from an Internet photo-sharing site, unrelated to the source of those pictures used in the training process, have been tested. The experimental results show that a single computer processor can suggest annotation terms in real-time and with good accuracy.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

An information fusion framework for robust shape tracking

Xiang Sean Zhou; Alok Gupta; Dorin Comaniciu

Existing methods for incorporating subspace model constraints in shape tracking use only partial information from the measurements and model distribution. We propose a unified framework for robust shape tracking, optimally fusing heteroscedastic uncertainties or noise from measurement, system dynamics, and a subspace model. The resulting nonorthogonal subspace projection and fusion are natural extensions of the traditional model constraint using orthogonal projection. We present two motion measurement algorithms and introduce alternative solutions for measurement uncertainty estimation. We build shape models offline from training data and exploit information from the ground truth initialization online through a strong model adaptation. Our framework is applied for tracking in echocardiograms where the motion estimation errors are heteroscedastic in nature, each heart has a distinct shape, and the relative motions of epicardial and endocardial borders reveal crucial diagnostic features. The proposed method significantly outperforms the existing shape-space-constrained tracking algorithm. Due to the complete treatment of heteroscedastic uncertainties, the strong model adaptation, and the coupled tracking of double-contours, robust performance is observed even on the most challenging cases.


IEEE Transactions on Medical Imaging | 2004

Robust real-time myocardial border tracking for echocardiography: an information fusion approach

Dorin Comaniciu; Xiang Sean Zhou; Sriram Krishnan

Ultrasound is a main noninvasive modality for the assessment of the heart function. Wall tracking from ultrasound data is, however, inherently difficult due to weak echoes, clutter, poor signal-to-noise ratio, and signal dropouts. To cope with these artifacts, pretrained shape models can be applied to constrain the tracking. However, existing methods for incorporating subspace shape constraints in myocardial border tracking use only partial information from the model distribution, and do not exploit spatially varying uncertainties from feature tracking. In this paper, we propose a complete fusion formulation in the information space for robust shape tracking, optimally resolving uncertainties from the system dynamics, heteroscedastic measurement noise, and subspace shape model. We also exploit information from the ground truth initialization where this is available. The new framework is applied for tracking of myocardial borders in very noisy echocardiography sequences. Numerous myocardium tracking experiments validate the theory and show the potential of very accurate wall motion measurements. The proposed framework outperforms the traditional shape-space-constrained tracking algorithm by a significant margin. Due to the optimal fusion of different sources of uncertainties, robust performance is observed even for the most challenging cases.


international conference on computer vision | 2005

Image based regression using boosting method

Shaohua Kevin Zhou; Bogdan Georgescu; Xiang Sean Zhou; Dorin Comaniciu

We present a general algorithm of image based regression that is applicable to many vision problems. The proposed regressor that targets a multiple-output setting is learned using boosting method. We formulate a multiple-output regression problem in such a way that overfitting is decreased and an analytic solution is admitted. Because we represent the image via a set of highly redundant Haar-like features that can be evaluated very quickly and select relevant features through boosting to absorb the knowledge of the training data, during testing we require no storage of the training data and evaluate the regression function almost in no time. We also propose an efficient training algorithm that breaks the computational bottleneck in the greedy feature selection process. We validate the efficiency of the proposed regressor using three challenging tasks of age estimation, tumor detection, and endocardial wall localization and achieve the best performance with a dramatic speed, e.g., more than 1000 times faster than conventional data-driven techniques such as support vector regressor in the experiment of endocardial wall localization.


conference on image and video retrieval | 2003

The state of the art in image and video retrieval

Nicu Sebe; Michael S. Lew; Xiang Sean Zhou; Thomas S. Huang; E. Bakker

Image and video retrieval continues to be one of the most exciting and fastest-growing research areas in the field of multimedia technology. What are the main challenges in image and video retrieval? Despite the sustained efforts in the last years, we think that the paramount challenge remains bridging the semantic gap. By this we mean that low level features are easily measured and computed, but the starting point of the retrieval process is typically the high level query from a human. Translating or converting the question posed by a human to the low level features seen by the computer illustrates the problem in bridging the semantic gap. However, the semantic gap is not merely translating high level features to low level features. The essence of a semantic query is understanding the meaning behind the query. This can involve understanding both the intellectual and emotional sides of the human, not merely the distilled logical portion of the query but also the personal preferences and emotional subtons of the query and the preferential form of the results.


computer vision and pattern recognition | 2005

Illumination normalization for face recognition and uneven background correction using total variation based image models

Terrence Chen; Wotao Yin; Xiang Sean Zhou; Dorin Comaniciu; Thomas S. Huang

We present a new algorithm for illumination normalization and uneven background correction in images, utilizing the recently proposed TV+L/sup 1/ model: minimizing the total variation of the output cartoon while subject to an L/sup 1/-norm fidelity term. We give intuitive proofs of its main advantages, including the well-known edge preserving capability, minimal signal distortion, and scale-dependent but intensity-independent foreground extraction. We then propose a novel TV-based quotient image model (TVQI) for illumination normalization, an important preprocessing for face recognition under different lighting conditions. Using this model, we achieve 100% face recognition rate on Yale face database B if the reference images are under good lighting condition and 99.45% if not. These results, compared to the average 65% recognition rate of the quotient image model and the average 95% recognition rate of the more recent self quotient image model, show a clear improvement. In addition, this model requires no training data, no assumption on the light source, and no alignment between different images for illumination normalization. We also present the results of the related applications - uneven background correction for cDNA mic roar ray films and digital microscope images. We believe the proposed works can serve important roles in the related fields.


IEEE Signal Processing Magazine | 2006

Semantic retrieval of video - review of research on video retrieval in meetings, movies and broadcast news, and sports

Ziyou Xiong; Xiang Sean Zhou; Qi Tian; Yong Rui; Huangm Ts

This paper reviews the different research works on three types of video, i.e., video of meetings, movies and broadcast news, and sports video. The paper puts them into a general framework of video summarization, browsing, and retrieval. It also reviews different video representation techniques for these three types of video content within this general framework. Finally, the challenges facing the video retrieval research community are presented

Collaboration


Dive into the Xiang Sean Zhou's collaboration.

Researchain Logo
Decentralizing Knowledge