Is this you? Create Your Porfile

Ziyou Xiong

University of Illinois at Urbana–Champaign

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ziyou Xiong is active.

Explore More

Publication

Featured researches published by Ziyou Xiong.

international conference on acoustics, speech, and signal processing | 2003

Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework

Ziyou Xiong; Regunathan Radhakrishnan; Ajay Divakaran; Thomas S. Huang

We developed a unified framework to extract highlights from three sports: baseball, golf and soccer by detecting some of the common audio events that are directly indicative of highlights. We used MPEG-7 audio features and entropic prior hidden Markov models (HMM) as the audio features and classifier respectively to recognize these common audio events. Together with pre- and post-processing techniques using general sports knowledge, we have been able to generate promising results dealing with the audio track that is dominated by audio mixtures and noisy background.

international conference on multimedia and expo | 2005

Highlights extraction from sports video based on an audio-visual marker detection framework

Ziyou Xiong; Regunathan Radhakrishnan; Ajay Divakaran; Thomas S. Huang

We propose to use a visual object (e.g., the baseball catcher) detection algorithm to find local, semantic objects in video frames in addition to an audio classification algorithm to find semantic audio objects in the audio track for sports highlights extraction. The highlight candidates are then further grouped into finer-resolution highlight segments, using color or motion information. During the grouping phase, many of the false alarms can be correctly identified and eliminated. Our experimental results with baseball, soccer and golf video are promising.

international conference on acoustics, speech, and signal processing | 2003

Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification

Ziyou Xiong; Regunathan Radhakrishnan; Ajay Divakaran; Thomas S. Huang

We present a comparison of 6 methods for classification of sports audio. For feature extraction, we have two choices: MPEG-7 audio features and Mel-scale frequency cepstrum coefficients (MFCC). For classification, we also have two choices: maximum likelihood hidden Markov models (ML-HMM) and entropic prior HMMs (EP-HMM). EP-HMMs, in turn, have two variations: with and without trimming of the model parameters. We thus have 6 possible methods, each of which corresponds to a combination. Our results show that all the combinations achieve classification accuracy of around 90% with the best and the second best being, respectively, MPEG-7 features with EP-HMM and MFCC with ML-HMM.

A Unified Framework for Video Summarization, Browsing and Retrieval#R##N#With Applications to Consumer and Surveillance Video | 2006

A Unified Framework for Video Summarization, Browsing, and Retrieval

Ziyou Xiong; Regunathan Radhakrishnan; Ajay Divakaran; Yong Rui; Thomas S. Huang

This chapter reviews and discusses recent research progress in multimodal analysis, representation, summarization, browsing, and retrieval. It introduces the video table of contents (ToC), the highlights, and the index, and presents techniques for constructing them. It further proposes a unified framework for video summarization, browsing, and retrieval to enable a user to go back and forth between browsing and retrieval. An essential part of the unified framework is composed of the weighted links. The links can be established between index entities and scenes, groups, shots, and key frames in the ToC structure for scripted content and between index entities and finer-resolution highlights, highlight candidates, audio-visual markers, and plays/breaks. For scripted content, focus is given on the links between index entities and shots. Shots are the building blocks of the ToC. An example of going from the visual index to the highlights is shown for unscripted content. This chapter recapitulates the key components of video highlights extraction and video retrieval. Video retrieval is concerned with how to return similar video clips to a user given a video query.

Journal of Electronic Imaging | 2005

On the security of the visual hash function

Regunathan Radhakrishnan; Ziyou Xiong; Nasir D. Memon

Robust hash functions are central to the security of multimedia content authentication systems. Such functions are sensitive to a key but are robust to many allowed signal processing operations on the underlying content. The robustness of the hash function to changes in the original content implies the existence of a cluster in the feature space around the original contents feature vector, any point within which getting hashed to the same output. The shape and size of the cluster determines the trade-off between the robustness offered and the security of the authentication system based on the robust hash function. The clustering itself is based on a secret key and hence unknown to the attacker. However, we show that the specific clustering arrived at by the robust visual hash function (VHF) may be possible to learn. Given just an input and its hash bits, we show how to construct a statistical model of the hash function, without any knowledge of the secret key used to compute the hash. We also show how to use this model to engineer arbitrary and malicious collisions. Finally, we propose one possible modification to VHF so that constructing a model that mimics its behavior becomes difficult.

international conference on image processing | 2002

Wavelet-based texture features can be extracted efficiently from compressed-domain for JPEG2000 coded images

Ziyou Xiong; Thomas S. Huang

The contribution of this paper is the development of a fast, subband-based JPEG2000 image indexing system in the compressed domain which achieves high memory efficiency. This is the extended work on a previously block-based indexing system. The feature extracted is the variance of each wavelet subband in the compressed domain with the emphasis that subbands are not buffered to maintain memory efficiency. Retrieval performance on VisTex image database indexing has shown the effectiveness and speed up of execution of the proposed features.

Handbook of Face Recognition | 2011

Face Recognition Applications

Thomas S. Huang; Ziyou Xiong; ZhenQiu Zhang

As one of the most nonintrusive biometrics, face recognition technology is becoming ever closer to people’s daily lives. Evidence of this is that in 2000 the International Civil Aviation Organization endorsed facial recognition as the most suitable biometrics for air travel. To our knowledge, no review papers are available on the newly enlarged application scenarios since then. We hope this chapter will be an extension of the previous studies. We review many face recognition applications that have already used face recognition technologies. This set of applications is a much larger super-set of previously reviewed. We also review some other new scenarios that will potentially utilize face recognition technologies in the near future.

international geoscience and remote sensing symposium | 2001

Fast retrieval of multi- and hyperspectral images using relevance feedback

Irwin E. Alber; Ziyou Xiong; Nancy Yeager; Morton S. Farber; William M. Pottenger

A high speed of retrieval is very important to developing an effective image cube search algorithm for the remote sensing community. Following the work of Berman and Shapiro (1999), it is shown that a triangle inequality search technique applied to a relevance feedback retrieval algorithm can significantly speed up the search for and retrieval of physical events of interest in large remote-sensing databases. An improvement in retrieval speed is illustrated using hurricane queries applied to the multispectral GOES database.

electronic imaging | 2003

Security of visual hash function

Regunathan Radhakrishnan; Ziyou Xiong; Nasir D. Memon

Robust hash functions are central to the security of multimedia content authentication systems. Such functions are sensitive to a key but robust to many allowed signal processing operations on the underlying content. Robustness of the hash function to changes in the original content implies the existence of a cluster in the feature space around the original contents feature vector, any point within which getting hashed to the same output. The shape and size of the cluster determines the trade-off between the robustness offered and the security of the authentication system based on the robust hash function. The clustering itself is based on a secret key and hence unknown to the attacker. However, we show in this paper that the specific clustering arrived at by a robust hash function may be possible to learn. Specifically, we look at a well known robust hash function for image data called the Visual Hash Function (VHF). Given just an input and its hash value, we show how to construct a statistical model of the hash function, without any knowledge of the secret key used to compute the hash. We also show how to use this model to engineer arbitrary and malicious collisions. Finally, we propose one possible modification to VHF so that constructing a model that mimics its behavior becomes difficult.

international conference on multimodal interfaces | 2002

Improved information maximization based face and facial feature detection from real-time video and application in a multi-modal person identification system

Ziyou Xiong; Yunqiang Chen; Roy Wang; Thomas S. Huang

In this paper an improved face detection method based on our previous information-based maximum discrimination approach is presented that maximizes the discrimination between face and non-face examples in a training set without using color or motion information. A short review of our previous method is given together with a description of a recent improvement of its detection speed. A person identification system has been developed that performs multi-modal person identification in real-time video based on this newly improved face detection method together with speaker identification.

Explore More