Alexander C. Loui | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexander C. Loui is active.

Explore More

Publication

Featured researches published by Alexander C. Loui.

multimedia information retrieval | 2007

Large-scale multimodal semantic concept detection for consumer video

Shih-Fu Chang; Daniel P. W. Ellis; Wei Jiang; Keansub Lee; Akira Yanagawa; Alexander C. Loui; Jiebo Luo

In this paper we present a systematic study of automatic classification of consumer videos into a large set of diverse semantic concept classes, which have been carefully selected based on user studies and extensively annotated over 1300+ videos from real users. Our goals are to assess the state of the art of multimedia analytics (including both audio and visual analysis) in consumer video classification and to discover new research opportunities. We investigated several statistical approaches built upon global/local visual features, audio features, and audio-visual combinations. Three multi-modal fusion frameworks (ensemble, context fusion, and joint boosting) are also evaluated. Experiment results show that visual and audio models perform best for different sets of concepts. Both provide significant contributions to multimodal fusion, via expansion of the classifier pool for context fusion and the feature bases for feature sharing. The fused multimodal models are shown to significantly reduce the detection errors (compared to single modality models), resulting in a promising accuracy of 83% over diverse concepts. To the best of our knowledge, this is the first work on systematic investigation of multimodal classification using a large-scale ontology and realistic video corpus.

multimedia information retrieval | 2007

Kodak's consumer video benchmark data set: concept definition and annotation

Alexander C. Loui; Jiebo Luo; Shih-Fu Chang; Daniel P. W. Ellis; Wei Jiang; Lyndon Kennedy; Keansub Lee; Akira Yanagawa

Semantic indexing of images and videos in the consumer domain has become a very important issue for both research and actual application. In this work we developed Kodaks consumer video benchmark data set, which includes (1) a significant number of videos from actual users, (2) a rich lexicon that accommodates consumers. needs, and (3) the annotation of a subset of concepts over the entire video data set. To the best of our knowledge, this is the first systematic work in the consumer domain aimed at the definition of a large lexicon, construction of a large benchmark data set, and annotation of videos in a rigorous fashion. Such effort will have significant impact by providing a sound foundation for developing and evaluating large-scale learning-based semantic indexing/annotation techniques in the consumer domain.

international conference on image processing | 2008

Cross-domain learning methods for high-level visual concept classification

Wei Jiang; Eric Zavesky; Shih-Fu Chang; Alexander C. Loui

Exploding amounts of multimedia data increasingly require automatic indexing and classification, e.g. training classifiers to produce high-level features, or semantic concepts, chosen to represent image content, like car, person, etc. When changing the applied domain (i.e. from news domain to consumer home videos), the classifiers trained in one domain often perform poorly in the other domain due to changes in feature distributions. Additionally, classifiers trained on the new domain alone may suffer from too few positive training samples. Appropriately adapting data/models from an old domain to help classify data in a new domain is an important issue. In this work, we develop a new cross-domain SVM (CDSVM) algorithm for adapting previously learned support vectors from one domain to help classification in another domain. Better precision is obtained with almost no additional computational cost. Also, we give a comprehensive summary and comparative study of the state- of-the-art SVM-based cross-domain learning methods. Evaluation over the latest large-scale TRECVID benchmark data set shows that our CDSVM method can improve mean average precision over 36 concepts by 7.5%. For further performance gain, we also propose an intuitive selection criterion to determine which cross-domain learning method to use for each concept.

international conference on acoustics, speech, and signal processing | 2007

Context-Based Concept Fusion with Boosted Conditional Random Fields

Wei Jiang; Shih-Fu Chang; Alexander C. Loui

The contextual relationships among different semantic concepts provide important information for automatic concept detection in images/videos. We propose a new context-based concept fusion (CBCF) method for semantic concept detection. Our work includes two folds. (1) We model the inter-conceptual relationships by a conditional random field (CRF) that improves detection results from independent detectors by taking into account the inter-correlation among concepts. CRF directly models the posterior probability of concept labels and is more accurate for the discriminative concept detection than previous statistical inferencing techniques. The boosted CRF framework is incorporated to further enhance performance by combining the power of boosting with CRF. (2) We develop an effective criterion to predict which concepts may benefit from CBCF. As reported in previous works, CBCF has inconsistent performance gain on different concepts. With accurate prediction, computational and data resources can be allocated to enhance concepts that are promising to gain performance. Evaluation on TRECVID2005 development set demonstrates the effectiveness of our algorithm.

acm multimedia | 2009

Short-term audio-visual atoms for generic video concept classification

Wei Jiang; Courtenay Valentine Cotton; Shih-Fu Chang; Daniel P. W. Ellis; Alexander C. Loui

We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at semantic concept detection. We propose to extract a novel representation, the Short-term Audio-Visual Atom (S-AVA), for improved concept detection. An S-AVA is defined as a short-term region track associated with regional visual features and background audio features. An effective algorithm, named Short-Term Region tracking with joint Point Tracking and Region Segmentation (STR-PTRS), is developed to extract S-AVAs from generic videos under challenging conditions such as uneven lighting, clutter, occlusions, and complicated motions of both objects and camera. Discriminative audio-visual codebooks are constructed on top of S-AVAs using Multiple Instance Learning. Codebook-based features are generated for semantic concept detection. We extensively evaluate our algorithm over Kodaks consumer benchmark video set from real users. Experimental results confirm significant performance improvements - over 120% MAP gain compared to alternative approaches using static region segmentation without temporal tracking. The joint audio-visual features also outperform visual features alone by an average of 8.5% (in terms of AP) over 21 concepts, with many concepts achieving more than 20%.

acm multimedia | 2010

Towards aesthetics: a photo quality assessment and photo selection system

Congcong Li; Alexander C. Loui; Tsuhan Chen

Automatic photo quality assessment and selection systems are helpful for managing the large mount of consumer photos. In this paper, we present such a system based on evaluating the aesthetic quality of consumer photos. The proposed system focuses on photos with faces, which constitute an important part of consumer photo albums. The system has three contributions: 1) We propose an aesthetics-based photo assessment algorithm, by considering different aesthetics-related factors, including the technical characteristics of the photo and the specific features related to faces; 2) Based on the aesthetic measurement, we propose a cropping-based photo editing algorithm, which differs from prior works by eliminating unimportant faces before optimizing photo composition; 3) We also incorporate the aesthetic evaluation with other metrics to select quintessential photos for a large collection of photos. The entire system is delivered by a web interface, which allows users to submit images or albums, and returns promising results for photo evaluation, editing recommendation, and photo selection.

IEEE MultiMedia | 2003

Using genetic algorithms for album page layouts

Joe Geigel; Alexander C. Loui

We describe a system that uses a genetic algorithm to interactively generate personalized album pages for visual content collections on the Internet. The system has three modules: preprocessing, page creation, and page layout. We focus on the details of the genetic algorithm used in the page-layout task.

international conference on image processing | 2010

Aesthetic quality assessment of consumer photos with faces

Congcong Li; Andrew C. Gallagher; Alexander C. Loui; Tsuhan Chen

Automatically assessing the subjective quality of a photo is a challenging area in visual computing. Previous works study the aesthetic quality assessment on a general set of photos regardless of the photos content and mainly use features extracted from the entire image. In this work, we focus on a specific genre of photos: consumer photos with faces. This group of photos constitutes an important part of consumer photo collections. We first conduct an online study on Mechanical Turk to collect ground-truth and subjective opinions for a database of consumer photos with faces. We then extract technical features, perceptual features, and social relationship features to represent the aesthetic quality of a photo, by focusing on face-related regions. Experiments show that our features perform well for categorizing or predicting the aesthetic quality.

international conference on multimedia and expo | 2010

Automatic aesthetic value assessment in photographic images

Wei Jiang; Alexander C. Loui; Cathleen D. Cerosaletti

The automatic assessment of aesthetic values in consumer photographic images is an important issue for content management, organizing and retrieving images, and building digital image albums. This paper explores automatic aesthetic estimation in two different tasks: (1) to estimate fine-granularity aesthetic scores ranging from 0 to 100, a novel regression method, namely Diff-RankBoost, is proposed based on RankBoost and support vector techniques; and (2) to predict coarse-granularity aesthetic categories (e.g., visually “very pleasing” or “not pleasing”), multi-category classifiers are developed. A set of visual features describing various characteristics related to image quality and aesthetic values are used to generate multidimensional feature spaces for aesthetic estimation. Experiments over a consumer photographic image collection with user ground-truth indicate that the proposed algorithms provide promising results for automatic image aesthetic assessment.

IEEE Transactions on Circuits and Systems for Video Technology | 2003

Finding structure in home videos by probabilistic hierarchical clustering

Daniel Gatica-Perez; Alexander C. Loui; Ming-Ting Sun

Accessing, organizing, and manipulating home videos present technical challenges due to their unrestricted content and lack of storyline. We present a methodology to discover cluster structure in home videos, which uses video shots as the unit of organization, and is based on two concepts: (1) the development of statistical models of visual similarity, duration, and temporal adjacency of consumer video segments and (2) the reformulation of hierarchical clustering as a sequential binary Bayesian classification process. A Bayesian formulation allows for the incorporation of prior knowledge of the structure of home video and offers the advantages of a principled methodology. Gaussian mixture models are used to represent the class-conditional distributions of intra- and inter-segment visual and temporal features. The models are then used in the probabilistic clustering algorithm, where the merging order is a variation of highest confidence first, and the merging criterion is maximum a posteriori. The algorithm does not need any ad-hoc parameter determination. We present extensive results on a 10-h home-video database with ground truth which thoroughly validate the performance of our methodology with respect to cluster detection, individual shot-cluster labeling, and the effect of prior selection.

Explore More