Is this you? Create Your Porfile

Lexing Xie

Australian National University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lexing Xie is active.

Explore More

Publication

Featured researches published by Lexing Xie.

Government Information Quarterly | 2012

Social media use by government: From the routine to the critical

Andrea L. Kavanaugh; Edward A. Fox; Steven D. Sheetz; Seungwon Yang; Lin Tzy Li; Donald J. Shoemaker; Apostol Natsev; Lexing Xie

Abstract Social media and online services with user-generated content (e.g., Twitter, Facebook, Flickr, YouTube) have made a staggering amount of information (and misinformation) available. Government officials seek to leverage these resources to improve services and communication with citizens. Significant potential exists to identify issues in real time, so emergency managers can monitor and respond to issues concerning public safety. Yet, the sheer volume of social data streams generates substantial noise that must be filtered in order to detect meaningful patterns and trends. Important events can then be identified as spikes in activity, while event meaning and consequences can be deciphered by tracking changes in content and public sentiment. This paper presents findings from a exploratory study we conducted between June and December 2010 with government officials in Arlington, VA (and the greater National Capitol Region around Washington, D.C.), with the broad goal of understanding social media use by government officials as well as community organizations, businesses, and the public at large. A key objective was also to understand social media use specifically for managing crisis situations from the routine (e.g., traffic, weather crises) to the critical (e.g., earthquakes, floods).

international conference on multimedia and expo | 2001

Algorithms and system for segmentation and structure analysis in soccer video

Peng Xu; Lexing Xie; Shih-Fu Chang; Ajay Divakaran; Anthony Vetro; Huifang Sun

In this paper, we present a novel system and effective algorithms for soccer video segmentation. The output, about whether the ball is in play, reveals high-level structure of the content. The first step is to classify each sample frame into 3 kinds of view using a unique domain-specific feature, grass-area-ratio. Here the grass value and classification rules are learned and automatically adjusted to each new clip. Then heuristic rules are used in processing the view label sequence, and obtain play/break status of the game. The results provide good basis for detailed content analysis in next step. We also show that low-level features and mid-level view classes can be combined to extract more information about the game, via the example of detecting grass orientation in the field. The results are evaluated under different metrics intended for different applications; the best result in segmentation is 86.5%.

Pattern Recognition Letters | 2004

Structure analysis of soccer video with domain knowledge and hidden Markov models

Lexing Xie; Peng Xu; Shih-Fu Chang; Ajay Divakaran; Huifang Sun

In this paper, we present statistical techniques for parsing the structure of produced soccer programs. The problem is important for applications such as personalized video streaming and browsing systems, in which videos are segmented into different states and important states are selected based on user preferences. While prior work focuses on the detection of special events such as goals or corner kicks, this paper is concerned with generic structural elements of the game. We define two mutually exclusive states of the game, play and break based on the rules of soccer. Automatic detection of such generic states represents an original challenging issue due to high appearance diversities and temporal dynamics of such states in different videos. We select a salient feature set from the compressed domain, dominant color ratio and motion intensity, based on the special syntax and content characteristics of soccer videos. We then model the stochastic structures of each state of the game with a set of hidden Markov models. Finally, higher-level transitions are taken into account and dynamic programming techniques are used to obtain the maximum likelihood segmentation of the video sequence. The system achieves a promising classification accuracy of 83.5%, with light-weight computation on feature extraction and model inference, as well as a satisfactory accuracy in boundary timing.

international conference on acoustics, speech, and signal processing | 2002

Structure analysis of soccer video with hidden Markov models

Lexing Xie; Shih-Fu Chang; Ajay Divakaran; Huifang Sun

In this paper, we present algorithms for parsing the structure of produced soccer programs. The problem is important in the context of a personalized video streaming and browsing system. While prior work focuses on the detection of special events such as goals or corner kicks, this paper is concerned with generic structural elements of the game. We begin by defining two mutually exclusive states of the game, play and break based on the rules of soccer. We select a domain-tuned feature set, dominant color ratio and motion intensity, based on the special syntax and content characteristics of soccer videos. Each state of the game has a stochastic structure that is modeled with a set of hidden Markov models. Finally, standard dynamic programming techniques are used to obtain the maximum likelihood segmentation of the game into the two states. The system works well, with 83.5% classification accuracy and good boundary timing from extensive tests over diverse data sets.

acm multimedia | 2007

Semantic concept-based query expansion and re-ranking for multimedia retrieval

Apostol Natsev; Alexander Haubold; Jelena Tesic; Lexing Xie; Rong Yan

We study the problem of semantic concept-based query expansion and re-ranking for multimedia retrieval. In particular, we explore the utility of a fixed lexicon of visual semantic concepts for automatic multimedia retrieval and re-ranking purposes. In this paper, we propose several new approaches for query expansion, in which textual keywords, visual examples, or initial retrieval results are analyzed to identify the most relevant visual concepts for the given query. These concepts are then used to generate additional query results and/or to re-rank an existing set of results. We develop both lexical and statistical approaches for text query expansion, as well as content-based approaches for visual query expansion. In addition, we study several other recently proposed methods for concept-based query expansion. In total, we compare 7 different approaches for expanding queries with visual semantic concepts. They are evaluated using a large video corpus and 39 concept detectors from the TRECVID-2006 video retrieval benchmark. We observe consistent improvement over the baselines for all 7 approaches, leading to an overall performance gain of 77% relative to a text retrieval baseline, and a 31% improvement relative to a state-of-the-art multimodal retrieval baseline.

international acm sigir conference on research and development in information retrieval | 2013

Improving LDA topic models for microblogs via tweet pooling and automatic labeling

Rishabh Mehrotra; Scott Sanner; Wray L. Buntine; Lexing Xie

Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through various pooling schemes that aggregate tweets in a data preprocessing step for LDA. We empirically establish that a novel method of tweet pooling by hashtags leads to a vast improvement in a variety of measures for topic coherence across three diverse Twitter datasets in comparison to an unmodified LDA baseline and a variety of pooling schemes. An additional contribution of automatic hashtag labeling further improves on the hashtag pooling results for a subset of metrics. Overall, these two novel schemes lead to significantly improved LDA topic models on Twitter content.

acm multimedia | 2005

Physics-motivated features for distinguishing photographic images and computer graphics

Tian-Tsong Ng; Shih-Fu Chang; Jessie Hsu; Lexing Xie; Mao-Pei Tsui

The increasing photorealism for computer graphics has made computer graphics a convincing form of image forgery. Therefore, classifying photographic images and photorealistic computer graphics has become an important problem for image forgery detection. In this paper, we propose a new geometry-based image model, motivated by the physical image generation process, to tackle the above-mentioned problem. The proposed model reveals certain physical differences between the two image categories, such as the gamma correction in photographic images and the sharp structures in computer graphics. For the problem of image forgery detection, we propose two levels of image authenticity definition, i.e., imaging-process authenticity and scene authenticity, and analyze our technique against these definitions. Such definition is important for making the concept of image authenticity computable. Apart from offering physical insights, our technique with a classification accuracy of 83.5% outperforms those in the prior work, i.e., wavelet features at 80.3% and cartoon features at 71.0%. We also consider a recapturing attack scenario and propose a counter-attack measure. In addition, we constructed a publicly available benchmark dataset with images of diverse content and computer graphics of high photorealism.

IEEE Transactions on Multimedia | 2012

Semantic Model Vectors for Complex Video Event Recognition

Michele Merler; Bert Huang; Lexing Xie; Gang Hua; Apostol Natsev

We propose semantic model vectors, an intermediate level semantic representation, as a basis for modeling and detecting complex events in unconstrained real-world videos, such as those from YouTube. The semantic model vectors are extracted using a set of discriminative semantic classifiers, each being an ensemble of SVM models trained from thousands of labeled web images, for a total of 280 generic concepts. Our study reveals that the proposed semantic model vectors representation outperforms-and is complementary to-other low-level visual descriptors for video event modeling. We hence present an end-to-end video event detection system, which combines semantic model vectors with other static or dynamic visual descriptors, extracted at the frame, segment, or full clip level. We perform a comprehensive empirical study on the 2010 TRECVID Multimedia Event Detection task (http://www.nist.gov/itl/iad/mig/med10.cfm), which validates the semantic model vectors representation not only as the best individual descriptor, outperforming state-of-the-art global and local static features as well as spatio-temporal HOG and HOF descriptors, but also as the most compact. We also study early and late feature fusion across the various approaches, leading to a 15% performance boost and an overall system performance of 0.46 mean average precision. In order to promote further research in this direction, we made our semantic model vectors for the TRECVID MED 2010 set publicly available for the community to use (http://www1.cs.columbia.edu/~mmerler/SMV.html).

IEEE Transactions on Multimedia | 2013

Learning to Distribute Vocabulary Indexing for Scalable Visual Search

Rongrong Ji; Ling-Yu Duan; Jie Chen; Lexing Xie; Hongxun Yao; Wen Gao

In recent years, there is an ever-increasing research focus on Bag-of-Words based near duplicate visual search paradigm with inverted indexing. One fundamental yet unexploited challenge is how to maintain the large indexing structures within a single server subject to its memory constraint, which is extremely hard to scale up to millions or even billions of images. In this paper, we propose to parallelize the near duplicate visual search architecture to index millions of images over multiple servers, including the distribution of both visual vocabulary and the corresponding indexing structure. We optimize the distribution of vocabulary indexing from a machine learning perspective, which provides a “memory light” search paradigm that leverages the computational power across multiple servers to reduce the search latency. Especially, our solution addresses two essential issues: “What to distribute” and “How to distribute”. “What to distribute” is addressed by a “lossy” vocabulary Boosting, which discards both frequent and indiscriminating words prior to distribution. “How to distribute” is addressed by learning an optimal distribution function, which maximizes the uniformity of assigning the words of a given query to multiple servers. We validate the distributed vocabulary indexing scheme in a real world location search system over 10 million landmark images. Comparing to the state-of-the-art alternatives of single-server search [5], [6], [16] and distributed search [23], our scheme has yielded a significant gain of about 200% speedup at comparable precision by distributing only 5% words. We also report excellent robustness even when partial servers crash.

acm multimedia | 2002

A utility framework for the automatic generation of audio-visual skims

Hari Sundaram; Lexing Xie; Shih-Fu Chang

In this paper, we present a novel algorithm for generating audio-visual skims from computable scenes. Skims are useful for browsing digital libraries, and for on-demand summaries in set-top boxes. A computable scene is a chunk of data that exhibits consistencies with respect to chromaticity, lighting and sound. There are three key aspects to our approach: (a) visual complexity and grammar, (b) robust audio segmentation and (c) an utility model for skim generation. We define a measure of visual complexity of a shot, and map complexity to the minimum time for comprehending the shot. Then, we analyze the underlying visual grammar, since it makes the shot sequence meaningful. We segment the audio data into four classes, and then detect significant phrases in the speech segments. The utility functions are defined in terms of complexity and duration of the segment. The target skim is created using a general constrained utility maximization procedure that maximizes the information content and the coherence of the resulting skim. The objective function is constrained due to multimedia synchronization constraints, visual syntax and by penalty functions on audio and video segments. The user study results indicate that the optimal skims show statistically significant differences with other skims with compression rates up to 90%.

Explore More