Eric Chang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eric Chang is active.

Explore More

Publication

Featured researches published by Eric Chang.

knowledge discovery and data mining | 2014

Inferring gas consumption and pollution emission of vehicles throughout a city

Jingbo Shang; Yu Zheng; Wenzhu Tong; Eric Chang; Yong Yu

This paper instantly infers the gas consumption and pollution emission of vehicles traveling on a citys road network in a current time slot, using GPS trajectories from a sample of vehicles (e.g., taxicabs). The knowledge can be used to suggest cost-efficient driving routes as well as identifying road segments where gas has been wasted significantly. The instant estimation of the emissions from vehicles can enable pollution alerts and help diagnose the root cause of air pollution in the long run. In our method, we first compute the travel speed of each road segment using the GPS trajectories received recently. As many road segments are not traversed by trajectories (i.e., data sparsity), we propose a Travel Speed Estimation (TSE) model based on a context-aware matrix factorization approach. TSE leverages features learned from other data sources, e.g., map data and historical trajectories, to deal with the data sparsity problem. We then propose a Traffic Volume Inference (TVI) model to infer the number of vehicles passing each road segment per minute. TVI is an unsupervised Bayesian Network that incorporates multiple factors, such as travel speed, weather conditions and geographical features of a road. Given the travel speed and traffic volume of a road segment, gas consumption and emissions can be calculated based on existing environmental theories. We evaluate our method based on extensive experiments using GPS trajectories generated by over 32,000 taxis in Beijing over a period of two months. The results demonstrate the advantages of our method over baselines, validating the contribution of its components and finding interesting discoveries for the benefit of society.

pacific rim conference on multimedia | 2001

Emotion Detection from Speech to Enrich Multimedia Content

Feng Yu; Eric Chang; Ying-Qing Xu; Heung-Yeung Shum

This paper describes an experimental study on the detection of emotion from speech. As computer-based characters such as avatars and virtual chat faces become more common, the use of emotion to drive the expression of the virtual characters becomes more important. This study utilizes a corpus containing emotional speech with 721 short utterances expressing four emotions: anger, happiness, sadness, and the neutral (unemotional) state, which were captured manually from movies and teleplays. We introduce a new concept to evaluate emotions in speech. Emotions are so complex that most speech sentences cannot be precisely assigned to a particular emotion category; however, most emotional states nevertheless can be described as a mixture of multiple emotions. Based on this concept we have trained SVMs (support vector machines) to recognize utterances within these four categories and developed an agent that can recognize and express emotions.

international conference on acoustics, speech, and signal processing | 2004

Vocabulary-independent search in spontaneous speech

Frank Seide; Peng Yu; Chengyuan Ma; Eric Chang

For efficient organization of speech recordings - meetings, interviews, voice mails, lectures - the ability to search for spoken keywords is an essential capability. Today, most spoken-document retrieval systems use large-vocabulary recognition. For the above scenarios, such systems suffer from both the unpredictable vocabulary/domain and generally high word-error rates (WER). We present a vocabulary-independent system to index and to search rapidly spontaneous speech. A speech recognizer generates lattices of phonetic word fragments, against which keywords are matched phonetically. We first show the need to use recognition alternatives (lattices) in a high-WER context, on a word-based baseline. Then we introduce our new method of phonetic word-fragment lattice generation, which uses longer-span language knowledge than a phoneme recognizer. Last we introduce heuristics to compact the lattices to feasible sizes that can be searched efficiently. On the LDC voice mail corpus, we show that vocabulary/domain-independent phonetic search is as accurate as a vocabulary/domain-dependent word-lattice based baseline system for in-vocabulary keywords (FOMs of 74-75%), but nearly maintains this accuracy also for out-of-vocabulary keywords.

international conference on acoustics, speech, and signal processing | 2014

Deep learning of feature representation with multiple instance learning for medical image analysis

Yan Xu; Tao Mo; Qiwei Feng; Peilin Zhong; Maode Lai; Eric Chang

This paper studies the effectiveness of accomplishing high-level tasks with a minimum of manual annotation and good feature representations for medical images. In medical image analysis, objects like cells are characterized by significant clinical features. Previously developed features like SIFT and HARR are unable to comprehensively represent such objects. Therefore, feature representation is especially important. In this paper, we study automatic extraction of feature representation through deep learning (DNN). Furthermore, detailed annotation of objects is often an ambiguous and challenging task. We use multiple instance learning (MIL) framework in classification training with deep learning features. Several interesting conclusions can be drawn from our work: (1) automatic feature learning outperforms manual feature; (2) the unsupervised approach can achieve performance thats close to fully supervised approach (93.56%) vs. (94.52%); and (3) the MIL performance of coarse label (96.30%) outweighs the supervised performance of fine label (95.40%) in supervised deep learning features.

IEEE Transactions on Speech and Audio Processing | 2002

A system for spoken query information retrieval on mobile devices

Eric Chang; Frank Seide; Helen M. Meng; Zhuoran Chen; Yu Shi; Yuk-Chi Li

With the proliferation of handheld devices, information access on mobile devices is a topic of growing relevance. This paper presents a system that allows the user to search for information on mobile devices using spoken natural-language queries. We explore several issues related to the creation of this system, which combines state-of-the-art speech-recognition and information-retrieval technologies. This is the first work that we are aware of which evaluates spoken query based information retrieval on a commonly available and well researched text database, the Chinese news corpus used in the National Institute of Standards and Technology (NIST)s TREC-5 and TREC-6 benchmarks. To compare spoken-query retrieval performance for different relevant scenarios and recognition accuracies, the benchmark queries-read verbatim by 20 speakers-were recorded simultaneously through three channels: headset microphone, PDA microphone, and cellular phone. Our results show that for mobile devices with high-quality microphones, spoken-query retrieval based on existing technologies yields retrieval precisions that come close to that for perfect text input (mean average precision 0.459 and 0.489, respectively, on TREC-6).

international conference on acoustics, speech, and signal processing | 2001

Selecting non-uniform units from a very large corpus for concatenative speech synthesizer

Min Chu; Hu Peng; Hong-yun Yang; Eric Chang

This paper proposes a two-module text to speech system (TTS) structure, which bypasses the prosody model that predicts numerical prosodic parameters for synthetic speech. Instead, many instances of each basic unit from a large speech corpus are classified into categories by a classification and regression tree (CART), in which the expectation of the weighted sum of square regression error of prosodic features is used as splitting criterion. Better prosody is achieved by keeping slender diversity in prosodic features of instances belonging to the same class. A multi-tier non-uniform unit selection method is presented. It makes the best decision on unit selection by minimizing the concatenated cost of a whole utterance. Since the largest available and suitable units are selected for concatenating, distortion caused by mismatches at concatenated points is minimized. Very natural and fluent speech is synthesized, according to informal listening test.

ieee automatic speech recognition and understanding workshop | 2001

Automatic accent identification using Gaussian mixture models

Tao Chen; Chao Huang; Eric Chang; Jingchun Wang

It is well known that speaker variability caused by accent is an important factor io speech recognition. Some major accents in China are so different as to make this problem very severe. We propose a Gaussian mixture model (GMM) based Mandarin accent identitication method. In this method a number of GMMs are trained to identify the most likely accent given test utterances. The identified accent type can be used to select an accent-dependent model for speech recognition. A multi-accent Mandarin corpus was developed for the task, including 4 typical accents in China with 1,440 speakers (l,200 for training, 240 for testing). We explore experimentally the effect of the number of components in GMM on identification performance. We also investigate how many utterances per speaker are sufficient to reliably recognize his/her accent. Finally, we show the correlations among accents and provide some discussion.

knowledge discovery and data mining | 2015

Forecasting Fine-Grained Air Quality Based on Big Data

Yu Zheng; Xiuwen Yi; Ming Li; Ruiyuan Li; Zhangqing Shan; Eric Chang; Tianrui Li

In this paper, we forecast the reading of an air quality monitoring station over the next 48 hours, using a data-driven method that considers current meteorological data, weather forecasts, and air quality data of the station and that of other stations within a few hundred kilometers. Our predictive model is comprised of four major components: 1) a linear regression-based temporal predictor to model the local factors of air quality, 2) a neural network-based spatial predictor to model global factors, 3) a dynamic aggregator combining the predictions of the spatial and temporal predictors according to meteorological data, and 4) an inflection predictor to capture sudden changes in air quality. We evaluate our model with data from 43 cities in China, surpassing the results of multiple baseline methods. We have deployed a system with the Chinese Ministry of Environmental Protection, providing 48-hour fine-grained air quality forecasts for four major Chinese cities every hour. The forecast function is also enabled on Microsoft Bing Map and MS cloud platform Azure. Our technology is general and can be applied globally for other cities.

Medical Image Analysis | 2014

Weakly supervised histopathology cancer image segmentation and classification

Yan Xu; Jun-Yan Zhu; Eric Chang; Maode Lai; Zhuowen Tu

Labeling a histopathology image as having cancerous regions or not is a critical task in cancer diagnosis; it is also clinically important to segment the cancer tissues and cluster them into various classes. Existing supervised approaches for image classification and segmentation require detailed manual annotations for the cancer pixels, which are time-consuming to obtain. In this paper, we propose a new learning method, multiple clustered instance learning (MCIL) (along the line of weakly supervised learning) for histopathology image segmentation. The proposed MCIL method simultaneously performs image-level classification (cancer vs. non-cancer image), medical image segmentation (cancer vs. non-cancer tissue), and patch-level clustering (different classes). We embed the clustering concept into the multiple instance learning (MIL) setting and derive a principled solution to performing the above three tasks in an integrated framework. In addition, we introduce contextual constraints as a prior for MCIL, which further reduces the ambiguity in MIL. Experimental results on histopathology colon cancer images and cytology images demonstrate the great advantage of MCIL over the competing methods.

computer vision and pattern recognition | 2012

Unsupervised object class discovery via saliency-guided multiple class learning

Jun-Yan Zhu; Jiajun Wu; Yichen Wei; Eric Chang; Zhuowen Tu

Discovering object classes from images in a fully unsupervised way is an intrinsically ambiguous task; saliency detection approaches however ease the burden on unsupervised learning. We develop an algorithm for simultaneously localizing objects and discovering object classes via bottom-up (saliency-guided) multiple class learning (bMCL), and make the following contributions: (1) saliency detection is adopted to convert unsupervised learning into multiple instance learning, formulated as bottom-up multiple class learning (bMCL); (2) we utilize the Discriminative EM (DiscEM) to solve our bMCL problem and show DiscEMs connection to the MIL-Boost method[34]; (3) localizing objects, discovering object classes, and training object detectors are performed simultaneously in an integrated framework; (4) significant improvements over the existing methods for multi-class object discovery are observed. In addition, we show single class localization as a special case in our bMCL framework and we also demonstrate the advantage of bMCL over purely data-driven saliency methods.

Explore More