Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Boon Pang Lim is active.

Publication


Featured researches published by Boon Pang Lim.


international conference on acoustics, speech, and signal processing | 2014

Strategies for Vietnamese keyword search

Nancy F. Chen; Sunil Sivadas; Boon Pang Lim; Hoang Gia Ngo; Haihua Xu; Van Tung Pham; Bin Ma; Haizhou Li

We propose strategies for a state-of-the-art Vietnamese keyword search (KWS) system developed at the Institute for Infocomm Research (I2R). The KWS system exploits acoustic features characterizing creaky voice quality peculiar to lexical tones in Vietnamese, a minimal-resource transliteration framework to alleviate out-of-vocabulary issues from foreign loan words, and a proposed system combination scheme FusionX. We show that the proposed creaky voice quality features complement pitch-related features, reaching fusion gains of 17.7% relative (6.9% absolute). To the best of our knowledge, the proposed transliteration framework is the first reported rule-based system for Vietnamese; it outperforms statistical-approach baselines up to 14.93-36.73% relative on foreign loan word search tasks. Using FusionX to combine 3 sub-systems, the actual term-weighted value (ATWV) reaches 0.4742, exceeding the ATWV=0.3 benchmark for IARPA Babel participants in the NIST OpenKWSB Evaluation.


international conference on acoustics, speech, and signal processing | 2015

Low-resource keyword search strategies for tamil

Nancy F. Chen; Chongjia Ni; I-Fan Chen; Sunil Sivadas; Van Tung Pham; Haihua Xu; Xiong Xiao; Tze Siong Lau; Su Jun Leow; Boon Pang Lim; Cheung-Chi Leung; Lei Wang; Chin-Hui Lee; Alvina Goh; Eng Siong Chng; Bin Ma; Haizhou Li

We propose strategies for a state-of-the-art keyword search (KWS) system developed by the SINGA team in the context of the 2014 NIST Open Keyword Search Evaluation (OpenKWS14) using conversational Tamil provided by the IARPA Babel program. To tackle low-resource challenges and the rich morphological nature of Tamil, we present highlights of our current KWS system, including: (1) Submodular optimization data selection to maximize acoustic diversity through Gaussian component indexed N-grams; (2) Keywordaware language modeling; (3) Subword modeling of morphemes and homophones.


international symposium on chinese spoken language processing | 2014

A novel keyword+LVCSR-filler based grammar network representation for spoken keyword search

I-Fan Chen; Chongjia Ni; Boon Pang Lim; Nancy F. Chen; Chin-Hui Lee

A novel spoken keyword search grammar representation framework is proposed to combine the advantages of conventional keyword-filler based keyword search (KWS) and the LVCSR-based KWS systems. The proposed grammar representation allows keyword search systems to be flexible on keyword target settings as in the LVCSR-based keyword search. In low-resource scenarios it also provides the system with the ability to achieve high keyword detection accuracies as in the keyword-filler based KWS systems and to attain a low false alarm rate inherent in the LVCSR-based KWS systems. In this paper the proposed grammar is realized in three ways by modifying the language models used in LVCSR-based KWS. Tested on the evalpart1 data of the IARPA Babel OpenKWS13 Vietnamese tasks, experimental results indicate that the combined approaches achieve a significant ATWV improvement of more than 50% relatively (from 0.2093 to 0.3287) on the limited-language-pack task, while a 20% relative ATWV improvement (from 0.4578 to 0.5486) is observed on the full-language-pack task.


international conference on acoustics, speech, and signal processing | 2014

Discriminative score normalization for keyword search decision

Van Tung Pham; Haihua Xu; Nancy F. Chen; Sunil Sivadas; Boon Pang Lim; Eng Siong Chng; Haizhou Li

Many keyword search (KWS) systems make “hit/false alarm (FA)” decisions based on the lattice-based posterior probability, which is incomparable across keywords. Therefore, score normalization is essential for a KWS system. In this paper, we investigate the integration of two novel features, ranking-score and relative-to-max, into a discriminative score normalization method. These features are extracted by considering all competing hypotheses of a putative detection. A metric-based normalization method is also applied as a post-processing step to further optimize the term-weighted value (TWV) evaluation metric. We report empirical improvements over standard baselines using the Vietnamese data from IARPAs Babel program in the NIST OpenKWS13 Evaluation setup.


international conference on acoustics, speech, and signal processing | 2015

A keyword-aware grammar framework for LVCSR-based spoken keyword search

I-Fan Chen; Chongjia Ni; Boon Pang Lim; Nancy F. Chen; Chin-Hui Lee

In this paper, we proposed a method to realize the recently developed keyword-aware grammar for LVCSR-based keyword search using weight finite-state automata (WFSA). The approach creates a compact and deterministic grammar WFSA by inserting keyword paths to an existing n-gram WFSA. Tested on the evalpart1 data of the IARPA Babel OpenKWS13 Vietnamese and OpenKWS14 Tamil limited language pack tasks, the experimental results indicate the proposed keyword-aware framework achieves significant improvement, with about 50% relative actual term weighted value (ATWV) enhancement for both languages. Comparisons between the keyword-aware grammar and our previously proposed n-gram LM based approximation approach for the grammar also show that the KWS performances of these two realizations are complementary.


international conference on acoustics, speech, and signal processing | 2014

Subspace Gaussian mixture model for computer-assisted language learning

Rong Tong; Boon Pang Lim; Nancy F. Chen; Bin Ma; Haizhou Li

In computer-assisted language learning (CALL), speech data from non-native speakers are usually insufficient for acoustic modeling. Subspace Gaussian Mixture Models (SGMM) have been effective in training automatic speech recognition (ASR) systems with limited amounts of training data. Therefore, in this work, we propose to use SGMM to improve the fluency assessment performance. In particular, the contributions of this work are: (i) The proposed SGMM acoustic model trained with native data outperforms the MMI-GMM/HMM baseline by 25% relative, (ii) when incorporating a small amount of non-native training data, the SGMM acoustic model further improves the performance of fluency assessment by 47% relative.


international conference on human-computer interaction | 2015

Designing IDA - An Intelligent Driver Assistant for Smart City Parking in Singapore

Andreea I. Niculescu; Mei Quin Lim; Seno A. Wibowo; Kheng Hui Yeo; Boon Pang Lim; Michael Popow; Dan Chia; Rafael E. Banchs

A current problem modern cities are facing is the increased traffic flow and heavily congested parking places. To reduce the time and traffic caused by finding available parking we propose IDA, an Intelligent Driver Assistant. The main objective of IDA is to help drivers to find suitable park places, to online monitor car park availability and to redirect drivers when the number of free available spots drops to a critical level. Unlike other parking applications, IDA uses speech to interact with the driver and becomes an active helper during the navigation process by adjusting dynamically the parking decisions based on the traffic situation. The paper presents the current work in progress, interaction design aspects, uses cases, as well as a first user feedback received during a public event where IDA was showcased.


conference of the international speech communication association | 2016

Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages.

Van Hai Do; Nancy F. Chen; Boon Pang Lim; Mark Hasegawa-Johnson

When speech data with native transcriptions are scarce in an under-resourced language, automatic speech recognition (ASR) must be trained using other methods. Semi-supervised learning first labels the speech using ASR from other languages, then re-trains the ASR using the generated labels. Mismatched crowdsourcing asks crowd-workers unfamiliar with the language to transcribe it. In this paper, self-training and mismatched crowdsourcing are compared under exactly matched conditions. Specifically, speech data of the target language are decoded by the source language ASR systems into source language phone/word sequences. We find that (1) human mismatched crowdsourcing and cross-lingual ASR have similar error patterns, but different specific errors. (2) These two sources of information can be usefully combined in order to train a better target-language ASR. (3) The differences between the error patterns of non-native human listeners and non-native ASR are small, but when differences are observed, they provide information about the relationship between the phoneme systems of the annotator/source language (Mandarin) and the target language (Vietnamese).


international conference on acoustics, speech, and signal processing | 2015

Tokenizing fundamental frequency variation for Mandarin tone error detection

Rong Tong; Nancy F. Chen; Boon Pang Lim; Bin Ma; Haizhou Li

Tone error is commonly observed in tonal language acquisition. Correct tone production is especially challenging for native speakers of non-tonal languages. In this paper, we exploit the fundamental frequency variation (FFV) feature for Mandarin tone error detection. We propose to use FFV through two approaches: (1) Concatenating FFVs along side with standard speech recognition features; (2) Token FFV: Characterizing pitch variation with longer temporal context through GMM tokenization and n-gram language modeling. Our results show that tone error detection improves by incorporating FFV features and the two approaches are complementary to each other.


international conference on asian language processing | 2016

Speech recognition of under-resourced languages using mismatched transcriptions

Van Hai Do; Nancy F. Chen; Boon Pang Lim; Mark Hasegawa-Johnson

Mismatched crowdsourcing is a technique to derive speech transcriptions using crowd-workers unfamiliar with the language being spoken. This technique is especially useful for under-resourced languages since it is hard to hire native transcribers. In this paper, we demonstrate that using mismatched transcription for adaptation improves performance of speech recognition under limited matched training data conditions. In addition, we show that using data augmentation improves not only performance of monolingual system but also makes mismatched transcription adaptation more effective.

Collaboration


Dive into the Boon Pang Lim's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Haizhou Li

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Van Hai Do

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Haihua Xu

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Van Tung Pham

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Chin-Hui Lee

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

I-Fan Chen

Georgia Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge