Tien-Ping Tan
Universiti Sains Malaysia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tien-Ping Tan.
2009 Oriental COCOSDA International Conference on Speech Database and Assessments | 2009
Tien-Ping Tan; Xiong Xiao; Enya Kong Tang; Eng Siong Chng; Haizhou Li
This paper presents the development of the speech, text and pronunciation dictionary resources required to build a large vocabulary speech recognizer for the Malay language. This project is a collaboration project among three universities: USM, MMU from Malaysia and NTU from Singapore. The Malay speech corpus consists of read speech (speaker independent/ dependent and accent independent/ dependent) and broadcast news. To date, 90 speakers have been recorded which is equal to a total of nearly 70 hours of read speech, and 10 hours of broadcast news from local TV stations in Malaysia was transcribed. The text corpus consists of 700Mbytes of data extracted from Malaysias local news web pages from 1998–2008 and a rule based G2P tool is develop to generate the pronunciation dictionary.
international conference on acoustics, speech, and signal processing | 2007
Tien-Ping Tan; Laurent Besacier
This paper proposes three interpolation techniques which use the target language and the speakers native language to improve non-native speech recognition system. These interpolation techniques are manual interpolation, weighted least square and eigenvoices. Each of them can be used under different situation and constraints. In contrast to weighted least square and eigenvoices methods, manual interpolation can be achieved offline without any adaptation data. These methods can also be combined with MLLR to improve the recognition rate. Experiments presented in this paper show that the best non native adaptation method, combined with MLLR can give 10% WER absolute reduction on a French automatic speech recognition system for both Chinese and Vietnamese native speakers.
international conference on asian language processing | 2012
Sarah Samson Juan; Laurent Besacier; Tien-Ping Tan
This paper explores speech recognition performance for Malay language with multi accents from speakers of different origins or ethnicities. Accented speech imposes accuracy problem in automatic speech recognition systems. This frequently occurs to non-native speakers of a language due to insufficiency of the non-natives data in the recognizers. In this study, we investigate the mentioned problem by building a Malay model in our recognizer and test its performance for speakers of various ethnicities. Our Malay corpora consist of read speeches and texts that are collected from local newspapers in Malaysia. Speakers who contributed the speeches are of different ethnic backgrounds. We employ context dependent models by applying linear discriminant analysis for our acoustic model and a trigram based language model. Our experiments show improved results when linear discriminant analysis technique was employed in our model while our recognizer performed worst for speakers with accent that are not available in the training data.
international conference on asian language processing | 2012
Basem H. A. Ahmed; Tien-Ping Tan
In this paper, we propose a novel approach to automatic recognition of code-switching speech. The proposed method consists of two phases: automatic speech recognition, and rescoring. The framework uses parallel automatic speech recognizers for speech recognition. The lattices produced are subsequently joined and rescored to estimate the most probable word sequence. Experiment shows that the proposed approach reduction of more than 5% WER, when tested on English/Malay code switching speech. In addition, the framework has shown to be very robust. Besides, we also propose an acoustic model adaptation approach known as hybrid approach of interpolation and merging to cross adapt acoustic models of different languages to recognize code switching speech. The adapted acoustic models show reduction in WER, when they are used for code switching speech recognition.
international conference on asian language processing | 2011
Yin-Lai Yeong; Tien-Ping Tan
In this paper, we propose an automatic language identification approach for code switching sentences by using the morphological structures and sequence of the syllable. The approach was tested on Malay-English code switching sentences. The proposed language identification approach achieves 90.75% in term of accuracy on the vocabularies. Our approach was further improved by combining the knowledge from other level in the sentence: word and alphabet. The additional information further improves the accuracy of our language identification method to 96.36%.
Journal of Information Science | 2017
Saif A. Ahmad Alrababah; Keng Hoon Gan; Tien-Ping Tan
Online customer reviews are an important assessment tool for businesses as they contain feedback that is valuable from the customer perspective. These reviews provide a significant basis on which potential customers can select the product that best meets their preferences. In online reviews, customers describe positive or negative experiences with a product or service or any part of it (i.e. features). Consumers frequently experience difficulty finding the desired product for comparison because of the massive number of online reviews. The automatic extraction of important product features is necessary to support customers in search of relevant product features. These features are the criteria that make it possible for customers to characterise different types of products. This article proposes a domain independent approach for identifying explicit opinionated features and attributes that are strongly related to a specific domain product using lexicographer files in WordNet. In our approach, N_gram analysis and the SentiStrength opinion lexicon have been employed to support the extraction of opinionated features. The empirical evaluation of the proposed system using online reviews of two popular datasets of supervised and unsupervised systems showed that our approach achieved competitive results for feature extraction from product reviews.
world congress on information and communication technologies | 2014
Moon Hong Wun; Li-Pei WongT; Ahamad Tajudin Khader; Tien-Ping Tan
Sequential Ordering Problem (SOP) is a type of Combinatorial Optimization Problem (COP). Solving SOP requires finding a feasible Hamiltonian path with minimum cost without violating the precedence constraints. SOP models myriad of real world industrial applications, particularly in the fields of transportation, vehicle routing and production planning. The main objective of this research is to propose an idea of solving SOP using the Bee Colony Optimization (BCO) algorithm. The underlying mechanism of the BCO algorithm is the bee foraging behavior in a typical bee colony. Throughout the research, the SOP benchmark problems from TSPLIB will be chosen as the testbed to evaluate the performance of the BCO algorithm in terms of the solution cost and the computational time needed to obtain an optimum solution. Moreover, efforts are taken to investigate the feasibility of using the Genetic Algorithm to optimally tune the parameters equipped in the existing BCO model. On average, over the selected 40 benchmark problems, the proposed method has successfully solved 9 (22.5%) benchmark problems to optimum, 17 (42.5%) benchmark problems ≤ 1% of deviation from the known optimum, and 37 (85%) benchmark problems ≤ 5% of deviation from the known optimum. Overall, the 40 benchmark problems are solved to 2.19% from the known optimum on average.
intelligent systems design and applications | 2013
Li-Pei Wong; Ahamad Tajudin Khader; Mohammed Azmi Al-Betar; Tien-Ping Tan
The Asymmetric Traveling Salesman Problem (ATSP) is one of the Combinatorial Optimization Problems that has been intensively studied in computer science and operations research. Solving ATSP is NP-hard and it is harder if the problem is with large scale data. This paper intends to address the ATSP using an hybrid approach which integrates the generic Bee Colony Optimization (BCO) framework and an insertion-based local search procedure. The generic BCO framework computationally realizes the bee foraging behaviour in a typical bee colony where bees travel across different locations to discover new food sources and perform waggle dances to recruit more bees towards newly discovered food sources. Besides the bee foraging behaviour, the generic BCO framework is enriched with an initialization engine, a fragmented solution construction mechanism, a local search and a pruning strategy. When the proposed algorithm is tested on a set of 27 ATSP benchmark problem instances, 37% of the benchmark instances are constantly solved to optimum. 89% of the problem instances are optimally solved for at least once. On average, the proposed BCO algorithm is able to obtain 0.140% deviation from known optimum for all the 27 instances. In terms of the average computational time, the proposed algorithm requires 48.955s (<; 1 minutes) to obtain the best tour length for each instance.
international conference on asian language processing | 2011
Basem H. A. Ahmed; Tien-Ping Tan
In this paper, we proposed an approach to model the pronunciation of non-native accented speech for automatic speech recognition system. The proposed method consists of two phases: phones adaptation and pronunciation generalization. In phones adaptation, we identify the phones used by non-native speakers compared to the standard phones, and then remove the mismatch, as a result of the influence from mother tongue. In pronunciation adaptation, we predict the pronunciations of words by non-native speakers. The results shown the proposed approach reduce the WER from 44.8% to 41.9%.
Multimedia Tools and Applications | 2010
Georges Quénot; Tien-Ping Tan; Viet Bac Le; Stéphane Ayache; Laurent Besacier; Philippe Mulhem
We present in this paper an approach based on the use of the International Phonetic Alphabet (IPA) for content-based indexing and retrieval of multilingual audiovisual documents. The approach works even if the languages of the document are unknown. It has been validated in the context of the “Star Challenge” search engine competition organized by the Agency for Science, Technology and Research (A*STAR) of Singapore. Our approach includes the building of an IPA-based multilingual acoustic model and a dynamic programming based method for searching document segments by “IPA string spotting”. Dynamic programming allows for retrieving the query string in the document string even with a significant transcription error rate at the phone level. The methods that we developed ranked us as first and third on the monolingual (English) search task, as fifth on the multilingual search task and as first on the multimodal (audio and image) search task.