Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Bo-Ren Bai is active.

Publication


Featured researches published by Bo-Ren Bai.


international conference on speech image processing and neural networks | 1994

Golden Mandarin(II)-an intelligent Mandarin dictation machine for Chinese character input with adaptation/learning functions

Lin-Shan Lee; Keh-Jiann Chen; Chiu-yu Tseng; Ren-Yuan Lyu; Lee-Feng Chien; Hsin-Min Wang; Jia-Lin Shen; Sung-Chien Lin; Yen-Ju Yang; Bo-Ren Bai; Chi-ping Nee; Chun-Yi Liao; Shueh-Sheng Lin; Chung-Shu Yang; I-Jung Hung; Ming-Yu Lee; Rei-Chang Wang; Bo-Shen Lin; Yuan-Cheng Chang; Rung-Chiung Yang; Yung-Chi Huang; Chen-Yuan Lou; Tung-Sheng Lin

Golden Mandarin (II) is an intelligent single-chip based real-time Mandarin dictation machine for the Chinese language with a very large vocabulary for the input of unlimited Chinese texts into computers using voice. This dictation machine can be installed on any personal computer, in which only a single chip Motorola DSP 96002D is used, with a preliminary character correct rate around 95% at a speed of 0.6 sec per character. Various adaptation/learning functions have been developed for this machine, including fast adaptation to new speakers, on-line learning the voice characteristics, task domains, word pattern and noise environments of the users, so the machine can be easily personalized for each user. These adaptation/learning functions are the major subjects of the paper.<<ETX>>


IEEE Transactions on Speech and Audio Processing | 1997

Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data

Hsin-Min Wang; Tai-Hsuan Ho; Rung-Chiung Yang; Jia-Lin Shen; Bo-Ren Bai; Jenn-Chau Hong; Wei-Peng Chen; Tong-Lo Yu; Lin-Shan Lee

This correspondence presents the first known results of complete recognition of continuous Mandarin speech for the Chinese language with very large vocabulary but very limited training data. Various acoustic and linguistic processing techniques were developed, and a prototype system of a continuous speech Mandarin dictation machine has been successfully implemented. The best recognition accuracy achieved is 92.2% for finally decoded Chinese characters.


IEEE Transactions on Consumer Electronics | 1998

Intelligent retrieval of dynamic networked information from mobile terminals using spoken natural language queries

Bo-Ren Bai; Chun-Liang Chen; Lee-Feng Chien; Lin-Shan Lee

This paper presents a working architecture for intelligent retrieval of dynamic networked information from mobile terminals using spoken natural language queries. A very efficient dynamic keyword extraction technique is used to extract automatically limited numbers of keywords to specify the huge amount of networked information, so as to reduce the very difficult problem of recognizing unconstrained spoken natural language queries into the tractable task of spotting limited number of keywords embedded in the voice queries. A vocabulary-flexible voice keyword spotting technique is used to handle the live, dynamic and ever-changing networked information. A client-server architecture is used to integrate the dynamic keyword extraction and voice keyword spotter modules under a mobile network environment. Very encouraging experimental results indicating the feasibility of the proposed approach have been obtained.


international conference on acoustics, speech, and signal processing | 1995

Golden Mandarin (III)-a user-adaptive prosodic-segment-based Mandarin dictation machine for Chinese language with very large vocabulary

Ren-Yuan Lyu; Lee-Feng Chien; Shiao-Hong Hwang; Hung-Yun Hsieh; Rung-Chiuan Yang; Bo-Ren Bai; Jia-Chi Weng; Yen-Ju Yang; Shi-Wei Lin; Keh-Jiann Chen; Chiu-yu Tseng; Lin-Shan Lee

This paper presents a prototype prosodic-segment-based Mandarin dictation machine for the Chinese language with very large vocabulary. It accepts utterances continuous within a prosodic segment which is composed of one or a few word(s). It also possesses various on-line learning capabilities for fast adaptation to a new user in acoustic, lexical and linguistic levels. The overall system is implemented on an IBM/PC with an additional DSP card including a Motorola DSP 96002 chip. The word accuracy can achieve nearly 90% for a new user after he produces about 10 minutes of speech to train the system, and the accuracy can be further improved with the on-line learning functions.


international conference on acoustics, speech, and signal processing | 1997

A multi-phase approach for fast spotting of large vocabulary Chinese keywords from Mandarin speech using prosodic information

Bo-Ren Bai; Chiu-yu Tseng; Lin-Shan Lee

This paper presents a multi-phase approach for fast spotting of large vocabulary Chinese keywords from a spontaneous Mandarin speech utterance using prosodic knowledge. Without searching through the whole utterance using large number of keyword models, the multi-phase framework proposed including some special scoring schemes provides very good efficiency by considering the monosyllable-based structure of Mandarin Chinese. This approach is therefore very fast due to very good boundary estimations and the deletion of most impossible syllable and keyword candidates using context independent models, and is also very accurate due to the carefully designed scoring processes. A task with 2611 keywords was tested. An inclusion rate of 85.79% for the top 10 candidates is attained, at a speed requiring only 1.2 times that of the utterance length on a Sparc 20 workstation.


international conference on spoken language processing | 1996

Very-large-vocabulary Mandarin voice message file retrieval using speech queries

Bo-Ren Bai; Lee-Feng Chien; Lin-Shan Lee

In order to solve the problem with the new environment of fast growth of audio resources on the Internet, the paper presents an approach which is capable of retrieving Mandarin voice message files using queries of unconstrained speech. By properly utilizing the monosyllabic structure of the Chinese language, the proposed approach performs the statistical similarity estimation between the speech queries and the voice message files, and executes the complete matching process directly at the phonetic level using syllable based statistical information. Based on this approach, some experiments are tested and encouraging results are demonstrated.


Journal of the Association for Information Science and Technology | 2000

A spoken-access approach for Chinese text and speech information retrieval

Lee-Feng Chien; Hsin-Min Wang; Bo-Ren Bai; Sun-Chien Lin

This paper presents an efficient spoken‐access approach for both Chinese text and Mandarin speech information retrieval. The proposed approach is developed not only to deal with the retrieval of spoken documents, but also to improve the capability of human‐computer interaction via voice input for information‐retrieval systems. Based on utilization of the monosyllabic structure of the Chinese language, the proposed approach can tolerate speech recognition errors by performing speech query recognition and approximate information retrieval at the syllable‐level. Furthermore, with the help of automatic term suggestion and relevance feedback techniques, the proposed approach is robust in enabling users using voice input to interact with IR systems at each stage of the retrieval process. Extensive experiments show that the proposed approach can improve the effectiveness of information retrieval via speech interaction. The encouraging results suggest that a Mandarin speech interface for information retrieval and digital library systems can, therefore, be developed.


international conference on acoustics, speech, and signal processing | 1997

Syllable-based relevance feedback techniques for Mandarin voice record retrieval using speech queries

Lin-Shan Lee; Bo-Ren Bai; Lee-Feng Chien

In order to solve the problem with the new environment of fast growth of audio resources on the Internet, we have presented a syllable-based approach which is capable of retrieving Mandarin voice records using queries of unconstrained speech. However, the performance achieved by this previously proposed approach is still not satisfactory, and one of the reason is that very often the information provided by the speech query for the request subject may not be sufficient. We present approaches based the relevance feedback technique to improving the performance of the previous research. The proposed approaches include a relevance measure adjustment scheme using a relevance table for the voice database, a query expansion scheme to generate a new query including the feedback information, and a combination of these two schemes. Extensive preliminary experiments were performed and demonstrated.


international conference on acoustics, speech, and signal processing | 1994

An initial study on a segmental probability model approach to large-vocabulary continuous Mandarin speech recognition

Jia-Lin Shen; Hsin-Min Wang; Bo-Ren Bai; Lin-Shan Lee

This paper presents an initial study to perform large-vocabulary continuous Mandarin speech recognition based on a segmental probability model (SPM) approach. SPM was first proposed for recognition of isolated Mandarin syllables, in which every syllable must be equally segmented before recognition. A concatenated syllable matching algorithm is therefore introduced in place of the conventional Viterbi search algorithm to perform the recognition process based on SPM. In addition, a training procedure is also proposed to reestimate the SPM parameters for continuous speech. Preliminary simulation results indicate that significant improvements in both recognition rates and speed can be achieved as compared to the conventional HMM-based Viterbi search approaches.<<ETX>>


ieee region 10 conference | 1997

A word-length-dependent confidence measure for large vocabulary Chinese keyword spotting

Bo-Ren Bai; Hsin-Min Wang; Lin-Shan Lee

In this paper, a word-length-dependent confidence measure for large vocabulary Chinese keyword spotting is proposed to deal with the problem caused by the significant difference in keyword length, i.e. false alarms are likely to occur for shorter keywords, while false rejections are more likely to occur for longer keywords. The proposed confidence measure is based on not only the acoustic scores for the component sub-syllabic units of the keywords, but a set of word-length-dependent parameters trained with the minimum classification error criteria.

Collaboration


Dive into the Bo-Ren Bai's collaboration.

Top Co-Authors

Avatar

Lin-Shan Lee

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jia-Lin Shen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Berlin Chen

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar

Bor-Shen Lin

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ren-Yuan Lyu

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Rung-Chiung Yang

National Taiwan University

View shared research outputs
Researchain Logo
Decentralizing Knowledge