Bor-Shen Lin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bor-Shen Lin is active.

Explore More

Publication

Featured researches published by Bor-Shen Lin.

Pattern Recognition Letters | 2002

A hierarchical tag-graph search scheme with layered grammar rules for spontaneous speech understanding

Bor-Shen Lin; Berlin Chen; Hsin-Min Wang; Lin-Shan Lee

It has always been difficult for language understanding systems to handle spontaneous speech with satisfactory robustness, primarily due to such problems as the fragments, disfluencies, out-of-vocabulary words, and ill-formed sentence structures. Also, the search schemes used are usually not flexible enough in accepting different input linguistic units, and great efforts are therefore required when they are used with different acoustic front ends in different tasks, specially in multi-modal and multi-lingual systems. In this paper, a new hierarchical tag-graph-based search scheme for spontaneous speech understanding is proposed. This scheme is based on a layered hierarchy of grammar rules, and therefore can integrate all the statistical and rule-based knowledge including acoustic scores, language model scores and grammar rules into the search process. More robust speech understanding is thus achievable. In addition, this scheme can accept graphs of different linguistic units such as phonemes, syllables, characters, words, spotted keywords, or phrases as the input, thus compatible to different acoustic front ends and multi-modal and multi-lingual applications can be easily developed. This search scheme has been successfully applied to a multi-domain, multi-modal dialogue system.

international conference on acoustics, speech, and signal processing | 2000

Fundamental performance analysis for spoken dialogue systems based on a quantitative simulation approach

Bor-Shen Lin; Lin-Shan Lee

The performance of dialogue systems is mostly measured based on the analysis of a large dialogue corpus. In this way, the dialogue performance can not be obtained before the system is on line, and the dialogue corpus should be recollected if the system is modified. Also, the effect of different factors, including system dialogue strategy, recognition and understanding accuracy or user response pattern, etc., on the dialogue performance can not be quantitatively identified because they can not be precisely controlled in different corpora. In this paper, a fundamental performance analysis scheme for dialogue systems based on a quantitative simulation approach is proposed. With this scheme the fundamental performance of a dialogue system can be predicted and analyzed efficiently without having any real spoken dialogue system implemented or having any dialogue corpus actually collected. How the dialogue performance varies with respect to each factor, from recognition accuracy to dialogue strategy, can be individually identified, because all such factors can be precisely controlled in the simulation. The quality of service for the spoken dialogue system can also be flexibly defined and the design parameters easily determined. This approach is therefore very useful for the design, development and improvement of spoken dialogue although the on-line system and real corpus will eventually be needed in the final evaluation and analysis of the system performance in any case.

systems, man and cybernetics | 2002

Question type classification and its application to a question answering system

June-Jei Kuo; Kuei-Kuang Lin; Hsin-Hsi Chen; Cheng-Hsuan Kao; Bor-Shen Lin

-To explore the feasibility of semantic focus of inquiring information on a specific domain, this paper proposes a question classification algorithm using question word, intention word and some related words. An air traffic information service corpus was used to train a question type decision tree. An experiment showed that we were able to achieve 84% accuracy. Then we employed the question type classification algorithm to a question answering system for railway information service. Compared with other training methods like unigram or bigram, the accuracy could be improved from 65% to 70%. To deal with the domain shift problem, we adapted the question type classifier with a small railway corpus, the accuracy could be further improved to 80%. Besides the classification, we also disambiguate the semantics of some important temporal and spatial keywords. The experiments have shown that our method is promising.

international conference on consumer electronics | 1998

A Prototype Of Mandarin Voice Memo System

Bor-Shen Lin; Hsin-Min Wang; Bo ren Bai; Berlin Chen

makes it possible to compare the what-to-do expressions In this paper, a prototype of Mandarin voice memo and the speech queries directly at the syllable level [I]. The system which provides fimctions of automatic notification above feature extracting procedures can be performed offand voice retrieval is implemented on a PC. The techniques line on all voice memos in database D to form the include both general content-based and special date-time corresponding feature vector database D, , and date-time expression based voice retrieval approaches. representation database D, , both will be the target With the progress of speech recognition, voice retrieval of speech database has become feasible [1][2]. However, a general voice retrieval approach is certainly not enough for a voice memo system. In addition to the content-based voice retrieval technology, date-time expression detection and understanding is also necessary for automatic notification and voice retrieval using speech queries containing date-time expressions. This paper presents the overall architecture of our Mandarin voice memo system and the main techniques, including both the general content-based voice retrieval approach and the special datetime expression retrieval approach based upon date-time expression detection and understanding. In general, the contents of memo mainly include both date-time expressions and what-to-do expressions. An example of Mandarin memo is shown in Figure 1, where “8A k&kfi%i” belongs to a date-time expression, while “&Si@€ % 9 % dC” is a what-to-do expression. That is, the user can retrieve the memo system using either datetime expression queries or other natural language queries which probably contain only keywords related to the things to do. Some possible queries are shown in Figure 2. Figure 3 shows the block diagram of the proposed Mandarin voice memo system. Given a new voice memo, it is first added to the voice memo database D . The date-time expression detection and understanding [3][4] module then performs keyword spotting to spot all possible date-time keyword candidates, and based upon these candidates, the date-time expression part can be understood and transcribed into a yearlmonthldateltime type of knowledge representation, and the date-time representation will be added to the datetime representation database 0, . The date-time representation database plays two roles in this system. The first one is the target database for retrieving using queries containing date-time expressions, while the second one is for automatic notification. Then, the large vocabulary continuous speech recognition is applied to the residual speech segments in the feature extraction module to extract the desired feature vector to represent the what-to-do part. The special monosyllabic structure of the Chinese language databases for retrieval. When a speech query is entered, the same procedures discussed above are first applied to obtain the corresponding feature vector V, or date-time representation Tq , then the matching module performs feature vector matching between Vq and all V, ’s in D, or date-time representation matching between T, and all Td ’s in D, , and the decision module finds the voice memos that most fit the speech query as the output. The details of the date-time expression detection and understanding module are further described in this section. The keyword set for date-time expression used in this paper consists of a total of 1 16 keywords, such as “+ k” (“today”), “ZStS-” (“Monday”), “TA” (“morning”) and so on. First, the keyword spotting can spot all possible date-time keyword candidates and output a keyword graph. The A* searching algorithm [SI then sequentially generates N-best paths based on the keyword graph. Each path is parsed with a set of date-time grammar rules. If any path is accepted by the syntactic parser, the parsing tree can be fixther transcribed into a proper date-time knowledge representation. If the parsing tree passes the semantic check, the knowledge representation is finally used as the output, and the date-time expression detection and understanding process stops. If not, the A* searching algorithm generates another path and passes it into the syntactic and semantic checking processes. In the following, the general content-based voice retrieval approach [ 11 will be briefly introduced. After date-time expression detection and understanding has been performed on the new voice memo or the speech query, the large vocabulary continuous speech recognition is applied to the residual speech segments in the feature extraction module to extract the desired feature vector to represent the what-to-do part. The special monosyllabic structure of the Chinese language makes it possible to compare the whatto-do expressions and the speech queries directly at the syllable level. That is, each what-to-do expression is transcribed into a syllable lattice with the corresponding

ieee region 10 conference | 1997

A key-phrase understanding framework integrating real world knowledge with speech recognition with initial application in voice memo systems for Mandarin Chinese

Bor-Shen Lin; Hsin-Min Wang; Lin-Shan Lee

Automatic speech recognition by computers can provide the most natural and efficient method of communication between humans and computers. The key-issue in some specific applications such as electronic shopping on the World Wide Web (WWW) is not just to improve the syllable or word recognition accuracy but to achieve correct understanding of the speech. Thus, key-phrase understanding, which integrates real world knowledge with key-phrase recognition techniques and then achieves partial speech understanding, plays an important role in such areas. A general framework for such a key-phrase understanding problem is presented and an initial application to be used in voice memo systems for Mandarin Chinese has been developed.

Archive | 1999