Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Berlin Chen is active.

Publication


Featured researches published by Berlin Chen.


IEEE Signal Processing Magazine | 2005

Spoken document understanding and organization

Lin-Shan Lee; Berlin Chen

Spoken documents (or associated multimedia content) are in fact better understood and reorganized in a way that retrieval/browsing can be performed easily. For example, they are now in the form of short paragraphs, properly organized in some hierarchical visual presentation with titles/summaries/topic labels as references for retrieval and browsing. The retrieval can be performed based on the full content, the summaries/titles/topic labels, or both. In this article, this is referred to as spoken document understanding and organization for efficient retrieval/browsing applications. The purpose of this article is to present a concise, comprehensive, and integrated overview of related areas in a unified context of spoken document understanding and organization for efficient retrieval/browsing applications. In addition, we present an initial prototype system we developed at National Taiwan University as a new example of integrating the various technologies and functionalities.


ieee automatic speech recognition and understanding workshop | 2001

Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval

Helen M. Meng; Wai Kit Lo; Berlin Chen; Karen P. Tang

We have developed a technique for automatic transliteration of named entities for English-Chinese cross-language spoken document retrieval (CL-SDR). Our retrieval system integrates machine translation, speech recognition and information retrieval technologies. An English news story forms a textual query that is automatically translated into Chinese words, which are mapped into Mandarin syllables by pronunciation dictionary lookup. Mandarin radio news broadcasts form spoken documents that are indexed by word and syllable recognition. The information retrieval engine performs matching in both word and syllable scales. The English queries contain many named entities that tend to be out-of-vocabulary words for machine translation and speech recognition, and are omitted in retrieval. Names are often transliterated across languages and are generally important for retrieval. We present a technique that takes in a name spelling and automatically generates a phonetic cognate in terms of Chinese syllables to be used in retrieval. Experiments show consistent retrieval performance improvement by including the use of named entities in this way.


IEEE Transactions on Speech and Audio Processing | 2002

Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese

Berlin Chen; Hsin-Min Wang; Lin-Shan Lee

With the rapidly growing use of the audio and multimedia information over the Internet, the technology for retrieving speech information using voice queries is becoming more and more important. In this paper, considering the monosyllabic structure of the Chinese language, a whole class of syllable-based indexing features, including overlapping segments of syllables and syllable pairs separated by a few syllables, is extensively investigated based on a Mandarin broadcast news database. The strong discriminating capabilities of such syllable-based features were verified by comparing with the word- or character-based features. Good approaches for better utilizing such capabilities, including fusion with the word- and character-level information and improved approaches to obtain better syllable-based features and query expressions, were extensively investigated. Very encouraging experimental results were obtained.


international conference on acoustics, speech, and signal processing | 2004

Lightly supervised and data-driven approaches to Mandarin broadcast news transcription

Berlin Chen; Jen Wei Kuo; Wen Hung Tsai

This paper investigates the use of several lightly supervised and data-driven approaches to Mandarin broadcast news transcription. First, with a consideration of the special structural properties of the Chinese language, a fast acoustic took-ahead technique for estimating the unexplored part of speech utterance was integrated into the lexical tree search to improve the search efficiency, in conjunction with the conventional language model look-ahead technique. Then, a verification-based method for automatic acoustic training data acquisition was developed to make use of the large amount of untranscribed speech data. Finally, two alternative strategies for language model adaptation were further studied for accurate language model estimation. With the above approaches, the system yielded an 11.94% character error rate on the Mandarin broadcast news collected in Taiwan.


international conference on human language technology research | 2001

Mandarin-English Information (MEI): investigating translingual speech retrieval

Helen M. Meng; Berlin Chen; Sanjeev Khudanpur; Gina-Anne Levow; Wai Kit Lo; Douglas W. Oard; Patrick Schone; Karen P. Tang; Hsin-Min Wang; Jianqiang Wang

This paper describes the Mandarin-English Information (MEI) project, where we investigated the problem of cross-language spoken document retrieval (CL-SDR), and developed one of the first English-Chinese CL-SDR systems. Our system accepts an entire English news story (text) as query, and retrieves relevant Chinese broadcast news stories (audio) from the document collection. Hence this is a cross-language and cross-media retrieval task. We applied a multi-scale approach to our problem, which unifies the use of phrases, words and subwords in retrieval. The English queries are translated into Chinese by means of a dictionary-based approach, where we have integrated phrase-based translation with word-by-word translation. Untranslatable named entities are transliterated by a novel subword translation technique. The multi-scale approach can be divided into three subtasks -- multi-scale query formulation, multi-scale audio indexing (by speech recognition) and multi-scale retrieval. Experimental results demonstrate that the use of phrase-based translation and subword translation gave performance gains, and multi-scale retrieval outperforms word-based retrieval.


international conference on acoustics, speech, and signal processing | 2000

Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics

Berlin Chen; Hsin-Min Wang; Lin-Shan Lee

Spoken document retrieval has been extensively studied over the years because of its high potential in various applications in the near future. Considering the monosyllabic structure of the Chinese language, a whole class of indexing features for retrieval of spoken documents in Mandarin Chinese using syllable-level statistical characteristics has been studied, and very encouraging experimental results on retrieval of broadcast news speech collected in Taiwan were obtained. This paper reports some interesting initial results and findings obtained in this research.


ACM Transactions on Asian Language Information Processing | 2009

Word Topic Models for Spoken Document Retrieval and Transcription

Berlin Chen

Statistical language modeling (LM), which aims to capture the regularities in human natural language and quantify the acceptability of a given word sequence, has long been an interesting yet challenging research topic in the speech and language processing community. It also has been introduced to information retrieval (IR) problems, and provided an effective and theoretically attractive probabilistic framework for building IR systems. In this article, we propose a word topic model (WTM) to explore the co-occurrence relationship between words, as well as the long-span latent topical information, for language modeling in spoken document retrieval and transcription. The document or the search history as a whole is modeled as a composite WTM model for generating a newly observed word. The underlying characteristics and different kinds of model structures are extensively investigated, while the performance of WTM is thoroughly analyzed and verified by comparison with the well-known probabilistic latent semantic analysis (PLSA) model as well as the other models. The IR experiments are performed on the TDT Chinese collections (TDT-2 and TDT-3), while the large vocabulary continuous speech recognition (LVCSR) experiments are conducted on the Mandarin broadcast news collected in Taiwan. Experimental results seem to indicate that WTM is a promising alternative to the existing models.


ACM Transactions on Asian Language Information Processing | 2004

A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents

Berlin Chen; Hsin-Min Wang; Lin-Shan Lee

In recent years, statistical modeling approaches have steadily gained in popularity in the field of information retrieval. This article presents an HMM/N-gram-based retrieval approach for Mandarin spoken documents. The underlying characteristics and the various structures of this approach were extensively investigated and analyzed. The retrieval capabilities were verified by tests with word- and syllable-level indexing features and comparisons to the conventional vector-space model approach. To further improve the discrimination capabilities of the HMMs, both the expectation-maximization (EM) and minimum classification error (MCE) training algorithms were introduced in training. Fusion of information via indexing word- and syllable-level features was also investigated. The spoken document retrieval experiments were performed on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3). Very encouraging retrieval performance was obtained.


ACM Transactions on Asian Language Information Processing | 2009

A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization

Shih Hsiang Lin; Berlin Chen; Hsin-Min Wang

Extractive document summarization automatically selects a number of indicative sentences, passages, or paragraphs from an original document according to a target summarization ratio, and sequences them to form a concise summary. In this article, we present a comparative study of various probabilistic ranking models for spoken document summarization, including supervised classification-based summarizers and unsupervised probabilistic generative summarizers. We also investigate the use of unsupervised summarizers to improve the performance of supervised summarizers when manual labels are not available for training the latter. A novel training data selection approach that leverages the relevance information of spoken sentences to select reliable document-summary pairs derived by the probabilistic generative summarizers is explored for training the classification-based summarizers. Encouraging initial results on Mandarin Chinese broadcast news data are demonstrated.


IEEE Transactions on Audio, Speech, and Language Processing | 2009

A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization

Yi Ting Chen; Berlin Chen; Hsin-Min Wang

In this paper, we consider extractive summarization of broadcast news speech and propose a unified probabilistic generative framework that combines the sentence generative probability and the sentence prior probability for sentence ranking. Each sentence of a spoken document to be summarized is treated as a probabilistic generative model for predicting the document. Two matching strategies, namely literal term matching and concept matching, are thoroughly investigated. We explore the use of the language model (LM) and the relevance model (RM) for literal term matching, while the sentence topical mixture model (STMM) and the word topical mixture model (WTMM) are used for concept matching. In addition, the lexical and prosodic features, as well as the relevance information of spoken sentences, are properly incorporated for the estimation of the sentence prior probability. An elegant feature of our proposed framework is that both the sentence generative probability and the sentence prior probability can be estimated in an unsupervised manner, without the need for handcrafted document-summary pairs. The experiments were performed on Chinese broadcast news collected in Taiwan, and very encouraging results were obtained.

Collaboration


Dive into the Berlin Chen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shih Hsiang Lin

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lin-Shan Lee

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yao Ming Yeh

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar

Hsu Chun Yen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Hung Shin Lee

National Taiwan Normal University

View shared research outputs
Researchain Logo
Decentralizing Knowledge