Masaru Tomita | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Masaru Tomita is active.

Explore More

Publication

Featured researches published by Masaru Tomita.

Language | 1991

Generalized LR Parsing

Masaru Tomita

1 The Generalized LR Parsing Algorithm.- 2 Experiments with GLR and Chart Parsing.- 3 The Computational Complexity of GLR Parsing.- 4 GLR Parsing in Time O(n3).- 5 GLR Parsing for ?-Grammers.- 6 Parallel GLR Parsing Based on Logic Programming.- 7 GLR Parsing with Scoring.- 8 GLR Parsing With Probability.- 9 GLR Parsing for Erroneous Input.- 10 GLR Parsing for Noisy Input.- 11 GLR Parsing in Hidden Markov Model.

international conference on computational linguistics | 1986

Parsing spoken language: a semantic caseframe approach

Philip J. Hayes; Alexander G. Hauptmann; Jaime G. Carbonell; Masaru Tomita

Parsing spoken input introduces serious problems not present in parsing typed natural language. In particular, indeterminacies and inaccuracies of acoustic recognition must be handled in an integral manner. Many techniques for parsing typed natural language do not adapt well to these extra demands. This paper describes an extension of semantic caseframe parsing to restricted-domain spoken input. The semantic caseframe grammar representation is the same as that used for earlier work on robust parsing of typed input. Due to the uncertainty inherent in speech recognition, the caseframe grammar is applied in a quite different way, emphasizing island growing from caseframe headers. This radical change in application is possible due to the high degree of abstraction in the caseframe representation. The approach presented was tested successfully in a preliminary implementation.

human language technology | 1993

Recent advances in Janus: a speech translation system

Monika Woszczyna; Noah Coccaro; Andreas Eisele; Alon Lavie; Arthur E. McNair; Thomas Polzin; Ivica Rogina; Carolyn Penstein Rosé; Tilo Sloboda; Masaru Tomita; J. Tsutsumi; Naomi Aoki-Waibel; Alex Waibel; Wayne H. Ward

We present recent advances from our efforts in increasing coverage, robustness, generality and speed of JANUS, CMUs speech-to-speech translation system. JANUS is a speaker-independent system translating spoken utterances in English and also in German into one of German, English or Japanese. The system has been designed around the task of conference registration (CR). It has initially been built based on a speech database of 12 read dialogs, encompassing a vocabulary of around 500 words. We have since been expanding the system along several dimensions to improve speed, robustness and coverage and to move toward spontaneous input.

international conference on acoustics, speech, and signal processing | 1994

JANUS 93: towards spontaneous speech translation

Monika Woszczyna; Naomi Aoki-Waibel; Finn Dag Buø; Noah Coccaro; Keiko Horiguchi; Thomas Kemp; Alon Lavie; Arthur E. McNair; Thomas Polzin; Ivica Rogina; Carolyn Penstein Rosé; Tanja Schultz; Bernhard Suhm; Masaru Tomita; Alex Waibel

We present first results from our efforts toward translation of spontaneously spoken speech. Improvements include increasing coverage, robustness, generality and speed of JANUS, the speech-to-speech translation system of Carnegie Mellon and Karlsruhe University. The recognition and machine translation engine have been upgraded to deal with requirements introduced by spontaneous human to human dialogs. To allow for development and evaluation of our system on adequate data, a large database with spontaneous scheduling dialogs is being gathered for English, German and Spanish.<<ETX>>

international conference on acoustics, speech, and signal processing | 1986

An efficient word lattice parsing algorithm for continuous speech recognition

Masaru Tomita

An efficient word lattice parsing algorithm is introduced for continuous speech recognition. A word lattice is a set of hypothesized words with different starting and ending positions in the input signal. Parsing a word lattice involves much more search than typed natural language parsing, and a very efficient algorithm is desired. The algorithm is based on the context-free parsing algorithm recently developed by the author. Our algorithm (1) is fast due to utilization of LR parsing tables, (2) produces all possible parses in an efficient representation, and (3) processes an input word lattice in a strict left-to-right manner, which may allow the algorithm to pipeline with lower level processes (i.e. word hypothesizers). The algorithm has been implemented in the continuous speech recognition project at Carnegie-Mellon University, and is being tested against real speech data.

Language | 1996

Recent Advances in Parsing Technology

Masaru Tomita; Harry C. Bunt

From the Publisher: Parsing technologies are concerned with the automatic decomposition of complex structures into their constituent parts, with structures in formal or natural languages as their main, but certainly not their only domain of application. The focus of Recent Advances in Parsing Technology is on parsing technologies for linguistic structures, but it also contains chapters concerned with parsing two or more dimensional languages. New and improved parsing technologies are important not only for achieving better performance in terms of efficiency, robustness, coverage, etc., but also because the developments in areas related to natural language processing give rise to new requirements on parsing technologies. Ongoing research in the areas of formal and computational linguistics and artificial intelligence lead to new formalisms for the representation of linguistic knowledge, and these formalisms and their application in such areas as machine translation and language-based interfaces call for new, effective approaches to parsing. Moreover, advances in speech technology and multimedia applications cause an increasing demand for parsing technologies where language, speech, and other modalities are fully integrated. Recent Advances in Parsing Technology presents an overview of recent developments in this area with an emphasis on new approaches for parsing modern, constraint-based formalisms on stochastic approaches to parsing, and on aspects of integrating syntactic parsing in further processing.

meeting of the association for computational linguistics | 1988

Graph-structured Stack and Natural Language Parsing

Masaru Tomita

A general device for handling nondeterminism in stack operations is described. The device, called a Graph-structured Stack, can eliminate duplication of operations throughout the nondeterministic processes. This paper then applies the graph-structured stack to various natural language parsing methods, including ATN, LR parsing, categorial grammar and principle-based parsing. The relationship between the graph-structured stack and a chart in chart parsing is also discussed.

international conference on computational linguistics | 1986

Another stride towards knowledge-based machine translation

Masaru Tomita; Jaime G. Carbonell

Building on the well-established premise that reliable machine translation requires a significant degree of text comprehension, this paper presents a recent advance in multi-lingual knowledge-based machine translation (KBMT). Unlike previous approaches, the current method provides for separate syntactic and semantic knowledge sources that are integrated dynamically for parsing and generation. Such a separation enables the system to have syntactic grammars, language specific but domain general, and semantic knowledge bases, domain specific but language general. Subsequently, grammars and domain knowledge are precompiled automatically in any desired combination to produce very efficient and very thorough real-time parsers. A pilot implementation of our KBMT architecture using functional grammars and entity-oriented semantics demonstrates the feasibility of the new approach.1

Archive | 1991

Parsing 2-Dimensional Language

Masaru Tomita

2-Dimensional Context Free Grammar (2D-CFG) for 2-dimensional input text is introduced and efficient parsing algorithms for 2D-CFG are presented. In 2D-CFG, a grammar rule’s right hand side symbols can be placed not only horizontally but also vertically. Terminal symbols in a 2-dimensional input text are combined to form a rectangular region, and regions are combined to form a larger region using a 2-dimensional phrase structure rule. The parsing algorithms presented in this chapter are 2D-Earley algorithm and 2D-LR algorithm, which are a 2-dimensionally extended version of Earley’s algorithm and the Generalized LR algorithm, respectively.

Machine Translation | 1986

Sentence disambiguation by asking

Masaru Tomita

Our parsing algorithm produces all possible parses from an ambiguous sentence. For the algorithm to be useful in practical applications, we need a mechanism to select one intended parse out of a batch of parses. In theory, it is desirable for the system to be able to disambiguate a sentence by semantics and pragmatics, and a large number of techniques using semantic information have been developed to resolve natural language ambiguity [13, 45, 7]. In practice, however, not all ambiguity problems can be solved by those techniques at the current state of the art. Moreover, some sentences are absolutely ambiguous, that is, even a human cannot disambiguate them, unless he knows the intent of the speaker.

Explore More