Takehiko Maruyama
Senshu University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Takehiko Maruyama.
language resources and evaluation | 2014
Kikuo Maekawa; Makoto Yamazaki; Toshinobu Ogiso; Takehiko Maruyama; Hideki Ogura; Wakako Kashino; Hanae Koiso; Masaya Yamaguchi; Makiro Tanaka; Yasuharu Den
Abstract The balanced corpus of contemporary written Japanese (BCCWJ) is Japan’s first 100 million words balanced corpus. It consists of three subcorpora (publication subcorpus, library subcorpus, and special-purpose subcorpus) and covers a wide range of text registers including books in general, magazines, newspapers, governmental white papers, best-selling books, an internet bulletin-board, a blog, school textbooks, minutes of the national diet, publicity newsletters of local governments, laws, and poetry verses. A random sampling technique is utilized whenever possible in order to maximize the representativeness of the corpus. The corpus is annotated in terms of dual POS analysis, document structure, and bibliographical information. The BCCWJ is currently accessible in three different ways including Chunagon a web-based interface to the dual POS analysis data. Lastly, results of some pilot evaluation of the corpus with respect to the textual diversity are reported. The analyses include POS distribution, word-class distribution, entropy of orthography, sentence length, and variation of the adjective predicate. High textual diversity is observed in all these analyses.
meeting of the association for computational linguistics | 2006
Tomohiro Ohno; Shigeki Matsubara; Hideki Kashioka; Takehiko Maruyama; Yasuyoshi Inagaki
In applications of spoken monologue processing such as simultaneous machine interpretation and real-time captions generation, incremental language parsing is strongly required. This paper proposes a technique for incremental dependency parsing of Japanese spoken monologue on a clause-by-clause basis. The technique identifies the clauses based on clause boundaries analysis, analyzes the dependency structures of them, and tries to decide the dependency relations with another clauses, simultaneously with the monologue speech input. The dependency relations are generated at the stage before the input of the entire monologue, and therefore, our technique can be used for language parsing in simultaneous Japanese speech understanding. An experiment using Japanese monologues has shown that our technique had the same degree of the performance as the usual dependency parsing for monologue sentences.
language resources and evaluation | 2007
Tomohiro Ohno; Shigeki Matsubara; Hideki Kashioka; Takehiko Maruyama; Hideki Tanaka; Yasuyoshi Inagaki
Spoken monologues feature greater sentence length and structural complexity than spoken dialogues. To achieve high-parsing performance for spoken monologues, simplifying the structure by dividing a sentence into suitable language units could prove effective. This paper proposes a method for dependency parsing of Japanese spoken monologues based on sentence segmentation. In this method, dependency parsing is executed in two stages: at the clause level and the sentence level. First, dependencies within a clause are identified by dividing a sentence into clauses and executing stochastic dependency parsing for each clause. Next, dependencies across clause boundaries are identified stochastically, and the dependency structure of the entire sentence is thus completed. An experiment using a spoken monologue corpus shows the effectiveness of this method for efficient dependency parsing of Japanese monologue sentences.
Journal of Psycholinguistic Research | 2017
Ye Tian; Takehiko Maruyama; Jonathan Ginzburg
There is an ongoing debate whether phenomena of disfluency (such as filled pauses) are produced communicatively. Clark and Fox Tree (Cognition 84(1):73–111, 2002) propose that filled pauses are words, and that different forms signal different lengths of delay. This paper evaluates this Filler-As-Words hypothesis by analyzing the distribution of self-addressed-questions or SAQs (such as “what’s the word”) in relation to filled pauses. We found that SAQs address different problems in different languages (most frequently about memory-retrieval in English and Chinese, and about appropriateness in Japanese). In relation to filled pauses, British but not American English uses “um” to signal a more severe problem than “uh”. Chinese uses different filled pauses to signal the syntactic category of the problem constituent. Japanese uses different filled pauses to signal levels of interaction with the interlocuter. Overall, our data supports the Filler-As-Words hypothesis that filled pauses are used communicatively. However, the dimensions of its meanings vary across languages and dialects.
language resources and evaluation | 2010
Kikuo Maekawa; Makoto Yamazaki; Takehiko Maruyama; Masaya Yamaguchi; Hideki Ogura; Wakako Kashino; Toshinobu Ogiso; Hanae Koiso; Yasuharu Den
Journal of Natural Language Processing | 2004
Takehiko Maruyama; Hideki Kashioka; Tadashi Kumano; Hideki Tanaka
language resources and evaluation | 2010
Yasuharu Den; Hanae Koiso; Takehiko Maruyama; Kikuo Maekawa; Katsuya Takanashi; Mika Enomoto; Nao Yoshida
language resources and evaluation | 2006
Kiyotaka Uchimoto; Ryoji Hamabe; Takehiko Maruyama; Katsuya Takanashi; Tatsuya Kawahara; Hitoshi Isahara
Journal of Psycholinguistic Research | 2016
Takehiko Maruyama; Jonathan Ginzburg; Ye Tian
DiSS | 2013
Vered Silber-Varod; Takehiko Maruyama
Collaboration
Dive into the Takehiko Maruyama's collaboration.
National Institute of Information and Communications Technology
View shared research outputs