Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Takehiko Maruyama is active.

Publication


Featured researches published by Takehiko Maruyama.


language resources and evaluation | 2014

Balanced corpus of contemporary written Japanese

Kikuo Maekawa; Makoto Yamazaki; Toshinobu Ogiso; Takehiko Maruyama; Hideki Ogura; Wakako Kashino; Hanae Koiso; Masaya Yamaguchi; Makiro Tanaka; Yasuharu Den

Abstract The balanced corpus of contemporary written Japanese (BCCWJ) is Japan’s first 100 million words balanced corpus. It consists of three subcorpora (publication subcorpus, library subcorpus, and special-purpose subcorpus) and covers a wide range of text registers including books in general, magazines, newspapers, governmental white papers, best-selling books, an internet bulletin-board, a blog, school textbooks, minutes of the national diet, publicity newsletters of local governments, laws, and poetry verses. A random sampling technique is utilized whenever possible in order to maximize the representativeness of the corpus. The corpus is annotated in terms of dual POS analysis, document structure, and bibliographical information. The BCCWJ is currently accessible in three different ways including Chunagon a web-based interface to the dual POS analysis data. Lastly, results of some pilot evaluation of the corpus with respect to the textual diversity are reported. The analyses include POS distribution, word-class distribution, entropy of orthography, sentence length, and variation of the adjective predicate. High textual diversity is observed in all these analyses.


meeting of the association for computational linguistics | 2006

Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries

Tomohiro Ohno; Shigeki Matsubara; Hideki Kashioka; Takehiko Maruyama; Yasuyoshi Inagaki

In applications of spoken monologue processing such as simultaneous machine interpretation and real-time captions generation, incremental language parsing is strongly required. This paper proposes a technique for incremental dependency parsing of Japanese spoken monologue on a clause-by-clause basis. The technique identifies the clauses based on clause boundaries analysis, analyzes the dependency structures of them, and tries to decide the dependency relations with another clauses, simultaneously with the monologue speech input. The dependency relations are generated at the stage before the input of the entire monologue, and therefore, our technique can be used for language parsing in simultaneous Japanese speech understanding. An experiment using Japanese monologues has shown that our technique had the same degree of the performance as the usual dependency parsing for monologue sentences.


language resources and evaluation | 2007

Dependency parsing of Japanese monologue using clause boundaries

Tomohiro Ohno; Shigeki Matsubara; Hideki Kashioka; Takehiko Maruyama; Hideki Tanaka; Yasuyoshi Inagaki

Spoken monologues feature greater sentence length and structural complexity than spoken dialogues. To achieve high-parsing performance for spoken monologues, simplifying the structure by dividing a sentence into suitable language units could prove effective. This paper proposes a method for dependency parsing of Japanese spoken monologues based on sentence segmentation. In this method, dependency parsing is executed in two stages: at the clause level and the sentence level. First, dependencies within a clause are identified by dividing a sentence into clauses and executing stochastic dependency parsing for each clause. Next, dependencies across clause boundaries are identified stochastically, and the dependency structure of the entire sentence is thus completed. An experiment using a spoken monologue corpus shows the effectiveness of this method for efficient dependency parsing of Japanese monologue sentences.


Journal of Psycholinguistic Research | 2017

Self Addressed Questions and Filled Pauses: A Cross-linguistic Investigation

Ye Tian; Takehiko Maruyama; Jonathan Ginzburg

There is an ongoing debate whether phenomena of disfluency (such as filled pauses) are produced communicatively. Clark and Fox Tree (Cognition 84(1):73–111, 2002) propose that filled pauses are words, and that different forms signal different lengths of delay. This paper evaluates this Filler-As-Words hypothesis by analyzing the distribution of self-addressed-questions or SAQs (such as “what’s the word”) in relation to filled pauses. We found that SAQs address different problems in different languages (most frequently about memory-retrieval in English and Chinese, and about appropriateness in Japanese). In relation to filled pauses, British but not American English uses “um” to signal a more severe problem than “uh”. Chinese uses different filled pauses to signal the syntactic category of the problem constituent. Japanese uses different filled pauses to signal levels of interaction with the interlocuter. Overall, our data supports the Filler-As-Words hypothesis that filled pauses are used communicatively. However, the dimensions of its meanings vary across languages and dialects.


language resources and evaluation | 2010

Design, Compilation, and Preliminary Analyses of Balanced Corpus of Contemporary Written Japanese.

Kikuo Maekawa; Makoto Yamazaki; Takehiko Maruyama; Masaya Yamaguchi; Hideki Ogura; Wakako Kashino; Toshinobu Ogiso; Hanae Koiso; Yasuharu Den


Journal of Natural Language Processing | 2004

Development and Evaluation of Japanese Clause Boundaries Annotation Program

Takehiko Maruyama; Hideki Kashioka; Tadashi Kumano; Hideki Tanaka


language resources and evaluation | 2010

Two-level Annotation of Utterance-units in Japanese Dialogs: An Empirically Emerged Scheme.

Yasuharu Den; Hanae Koiso; Takehiko Maruyama; Kikuo Maekawa; Katsuya Takanashi; Mika Enomoto; Nao Yoshida


language resources and evaluation | 2006

Dependency-structure Annotation to Corpus of Spontaneous Japanese.

Kiyotaka Uchimoto; Ryoji Hamabe; Takehiko Maruyama; Katsuya Takanashi; Tatsuya Kawahara; Hitoshi Isahara


Journal of Psycholinguistic Research | 2016

Filled Pauses and Self Addressed Questions

Takehiko Maruyama; Jonathan Ginzburg; Ye Tian


DiSS | 2013

The linguistic role of hesitation disfluencies: evidence from Hebrew and Japanese.

Vered Silber-Varod; Takehiko Maruyama

Collaboration


Dive into the Takehiko Maruyama's collaboration.

Top Co-Authors

Avatar

Hideki Kashioka

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Toshinobu Ogiso

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nao Yoshida

Tokyo University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge