Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yasuharu Den is active.

Publication


Featured researches published by Yasuharu Den.


language resources and evaluation | 2014

Balanced corpus of contemporary written Japanese

Kikuo Maekawa; Makoto Yamazaki; Toshinobu Ogiso; Takehiko Maruyama; Hideki Ogura; Wakako Kashino; Hanae Koiso; Masaya Yamaguchi; Makiro Tanaka; Yasuharu Den

Abstract The balanced corpus of contemporary written Japanese (BCCWJ) is Japan’s first 100 million words balanced corpus. It consists of three subcorpora (publication subcorpus, library subcorpus, and special-purpose subcorpus) and covers a wide range of text registers including books in general, magazines, newspapers, governmental white papers, best-selling books, an internet bulletin-board, a blog, school textbooks, minutes of the national diet, publicity newsletters of local governments, laws, and poetry verses. A random sampling technique is utilized whenever possible in order to maximize the representativeness of the corpus. The corpus is annotated in terms of dual POS analysis, document structure, and bibliographical information. The BCCWJ is currently accessible in three different ways including Chunagon a web-based interface to the dual POS analysis data. Lastly, results of some pilot evaluation of the corpus with respect to the textual diversity are reported. The analyses include POS distribution, word-class distribution, entropy of orthography, sentence length, and variation of the adjective predicate. High textual diversity is observed in all these analyses.


Life-like characters | 2004

Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

Shinichi Kawamoto; Hiroshi Shimodaira; Tsuneo Nitta; Takuya Nishimoto; Satoshi Nakamura; Katsunobu Itou; Shigeo Morishima; Tatsuo Yotsukura; Atsuhiko Kai; Akinobu Lee; Yoichi Yamashita; Takao Kobayashi; Keiichi Tokuda; Keikichi Hirose; Nobuaki Minematsu; Atsushi Yamada; Yasuharu Den; Takehito Utsuro; Shigeki Sagayama

Galatea is a software toolkit to develop a human-like spoken dialog agent. In order to easily integrate the modules of different characteristics including speech recognizer, speech synthesizer, facial animation synthesizer, and dialog controller, each module is modeled as a virtual machine having a simple common interface and connected to each other through a broker (communication manager). Galatea employs model-based speech and facial animation synthesizers whose model parameters are adapted easily to those for an existing person if his or her training data is given. The software toolkit that runs on both UNIX/Linux and Windows operating systems will be publicly available in the middle of 2003 [7, 6].


robot and human interactive communication | 2004

Activities of Interactive Speech Technology Consortium (ISTC) targeting open software development for MMI systems

Tsuneo Nitta; Shigeki Sagayama; Yoichi Yamashita; Tatsuya Kawahara; Shigeo Morishima; Shizuka Nakamura; Atsushi Yamada; Koji Ito; M. Kai; A. Li; Masato Mimura; Keikichi Hirose; Takao Kobayashi; Keiichi Tokuda; Nobuaki Minematsu; Yasuharu Den; Takehito Utsuro; Tatsuo Yotsukura; Hiroshi Shimodaira; M. Araki; Takuya Nishimoto; N. Kawaguchi; H. Banno; Kouichi Katsurada

Interactive Speech Technology Consortium (ISTC), established on November 2003 after three years activity of the Galatea project supported by Information-technology Promotion Agency (IPA) of Japan, aims at supporting open-source free software development of multi-modal interaction (MMI) for human-like agents. The software named Galatea-toolkit developed by 24 researchers of 16 research institutes in Japan includes a Japanese speech recognition engine, a Japanese speech synthesis engine, and a facial image synthesis engine used for developing an anthropomorphic agent, as well as dialogue manager that can integrates multiple modalities, interprets them, and decides an action with differentiating it to multiple media of voice and facial expression. ISTC provides members a one-day technical seminar and one-week training course to master Galatea-toolkit, as well as a software set (CDROM) every year.


2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) | 2011

Annotation of japanese response tokens and preliminary analysis on their distribution in three-party conversations

Yasuharu Den; Nao Yoshida; Katsuya Takanashi; Hanae Koiso

In this paper, we propose a new annotation scheme for Japanese response tokens (RTs), which is based on strict and consistent procedures. Our scheme consists of two-stage annotation, in which RTs are first identified and classified according to their forms and then further sub-classified based on their sequential positions. Six forms are included in our class of RTs: i) responsive interjections, ii) expressive interjections, iii) lexical reactive expressions, iv) repetitions, v) completions, and vi) assessments. Some of them bear an additional tag according to their sequential position in the discourse: i) first pair parts, ii) second pair parts, iii) sequence-closing thirds, iv) other responding turns, and v) unclassifiable positions. We apply our scheme to annotate a Japanese three-party conversation corpus, and present the results of a preliminary analysis on the distribution of RTs in the corpus.


international conference on multimodal interfaces | 2007

Simultaneous prediction of dialog acts and address types in three-party conversations

Yosuke Matsusaka; Mika Enomoto; Yasuharu Den

This paper reports on automatic prediction of dialog acts and address types in three-party conversations. In multi-party interaction, dialog structure becomes more complex compared to one-to-one case, because there is more than one hearer for an utterance. To cope with this problem, we predict dialog acts and address types simultaneously on our framework. Prediction of dialog act labels has gained to 68.5% by considering both context and address types. CART decision tree analysis has also been applied to examine useful features to predict those labels.


natural language generation | 1993

A Chart-Based Semantic Head Driven Generation Algorithm

Masahiko Haruno; Yasuharu Den; Yuji Matsumoto

Natural language generation systems need efficient and flexible search strategies because they produce texts from abstract representations through sophisticated selections of various rules. Semantic-head-driven (SHD) algorithm resolved problems of top-down and bottom-up search methods by skilfully combining both of them. However, straightforward depth-first implementations of the algorithm still suffer from inefficiency of extensive backtracking when applied to a large scale grammar. The backtracking leads to a large amount of recomputation of partial results which could be shared among several alternatives. In addition to this, the depth-first search method generally has inability to handle multiple contexts in search space,i.e., generator cannot compare plausible candidates at a time.


Laboratory Phonology | 2015

Some phonological, syntactic, and cognitive factors behind phrase-final lengthening in spontaneous Japanese: A corpus-based study

Yasuharu Den

Abstract In this study, we investigated segment lengthening in spontaneous Japanese based on a quantitative analysis of a large-scale corpus, focusing on the following three locations at which lengthening frequently occurs: the final segments of (i) clause-initial preface tokens (fillers and conjunctions), (ii) clause-initial wa-marked topic phrases, and (iii) clause-final particles. Two cognitive factors, namely clause complexity and boundary depth, were precisely analyzed using statistical models that also accounted for several phonological and syntactic factors. The results showed that in addition to the reliably strong effects of some phonological factors such as the presence of a following pause and the presence of boundary pitch movement, the effects of two cognitive factors were also evident. The way in which lengthening is related to the cognitive factors, however, varies significantly by location and token type. Lengthening of clause-final particles was affected by boundary depth, while lengthening of the topic marker wa of clause-initial topic phrases was influenced by clause complexity. Lengthening of the filler e was affected by both factors. A significant interaction between the two factors was also observed for the filler ano. We discuss the implications of these results as well as agendas for improving the current analysis.


annual meeting of the special interest group on discourse and dialogue | 2008

Implicit Proposal Filtering in Multi-Party Consensus-Building Conversations

Yasuhiro Katagiri; Yosuke Matsusaka; Yasuharu Den; Mika Enomoto; Masato Ishizaki; Katsuya Takanashi

An attempt was made to statistically estimate proposals which survived the discussion to be incorporated in the final agreement in an instance of a Japanese design conversation. Low level speech and vision features of hearer behaviors corresponding to aiduti, noddings and gaze were found to be a positive predictor of survival. The result suggests that non-linguistic hearer responses work as implicit proposal filters in consensus building, and could provide promising candidate features for the purpose of recognition and summarization of meeting events.


Journal of Natural Language Processing | 2014

An Environment for the Usage of Spoken Discourse Corpora that Effectively Utilizes Existing Tools

Yasuharu Den; Hanae Koiso

近年,コーパスアノテーションは多様化し,多層アノテーションを統合利用する仕 組みが欠かせない.とくに話し言葉コーパスでは,言語・非言語に関する 10種類以 上もの単位とそれらの相互関係を統合し,複数の単位を組み合わせた複雑な検索を 可能にする必要がある.本研究では,このような要請に応えるため,(1) マルチモー ダル・マルチチャネルの話し言葉コーパスを表現できる,汎用的なデータベースス キーマを設計し,(2) 既存のアノテーションツールで作成された,種々の書式を持つ アノテーションを入力とし,汎用的なデータベーススキーマから具現化されたデー タベースを構築するツールを開発する.話し言葉の分野では,広く使われている既 存のアノテーションツールを有効に利用することが不可欠であり,本研究は,既存 のアノテーションツールやコーパス検索ツールを用いたコーパス利用環境を構築す る手法を提案する.提案手法は,開発主体の異なる複数の話し言葉コーパスに適用 され,運用に供されている. キーワード:話し言葉コーパス,多層アノテーション,既存のツールの利用


Speech Communication | 2008

Filled pauses as cues to the complexity of upcoming phrases for native and non-native listeners

Michiko Watanabe; Keikichi Hirose; Yasuharu Den; Nobuaki Minematsu

Collaboration


Dive into the Yasuharu Den's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mika Enomoto

Tokyo University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Keiichi Tokuda

Nagoya Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Takao Kobayashi

Tokyo Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge