Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daichi Mochihashi is active.

Publication


Featured researches published by Daichi Mochihashi.


international conference on acoustics, speech, and signal processing | 2013

Bayesian semi-supervised audio event transcription based on Markov indian buffet process

Yasunori Ohishi; Daichi Mochihashi; Tomoko Matsui; Masahiro Nakano; Hirokazu Kameoka; Tomonori Izumitani; Kunio Kashino

We present a novel generative model for audio event transcription that recognizes “events” on audio signals including multiple kinds of overlapping sounds. In the proposed model, firstly, the overlapping audio events are modeled based on nonnegative matrix factorization into which Bayesian nonparametric approaches: the Markov Indian buffet process and the Chinese restaurant process, are incorporated. This approach allows us to automatically transcribe the events while avoiding the model selection problem by assuming a countably infinite number of possible audio events in the input signal. Then, Bayesian logistic regression annotates the audio frames with the multiple event labels in a semi-supervised learning setup. Experimental results show that our model can better annotate an audio signal in comparison with a baseline method. Additionally, we verify that our infinite generative model is also able to detect unknown audio events that are not included in the training data.


Advanced Robotics | 2016

Learning word meanings and grammar for verbalization of daily life activities using multilayered multimodal latent Dirichlet allocation and Bayesian hidden Markov models

Muhammad Attamimi; Yuji Ando; Tomoaki Nakamura; Takayuki Nagai; Daichi Mochihashi; Ichiro Kobayashi; Hideki Asoh

Intelligent systems need to understand and respond to human words to enable them to interact with humans in a natural way. Several studies attempted to realize these abilities by investigating the symbol grounding problem. For example, we proposed multilayered multimodal latent Dirichlet allocation (mMLDA) to enable the formation of various concepts and inference using grounded concepts. We previously reported on the issue of connecting words to various hierarchical concepts and also proposed a simple preliminary algorithm for generating sentences. This paper proposes a novel method that enables a sensing system to verbalize an everyday scene it observes. The method uses mMLDA and Bayesian hidden Markov models (BHMM) and the proposed algorithm improves the word inference of our previous work. The advantage of our approach is that grammar learning based on BHMM not only boosts concept selection results but enables our method to process functional words. The proposed verbalization algorithm produces results that are far superior to those of previous methods. Finally, we developed a system to obtain multimodal data from human everyday activities. We evaluate language learning and sentence generation as a complete process under this realistic setting. The results demonstrate the effectiveness of our method. Graphical Abstract


international joint conference on natural language processing | 2015

Inducing Word and Part-of-Speech with Pitman-Yor Hidden Semi-Markov Models

Kei Uchiumi; Hiroshi Tsukahara; Daichi Mochihashi

We propose a nonparametric Bayesian model for joint unsupervised word segmentation and part-of-speech tagging from raw strings. Extending a previous model for word segmentation, our model is called a Pitman-Yor Hidden SemiMarkov Model (PYHSMM) and considered as a method to build a class n-gram language model directly from strings, while integrating character and word level information. Experimental results on standard datasets on Japanese, Chinese and Thai revealed it outperforms previous results to yield the state-of-the-art accuracies. This model will also serve to analyze a structure of a language whose words are not identified a priori.


international conference on acoustics, speech, and signal processing | 2011

Gibbs sampling based Multi-scale Mixture Model for speaker clustering

Shinji Watanabe; Daichi Mochihashi; Takaaki Hori; Atsushi Nakamura

A novel sampling method is proposed for estimating a continuous multi-scale mixture model. The multi-scale mixture models we assume have a hierarchical structure in which each component of the mixture is represented by a Gaussian mixture model (GMM). In speaker modeling from speech, this GMM represents intra-speaker dynamics derived from the difference in the attributes such as phoneme contexts and the existence of non-stationary noise and the mixture of GMMs (MoGMMs) represents inter-speaker dynamics derived from the difference in speakers. Gibbs sampling is a powerful technique to estimate such hierarchically structured models but can easily induce the local optima problem depending on its use especially when the elemental GMMs are complex in structure. To solve this problem, a highly accurate and robust sampling method based on the blocked Gibbs sampling and iterative conditional modes (ICM) is proposed and effectively applied for reducing a singularity solution given in the model with complex multi-modal distributions. In speaker clustering experiments under non-stationary noise, the proposed sampling-based model estimation improved the clustering performance by 17% on average compared to the conventional sampling-based methods.


international conference on acoustics, speech, and signal processing | 2014

Mixture of Gaussian process experts for predicting sung melodic contour with expressive dynamic fluctuations

Yasunori Ohishi; Daichi Mochihashi; Hirokazu Kameoka; Kunio Kashino

We present a generative model for predicting the sung melodic contour, i.e., F0 contour, with expressive dynamic fluctuations, such as vibrato and portamento, for a given musical score. Although several studies have attempted to characterize such fluctuations, no systematic method has been developed for generating the F0 contour with them in connection with musical notes. In our model, the relationship between a musical note sequence and F0 contour is directly learned by a mixture of Gaussian process experts. This approach allows us to automatically characterize the fluctuations by utilizing the kernel function for each Gaussian process expert and predict the F0 contour for an arbitrary musical note sequence. Experimental results show that our model can better predict the F0 contour than a baseline method can. Additionally, we discuss the effective musical contexts and the amount of training data for the prediction.


european conference on principles of data mining and knowledge discovery | 2006

Exploring multiple communities with kernel-based link analysis

Takahiko Ito; Masashi Shimbo; Daichi Mochihashi; Yuji Matsumoto

We discuss issues raised by applying von Neumann kernels to graphs with multiple communities. Depending on the parameter setting, Kandola et al.s von Neumann kernels can identify not only nodes related to a given node but also the most important nodes in a graph. However, when von Neumann kernels are biased towards importance, top-ranked nodes are the important nodes in the dominant community of the graph irrespective of the communities where the target node belongs. To solve this “topic-drift” problem, we apply von Neumann kernels to the weighted graphs (community graph), which are derived from a generative model of links.


inlg workshop computational creativity natural language generation | 2016

Human-like Natural Language Generation Using Monte Carlo Tree Search.

Kaori Kumagai; Ichiro Kobayashi; Daichi Mochihashi; Hideki Asoh; Tomoaki Nakamura; Takayuki Nagai

A model is proposed showing how automatically extracted and manually written association rules can be used to build the structure of a narrative from real-life temporal data. The generated text’s communicative goal is to help the reader construct a causal representation of the events. A connecting associative thread allows the reader to follow associations from the beginning to the end of the text. It is created using a spanning tree over a selected associative sub-network. The results of a text quality evaluation show that the texts were understandable, but that flow between sentences, although not bad, could still be improved.


empirical methods in natural language processing | 2015

Learning Word Meanings and Grammar for Describing Everyday Activities in Smart Environments

Muhammad Attamimi; Yuji Ando; Tomoaki Nakamura; Takayuki Nagai; Daichi Mochihashi; Ichiro Kobayashi; Hideki Asoh

If intelligent systems are to interact with humans in a natural manner, the ability to describe daily life activities is important. To achieve this, sensing human activities by capturing multimodal information is necessary. In this study, we consider a smart environment for sensing activities with respect to realistic scenarios. We next propose a sentence generation system from observed multimodal information in a bottom up manner using multilayered multimodal latent Dirichlet allocation and Bayesian hidden Markov models. We evaluate the grammar learning and sentence generation as a complete process within a realistic setting. The experimental result reveals the effectiveness of the proposed method.


Frontiers in Neurorobotics | 2017

Segmenting Continuous Motions with Hidden Semi-markov Models and Gaussian Processes

Tomoaki Nakamura; Takayuki Nagai; Daichi Mochihashi; Ichiro Kobayashi; Hideki Asoh; Masahide Kaneko

Humans divide perceived continuous information into segments to facilitate recognition. For example, humans can segment speech waves into recognizable morphemes. Analogously, continuous motions are segmented into recognizable unit actions. People can divide continuous information into segments without using explicit segment points. This capacity for unsupervised segmentation is also useful for robots, because it enables them to flexibly learn languages, gestures, and actions. In this paper, we propose a Gaussian process-hidden semi-Markov model (GP-HSMM) that can divide continuous time series data into segments in an unsupervised manner. Our proposed method consists of a generative model based on the hidden semi-Markov model (HSMM), the emission distributions of which are Gaussian processes (GPs). Continuous time series data is generated by connecting segments generated by the GP. Segmentation can be achieved by using forward filtering-backward sampling to estimate the models parameters, including the lengths and classes of the segments. In an experiment using the CMU motion capture dataset, we tested GP-HSMM with motion capture data containing simple exercise motions; the results of this experiment showed that the proposed GP-HSMM was comparable with other methods. We also conducted an experiment using karate motion capture data, which is more complex than exercise motion capture data; in this experiment, the segmentation accuracy of GP-HSMM was 0.92, which outperformed other methods.


international conference on data engineering | 2006

Investigating the Effect of Multiple Communities on Kernel-Based Citation Analysis

Takahiko Ito; Masashi Shimbo; Daichi Mochihashi; Yuji Matsumoto

In this paper, we discuss issues raised by applying Kandola et al.s Neumann kernels to large citation graphs that have multiple communities. Neumann kernels can identify not only documents related a given document but also the most important documents in a citation graph. However, when Neumann kernels are biased towards importance, topranked documents are uniformly documents in the dominant community of the citation graph irrespective of the communities where the target document is cited. To solve this problem, we model a generation process of citations by probabilistic Latent Semantic Indexing, and then construct a weighted graph (hidden topic graph) for each community (topic). Applying Neumann kernels to each hidden topic graph, we can rank documents on the basis of the communities in which they appear.

Collaboration


Dive into the Daichi Mochihashi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kunio Kashino

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yuji Matsumoto

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Hideki Asoh

National Institute of Advanced Industrial Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Takayuki Nagai

University of Electro-Communications

View shared research outputs
Top Co-Authors

Avatar

Tomoaki Nakamura

University of Electro-Communications

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Masataka Goto

National Institute of Advanced Industrial Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge