Naohiro Tawara | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Naohiro Tawara is active.

Explore More

Publication

Featured researches published by Naohiro Tawara.

international conference on acoustics, speech, and signal processing | 2012

Fully Bayesian inference of multi-mixture Gaussian model and its evaluation using speaker clustering

Naohiro Tawara; Tetsuji Ogawa; Shinji Watanabe; Tetsunori Kobayashi

This study aims to verify effective optimization methods for estimating parametric, fully Bayesian models in speech processing. For that purpose, we investigate the impact of the difference in optimization methods for the multi-scale Gaussian mixture model, which is suitable for speaker clustering, on the clustering accuracy. The Markov chain Monte Carlo (MCMC)-based method was compared with the variational Bayesian method in the speaker clustering experiment; with a small amount of data, the MCMC-based method was more effective; with large scale data (more than one million samples), the difference between these methods in terms of the clustering accuracy decreased and the MCMC-based method was computationally efficient.

international conference on acoustics, speech, and signal processing | 2015

A comparative study of spectral clustering for i-vector-based speaker clustering under noisy conditions

Naohiro Tawara; Tetsuji Ogawa; Tetsunori Kobayashi

The present paper dealt with speaker clustering for speech corrupted by noise. In general, the performance of speaker clustering significantly depends on how well the similarities between speech utterances can be measured. The recently proposed i-vector-based cosine similarity has yielded the state-of-the-art performance in speaker clustering systems. However, this similarity often fails to capture the speaker similarity under noisy conditions. Therefore, we attempted to examine the efficiency of spectral clustering on i-vector-based similarity for speech corrupted by noise because spectral clustering can yield robustness against noise by non-linear projection. Experimental comparisons demonstrated that spectral clustering yielded significant improvement from conventional methods, such as agglomerative clustering and k-means clustering, under non-stationary noise conditions.

APSIPA Transactions on Signal and Information Processing | 2015

A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data

Naohiro Tawara; Tetsuji Ogawa; Shinji Watanabe; Atsushi Nakamura; Tetsunori Kobayashi

An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet process mixture model (UO-DPMM). The present paper demonstrates that UO-DPMM is successfully applied on large-scale data and outperforms the conventional hierarchical agglomerative clustering, especially for large amounts of utterances.

international workshop on machine learning for signal processing | 2013

Blocked Gibbs sampling based multi-scale mixture model for speaker clustering on noisy data

Naohiro Tawara; Tetsuji Ogawa; Shinji Watanabe; Atsushi Nakamura; Tetsunori Kobayashi

A novel sampling method is proposed for estimating a continuous multi-scale mixture model. The multi-scale mixture models we assume have a hierarchical structure in which each component of the mixture is represented by a Gaussian mixture model (GMM). In speaker modeling from speech, this GMM represents intra-speaker dynamics derived from the difference in the attributes such as phoneme contexts and the existence of non-stationary noise and the mixture of GMMs (MoGMMs) represents inter-speaker dynamics derived from the difference in speakers. Gibbs sampling is a powerful technique to estimate such hierarchically structured models but can easily induce the local optima problem depending on its use especially when the elemental GMMs are complex in structure. To solve this problem, a highly accurate and robust sampling method based on the blocked Gibbs sampling and iterative conditional modes (ICM) is proposed and effectively applied for reducing a singularity solution given in the model with complex multi-modal distributions. In speaker clustering experiments under non-stationary noise, the proposed sampling-based model estimation improved the clustering performance by 17% on average compared to the conventional sampling-based methods.

conference of the international speech communication association | 2011

Speaker clustering based on utterance-oriented Dirichlet process mixture model

Naohiro Tawara; Shinji Watanabe; Tetsuji Ogawa; Tetsunori Kobayashi

international conference on acoustics, speech, and signal processing | 2018

Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning.

Taira Tsuchiya; Naohiro Tawara; Tetsuji Ogawa; Tetsunori Kobayashi

international conference on acoustics, speech, and signal processing | 2018

Language Model Domain Adaptation Via Recurrent Neural Networks with Domain-Shared and Domain-Specific Representations.

Tsuyoshi Moriokal; Naohiro Tawara; Tetsuji Ogawa; Atsunori Ogawa; Tomoharu Iwata; Tetsunori Kobayashi

asia pacific signal and information processing association annual summit and conference | 2017

Exploiting end of sentences and speaker alternations in language modeling for multiparty conversations

Hiroto Ashikawa; Naohiro Tawara; Atsunori Ogawa; Tomoharu Iwata; Tetsunori Kobayashi; Tetsuji Ogawa

Journal of The Japan Society for Precision Engineering | 2014

Improving classification accuracy of image categories using local descriptors with supplemental information

Kazuya Ueki; Youhei Shiraishi; Naohiro Tawara; Tetsunori Kobayashi

conference of the international speech communication association | 2012

Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model

Naohiro Tawara; Tetsuji Ogawa; Shinji Watanabe; Atsushi Nakamura; Tetsunori Kobayashi

Explore More

Collaboration

Dive into the Naohiro Tawara's collaboration.

Top Co-Authors

Tetsunori Kobayashi

Waseda University

View shared research outputs

Top Co-Authors

Tetsuji Ogawa

Waseda University

View shared research outputs

Top Co-Authors

Shinji Watanabe

Mitsubishi Electric Research Laboratories

View shared research outputs

Top Co-Authors

Atsushi Nakamura

Nippon Telegraph and Telephone

View shared research outputs

Top Co-Authors

Tomoharu Iwata

Nippon Telegraph and Telephone

View shared research outputs

Top Co-Authors

Atsunori Ogawa

Nagoya University

View shared research outputs

Top Co-Authors

Hiroto Ashikawa

Waseda University

View shared research outputs

Top Co-Authors

Youhei Shiraishi

Waseda University

View shared research outputs

Explore More