Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yamato Ohtani is active.

Publication


Featured researches published by Yamato Ohtani.


international conference on acoustics, speech, and signal processing | 2010

Non-parallel training for many-to-many eigenvoice conversion

Yamato Ohtani; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano

This paper presents a novel training method of an eigenvoice Gaussian mixture model (EV-GMM) effectively using non-parallel data sets for many-to-many eigenvoice conversion, which is a technique for converting an arbitrary source speakers voice into an arbitrary target speakers voice. In the proposed method, an initial EV-GMM is trained with the conventional method using parallel data sets consisting of a single reference speaker and multiple pre-stored speakers. Then, the initial EV-GMM is further refined using non-parallel data sets including a larger number of pre-stored speakers while considering the reference speakers voices as hidden variables. The experimental results demonstrate that the proposed method yields significant quality improvements in converted speech by enabling us to use data of a larger number of pre-stored speakers.


Journal of the Acoustical Society of America | 2006

Evaluation of eigenvoice conversion based on Gaussian mixture model

Yamato Ohtani; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano

Eigenvoice conversion (EVC) has been proposed as a new framework of voice conversion (VC) based on the Gaussian mixture model (GMM) [Toda et al., ‘‘Eigenvoice Conversion Based on Gaussian Mixture Model,’’ ICSLP, Pittsburgh, Sept. 2006]. This paper evaluates the performance of EVC in conversion from one source speaker’s voice to an arbitrary target speakers’ voices. This framework trains canonical GMM (EV‐GMM) in advance using multiple parallel data sets consisting of utterance pairs of the source and many prestored target speakers. This model is adapted to a specific target speaker by estimating a small number of free parameters using a few utterances of the target speaker. This paper compares spectral distortion between converted and target voices in EVC with conventional VC based on GMM when varying the amount of training data and the number of mixtures. Results show EVC outperforms conventional VC when using small amounts of training data. EVC can effectively train a complex conversion model using the ...


conference of the international speech communication association | 2016

Voice Quality Control Using Perceptual Expressions for Statistical Parametric Speech Synthesis Based on Cluster Adaptive Training.

Yamato Ohtani; Koichiro Mori; Masahiro Morita

This paper describes novel voice quality control of synthetic speech using cluster adaptive training (CAT). In this method, we model voice quality factors labeled with perceptual expressions such as “Gender,” “Age” and “Brightness.” In advance, we obtain the intensity scores of the perceptual expressions by conducting a listening test, which evaluates differences of voice qualities between synthetic speech of average voice and that of the target. Then we build perceptual expression (PE) clusters that we call PE models (PEM) under the conditions that the average voice model is used as the bias cluster and the PE intensity scores are employed as the CAT weights. In synthesis, we can generate controlled synthetic speech by the linear combination of PEMs and the existing speaker’s model. Subjective results demonstrate that the proposed method can control the voice qualities with PEs in many cases and the target synthetic speech modified by PEMs achieves comparatively good speech quality.


conference of the international speech communication association | 2006

Eigenvoice Conversion Based on Gaussian Mixture Model

Tomoki Toda; Yamato Ohtani; Kiyohiro Shikano


conference of the international speech communication association | 2006

Maximum Likelihood Voice Conversion Based on GMM with STRAIGHT Mixed Excitation

Yamato Ohtani; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano


international conference on acoustics, speech, and signal processing | 2007

One-to-Many and Many-to-One Voice Conversion Based on Eigenvoices

Tomoki Toda; Yamato Ohtani; Kiyohiro Shikano


conference of the international speech communication association | 2008

Low-Delay Voice Conversion Based on Maximum Likelihood Estimation of Spectral Parameter Trajectory

Takashi Muramatsu; Yamato Ohtani; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano


conference of the international speech communication association | 2009

Many-to-many eigenvoice conversion with reference voice

Yamato Ohtani; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano


conference of the international speech communication association | 2010

Adaptive Voice-Quality Control Based on One-to-Many Eigenvoice Conversion

Kumi Ohta; Tomoki Toda; Yamato Ohtani; Hiroshi Saruwatari; Kiyohiro Shikano


IEICE Transactions on Information and Systems | 2010

Adaptive Training for Voice Conversion Based on Eigenvoices

Yamato Ohtani; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano

Collaboration


Dive into the Yamato Ohtani's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kiyohiro Shikano

National Archives and Records Administration

View shared research outputs
Top Co-Authors

Avatar

Masatsune Tamura

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Daisuke Tani

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Kumi Ohta

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Kiyohiro Shikano

National Archives and Records Administration

View shared research outputs
Researchain Logo
Decentralizing Knowledge