Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Masanori Morise is active.

Publication


Featured researches published by Masanori Morise.


international conference on acoustics, speech, and signal processing | 2008

Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation

Hideki Kawahara; Masanori Morise; Toru Takahashi; Ryuichi Nisimura; Toshio Irino; Hideki Banno

A simple new method for estimating temporally stable power spectra is introduced to provide a unified basis for computing an interference-free spectrum, the fundamental frequency (F0), as well as aperiodicity estimation. F0 adaptive spectral smoothing and cepstral liftering based on consistent sampling theory are employed for interference-free spectral estimation. A perturbation spectrum, calculated from temporally stable power and interference-free spectra, provides the basis for both F0 and aperiodicity estimation. The proposed approach eliminates ad-hoc parameter tuning and the heavy demand on computational power, from which STRAIGHT has suffered in the past.


international conference on acoustics, speech, and signal processing | 2009

Temporally variable multi-aspect auditory morphing enabling extrapolation without objective and perceptual breakdown

Hideki Kawahara; Ryuichi Nisimura; Toshio Irino; Masanori Morise; Toru Takahashi; Hideki Banno

A generalized framework of auditory morphing based on the speech analysis, modification and resynthesis system STRAIGHT is proposed that enables each morphing rate of representational aspects to be a function of time, including the temporal axis itself. Two types of algorithms were derived: an incremental algorithm for real-time manipulation of morphing rates and a batch processing algorithm for off-line post-production applications. By defining morphing in terms of the derivative of mapping functions in the logarithmic domain, breakdown of morphing resynthesis found in the previous formulation in the case of extrapolations was eliminated. A method to alleviate perceptual defects in extrapolation is also introduced.


Logopedics Phoniatrics Vocology | 2009

Noh voice quality

Osamu Fujimura; Kiyoshi Honda; Hideki Kawahara; Yasuyuki Konparu; Masanori Morise; Justin C. Williams

Abstract In Noh, a traditional performing art of Japan, extremely expressive voice quality is used to convey an emotional message. A periodicity of voice appears responsible for these special effects. Acoustic signals were recorded for selected portions of dramatic singing in order to study the acoustic effects of delicate voice control by a master of the Konparu school. Using a signal analysis-synthesis algorithm, TANDEM-STRAIGHT, to represent multiple candidates for pitch perception, signals deviating from the harmonic structure have been successfully displayed, corresponding to auditory impressions of pitch movements, even when narrow-band spectrograms failed to show the perceived events. Strong interaction between vocal tract resonance and vocal fold vibration seems to play a major role in producing these expressive voice qualities.


international conference on entertainment computing | 2009

v.morish'09: A Morphing-Based Singing Design Interface for Vocal Melodies

Masanori Morise; Masato Onishi; Hideki Kawahara; Haruhiro Katayose

This paper describes a singing design method based on morphing, the design and development of an intuitive interface to assist morphing-based singing design. The proposed interface has a function for real-time morphing, based on simple operation with a mouse, and an editor to control the singing features in detail. The user is able to enhance singing voices efficiently by using these two functions. In this paper, we discuss the requirement for an interface to assist in morphing-based singing design, and develope an interface to fulfill the requirement.


international conference on acoustics, speech, and signal processing | 2011

An interference-free representation of instantaneous frequency of periodic signals and its application to F0 extraction

Hideki Kawahara; Toshio Irino; Masanori Morise

An interference-free representation of the instantaneous frequency of constituent harmonic components of periodic signals is introduced. The power weighted average instantaneous frequency of a band-pass filter yields this property when the effective passband of the filter covers up to two harmonic components and the two windows used in averaging are separated by a half pitch period. The proposed representation eliminates the abrupt changes found in usual instantaneous frequency representations and is applicable to any periodic signals consisting of multiple harmonic components. An F0 extractor of voiced sounds based on this representation is introduced as an example of prospective applications.


Speech Communication | 2015

CheapTrick, a spectral envelope estimator for high-quality speech synthesis

Masanori Morise

Abstract A spectral envelope estimation algorithm is presented to achieve high-quality speech synthesis. The concept of the algorithm is to obtain an accurate and temporally stable spectral envelope. The algorithm uses fundamental frequency (F0) and consists of F0-adaptive windowing, smoothing of the power spectrum, and spectral recovery in the quefrency domain. Objective and subjective evaluations were carried out to demonstrate the effectiveness of the proposed algorithm. Results of both evaluations indicated that the proposed algorithm can obtain a temporally stable spectral envelope and synthesize speech with higher sound quality than speech synthesized with other algorithms.


Speech Communication | 2016

D4C, a band-aperiodicity estimator for high-quality speech synthesis

Masanori Morise

An algorithm is proposed for estimating the band aperiodicity of speech signals, where aperiodicity is defined as the power ratio between the speech signal and the aperiodic component of the signal. Since this power ratio depends on the frequency band, the aperiodicity should be given for several frequency bands. The proposed D4C (Definitive Decomposition Derived Dirt-Cheap) estimator is based on an extension of a temporally static group delay representation of periodic signals. In this paper, the principle and algorithm of D4C are explained, and its effectiveness is discussed with reference to objective and subjective evaluations. Evaluation results indicate that a speech synthesis system using D4C can synthesize natural speech better than ones using other algorithms.


asia-pacific signal and information processing association annual summit and conference | 2013

Temporally variable multi-aspect N-way morphing based on interference-free speech representations

Hideki Kawahara; Masanori Morise; Hideki Banno; Verena G. Skuk

Voice morphing is a powerful tool for exploratory research and various applications. A temporally variable multi-aspect morphing is extended to enable morphing of arbitrarily many voices in a single step procedure. The proposed method is implemented based on interference-free representations of periodic signals and found to yield highly-naturally sounding manipulated voices which are useful for investigating human perception of voice. The formulation of the proposed method is general enough to be applicable to other representations and easily modified depending on application needs.


international conference on acoustics, speech, and signal processing | 2013

Higher order waveform symmetry measure and its application to periodicity detectors for speech and singing with fine temporal resolution

Hideki Kawahara; Masanori Morise; Ryuichi Nisimura; Toshio Irino

Another simple and high-speed F0 extractor with high temporal resolution based on our previous proposal has been developed by adding a higher-order symmetry measure. This extension made the proposed method significantly more robust than the previous one. The proposed method is a detector of the lowest prominent sinusoidal component. It can use several F0 refinement procedures when the signal is the sum of harmonic sinusoidal components. The refinement procedure presented here is based on a stable representation of instantaneous frequency of periodic signals. The whole procedure implemented by Matlab runs faster than realtime on usual PCs for 44,100 Hz sampled sounds. Application of the proposed algorithm revealed that rapid temporal modulations in both F0 trajectory and spectral envelope exist typically in expressive voices such as those those used in lively singing performance.


international conference on acoustics, speech, and signal processing | 2012

Analysis and synthesis of strong vocal expressions: Extension and application of audio texture features to singing voice

Hideki Kawahara; Masanori Morise

Realistic reconstruction and manipulation of strong vocal expressions found in singing voices is a challenging and exciting topic. A speech analysis, modification and resynthesis framework based on interference-free power spectral and instantaneous frequency representations for periodic sounds is extended for handling such voices. Strong expressions are typically characterized by rapid variations in excitation timing and strength as well as complex structured excitation. Three types of excitation source extractors are revised and introduced to handle them. Preliminary tests successfully replicated strong vocal expressions. Also, additional attribute representations for modifying excitation and spectral information based on audio texture features are briefly discussed.

Collaboration


Dive into the Masanori Morise's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kenji Ozawa

University of Yamanashi

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ken-Ichi Sakakibara

Health Sciences University of Hokkaido

View shared research outputs
Researchain Logo
Decentralizing Knowledge