Shigeki Sagayama | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shigeki Sagayama is active.

Explore More

Publication

Featured researches published by Shigeki Sagayama.

IEEE Transactions on Audio, Speech, and Language Processing | 2007

Single and Multiple

J. Le Roux; Hirokazu Kameoka; Nobutaka Ono; A. de Cheveigne; Shigeki Sagayama

This paper proposes a novel F0 contour estimation algorithm based on a precise parametric description of the voiced parts of speech derived from the power spectrum. The algorithm is able to perform in a wide variety of noisy environments as well as to estimate the F0s of cochannel concurrent speech. The speech spectrum is modeled as a sequence of spectral clusters governed by a common F0 contour expressed as a spline curve. These clusters are obtained by an unsupervised 2-D time-frequency clustering of the power density using a new formulation of the EM algorithm, and their common F 0 contour is estimated at the same time. A smooth F0 contour is extracted for the whole utterance, linking together its voiced parts. A noise model is used to cope with nonharmonic background noise, which would otherwise interfere with the clustering of the harmonic portions of speech. We evaluate our algorithm in comparison with existing methods on several tasks, and show 1) that it is competitive on clean single-speaker speech, 2) that it outperforms existing methods in the presence of noise, and 3) that it outperforms existing methods for the estimation of multiple F0 contours of cochannel concurrent speech

asia-pacific signal and information processing association annual summit and conference | 2013

{ F}_{0}

Masato Tsuchiya; Kazuki Ochiai; Hirokazu Kameoka; Shigeki Sagayama

This paper proposes a Bayesian approach for automatic music transcription of polyphonic MIDI signals based on generative modeling of onset occurrences of musical notes. Automatic music transcription involves two subproblems that are interdependent of each other: rhythm recognition and tempo estimation. When we listen to music, we are able to recognize its rhythm and tempo (or beat location) fairly easily even though there is ambiguity in determining the individual note values and tempo. This may be made possible through our empirical knowledge about rhythm patterns and tempo variations that possibly occur in music. To automate the process of recognizing the rhythm and tempo of music, we propose modeling the generative process of a MIDI signal of polyphonic music by combining the sub-process by which a musically natural tempo curve is generated and the sub-process by which a set of note onset positions is generated based on a 2-dimensional rhythm tree structure representation of music, and develop a parameter inference algorithm for the proposed model. We show some of the transcription results obtained with the present method.

Archive | 2005