Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nobutaka Ono is active.

Publication


Featured researches published by Nobutaka Ono.


international conference on acoustics, speech, and signal processing | 2009

Complex NMF: A new sparse representation for acoustic signals

Hirokazu Kameoka; Nobutaka Ono; Kunio Kashino; Shigeki Sagayama

This paper presents a new sparse representation for acoustic signals which is based on a mixing model defined in the complex-spectrum domain (where additivity holds), and allows us to extract recurrent patterns of magnitude spectra that underlie observed complex spectra and the phase estimates of constituent signals. An efficient iterative algorithm is derived, which reduces to the multiplicative update algorithm for non-negative matrix factorization developed by Lee under a particular condition.


workshop on applications of signal processing to audio and acoustics | 2011

Stable and fast update rules for independent vector analysis based on auxiliary function technique

Nobutaka Ono

This paper presents stable and fast update rules for independent vector analysis (IVA) based on auxiliary function technique. The algorithm consists of two alternative updates: 1) weighted covariance matrix updates and 2) demixing matrix updates, which include no tuning parameters such as step size. The monotonic decrease of the objective function at each update is guaranteed. The experimental evaluation shows that the derived update rules yield faster convergence and better results than natural gradient updates.


workshop on applications of signal processing to audio and acoustics | 2007

Sparseness-Based 2CH BSS using the EM Algorithm in Reverberant Environment

Yosuke Izumi; Nobutaka Ono; Shigeki Sagayama

In this paper, we propose a new approach to sparseness-based BSS based on the EM algorithm, which iteratively estimates the DOA and the time-frequency mask for each source through the EM algorithm under the sparseness assumption. Our method has the following characteristics: 1) it enables the introduction of physical observation models such as the diffuse sound field, because the likelihood is defined in the original signal domain and not in the feature domain, 2) one does not necessarily have to know in advance the power of the background noise since they are also parameters which can be estimated from the observed signal, 3) it takes short computational time, 4) a common objective function is iteratively increased in localization and separation steps, which correspond to the E-step and M-step, respectively. Although our framework is applicable to general N channel BSS, we will concentrate on the formulation of the problem in the particular case where two sensory inputs are available, and we show some numerical simulation results.


workshop on applications of signal processing to audio and acoustics | 2009

Blind alignment of asynchronously recorded signals for distributed microphone array

Nobutaka Ono; Hitoshi Kohno; Nobutaka Ito; Shigeki Sagayama

In this paper, aiming to utilize independent recording devices as a distributed microphone array, we present a novel method for alignment of recorded signals with localizing microphones and sources. Unlike conventional microphone array, signals recorded by independent devices have different origins of time, and microphone positions are generally unknown. In order to estimate both of them from only recorded signals, time differences between channels for each source are detected, which still include the differences of time origins, and an objective function defined by their square errors is minimized. For that, simple iterative update rules are derived through auxiliary function approach. The validity of our approach is evaluated by simulative experiment.


international conference on acoustics, speech, and signal processing | 2010

HMM-based approach for automatic chord detection using refined acoustic features

Yushi Ueda; Yuuki Uchiyama; Takuya Nishimoto; Nobutaka Ono; Shigeki Sagayama

We discuss an HMM-based method for detecting the chord sequence from musical acoustic signals using percussion-suppressed, Fourier-transformed chroma and delta-chroma features. To reduce the interference often caused by percussive sounds in popular music, we use Harmonic/Percussive Sound Separation (HPSS) technique to suppress percussive sounds and to emphasize harmonic sound components. We also use the Fourier transform of chroma to approximately diagonalize the covariance matrix of feature parameters so as to reduce the number of model parameters without degrading performance. It is shown that HMM with the new features yields higher recognition rates (the best in MIREX 2008 audio chord detection task) than that with conventional features.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Single and Multiple

J. Le Roux; Hirokazu Kameoka; Nobutaka Ono; A. de Cheveigne; Shigeki Sagayama

This paper proposes a novel F0 contour estimation algorithm based on a precise parametric description of the voiced parts of speech derived from the power spectrum. The algorithm is able to perform in a wide variety of noisy environments as well as to estimate the F0s of cochannel concurrent speech. The speech spectrum is modeled as a sequence of spectral clusters governed by a common F0 contour expressed as a spline curve. These clusters are obtained by an unsupervised 2-D time-frequency clustering of the power density using a new formulation of the EM algorithm, and their common F 0 contour is estimated at the same time. A smooth F0 contour is extracted for the whole utterance, linking together its voiced parts. A noise model is used to cope with nonharmonic background noise, which would otherwise interfere with the clustering of the harmonic portions of speech. We evaluate our algorithm in comparison with existing methods on several tasks, and show 1) that it is competitive on clean single-speaker speech, 2) that it outperforms existing methods in the presence of noise, and 3) that it outperforms existing methods for the estimation of multiple F0 contours of cochannel concurrent speech


international conference on acoustics, speech, and signal processing | 2010

{ F}_{0}

Hideyuki Tachibana; Takuma Ono; Nobutaka Ono; Shigeki Sagayama

Estimation of melody line in homophonic music audio signals is a challenging subject of study. Some of the difficulties are derived from presence of accompanying components. To overcome those difficulties, we propose a method to enhance melodic components in music audio signals. The enhancement algorithm uses fluctuation and shortness of melodic components, which we call temporal-variability. We also discuss a melody tracking algorithm, which can be simple thanks to the preprocessing. In this paper, we describe the enhancement method and tracking method, and show the experimental results that supports the efficiency of our methods.


international workshop on machine learning for signal processing | 2010

Contour Estimation Through Parametric Spectrogram Modeling of Speech in Noisy Environments

Masahiro Nakano; Hirokazu Kameoka; Jonathan Le Roux; Yu Kitano; Nobutaka Ono; Shigeki Sagayama

This paper presents a new multiplicative algorithm for nonnegative matrix factorization with β-divergence. The derived update rules have a similar form to those of the conventional multiplicative algorithm, only differing through the presence of an exponent term depending on β. The convergence is theoretically proven for any real-valued β based on the auxiliary function method. The convergence speed is experimentally investigated in comparison with previous works.


workshop on applications of signal processing to audio and acoustics | 2011

Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source

Masahiro Nakano; Jonathan Le Roux; Hirokazu Kameoka; Tomohiko Nakamura; Nobutaka Ono; Shigeki Sagayama

This paper presents a Bayesian nonparametric latent source discovery method for music signal analysis. In audio signal analysis, an important goal is to decompose music signals into individual notes, with applications such as music transcription, source separation or note-level manipulation. Recently, the use of latent variable decompositions, especially nonnegative matrix factorization (NMF), has been a very active area of research. These methods are facing two, mutually dependent, problems: first, instrument sounds often exhibit time-varying spectra, and grasping this time-varying nature is an important factor to characterize the diversity of each instrument; moreover, in many cases we do not know in advance the number of sources and which instruments are played. Conventional decompositions generally fail to cope with these issues as they suffer from the difficulties of automatically determining the number of sources and automatically grouping spectra into single events. We address both these problems by developing a Bayesian nonparametric fusion of NMF and hidden Markov model (HMM). Our model decomposes music spectrograms in an automatically estimated number of components, each of which consisting in an HMM whose number of states is also automatically estimated from the data.


IEEE Transactions on Audio, Speech, and Language Processing | 2014

Convergence-guaranteed multiplicative algorithms for nonnegative matrix factorization with β-divergence

Hideyuki Tachibana; Nobutaka Ono; Shigeki Sagayama

We propose a novel singing voice enhancement technique for monaural music audio signals, which is a quite challenging problem. Many singing voice enhancement techniques have been proposed recently. However, our approach is based on a quite different idea from these existing methods. We focused on the fluctuation of a singing voice and considered to detect it by exploiting two differently resolved spectrograms, one has rich temporal resolution and poor frequency resolution, while the other has rich frequency resolution and poor temporal resolution. On such two spectrograms, the shapes of fluctuating components are quite different. Based on this idea, we propose a singing voice enhancement technique that we call two-stage harmonic/percussive sound separation (HPSS). In this paper, we describe the details of two-stage HPSS and evaluate the performance of the method. The experimental results show that SDR, a commonly-used criterion on the task, was improved by around 4 dB, which is a considerably higher level than existing methods. In addition, we also evaluated the performance of the method as a preprocessing for melody estimation in music. The experimental results show that our singing voice enhancement technique considerably improved the performance of a simple pitch estimation technique. These results prove the effectiveness of the proposed method.

Collaboration


Dive into the Nobutaka Ono's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jonathan Le Roux

Mitsubishi Electric Research Laboratories

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daichi Kitamura

Graduate University for Advanced Studies

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge