Ken O'Hanlon
Queen Mary University of London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ken O'Hanlon.
international conference on acoustics, speech, and signal processing | 2012
Ken O'Hanlon; Hidehisa Nagano; Mark D. Plumbley
Sparse representations have previously been applied to the automatic music transcription (AMT) problem. Structured sparsity, such as group and molecular sparsity allows the introduction of prior knowledge to sparse representations. Molecular sparsity has previously been proposed for AMT, however the use of greedy group sparsity has not previously been proposed for this problem. We propose a greedy sparse pursuit based on nearest subspace classification for groups with coherent blocks, based in a non-negative framework, and apply this to AMT. Further to this, we propose an enhanced molecular variant of this group sparse algorithm and demonstrate the effectiveness of this approach.
international conference on acoustics, speech, and signal processing | 2014
Ken O'Hanlon; Mark D. Plumbley
Non-negative Matrix Factorisation (NMF) is a popular tool in musical signal processing. However, problems using this methodology in the context of Automatic Music Transcription (AMT) have been noted resulting in the proposal of supervised and constrained variants of NMF for this purpose. Group sparsity has previously been seen to be effective for AMT when used with stepwise methods. In this paper group sparsity is introduced to supervised NMF decompositions and a dictionary tuning approach to AMT is proposed based upon group sparse NMF using the β-divergence. Experimental results are given showing improved AMT results over the state-of-the-art NMF-based AMT system.
international conference on acoustics, speech, and signal processing | 2013
Ken O'Hanlon; Mark D. Plumbley
Automatic Music Transcription (AMT) seeks to understand a musical piece in terms of note activities. Matrix decomposition methods are often used for AMT, seeking to decompose a spectrogram over a dictionary matrix of note-specific template vectors. The performance of these methods can suffer due to the large harmonic overlap found in tonal musical spectra. We propose a row weighting scheme that transforms each spectrogram frame and the dictionary, with the weighting determined by the effective correlations in the decomposition. Experiments show improved AMT performance.
IEEE Transactions on Audio, Speech, and Language Processing | 2016
Ken O'Hanlon; Hidehisa Nagano; Nicolas Keriven; Mark D. Plumbley
Automatic music transcription (AMT) can be performed by deriving a pitch-time representation through decomposition of a spectrogram with a dictionary of pitch-labelled atoms. Typically, non-negative matrix factorisation (NMF) methods are used to decompose magnitude spectrograms. One atom is often used to represent each note. However, the spectrum of a note may change over time. Previous research considered this variability using different atoms to model specific parts of a note, or large dictionaries comprised of datapoints from the spectrograms of full notes. In this paper, the use of subspace modelling of note spectra is explored, with group sparsity employed as a means of coupling activations of related atoms into a pitched subspace. Stepwise and gradient-based methods for non-negative group sparse decompositions are proposed. Finally, a group sparse NMF approach is used to tune a generic harmonic subspace dictionary, leading to improved NMF-based AMT results.
international workshop on machine learning for signal processing | 2013
Nicolas Keriven; Ken O'Hanlon; Mark D. Plumbley
Musical signals can be thought of as being sparse and structured, with few elements active at a given instant and temporal continuity of active elements observed. Greedy algorithms such as Orthogonal Matching Pursuit (OMP), and structured variants, have previously been proposed for Automatic Music Transcription (AMT), however some problems have been noted. Hence, we propose the use of a backwards elimination strategy in order to perform sparse decompositions for AMT, in particular with a proposed alternative sparse cost function. However, the main advantage of this approach is the ease with which structure can be incorporated. The use of group sparsity is shown to give increased AMT performance, while a molecular method incorporating onset information is seen to provide further improvements with little computational effort.
international conference on acoustics, speech, and signal processing | 2016
Ken O'Hanlon; Mark B. Sandler
Performance of Non-negative Matrix Factorisation (NMF) can be diminished when the underlying factors consist of elements that overlap in the matrix to be factorised. The use of ℓ0 sparsity may improve NMF, however such approaches are generally limited to Euclidean distance. We have previously proposed a stepwise ℓ0 method for Hellinger distance, leading to improved sparse NMF. We extend sparse Hellinger NMF by proposing an alternative Iterative Hard Thresholding sparse approximation method. Experimental validation of the proposed approach is given, with a large improvement over NMF methods when learning is performed on a large dataset.
international conference on acoustics, speech, and signal processing | 2015
Ken O'Hanlon; Mark B. Sandler; Mark D. Plumbley
Non-negative Matrix Factorisation (NMF) is a commonly used tool in many musical signal processing tasks, including Automatic Music Transcription (AMT). However unsupervised NMF is seen to be problematic in this context, and harmonically constrained variants of NMF have been proposed. While useful, the harmonic constraints may be constrictive in mixed signals. We have previously observed that recovery of overlapping signal elements using NMF is improved through introduction of a sparse coding step, and propose here the incorporation of a sparse coding step using the Hellinger distance into a NMF algorithm. Improved AMT results for unsupervised NMF are reported.
international conference on acoustics, speech, and signal processing | 2017
Ken O'Hanlon; Sebastian Ewert; Johan Pauwels; Mark B. Sandler
The task of chord recognition in music signals is often based upon pattern matching in chromagrams. Many variants of chroma exist and quality of chord recognition is related to the feature employed. Chroma Reduced Pitch (CRP) features are interesting in this context as they were designed to improve timbre invariance for the purpose of query retrieval. Their reapplication to chord recognition, however, has not been successful in previous studies. We consider that the default parametrisation of CRP attenuates some tonal information, as well as timbral, and consider alternatives to this default. We also provide a variant of a recently proposed compositional chroma feature, adapted for music pieces, rather than one instrument. Experiments described show improved results compared to existing features.
international conference on acoustics, speech, and signal processing | 2017
Elio Quinton; Ken O'Hanlon; Simon Dixon; Mark B. Sandler
The estimation of rhythmic properties such as tempo, beat positions or metrical structure are central aspects of Music Information Retrieval (MIR) research. Meter inference algorithms are typically designed to track metrical structure in presence of mild deviations of the feature estimates over time in order to account for performance imprecisions, expressive timing or musical effects such as accelerando. Abrupt changes of metrical structure over time are comparatively rarely addressed. In this paper, we present an unsupervised approach to detect metrical structure changes. Formulating the problem as a metrical structure based segmentation retrieval task, we present a variant of sparse NMF and compare it to existing methods. For evaluation, we introduce a new dataset of music recordings containing metric modulations with the corresponding annotations.
european signal processing conference | 2016
Ken O'Hanlon; Mark B. Sandler
Chroma features are a popular tool in musical signal processing and information retrieval tasks, providing a compact representation of the tonal content of a piece of music. A variety of approaches to chroma estimation have been proposed, most of which rely on the summation of related frequency partials. However, frequency partials may be incorrectly assigned due to the log/linear relationship of frequency and pitch. Variations of chroma employing overtone suppression strategies are found in the literature. We propose a compositional model of chroma, which considers a coarse modelling of the effects of overtones in the expected chroma vectors of single notes. Synthetic chord recognition experiments indicate the usefulness of the proposed approach.