Derry Fitzgerald
Cork Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Derry Fitzgerald.
Computational Intelligence and Neuroscience | 2008
Derry Fitzgerald; Matt Cranitch; Eugene Coyle
Recently, shift-invariant tensor factorisation algorithms have been proposed for the purposes of sound source separation of pitched musical instruments. However, in practice, existing algorithms require the use of log-frequency spectrograms to allow shift invariance in frequency which causes problems when attempting to resynthesise the separated sources. Further, it is difficult to impose harmonicity constraints on the recovered basis functions. This paper proposes a new additive synthesis-based approach which allows the use of linear-frequency spectrograms as well as imposing strict harmonic constraints, resulting in an improved model. Further, these additional constraints allow the addition of a source filter model to the factorisation framework, and an extended model which is capable of separating mixtures of pitched and percussive instruments simultaneously.
IEEE Transactions on Signal Processing | 2014
Antoine Liutkus; Derry Fitzgerald; Zafar Rafii; Bryan Pardo; Laurent Daudet
Source separation consists of separating a signal into additive components. It is a topic of considerable interest with many applications that has gathered much attention recently. Here, we introduce a new framework for source separation called Kernel Additive Modelling, which is based on local regression and permits efficient separation of multidimensional and/or nonnegative and/or non-regularly sampled signals. The main idea of the method is to assume that a source at some location can be estimated using its values at other locations nearby, where nearness is defined through a source-specific proximity kernel. Such a kernel provides an efficient way to account for features like periodicity, continuity, smoothness, stability over time or frequency, and self-similarity. In many cases, such local dynamics are indeed much more natural to assess than any global model such as a tensor factorization. This framework permits one to use different proximity kernels for different sources and to separate them using the iterative kernel backfitting algorithm we describe. As we show, kernel additive modelling generalizes many recent and efficient techniques for source separation and opens the path to creating and combining source models in a principled way. Experimental results on the separation of synthetic and audio signals demonstrate the effectiveness of the approach.
international conference on acoustics, speech, and signal processing | 2011
Rajesh Jaiswal; Derry Fitzgerald; Dan Barry; Eugene Coyle; Scott Rickard
Non-negative Matrix Factorization (NMF) has found use in single channel separation of audio signals, as it gives a parts-based decomposition of audio spectrograms where the parts typically correspond to individual notes or chords. However, a notable shortcoming of NMF is the need to cluster the basis functions to their sources after decomposition. Despite recent improvements in algorithms for clustering the basis functions to sources, much work still remains to further improve these algorithms. To this end we present a novel clustering algorithm which overcomes some of the limitations of previous clustering methods. This involves the use of Shifted Nonnegative Matrix Factorization (SNMF) as a means of clustering the frequency basis functions obtained from NMF. Results show that this gives improved clustering of pitched basis functions over previous methods.
IEEE/SP 13th Workshop on Statistical Signal Processing, 2005 | 2005
Derry Fitzgerald; Matt Cranitch; Eugene Coyle
A shifted non-negative matrix factorisation algorithm is derived, which offers advantages over previous matrix factorisation techniques for the purposes of single channel source separation. It represents a sound source as translations of a single frequency basis function. These translations approximately correspond to notes played by an instrument. Results are presented for a set of synthetic data, and on a single channel recording of piano and clarinet. Though the system is aimed at musical recordings, the technique can be applied to any data which contains shifted versions of an underlying factor, and so the algorithm could possibly be used in other applications such as image processing
international conference on acoustics, speech, and signal processing | 2006
Derry Fitzgerald; Matt Cranitch; Eugene Coyle
Recently, shifted non-negative matrix factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted non-negative tensor factorisation algorithm is derived, which extends shifted non-negative matrix factorisation to the multi-channel case. The use of this algorithm for multi-channel sound source separation of harmonic instruments is demonstrated. Further, it is shown that the algorithm can be used to perform non-negative tensor deconvolution, a multi-channel version of non-negative matrix deconvolution, to separate sound sources which have time evolving spectra from multi-channel signals
Archive | 2006
Derry Fitzgerald; Jouni Paulus
Up until recently, work on automatic music transcription has concentrated mainly on the transcription of pitched instruments, i.e., melodies. However, during the past few years there has been a growing interest in the problem of transcription of percussive instruments. This chapter aims to give an overview of the methods used in this field ranging from the pioneering works of the 1980s to more recent systems.
workshop on applications of signal processing to audio and acoustics | 2015
Antoine Liutkus; Derry Fitzgerald; Roland Badeau
Nonnegative matrix factorization (NMF) is an effective and popular low-rank model for nonnegative data. It enjoys a rich background, both from an optimization and probabilistic signal processing viewpoint. In this study, we propose a new cost-function for NMF fitting, which is introduced as arising naturally when adopting a Cauchy process model for audio waveforms. As we recall, this Cauchy process model is the only probabilistic framework known to date that is compatible with having additive magnitude spectrograms for additive independent audio sources. Similarly to the Gaussian power-spectral density, this Cauchy model features time-frequency nonnegative scale parameters, on which an NMF structure may be imposed. The Cauchy cost function we propose is optimal under that model in a maximum likelihood sense. It thus appears as an interesting newcomer in the inventory of useful cost-functions for NMF in audio. We provide multiplicative updates for Cauchy-NMF and show that they give good performance in audio source separation as well as in extracting nonnegative low-rank structures from data buried in very adverse noise.
international conference on digital signal processing | 2011
Derry Fitzgerald
We present a system for upmixing mono recordings to stereo through the use of sound source separation techniques. The use of sound source separation has the advantage of allowing sources to be placed at distinct points in the stereo field, resulting in more natural sounding upmixes. The system separates an input signal into a number of sources, which can then be imported into a digital audio workstation for upmixing to stereo. Considerations to be taken into account when upmixing are discussed, and a brief overview of the various sound source separation techniques used in the system are given. The effectiveness of the proposed system is then demonstrated on real-world mono recordings.
international conference on acoustics, speech, and signal processing | 2015
Antoine Liutkus; Derry Fitzgerald; Zafar Rafii
Recently, Kernel Additive Modelling (KAM) was proposed as a unified framework to achieve multichannel audio source separation. Its main feature is to use kernel models for locally describing the spectrograms of the sources. Such kernels can capture source features such as repetitivity, stability over time and/or frequency, self-similarity, etc. KAM notably subsumes many popular and effective methods from the state of the art, including REPET and harmonic/percussive separation with median filters. However, it also comes with an important drawback in its initial form: its memory usage badly scales with the number of sources. Indeed, KAM requires the storage of the full-resolution spectrogram for each source, which may become prohibitive for full-length tracks or many sources. In this paper, we show how it can be combined with a fast compression algorithm of its parameters to address the scalability issue, thus enabling its use on small platforms or mobile devices.
2014 4th Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, HSCMA 2014 | 2014
Antoine Liutkus; Zafar Rafii; Bryan Pardo; Derry Fitzgerald; Laurent Daudet
In this study, we introduce a new framework called Kernel Additive Modelling for audio spectrograms that can be used for multichannel source separation. It assumes that the spectrogram of a source at any time-frequency bin is close to its value in a neighbourhood indicated by a source-specific proximity kernel. The rationale for this model is to easily account for features like periodicity, stability over time or frequency, self-similarity, etc. In many cases, such local dynamics are indeed much more natural to assess than any global model such as a tensor factorization. This framework permits one to use different proximity kernels for different sources and to estimate them blindly using their mixtures only. Estimation is performed using a variant of the kernel backfitting algorithm that allows for multichannel mixtures and permits parallelization. Experimental results on the separation of vocals from musical backgrounds demonstrate the efficiency of the approach.