Is this you? Create Your Porfile

Dan Stowell

Queen Mary University of London

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dan Stowell is active.

Explore More

Publication

Featured researches published by Dan Stowell.

IEEE Transactions on Multimedia | 2015

Detection and Classification of Acoustic Scenes and Events

Dan Stowell; Dimitrios Giannoulis; Emmanouil Benetos; Mathieu Lagrange; Mark D. Plumbley

For intelligent systems to make best use of the audio modality, it is important that they can recognize not just speech and music, which have been researched as specific tasks, but also general sounds in everyday environments. To stimulate research in this field we conducted a public research challenge: the IEEE Audio and Acoustic Signal Processing Technical Committee challenge on Detection and Classification of Acoustic Scenes and Events (DCASE). In this paper, we report on the state of the art in automatically classifying audio scenes, and automatically detecting and classifying audio events. We survey prior work as well as the state of the art represented by the submissions to the challenge from various research groups. We also provide detail on the organization of the challenge, so that our experience as challenge hosts may be useful to those organizing challenges in similar domains. We created new audio datasets and baseline systems for the challenge; these, as well as some submitted systems, are publicly available under open licenses, to serve as benchmarks for further research in general-purpose machine listening.

PeerJ | 2014

Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning.

Dan Stowell; Mark D. Plumbley

Automatic species classification of birds from their sound is a computational tool of increasing importance in ecology, conservation monitoring and vocal communication studies. To make classification useful in practice, it is crucial to improve its accuracy while ensuring that it can run at big data scales. Many approaches use acoustic measures based on spectrogram-type data, such as the Mel-frequency cepstral coefficient (MFCC) features which represent a manually-designed summary of spectral information. However, recent work in machine learning has demonstrated that features learnt automatically from data can often outperform manually-designed feature transforms. Feature learning can be performed at large scale and “unsupervised”, meaning it requires no manual data labelling, yet it can improve performance on “supervised” tasks such as classification. In this work we introduce a technique for feature learning from large volumes of bird sound recordings, inspired by techniques that have proven useful in other domains. We experimentally compare twelve different feature representations derived from the Mel spectrum (of which six use this technique), using four large and diverse databases of bird vocalisations, classified using a random forest classifier. We demonstrate that in our classification tasks, MFCCs can often lead to worse performance than the raw Mel spectral data from which they are derived. Conversely, we demonstrate that unsupervised feature learning provides a substantial boost over MFCCs and Mel spectra without adding computational complexity after the model has been trained. The boost is particularly notable for single-label classification tasks at large scale. The spectro-temporal activations learned through our procedure resemble spectro-temporal receptive fields calculated from avian primary auditory forebrain. However, for one of our datasets, which contains substantial audio data but few annotations, increased performance is not discernible. We study the interaction between dataset characteristics and choice of feature representation through further empirical analysis.

IEEE Signal Processing Magazine | 2015

Acoustic Scene Classification: Classifying environments from the sounds they produce

Daniele Barchiesi; Dimitrios Giannoulis; Dan Stowell; Mark D. Plumbley

In this article, we present an account of the state of the art in acoustic scene classification (ASC), the task of classifying environments from the sounds they produce. Starting from a historical review of previous research in this area, we define a general framework for ASC and present different implementations of its components. We then describe a range of different algorithms submitted for a data challenge that was held to provide a general and fair benchmark for ASC techniques. The data set recorded for this purpose is presented along with the performance metrics that are used to evaluate the algorithms and statistical significance tests to compare the submitted methods.

workshop on applications of signal processing to audio and acoustics | 2013

Detection and classification of acoustic scenes and events: An IEEE AASP challenge

Dimitrios Giannoulis; Emmanouil Benetos; Dan Stowell; Mathias Rossignol; Mathieu Lagrange; Mark D. Plumbley

This paper describes a newly-launched public evaluation challenge on acoustic scene classification and detection of sound events within a scene. Systems dealing with such tasks are far from exhibiting human-like performance and robustness. Undermining factors are numerous: the extreme variability of sources of interest possibly interfering, the presence of complex background noise as well as room effects like reverberation. The proposed challenge is an attempt to help the research community move forward in defining and studying the aforementioned tasks. Apart from the challenge description, this paper provides an overview of systems submitted to the challenge as well as a detailed evaluation of the results achieved by those systems.

International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 2009

Evaluation of live human-computer music-making: Quantitative and qualitative approaches

Dan Stowell; Andrew Robertson; Nick Bryan-Kinns; Mark D. Plumbley

Live music-making using interactive systems is not completely amenable to traditional HCI evaluation metrics such as task-completion rates. In this paper we discuss quantitative and qualitative approaches which provide opportunities to evaluate the music-making interaction, accounting for aspects which cannot be directly measured or expressed numerically, yet which may be important for participants. We present case studies in the application of a qualitative method based on Discourse Analysis, and a quantitative method based on the Turing Test. We compare and contrast these methods with each other, and with other evaluation approaches used in the literature, and discuss factors affecting which evaluation methods are appropriate in a given context.

IEEE Signal Processing Letters | 2009

Fast Multidimensional Entropy Estimation by

Dan Stowell; Mark D. Plumbley

We describe a nonparametric estimator for the differential entropy of a multidimensional distribution, given a limited set of data points, by a recursive rectilinear partitioning. The estimator uses an adaptive partitioning method and runs in Theta(N log N) time, with low memory requirements. In experiments using known distributions, the estimator is several orders of magnitude faster than other estimators, with only modest increase in bias and variance.

international workshop on machine learning for signal processing | 2016

k

Dan Stowell; Michael D. Wood; Yannis Stylianou; Hervé Glotin

Many biological monitoring projects rely on acoustic detection of birds. Despite increasingly large datasets, this detection is often manual or semi-automatic, requiring manual tuning/postprocessing. We review the state of the art in automatic bird sound detection, and identify a widespread need for tuning-free and species-agnostic approaches. We introduce new datasets and an IEEE research challenge to address this need, to make possible the development of fully automatic algorithms for bird sound detection.

Methods in Ecology and Evolution | 2014

-d Partitioning

Dan Stowell; Mark D. Plumbley

* Birdsong often contains large amounts of rapid frequency modulation (FM). It is believed that the use or otherwise of FM is adaptive to the acoustic environment and also that there are specific social uses of FM such as trills in aggressive territorial encounters. Yet temporal fine detail of FM is often absent or obscured in standard audio signal analysis methods such as Fourier analysis or linear prediction. Hence, it is important to consider high-resolution signal processing techniques for analysis of FM in bird vocalizations. If such methods can be applied at big data scales, this offers a further advantage as large data sets become available. * We introduce methods from the signal processing literature which can go beyond spectrogram representations to analyse the fine modulations present in a signal at very short time-scales. Focusing primarily on the genus Phylloscopus, we investigate which of a set of four analysis methods most strongly captures the species signal encoded in birdsong. We evaluate this through a feature selection technique and an automatic classification experiment. In order to find tools useful in practical analysis of large data bases, we also study the computational time taken by the methods, and their robustness to additive noise and MP3 compression. * We find three methods which can robustly represent species-correlated FM attributes and can be applied to large data sets, and that the simplest method tested also appears to perform the best. We find that features representing the extremes of FM encode species identity supplementary to that captured in frequency features, whereas bandwidth features do not encode additional information. * FM analysis can extract information useful for bioacoustic studies, in addition to measures more commonly used to characterize vocalizations. Further, it can be applied efficiently across very large data sets and archives.

workshop on applications of signal processing to audio and acoustics | 2015

Bird detection in audio: A survey and a challenge

Dan Stowell; David F. Clayton

Many current paradigms for acoustic event detection (AED) are not adapted to the organic variability of natural sounds, and/or they assume a limit on the number of simultaneous sources: often only one source, or one source of each type, may be active. These aspects are highly undesirable for applications such as bird population monitoring. We introduce a simple method modelling the onsets, durations and offsets of acoustic events to avoid intrinsic limits on polyphony or on inter-event temporal patterns. We evaluate the method in a case study with over 3000 zebra finch calls. In comparison against a HMM-based method we find it more accurate at recovering acoustic events, and more robust for estimating calling rates.

Archive | 2013

Large‐scale analysis of frequency modulation in birdsong data bases

Dan Stowell; Alex McLean

In live human-computer music-making, how can interfaces successfully support the openness, reinterpretation and rich signification often important in live (especially improvised) musical performance? We argue that the use of design metaphors can lead to interfaces which constrain interactions and militate against reinterpretation, while consistent, grammatical interfaces empower the user to create and apply their own metaphors in developing their performance. These metaphors can be transitory and disposable, yet do not represent wasted learning since the underlying grammar is retained. We illustrate this move with reflections from live coding practice, from recent visual and two-dimensional programming language interfaces, and from musical voice mapping research. We consider the integration of the symbolic and the continuous in the human-computer interaction. We also describe how our perspective is reflected in approaches to system evaluation.

Explore More