Frank Kurth
University of Bonn
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Frank Kurth.
Pattern Recognition Letters | 2010
Rolf Bardeli; Daniel M. Wolff; Frank Kurth; M. Koch; Klaus-Henry Tauchert; Karl-Heinz Frommolt
Trends in bird population sizes are an important indicator in nature conservation but measuring such sizes is a very difficult, labour intensive process. Enormous progress in audio signal processing and pattern recognition in recent years makes it possible to incorporate automated methods into the detection of bird vocalisations. These methods can be employed to support the census of population sizes. We report about a study testing the feasibility of bird monitoring supported by automatic bird song detection. In particular, we describe novel algorithms for the detection of the vocalisations of two endangered bird species and show how these can be used in automatic habitat mapping. These methods are based on detecting temporal patterns in a given frequency band typical for the species. Special effort is put into the suppression of the noise present in real-world audio scenes. Our results show that even in real-world recording conditions high recognition rates with a tolerable rate of false positive detections are possible.
IEEE Transactions on Audio, Speech, and Language Processing | 2008
Frank Kurth; Meinard Müller
Given a large audio database of music recordings, the goal of classical audio identification is to identify a particular audio recording by means of a short audio fragment. Even though recent identification algorithms show a significant degree of robustness towards noise, MP3 compression artifacts, and uniform temporal distortions, the notion of similarity is rather close to the identity. In this paper, we address a higher level retrieval problem, which we refer to as audio matching: given a short query audio clip, the goal is to automatically retrieve all excerpts from all recordings within the database that musically correspond to the query. In our matching scenario, opposed to classical audio identification, we allow semantically motivated variations as they typically occur in different interpretations of a piece of music. To this end, this paper presents an efficient and robust audio matching procedure that works even in the presence of significant variations, such as nonlinear temporal, dynamical, and spectral deviations, where existing algorithms for audio identification would fail. Furthermore, the combination of various deformation- and fault-tolerance mechanisms allows us to employ standard indexing techniques to obtain an efficient, index-based matching procedure, thus providing an important step towards semantically searching large-scale real-world music collections.
EURASIP Journal on Advances in Signal Processing | 2007
Meinard Müller; Frank Kurth
One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations.
IEEE Transactions on Multimedia | 2004
Michael Clausen; Frank Kurth
In this paper, we propose a unified approach to fast index-based music recognition. As an important area within the field of music information retrieval (MIR), the goal of music recognition is, given a database of musical pieces and a query document, to locate all occurrences of that document within the database, up to certain possible errors. In particular, the identification of the query with regard to the database becomes possible. The approach presented in this paper is based on a general algorithmic framework for searching complex patterns of objects in large databases. We describe how this approach may be applied to two important music recognition tasks: The polyphonic (musical score-based) search in polyphonic score data and the identification of pulse-code modulation audio material from a given acoustic waveform. We give an overview on the various aspects of our technology including fault-tolerant search methods. Several areas of application are suggested. We describe several prototypic systems we have developed for those applications including the notify! and the audentify! systems for score- and waveform-based music recognition, respectively.
international conference on acoustics, speech, and signal processing | 2006
Meinard Müller; Frank Kurth
Similarity matrices have become an important tool in music audio analysis. However, the quadratic time and space complexity as well as the intricacy of extracting the desired structural information from these matrices are often prohibitive with regard to real-world applications. In this paper, we describe an approach for enhancing the structural properties of similarity matrices based on two concepts: first, we introduce a new class of robust and scalable audio features which absorb local temporal variations. As a second contribution, we then incorporate contextual information into the local similarity measure. The resulting enhancement leads to significant reduction in matrix size and also eases the structure extraction step. As an example, we sketch the application of our techniques to the problems of audio summarization and audio synchronization, obtaining effective and computationally feasible algorithms
international conference on acoustics, speech, and signal processing | 2010
Peter Grosche; Meinard Müller; Frank Kurth
The extraction of local tempo and beat information from audio recordings constitutes a challenging task, particularly for music that reveals significant tempo variations. Furthermore, the existence of various pulse levels such as measure, tactus, and tatum often makes the determination of absolute tempo problematic. In this paper, we present a robust mid-level representation that encodes local tempo information. Similar to the well-known concept of cyclic chroma features, where pitches differing by octaves are identified, we introduce the concept of cyclic tempograms, where tempi differing by a power of two are identified. Furthermore, we describe how to derive cyclic tempograms from music signals using two different methods for periodicity analysis and finally sketch some applications to tempo-based audio segmentation.
european conference on research and advanced technology for digital libraries | 2007
Meinard Müller; Frank Kurth; David Damm; Christian Fremerey; Michael Clausen
Modern digital music libraries contain textual, visual, and audio data describing music on various semantic levels. Exploiting the availability of different semantically interrelated representations for a piece of music, this paper presents a query-by-lyrics retrieval system that facilitates multimodal navigation in CD audio collections. In particular, we introduce an automated method to time align given lyrics to an audio recording of the underlying song using a combination of synchronization algorithms. Furthermore, we describe a lyrics search engine and show how the lyrics-audio alignments can be used to directly navigate from the list of query results to the corresponding matching positions within the audio recordings. Finally, we present a user interface for lyrics-based queries and playback of the query results that extends the functionality of our SyncPlayer framework for content-based music and audio navigation.
international conference on multimodal interfaces | 2008
David Damm; Christian Fremerey; Frank Kurth; Meinard Müller; Michael Clausen
Recent digitization efforts have led to large music collections, which contain music documents of various modes comprising textual, visual and acoustic data. In this paper, we present a multimodal music player for presenting and browsing digitized music collections consisting of heterogeneous document types. In particular, we concentrate on music documents of two widely used types for representing a musical work, namely visual music representation (scanned images of sheet music) and associated interpretations (audio recordings). We introduce novel user interfaces for multimodal (audio-visual) music presentation as well as intuitive navigation and browsing. Our system offers high quality audio playback with time-synchronous display of the digitized sheet music associated to a musical work. Furthermore, our system enables a user to seamlessly crossfade between various interpretations belonging to the currently selected musical work.
workshop on applications of signal processing to audio and acoustics | 2005
Meinard Müller; Frank Kurth; Michael Clausen
Large music collections often contain several recordings of the same piece of music, which are interpreted by various musicians and possibly arranged in different instrumentations. Given a short query audio clip, an important task in audio retrieval is to automatically and efficiently identify all corresponding audio clips irrespective of the specific interpretation or instrumentation. In view of this problem, which is also referred to as audio matching, the main contribution of this paper is to introduce a new type of audio feature that strongly correlates to the harmonic progression of the audio signal. In addition, our feature shows a high degree of robustness to variations in parameters such as dynamics, timbre, articulation, and local tempo deviations. The feature design is carried out in two stages basically taking short-time statistics over chroma-based energy distributions. Here, the chroma correspond to the 12 traditional pitch classes of the equal-tempered scale. Applied to audio matching on a large audio database consisting of a wide range of classical music (112 hours of audio material), our features proved to be a powerful tool providing accurate matchings in an efficient way concerning time as well as memory requirements.
multimedia signal processing | 2002
Andreas Ribbrock; Frank Kurth
We give an overview on a novel framework for content-based multimedia retrieval. In this paper, we present an implementation for audio identification. This framework consists of an index-based search combining algebraic methods with classical full-text retrieval. In the main part of the paper, we propose several feature extractors which may be used for indexing the PCM audio data. We give an overview on our test results containing performance data (e.g. query response times), memory requirements (e.g., index size), and robustness issues. The size of our index turns out to be only a 1/1000th to about 1/15000th of the original PCM material depending on the required granularity for identifying a piece of audio.