Hendrik Purwins
Pompeu Fabra University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hendrik Purwins.
international symposium on neural networks | 2000
Hendrik Purwins; Benjamin Blankertz; Klaus Obermayer
Cq-profiles are 12-dimensional vectors, each component referring to a pitch class. They can be employed to represent keys. Cq-profiles are calculated with the constant Q filter bank. They have the following advantages: 1) they correspond to probe tone ratings; 2) calculation is possible in real-time; 3) stability is obtained with respect to sound quality; and 4) they are transposable. By using the cq-profile technique as a simple auditory model in combination with the SOM, an arrangement of keys emerges, that resembles results from psychological experiments and from the music theory. Cq-profiles are reliably applied to modulation tracking by introducing a special distance measure.
IEEE-ASME Transactions on Mechatronics | 2014
Hendrik Purwins; Bernd Barak; Ahmed Nagi; Reiner Engel; Uwe Höckele; Andreas Kyek; Srikanth Cherla; Benjamin Lenz; Günter Pfeifer; Kurt Weinzierl
The quality of wafer production in semiconductor manufacturing cannot always be monitored by a costly physical measurement. Instead of measuring a quantity directly, it can be predicted by a regression method (virtual metrology). In this paper, a survey on regression methods is given to predict average silicon nitride cap layer thickness for the plasma-enhanced chemical vapor deposition dual-layer metal passivation stack process. Process and production equipment fault detection and classification data are used as predictor variables. Various variable sets are compared: one most predictive variable alone, the three most predictive variables, an expert selection, and full set. The following regression methods are compared: simple linear regression, multiple linear regression, partial least square regression, and ridge linear regression utilizing the partial least square estimate algorithm, and support vector regression (SVR). On a test set, SVR outperforms the other methods by a large margin, being more robust toward changes in the production conditions. The method performs better on high-dimensional multivariate input data than on the most predictive variables alone. Process expert knowledge used for a priori variable selection further enhances the performance slightly. The results confirm earlier findings that virtual metrology can benefit from the robustness of SVR, an adaptive generic method that performs well even if no process knowledge is applied. However, the integration of process expertise into the method improves the performance once more.
IEEE Journal of Selected Topics in Signal Processing | 2011
Simon Scholler; Hendrik Purwins
Up to now, there has only been little work on using features from temporal approximations of signals for audio recognition. Time-frequency tradeoffs are an important issue in signal processing; sparse representations using overcomplete dictionaries may (or may not, depending on the dictionary) have more time-frequency flexibility than standard short-time Fourier transform. Also, the precise temporal structure of signals cannot be captured by spectral-based feature methods. Here, we present a biologically inspired three-step process for audio classification: 1) Efficient atomic functions are learned in an unsupervised manner on mixtures of percussion sounds (drum phrases), optimizing the length as well as the shape of the atoms. 2) An analog spike model is used to sparsely approximate percussion sound signals (bass drum, snare drum, hi-hat). The spike model consists of temporally shifted versions of the learned atomic functions, each having a precise temporal position and amplitude. To obtain the decomposition given a set of atomic functions, matching pursuit is used. 3) Features are extracted from the resulting spike representation of the signal. The classification accuracy of our method using a support vector machine (SVM) in a 3-class database transfer task is 87.8%. Using gammatone functions instead of the learned sparse functions yields an even better classification rate of 97.6%. Testing the features on sounds containing additive white Gaussian noise reveals that sparse approximation features are far more robust to such distortions than our benchmark feature set of timbre descriptor (TD) features.
Journal of Neural Engineering | 2014
Matthias Sebastian Treder; Hendrik Purwins; Daniel Miklody; Irene Sturm; Benjamin Blankertz
OBJECTIVE Polyphonic music (music consisting of several instruments playing in parallel) is an intuitive way of embedding multiple information streams. The different instruments in a musical piece form concurrent information streams that seamlessly integrate into a coherent and hedonistically appealing entity. Here, we explore polyphonic music as a novel stimulation approach for use in a brain-computer interface. APPROACH In a multi-streamed oddball experiment, we had participants shift selective attention to one out of three different instruments in music audio clips. Each instrument formed an oddball stream with its own specific standard stimuli (a repetitive musical pattern) and oddballs (deviating musical pattern). MAIN RESULTS Contrasting attended versus unattended instruments, ERP analysis shows subject- and instrument-specific responses including P300 and early auditory components. The attended instrument can be classified offline with a mean accuracy of 91% across 11 participants. SIGNIFICANCE This is a proof of concept that attention paid to a particular instrument in polyphonic music can be inferred from ongoing EEG, a finding that is potentially relevant for both brain-computer interface and music research.
Organised Sound | 2000
Hendrik Purwins; Benjamin Blankertz; Klaus Obermayer
In this paper the ingredients of computing auditory perception are reviewed. On the basic level there is neurophysiology, which is abstracted to artificial neural nets (ANNs) and enhanced by statistics to machine learning. There are high-level cognitive models derived from psychoacoustics (especially Gestalt principles). The gap between neuroscience and psychoacoustics has to be filled by numerics, statistics and heuristics. Computerised auditory models have a broad and diverse range of applications: hearing aids and implants, compression in audio codices, automated music analysis, music composition, interactive music installations, and information retrieval from large databases of music samples.
conference on automation science and engineering | 2011
Hendrik Purwins; Ahmed Nagi; Bernd Barak; Uwe Höckele; Andreas Kyek; Benjamin Lenz; Günter Pfeifer; Kurt Weinzierl
Different approaches for the prediction of average Silicon Nitride cap layer thickness for the Plasma Enhanced Chemical Vapor Deposition (PECVD) dual-layer metal passivation stack process are compared, based on metrology and production equipment Fault Detection and Classification (FDC) data. Various sets of FDC parameters are processed by different prediction algorithms. In particular, the use of high-dimensional multivariate input data in comparison to small parameter sets is assessed. As prediction methods, Simple Linear Regression, Multiple Linear Regression, Partial Least Square Regression, and Ridge Linear Regression utilizing the Partial Least Square Estimate algorithm are compared. Regression parameter optimization and model selection is performed and evaluated via cross validation and grid search, using the Root Mean Squared Error. Process expert knowledge used for a priori selection of FDC parameters further enhances the performance. Our results indicate that Virtual Metrology can benefit from the usage of regression methods exploiting collinearity combined with comprehensive process expert knowledge.
Connection Science | 2009
Amaury Hazan; Ricard Marxer; Paul Brossier; Hendrik Purwins; Perfecto Herrera; Xavier Serra
A causal system to represent a stream of music into musical events, and to generate further expected events, is presented. Starting from an auditory front-end that extracts low-level (i.e. MFCC) and mid-level features such as onsets and beats, an unsupervised clustering process builds and maintains a set of symbols aimed at representing musical stream events using both timbre and time descriptions. The time events are represented using inter-onset intervals relative to the beats. These symbols are then processed by an expectation module using Predictive Partial Match, a multiscale technique based on N-grams. To characterise the ability of the system to generate an expectation that matches both ground truth and system transcription, we introduce several measures that take into account the uncertainty associated with the unsupervised encoding of the musical sequence. The system is evaluated using a subset of the ENST-drums database of annotated drum recordings. We compare three approaches to combine timing (when) and timbre (what) expectation. In our experiments, we show that the induced representation is useful for generating expectation patterns in a causal way.
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Kamil Adiloglu; Robert Anniés; Elio Wahlen; Hendrik Purwins; Klaus Obermayer
Studies of Gaver (W. W. Gaver, “How do we hear in the world? Explorations in ecological acoustics,” Ecological Psychology, 1993) revealed that humans categorize everyday sounds considering the processes that have generated them: He defined these categories in a taxonomy according to the aggregate states of the involved materials (solid, liquid, gas) and the physical nature of the sound generating interaction such as deformation, friction, etc., for solids. We exemplified this taxonomy in an everyday sound database that contains recordings of basic isolated sound events of these categories. We used a sparse method to represent and to visualize these sound events. This representation relies on a sparse decomposition of sounds into atomic filter functions in the time-frequency domain. The filter functions maximally correlated with a given sound are selected automatically to perform the decomposition. The obtained sparse point pattern depicts the skeleton of the given sound. The visualization of these point patterns revealed that acoustically similar sounds have similar point patterns. To detect these similarities, we defined a novel dissimilarity function by considering these point patterns as 3-D point graphs and applied a graph matching algorithm, which assigns the points of one sound to the points of the other sound. This novel dissimilarity measure is used in combination with a kernel machine for the classification experiments, yielding an average accuracy of 95% in one versus one discrimination tasks.
Connection Science | 2009
Martin Coath; Susan L. Denham; Leigh M. Smith; Henkjan Honing; Amaury Hazan; Piotr Holonowicz; Hendrik Purwins
We describe a biophysically motivated model of auditory salience based on a model of cortical responses and present results that show that the derived measure of salience can be used to identify the position of perceptual onsets in a musical stimulus successfully. The salience measure is also shown to be useful to track beats and predict rhythmic structure in the stimulus on the basis of its periodicity patterns. We evaluate the method using a corpus of unaccompanied freely sung stimuli and show that the method performs well, in some cases better than state-of-the-art algorithms. These results deserve attention because they are derived from a general model of auditory processing and not an arbitrary model achieving best performance in onset detection or beat-tracking tasks.
Computer Music Journal | 2013
Srikanth Cherla; Hendrik Purwins; Marco Marchini
A framework is proposed for generating interesting, musically similar variations of a given monophonic melody. The focus is on pop/rock guitar and bass guitar melodies with the aim of eventual extensions to other instruments and musical styles. It is demonstrated here how learning musical style from segmented audio data can be formulated as an unsupervised learning problem to generate a symbolic representation. A melody is first segmented into a sequence of notes using onset detection and pitch estimation. A set of hierarchical, coarse-to-fine symbolic representations of the melody is generated by clustering pitch values at multiple similarity thresholds. The variance ratio criterion is then used to select the appropriate clustering levels in the hierarchy. Note onsets are aligned with beats, considering the estimated meter of the melody, to create a sequence of symbols that represent the rhythm in terms of onsets/rests and the metrical locations of their occurrence. A joint representation based on the cross-product of the pitch cluster indices and metrical locations is used to train the prediction model, a variable-length Markov chain. The melodies generated by the model were evaluated through a questionnaire by a group of experts, and received an overall positive response.