Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jordi Janer is active.

Publication


Featured researches published by Jordi Janer.


Eurasip Journal on Audio, Speech, and Music Processing | 2010

Ecological acoustics perspective for content-based retrieval of environmental sounds

Gerard Roma; Jordi Janer; Stefan Kersten; Mattia Schirosa; Perfecto Herrera; Xavier Serra

In this paper we present a method to search for environmental sounds in large unstructured databases of user-submitted audio, using a general sound events taxonomy from ecological acoustics. We discuss the use of Support Vector Machines to classify sound recordings according to the taxonomy and describe two use cases for the obtained classification models: a content-based web search interface for a large audio database and a method for segmenting field recordings to assist sound design.


international conference on latent variable analysis and signal separation | 2017

Monoaural audio source separation using deep convolutional neural networks

Pritish Chandna; Marius Miron; Jordi Janer; Emilia Gómez

In this paper we introduce a low-latency monaural source separation framework using a Convolutional Neural Network (CNN). We use a CNN to estimate time-frequency soft masks which are applied for source separation. We evaluate the performance of the neural network on a database comprising of musical mixtures of three instruments: voice, drums, bass as well as other instruments which vary from song to song. The proposed architecture is compared to a Multilayer Perceptron (MLP), achieving on-par results and a significant improvement in processing time. The algorithm was submitted to source separation evaluation campaigns to test efficiency, and achieved competitive results.


international conference on latent variable analysis and signal separation | 2012

Low-Latency instrument separation in polyphonic audio using timbre models

Ricard Marxer; Jordi Janer; Jordi Bonada

This research focuses on the removal of the singing voice in polyphonic audio recordings under real-time constraints. It is based on time-frequency binary masks resulting from the combination of azimuth, phase difference and absolute frequency spectral bin classification and harmonic-derived masks. For the harmonic-derived masks, a pitch likelihood estimation technique based on Tikhonov regularization is proposed. A method for target instrument pitch tracking makes use of supervised timbre models. This approach runs in real-time on off-the-shelf computers with latency below 250ms. The method was compared to a state of the art Non-negative Matrix Factorization (NMF) offline technique and to the ideal binary mask separation. For the evaluation we used a dataset of multi-track versions of professional audio recordings.


international conference on acoustics, speech, and signal processing | 2012

A Tikhonov regularization method for spectrum decomposition in low latency audio source separation

Ricard Marxer; Jordi Janer

We present the use of a Tikhonov regularization based method, as an alternative to the Non-negative Matrix Factorization (NMF) approach, for source separation in professional audio recordings. This method is a direct and computationally less expensive solution to the problem, which makes it interesting in low latency scenarios. The technique sacrifices the non-negativity constraint that characterizes NMF in exchange for a closed-form solution to the problem of spectrum factorization. We quantitatively evaluated it in terms of reconstruction and separation quality on a dataset of excerpts of professionally recorded songs with singing voice. Results show that the the proposed approach achieves similar quality to that of NMF.


Journal of Electrical and Computer Engineering | 2016

Score-Informed Source Separation for Multichannel Orchestral Recordings

Marius Miron; Julio J. Carabias-Orti; Juan J. Bosch; Emilia Gómez; Jordi Janer

This paper proposes a system for score-informed audio source separation for multichannel orchestral recordings. The orchestral music repertoire relies on the existence of scores. Thus, a reliable separation requires a good alignment of the score with the audio of the performance. To that extent, automatic score alignment methods are reliable when allowing a tolerance window around the actual onset and offset. Moreover, several factors increase the difficulty of our task: a high reverberant image, large ensembles having rich polyphony, and a large variety of instruments recorded within a distant-microphone setup. To solve these problems, we design context-specific methods such as the refinement of score-following output in order to obtain a more precise alignment. Moreover, we extend a close-microphone separation framework to deal with the distant-microphone orchestral recordings. Then, we propose the first open evaluation dataset in this musical context, including annotations of the notes played by multiple instruments from an orchestral ensemble. The evaluation aims at analyzing the interactions of important parts of the separation framework on the quality of separation. Results show that we are able to align the original score with the audio of the performance and separate the sources corresponding to the instrument sections.


international conference on acoustics, speech, and signal processing | 2012

Combining a harmonic-based NMF decomposition with transient analysis for instantaneous percussion separation

Jordi Janer; Ricard Marxer; Keita Arimoto

Many recent approaches on musical source separation rely on model-based inference methods that take into account the signals harmonic structure. To address the particular case of instantaneous percussion separation, we propose a method that combines a harmonic-based decomposition using a Non-negative Matrix Factorization (NMF) algorithm, with the transient analysis of spectral peaks from a single audio frame. The signal model allows the estimation of harmonic and non-harmonic sources. Later, as shown in the evaluation, adding transient peak information improves the Signal-to-Distortion Ratio (SDR). Compared to other existing methods, this approach achieves a comparable performance, being suitable at the same time for low-latency conditions.


international conference on games and virtual worlds for serious applications | 2016

Immersive Orchestras: Audio Processing for Orchestral Music VR Content

Jordi Janer; Emilia Gómez; Agustín Martorell; Marius Miron; Benjamin de Wit

This paper combines Audio Signal Processing and Virtual Reality (VR) content to create novel immersive experiences for orchestral music audiences. In VR, the auralization of sound sources of recorded live content remains still a rather unexplored topic. We aim to build a multimodal experience, where visual and audio cues bring a sonic augmentation of the real scene. In the particular scenario of orchestral music content, our goal is to acoustically zoom on a particular instrument when the VR user stares at it. This work aims to improve the learning aspects of music listening, either for education or for personal enrichment. We use audio signal processing to separate different sound sources (instruments) in a acoustic scene (orchestral music recording). Given the signals captured by multiple microphones and the musical score of the piece, our system is able to isolate the different instruments. From the processed separated tracks, we use a binaural rendering technique to emphasize a give instrument. For these experiments we used original content from top European orchestras.


interaction design and children | 2010

KaleiVoiceKids: interactive real-time voice transformation for children

Oscar Mayor; Jordi Bonada; Jordi Janer

In this paper we describe the adaptation of an existing Real-time voice transformation exhibit to the special case of children as the interacting subjects. Many factors have been taken into consideration to adapt the body interaction design, the visual feedback given to the user and the core technology itself to fulfill the requirements of children. The paper includes a description of this installation that is being used daily by hundreds of children in a permanent museum exhibition.


international conference on latent variable analysis and signal separation | 2015

Evaluation of the Convolutional NMF for Supervised Polyphonic Music Transcription and Note Isolation

Stanislaw Gorlow; Jordi Janer

We evaluate the convolutive nonnegative matrix factorization in the context of automatic music transcription of polyphonic piano recordings and the associated problem of note isolation. Our intention is to find out whether the temporal continuity of piano notes is truthfully captured by the convolutional kernels and how the performance scales with complexity. Systematic studies of this kind are lacking in existing literature. We make use of established measures of accuracy and similarity. NMF dictionaries covering the pianos pitch range are learned from a given sample bank of isolated notes. The kernel alias patch size is varied. By using a measure of performance advantage, we show up that the improvements due to convolved bases do not justify the extra computational effort as compared to the standard NMF. In particular, this is true for the more realistic case, in which the dictionary does not fully correspond to the mixture signal. Further pertinent conclusions are drawn as well.


audio mostly conference | 2011

Towards equalization of environmental sounds using auditory-based features

Jorge Garcia; Stefan Kersten; Jordi Janer

In this paper we describe methods to assist soundscape design, sound production and processing for interactive environments, like games and simulations. Using auditory filter banks and sound texture synthesis, we develop algorithms that can be integrated with existing audio engines and can additionally support the development of dedicated high-level audio tools aimed at content authoring or transformations based on samples. The relationship between the auditory excitation patterns and the computation algorithm is explained within the context of footstep sounds. Moreover, methods for sound texture synthesis of water streams with artificial expansion of timbre space using auditory filtering techniques are presented.

Collaboration


Dive into the Jordi Janer's collaboration.

Top Co-Authors

Avatar

Jordi Bonada

Pompeu Fabra University

View shared research outputs
Top Co-Authors

Avatar

Gerard Roma

Pompeu Fabra University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marius Miron

Pompeu Fabra University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge