Jordi Pons
Pompeu Fabra University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jordi Pons.
content based multimedia indexing | 2016
Jordi Pons; Thomas Lidy; Xavier Serra
A common criticism of deep learning relates to the difficulty in understanding the underlying relationships that the neural networks are learning, thus behaving like a black-box. In this article we explore various architectural choices of relevance for music signals classification tasks in order to start understanding what the chosen networks are learning. We first discuss how convolutional filters with different shapes can fit specific musical concepts and based on that we propose several musically motivated architectures. These architectures are then assessed by measuring the accuracy of the deep learning model in the prediction of various music classes using a known dataset of audio recordings of ballroom music. The classes in this dataset have a strong correlation with tempo, what allows assessing if the proposed architectures are learning frequency and/or time dependencies. Additionally, a black-box model is proposed as a baseline for comparison. With these experiments we have been able to understand what some deep learning based algorithms can learn from a particular set of data.
international conference on acoustics, speech, and signal processing | 2017
Jordi Pons; Xavier Serra
Many researchers use convolutional neural networks with small rectangular filters for music (spectrograms) classification. First, we discuss why there is no reason to use this filters setup by default and second, we point that more efficient architectures could be implemented if the characteristics of the music features are considered during the design process. Specifically, we propose a novel design strategy that might promote more expressive and intuitive deep learning architectures by efficiently exploiting the representational capacity of the first layer - using different filter shapes adapted to fit musical concepts within the first layer. The proposed architectures are assessed by measuring their accuracy in predicting the classes of the Ballroom dataset. We also make available1 the used code (together with the audio-data) so that this research is fully reproducible.
international conference on acoustics, speech, and signal processing | 2018
Dario Rethage; Jordi Pons; Xavier Serra
european signal processing conference | 2017
Jordi Pons; Olga Slizovskaia; Rong Gong; Emilia Gómez; Xavier Serra
international symposium/conference on music information retrieval | 2017
Jordi Pons; Rong Gong; Xavier Serra
international symposium/conference on music information retrieval | 2017
Rong Gong; Jordi Pons; Xavier Serra
arXiv: Sound | 2018
Eduardo Fonseca; Manoj Plakal; Frederic Font; Daniel P. W. Ellis; Xavier Favory; Jordi Pons; Xavier Serra
arXiv: Sound | 2018
Jordi Pons; Joan Serrà; Xavier Serra
arXiv: Sound | 2018
Jordi Pons; Xavier Serra
international symposium/conference on music information retrieval | 2017
Eduardo Fonseca; Jordi Pons; Xavier Favory; Frederic Font; Dmitry Bogdanov; Andres Ferraro; Sergio Oramas; Alastair Porter; Xavier Serra