Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Martín Haro is active.

Publication


Featured researches published by Martín Haro.


Information Processing and Management | 2013

Semantic audio content-based music recommendation and visualization based on user preference examples

Dmitry Bogdanov; Martín Haro; Ferdinand Fuhrmann; Anna Xambó; Emilia Gómez; Perfecto Herrera

Preference elicitation is a challenging fundamental problem when designing recommender systems. In the present work we propose a content-based technique to automatically generate a semantic representation of the users musical preferences directly from audio. Starting from an explicit set of music tracks provided by the user as evidence of his/her preferences, we infer high-level semantic descriptors for each track obtaining a user model. To prove the benefits of our proposal, we present two applications of our technique. In the first one, we consider three approaches to music recommendation, two of them based on a semantic music similarity measure, and one based on a semantic probabilistic model. In the second application, we address the visualization of the users musical preferences by creating a humanoid cartoon-like character - the Musical Avatar - automatically inferred from the semantic representation. We conducted a preliminary evaluation of the proposed technique in the context of these applications with 12 subjects. The results are promising: the recommendations were positively evaluated and close to those coming from state-of-the-art metadata-based systems, and the subjects judged the generated visualizations to capture their core preferences. Finally, we highlight the advantages of the proposed semantic user model for enhancing the user interfaces of information filtering systems.


PLOS ONE | 2012

Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals

Martín Haro; Joan Serrà; Perfecto Herrera; Alvaro Corral

Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources.


content based multimedia indexing | 2011

A content-based system for music recommendation and visualization of user preferences working on semantic notions

Dmitry Bogdanov; Martín Haro; Ferdinand Fuhrmann

The amount of digital music has grown unprecedentedly during the last years and requires the development of effective methods for search and retrieval. In particular, content-based preference elicitation for music recommendation is a challenging problem that is effectively addressed in this paper. We present a system which automatically generates recommendations and visualizes a users musical preferences, given her/his accounts on popular online music services. Using these services, the system retrieves a set of tracks preferred by a user, and further computes a semantic description of musical preferences based on raw audio information. For the audio analysis we used the capabilities of the Canoris API. Thereafter, the system generates music recommendations, using a semantic music similarity measure, and a users preference visualization, mapping semantic descriptors to visual elements.


international world wide web conferences | 2012

Power-law distribution in encoded MFCC frames of speech, music, and environmental sound signals

Martín Haro; Joan Serrà; Alvaro Corral; Perfecto Herrera

Many sound-related applications use Mel-Frequency Cepstral Coefficients (MFCC) to describe audio timbral content. Most of the research efforts dealing with MFCCs have been focused on the study of different classification and clustering algorithms, the use of complementary audio descriptors, or the effect of different distance measures. The goal of this paper is to focus on the statistical properties of the MFCC descriptor itself. For that purpose, we use a simple encoding process that maps a short-time MFCC vector to a dictionary of binary code-words. We study and characterize the rank-frequency distribution of such MFCC code-words, considering speech, music, and environmental sound sources. We show that, regardless of the sound source, MFCC code-words follow a shifted power-law distribution. This implies that there are a few code-words that occur very frequently and many that happen rarely. We also observe that the inner structure of the most frequent code-words has characteristic patterns. For instance, close MFCC coefficients tend to have similar quantization values in the case of music signals. Finally, we study the rank-frequency distributions of individual music recordings and show that they present the same type of heavy-tailed distribution as found in the large-scale databases. This fact is exploited in two supervised semantic inference tasks: genre and instrument classification. In particular, we obtain similar classification results as the ones obtained by considering all frames in the recordings by just using 50 (properly selected) frames. Beyond this particular example, we believe that the fact that MFCC frames follow a power-law distribution could potentially have important implications for future audio-based applications.


applied sciences on biomedical and communication technologies | 2011

Music perception with current signal processing strategies for cochlear implants

Waldo Nogueira; Martín Haro; Perfecto Herrera; Xavier Serra

This work presents a brief review on hearing with cochlear implants with emphasis on music perception. Although speech perception in noise with cochlear implants is still the major challenge, music perception is becoming more and more important. Music can modulate emotions and stimulate the brain in different ways than speech, for this reason, music can impact in quality of life for cochlear implant users. In this paper we present traditional and new trends to improve the perception of pitch with cochlear implants as well as some signal processing methods that have been designed with the aim to improve music perception. Finally, a review of music evaluation methods will be presented.


computer music modeling and retrieval | 2012

Sample Identification in Hip Hop Music

Jan Van Balen; Joan Serrí; Martín Haro

Sampling is a creative tool in composition that is widespread in popular music production and composition since the 1980s. However, the concept of sampling has for a long time been unaddressed in Music Information Retrieval. We argue that information on the origin of samples has a great musicological value and can be used to organise and disclose large music collections. In this paper we introduce the problem of automatic sample identification and present a first approach for the case of hip hop music. In particular, we modify and optimize an existing fingerprinting approach to meet the necessary requirements of a realworld sample identification task. The obtained results show the viability of such an approach, and open new avenues for research, especially with regard to inferring artist influences and detecting musical reuse.


Scientific Reports | 2012

Measuring the Evolution of Contemporary Western Popular Music

Joan Serrà; Alvaro Corral; Marián Boguñá; Martín Haro; Josep Lluis Arcos


international symposium/conference on music information retrieval | 2009

Scalability, generality and temporal aspects in automatic recognition of predominant musical instruments in polyphonic music

Ferdinand Fuhrmann; Martín Haro; Perfecto Herrera


conference on recommender systems | 2010

Content-based music recommendation based on user preference examples

Dmitry Bogdanov; Martín Haro; Ferdinand Fuhrmann; Emilia Gómez; Perfecto Herrera


international symposium/conference on music information retrieval | 2009

Music and geography: content description of musical audio from different parts of the world

Emilia Gómez; Martín Haro; Perfecto Herrera

Collaboration


Dive into the Martín Haro's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Josep Lluis Arcos

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alvaro Corral

Autonomous University of Barcelona

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge