Mohamed Sordo
Pompeu Fabra University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohamed Sordo.
international world wide web conferences | 2012
Yi-Hsuan Yang; Dmitry Bogdanov; Perfecto Herrera; Mohamed Sordo
The emergence of social tagging websites such as Last.fm has provided new opportunities for learning computational models that automatically tag music. Researchers typically obtain music tags from the Internet and use them to construct machine learning models. Nevertheless, such tags are usually noisy and sparse. In this paper, we present a preliminary study that aims at refining (retagging) social tags by exploiting the content similarity between tracks and the semantic redundancy of the track-tag matrix. The evaluated algorithms include a graph-based label propagation method that is often used in semi-supervised learning and a robust principal component analysis (PCA) algorithm that has led to state-of-the-art results in matrix completion. The results indicate that robust PCA with content similarity constraint is particularly effective; it improves the robustness of tagging against three types of synthetic errors and boosts the recall rate of music auto-tagging by 7% in a real-world setting.
Journal of New Music Research | 2011
Gonçalo Marques; Thibault Langlois; Fabien Gouyon; Miguel Lopes; Mohamed Sordo
Abstract In music genre classification, most approaches rely on statistical characteristics of low-level features computed on short audio frames. In these methods, it is implicitly considered that frames carry equally relevant information loads and that either individual frames, or distributions thereof, somehow capture the specificities of each genre. In this paper we study the representation space defined by short-term audio features with respect to class boundaries, and compare different processing techniques to partition this space. These partitions are evaluated in terms of accuracy on two genre classification tasks, with several types of classifiers. Experiments show that a randomized and unsupervised partition of the space, used in conjunction with a Markov Model classifier lead to accuracies comparable to the state of the art. We also show that unsupervised partitions of the space tend to create less hubs.
data and knowledge engineering | 2016
Sergio Oramas; Luis Espinosa-Anke; Mohamed Sordo; Horacio Saggion; Xavier Serra
The rate at which information about music is being created and shared on the web is growing exponentially. However, the challenge of making sense of all this data remains an open problem. In this paper, we present and evaluate an Information Extraction pipeline aimed at the construction of a Music Knowledge Base. Our approach starts off by collecting thousands of stories about songs from the songfacts.com website. Then, we combine a state-of-the-art Entity Linking tool and a linguistically motivated rule-based algorithm to extract semantic relations between entity pairs. Next, relations with similar semantics are grouped into clusters by exploiting syntactic dependencies. These relations are ranked thanks to a novel confidence measure based on statistical and linguistic evidence. Evaluation is carried out intrinsically, by assessing each component of the pipeline, as well as in an extrinsic task, in which we evaluate the contribution of natural language explanations in music recommendation. We demonstrate that our method is able to discover novel facts with high precision, which are missing in current generic as well as music-specific knowledge repositories. A system that constructs a Music Knowledge Base entirely from scratch.A method for clustering and scoring relations in a Relation Extraction pipeline.Reveals music facts absent from knowledge repositories (e.g. Wikipedia).Explains music recommendations in natural language.
international world wide web conferences | 2012
Marcos Aurélio Domingues; Fabien Gouyon; Alípio Mário Jorge; José Paulo Leal; João Vinagre; Luís Lemos; Mohamed Sordo
In this paper we propose a hybrid music recommender system, which combines usage and content data. We describe an online evaluation experiment performed in real time on a commercial music web site, specialised in content from the very long tail of music content. We compare it against two stand-alone recommenders, the first system based on usage and the second one based on content data. The results show that the proposed hybrid recommender shows advantages with respect to usage- and content-based systems, namely, higher user absolute acceptance rate, higher user activity rate and higher user loyalty.
applications of natural language to data bases | 2015
Mohamed Sordo; Sergio Oramas; Luis Espinosa-Anke
This paper presents a method for the generation of structured data sources for music recommendation using information extracted from unstructured text sources. The proposed method identifies entities in text that are relevant to the music domain, and then extracts semantically meaningful relations between them. The extracted entities and relations are represented as a graph, from which the recommendations are computed. A major advantage of this approach is that the recommendations can be conveyed to the user using natural language, thus providing an enhanced user experience. We test our method on texts from songfacts.com, a website that provides facts and stories about songs. The extracted relations are evaluated intrinsically by assessing their linguistic quality, as well as extrinsically by assessing the extent to which they map an existing music knowledge base. Finally, an experiment with real users is performed to assess the suitability of the extracted knowledge for music recommendation. Our method is able to extract relations between pair of musical entities with high precision, and the explanation of those relations to the user improves user satisfaction considerably.
Journal of New Music Research | 2013
Mohamed Sordo; Fabien Gouyon; Luís Sarmento; Òscar Celma; Xavier Serra
Music folksonomies include both general and detailed descriptions of music, and are usually continuously updated. These are significant advantages over music taxonomies, which tend to be incomplete and inconsistent. However, music folksonomies have an inherent loose and open semantics, which hampers their use in many applications, such as structured music browsing and recommendation. In this paper, we present a system that can (1) automatically obtain a set of semantic facets underlying the folksonomy of the social music website Last.fm, and (2) categorize Last.fm tags with respect to the obtained facets. The semantic facets are anchored upon the structure of Wikipedia, a dynamic repository of universal knowledge.
conference on recommender systems | 2017
Sergio Oramas; Oriol Nieto; Mohamed Sordo; Xavier Serra
An increasing amount of digital music is being published daily. Music streaming services often ingest all available music, but this poses a challenge: how to recommend new artists for which prior knowledge is scarce? In this work we aim to address this so-called cold-start problem by combining text and audio information with user feedback data using deep network architectures. Our method is divided into three steps. First, artist embeddings are learned from biographies by combining semantics, text features, and aggregated usage data. Second, track embeddings are learned from the audio signal and available feedback data. Finally, artist and track embeddings are combined in a multimodal network. Results suggest that both splitting the recommendation problem between feature levels (i.e., artist metadata and audio track), and merging feature embeddings in a multimodal approach improve the accuracy of the recommendations.
international world wide web conferences | 2015
Sergio Oramas; Mohamed Sordo; Luis Espinosa-Anke
This paper presents a rule based approach to extracting relations from unstructured music text sources. The proposed approach identifies and disambiguates musical entities in text, such as songs, bands, persons, albums and music genres. Candidate relations are then obtained by traversing the dependency parsing tree of each sentence in the text with at least two identified entities. A set of syntactic rules based on part of speech tags are defined to filter out spurious and irrelevant relations. The extracted entities and relations are finally represented as a knowledge graph. We test our method on texts from songfacts.com, a website that provides tidbits with facts and stories about songs. The extracted relations are evaluated intrinsically by assessing their linguistic quality, as well as extrinsically by assessing the extent to which they map an existing music knowledge base. Our system produces a vast percentage of linguistically correct relations between entities, and is able to replicate a significant part of the knowledge base.
Proceedings of the 1st International Workshop on Digital Libraries for Musicology | 2014
Mohamed Sordo; Amin Chaachoo; Xavier Serra
Research corpora are fundamental for the computational study of music. The design criteria with which to create them is a research task in itself. These corpora need to be well suited for the specific research problems to be addressed. Since these research problems are also shaped by musical, cultural and other specific aspects of the music traditions to be studied, the research corpora should take these specificities into account. In this paper we address the problems of creating corpora for computational research on Arab-Andalusian music, considering several relevant criteria for creating such corpora. We focus on the problems raised during the annotation process of the corpora, specifically the language issues surrounding this art music tradition. Following the criteria, we created a research corpus consisting of audio recordings with their corresponding metadata, lyrics and music scores. So far we have gathered 338 recordings from 3 different Arab-Andalusian music schools of Morocco, covering most of the musical modes, rhythms and forms of this art music tradition. The Arab-Andalusian corpus is accessible to the research community from a central online repository. Moreover, the audio recordings of this corpora are freely available through the Internet Archive repository. The Arab-Andalusian corpus can be used to generate test datasets, which can be used as ground truth to test several computational research tasks.
4th International Workshop On Folk Music Analysis | 2018
Nadine Kroher; Emilia Gómez; Amin Chaachoo; Mohamed Sordo; José Miguel Díaz-Báñez; Francisco Gómez; Joaquín Mora
In this chapter we approach flamenco and Arab-Andalusian vocal music through the analysis of two representative pieces. We apply a hybrid methodology consisting of audio-signal processing to describe and contrast their melodic characteristics followed by musicological analysis. The use of such computational analysis tools complements a musicological-historical study with the aim of supporting the discovery and understanding of the specific characteristics of these musical traditions, their similarities and differences, while offering solutions to more general music information retrieval (MIR) research challenges.