Proceedings of the 29th ACM International Conference on Multimedia | 2021

MusicBERT: A Self-supervised Learning of Music Representation

Abstract

Music recommendation has been one of the most used information retrieval services on internet. Finding suitable music for users demands from tens of millions of music relies on the understanding of music content. Traditional studies usually focus on music representation based on massive user behavioral data and music meta-data, which ignore the audio characteristic of music. However, it is found that the melodic characteristics of music themselves can be further used to understand music. Moreover, how to utilize large-scale audio data to learn music representation is not well explored. To this end, we propose a self-supervised learning model for music representation. We firstly utilize a beat-level music pre-training model to learn the structure of music. Then, we use a multi-task learning framework to model music self-representation and co-relations between music, concurrently. Besides, we propose several downstream tasks to evaluate music representation, including music genre classification, music highlight, and music similarity retrieval. Extensive experiments on multiple music datasets demonstrate our model s superiority over baselines on learning music representation.

Volume None

Proceedings of the 29th ACM International Conference on Multimedia | 2021

MusicBERT: A Self-supervised Learning of Music Representation

Abstract

Volume None

Pages None

DOI 10.1145/3474085.3475576

Language English

Journal Proceedings of the 29th ACM International Conference on Multimedia

Full Text