2019 6th International Conference on Control, Decision and Information Technologies (CoDIT) | 2019

Audio Tempo Estimation Method Improved by Rhythm Pattern and Data Augmentation

 
 

Abstract


Tempo is the intuitive attribute of audio music, since people could feel fast or slow expressively and detect salient pulses to form perceived tempo value naturally. Nonetheless, for some audio, the tempo value could be ambiguous due to complex metrical level, different composing habit and creating style. Even though most of audio have the predominant tempo with consensus between the listeners, the others could have two dominant tempi. The challenge and goal of tempo estimation is to discriminate the salient tempi, mostly one or two tempos, related to the metric level by analyzing the audio signal directly. In this study, we propose the rhythm patterns of long-term periodicity curve derived from tempogram to improve the saliency detection. Besides, the data augmentation method is also invented to conquer the deficiency and representative of the three training datasets. The performance is evaluated on three public datasets in which the accuracy of “GiantSteps” dataset even outperforms the state-of-the-art tempo estimator of convolutional neural network implementation.

Volume None
Pages 779-784
DOI 10.1109/CoDIT.2019.8820457
Language English
Journal 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT)

Full Text