Yandre M. G. Costa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yandre M. G. Costa is active.

Explore More

Publication

Featured researches published by Yandre M. G. Costa.

Signal Processing | 2012

Music genre classification using LBP textural features

Yandre M. G. Costa; Luiz S. Oliveira; Alessandro L. Koerich; Fabien Gouyon; J. G. Martins

In this paper we present an approach to music genre classification which converts an audio signal into spectrograms and extracts texture features from these time-frequency images which are then used for modeling music genres in a classification system. The texture features are based on Local Binary Pattern, a structural texture operator that has been successful in recent image classification research. Experiments are performed with two well-known datasets: the Latin Music Database (LMD), and the ISMIR 2004 dataset. The proposed approach takes into account some different zoning mechanisms to perform local feature extraction. Results obtained with and without local feature extraction are compared. We compare the performance of texture features with that of commonly used audio content based features (i.e. from the MARSYAS framework), and show that texture features always outperforms the audio content based features. We also compare our results with results from the literature. On the LMD, the performance of our approach reaches about 82.33%, above the best result obtained in the MIREX 2010 competition on that dataset. On the ISMIR 2004 database, the best result obtained is about 80.65%, i.e. below the best result on that dataset found in the literature.

international symposium on neural networks | 2012

Comparing textural features for music genre classification

Yandre M. G. Costa; Luiz S. Oliveira; Alessandro L. Koerich; Fabien Gouyon

In this paper we compare two different textural feature sets for automatic music genre classification. The idea is to convert the audio signal into spectrograms and then extract features from this visual representation. Two textural descriptors are explored in this work: the Gray Level Co-Occurrence Matrix (GLCM) and Local Binary Patterns (LBP). Besides, two different strategies of extracting features are considered: a global approach where the features are extracted from the entire spectrogram image and then classified by a single classifier; a local approach where the spectrogram image is split into several zones which are classified independently and final decision is then obtained by combining all the partial results. The database used in our experiments was the Latin Music Database, which contains music pieces categorized into 10 musical genres, and has been used for MIREX (Music Information Retrieval Evaluation eXchange) competitions. After a comprehensive series of experiments we show that the SVM classifier trained with LBP is able to achieve a recognition rate of 80%. This rate not only outperforms the GLCM by a fair margin but also is slightly better than the results reported in the literature.

iberoamerican congress on pattern recognition | 2013

Music Genre Recognition Using Gabor Filters and LPQ Texture Descriptors

Yandre M. G. Costa; Luiz S. Oliveira; Alessandro L. Koerich; Fabien Gouyon

This paper presents a novel approach for automatic music genre recognition in the visual domain that uses two texture descriptors. For this, the audio signal is converted into spectrograms and then textural features are extracted from this visual representation. Gabor filters and LPQ texture descriptors were used to capture the spectrogram content. In order to evaluate the performance of local feature extraction, some different zoning mechanisms were taken into account. The experiments were performed on the Latin Music Database. At the end, we have shown that the SVM classifier trained with LPQ is able to achieve a recognition rate above 80%. This rate is among the best results ever presented in the literature.

Applied Soft Computing | 2017

An evaluation of Convolutional Neural Networks for music classification using spectrograms

Yandre M. G. Costa; Luiz S. Oliveira; Carlos Nascimento Silla

Graphical abstractDisplay Omitted HighlightsMusic classification using spectrograms and Convolutional Neural Networks.Compare results with state of the art in Latin Music Database, ISMIR 2004 and African music collection.Assessing complementarity between Convolutional Neural Networks and classifiers built with hand-crafted features. Music genre recognition based on visual representation has been successfully explored over the last years. Classifiers trained with textural descriptors (e.g., Local Binary Patterns, Local Phase Quantization, and Gabor filters) extracted from the spectrograms have achieved state-of-the-art results on several music datasets. In this work, though, we argue that we can go further with the time-frequency analysis through the use of representation learning. To show that, we compare the results obtained with a Convolutional Neural Network (CNN) with the results obtained by using handcrafted features and SVM classifiers. In addition, we have performed experiments fusing the results obtained with learned features and handcrafted features to assess the complementarity between these representations for the music classification task. Experiments were conducted on three music databases with distinct characteristics, specifically a western music collection largely used in research benchmarks (ISMIR 2004 Database), a collection of Latin American music (LMD database), and a collection of field recordings of ethnic African music. Our experiments show that the CNN compares favorably to other classifiers in several scenarios, hence, it is a very interesting alternative for music genre recognition. Considering the African database, the CNN surpassed the handcrafted representations and also the state-of-the-art by a margin. In the case of the LMD database, the combination of CNN and Robust Local Binary Pattern achieved a recognition rate of 92%, which to the best of our knowledge, is the best result (using an artist filter) on this dataset so far. On the ISMIR 2004 dataset, although the CNN did not improve the state of the art, it performed better than the classifiers based individually on other kind of features.

international conference on systems signals and image processing | 2013

Music genre recognition based on visual features with dynamic ensemble of classifiers selection

Yandre M. G. Costa; Luiz S. Oliveira; Alessandro L. Koerich; Fabien Gouyon

This paper introduces the use of a dynamic ensemble of classifiers selection scheme with a pool of classifiers created to perform automatic music genre classification. The classifiers are based on support vector machine trained with textural features extracted from spectrogram images using Local Binary Patterns. The results obtained on the Latin Music Database showed that local feature extraction and the k-nearest oracle (KNORA) for dynamic ensemble of classifiers selection can reach a recognition rate of 83%, which is a little better than the best result ever reported on this dataset using the restrictions imposed by “artist filter”. In addition, the results are compared with those obtained from traditional approaches using acoustic features.

Pattern Recognition Letters | 2017

Combining visual and acoustic features for audio classification tasks

Loris Nanni; Yandre M. G. Costa; Diego Rafael Lucio; Carlos Nascimento Silla; Sheryl Brahnam

Coupling texture descriptors and acoustic features.Different methods for representing an audio as an image are compared.Heterogeneous ensemble of different classifiers improves performance. Display Omitted In this paper a novel and effective approach for automated audio classification is presented that is based on the fusion of different sets of features, both visual and acoustic. A number of different acoustic and visual features of sounds are evaluated and compared. These features are then fused in an ensemble that produces better classification accuracy than other state-of-the-art approaches. The visual features of sounds are built starting from the audio file and are taken from images constructed from different spectrograms, a gammatonegram, and a rhythm image. These images are divided into subwindows from which a set of texture descriptors are extracted. For each feature descriptor a different Support Vector Machine (SVM) is trained. The SVMs outputs are summed for a final decision. The proposed ensemble is evaluated on three well-known databases of music genre classification (the Latin Music Database, the ISMIR 2004 database, and the GTZAN genre collection), a dataset of Bird vocalization aiming specie recognition, and a dataset of right whale calls aiming whale detection. The MATLAB code for the ensemble of classifiers and for the extraction of the features will be publicly available (https://www.dei.unipd.it/node/2357 +Pattern Recognition and Ensemble Classifiers).

iberoamerican congress on pattern recognition | 2015

Language Identification Using Spectrogram Texture

Ana Montalvo; Yandre M. G. Costa; José R. Calvo

This paper proposes a novel front-end for automatic spoken language recognition, based on the spectrogram representation of the speech signal and in the properties of the Fourier spectrum to detect global periodicity in an image. Local Phase Quantization (LPQ) texture descriptor was used to capture the spectrogram content. Results obtained for 30 seconds test signal duration have shown that this method is very promising for low cost language identification. The best performance is achieved when our proposed method is fused with the i-vector representation.

Iet Computer Vision | 2017

Bird and whale species identification using sound images

Loris Nanni; Rafael de Lima Aguiar; Yandre M. G. Costa; Sheryl Brahnam; Carlos Nascimento Silla; Ricky L. Brattin; Zhao Zhao

Image identification of animals is mostly centred on identifying them based on their appearance, but there are other ways images can be used to identify animals, including by representing the sounds they make with images. In this study, the authors present a novel and effective approach for automated identification of birds and whales using some of the best texture descriptors in the computer vision literature. The visual features of sounds are built starting from the audio file and are taken from images constructed from different spectrograms and from harmonic and percussion images. These images are divided into sub-windows from which sets of texture descriptors are extracted. The experiments reported in this study using a dataset of Bird vocalisations targeted for species recognition and a dataset of right whale calls targeted for whale detection (as well as three well-known benchmarks for music genre classification) demonstrate that the fusion of different texture features enhances performance. The experiments also demonstrate that the fusion of different texture features with audio features is not only comparable with existing audio signal approaches but also statistically improves some of the stand-alone audio features. The code for the experiments will be publicly available at https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0.

Journal of New Music Research | 2018

Ensemble of deep learning, visual and acoustic features for music genre classification

Loris Nanni; Yandre M. G. Costa; Rafael L. Aguiar; Carlos N. Silla; Sheryl Brahnam

ABSTRACT In this work, we present an ensemble for automated music genre classification that fuses acoustic and visual (both handcrafted and nonhandcrafted) features extracted from audio files. These features are evaluated, compared and fused in a final ensemble shown to produce better classification accuracy than other state-of-the-art approaches on the Latin Music Database, ISMIR 2004, and the GTZAN genre collection. To the best of our knowledge, this paper reports the largest test comparing the combination of different descriptors (including a wavelet convolutional scattering network, which has been tested here for the first time as an input for texture descriptors) and different matrix representations. Superior performance is obtained without ad hoc parameter optimisation; that is to say, the same ensemble of classifiers and parameter settings are used on all tested data-sets. To demonstrate generalisability, our approach is also assessed on the tasks of bird species recognition using vocalisation and whale detection data-sets. All MATLAB source code is available.

Ecological Informatics | 2018

Bird species identification using spectrogram and dissimilarity approach

Rafael H. D. Zottesso; Yandre M. G. Costa; Diego Bertolini; Luiz S. Oliveira

Abstract In this work, we investigate bird species identification starting from audio record-ings on eight quite challenging subsets taken from the LifeClef 2015 bird task contest database, in which the number of classes ranges from 23 to 915. The classification was addressed using textural features taken from spectrogram im-ages and the dissimilarity framework. The rationale behind it is that by using dissimilarity the classification system is less sensitive to the increase in the number of classes. A comprehensive set of experiments confirms this hypothesis. Although they cannot be directly compared to other results already published because in this application domain the works, in general, are not developed ex-actly on the same dataset, they overcome the state-of-the-art when we consider the number of classes involved in similar works. In the hardest scenario, we obtained an identification rate of 71% considering 915 species. We hope the subsets proposed in this work will also make future benchmarking possible.

Explore More