José Francisco Ruiz-Muñoz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where José Francisco Ruiz-Muñoz is active.

Explore More

Publication

Featured researches published by José Francisco Ruiz-Muñoz.

iberoamerican congress on pattern recognition | 2013

Managing Imbalanced Data Sets in Multi-label Problems: A Case Study with the SMOTE Algorithm

Andrés Felipe Giraldo-Forero; Jorge Alberto Jaramillo-Garzón; José Francisco Ruiz-Muñoz; César Germán Castellanos-Domínguez

Multi-label learning has been becoming an increasingly active area into the machine learning community since a wide variety of real world problems are naturally multi-labeled. However, it is not uncommon to find disparities among the number of samples of each class, which constitutes an additional challenge for the learning algorithm. Smote is an oversampling technique that has been successfully applied for balancing single-labeled data sets, but has not been used in multi-label frameworks so far. In this work, several strategies are proposed and compared in order to generate synthetic samples for balancing data sets in the training of multi-label algorithms. Results show that a correct selection of seed samples for oversampling improves the classification performance of multi-label algorithms. The uniform generation oversampling, provides an efficient methodology for a wide scope of real world problems.

signal processing systems | 2018

Dictionary Learning for Bioacoustics Monitoring with Applications to Species Classification

José Francisco Ruiz-Muñoz; Zeyu You; Raviv Raich; Xiaoli Z. Fern

This paper deals with the application of the convolutive version of dictionary learning to analyze in-situ audio recordings for bio-acoustics monitoring. We propose an efficient approach for learning and using a sparse convolutive model to represent a collection of spectrograms. In this approach, we identify repeated bioacoustics patterns, e.g., bird syllables, as words and represent new spectrograms using these words. Moreover, we propose a supervised dictionary learning approach in the multiple-label setting to support multi-label classification of unlabeled spectrograms. Our approach relies on a random projection for reduced computational complexity. As a consequence, the non-negativity requirement on the dictionary words is relaxed. Furthermore, the proposed approach is well-suited for a collection of discontinuous spectrograms. We evaluate our approach on synthetic examples and on two real datasets consisting of multiple birds audio recordings. Bird syllable dictionary learning from a real-world dataset is demonstrated. Additionally, we successfully apply the approach to spectrogram denoising and species classification.

Ecological Informatics | 2016

Enhancing the dissimilarity-based classification of birdsong recordings

José Francisco Ruiz-Muñoz; Germán Castellanos-Domínguez; Mauricio Orozco-Alzate

Abstract Classification of birdsong recordings can be naturally formulated as a multiple instance problem, where bags of instances are represented by either features or dissimilarities. In bioacoustics, bags typically correspond to regions of interest in spectrograms, which are detected after a segmentation stage of the audio recordings. In this paper, we use different dissimilarity measures between bags and explore whether the subsequent application of metric learning/adaptation methods and the construction of dissimilarity spaces allow increasing the classification performance of birdsong recordings. A publicly available bioacoustic data set is used for the experiments. Our results suggest, in the first place, that appropriate dissimilarity measures are those which capture most of the overall differences between bags, such as the modified Hausdorff distance and the mean minimum distance; in the second place, they confirm the benefit from adapting the applied dissimilarity measure as well as the potential further enhancement of the classification performance by building dissimilarity spaces and increasing training set sizes.

international conference of the ieee engineering in medicine and biology society | 2012

An adaptation of Pfam profiles to predict protein sub-cellular localization in Gram positive bacteria

G. A. Arango-Argoty; José Francisco Ruiz-Muñoz; Jorge Alberto Jaramillo-Garzón; César Germán Castellanos-Domínguez

Predicting the sub-cellular localization of a protein can provide useful information to uncover its molecular functions. In this sense, numerous prediction techniques have been developed, which usually have been focused on global information of the protein or sequence alignments. However, several studies have shown that the functional nature of proteins is ruled by conserved sub-sequence patterns known as domains. In this paper, an alternative methodology (PfamFeat) for gram-positive bacterial sub-cellular localization was developed. PfamFeat is based on information provided by Pfam database, which stores a series of HMM-profiles describing common protein domains. The likelihood of a sequence, to be generated by a given HMM-profile, can be used to characterize sequences in order to use pattern recognition techniques. Success rates obtained with a simple one-nearest neighbor classifier demonstrate that this method is competitive with popular sub-cellular prediction algorithms and it constitutes a promising research trend.

international workshop on machine learning for signal processing | 2015

Dictionary extraction from a collection of spectrograms for bioacoustics monitoring

José Francisco Ruiz-Muñoz; Zeyu You; Raviv Raich; Xiaoli Z. Fern

Dictionary learning of spectrograms consists of detecting their fundamental spectra-temporal patterns and their associated activation signals. In this paper, we propose an efficient convolutive dictionary learning approach for analyzing repetitive bioacoustics patterns from a collection of audio recordings. Our method is inspired by the convolutive non-negative matrix factorization (CNMF) model. The proposed approach relies on random projection for reduced computational complexity. As a consequence, the non-negativity requirement on the dictionary words is relaxed. Moreover, the proposed approach is well-suited for a collection of discontinuous spectrograms. We evaluate our approach on synthetic examples and on two real datasets consisting of multiple birds audio recordings. Bird syllable dictionary learning from a real-world dataset is demonstrated. Additionally, we apply the approach for spectrogram denoising in the presence of rain noise artifacts.

iberoamerican congress on pattern recognition | 2013

Threshold Estimation in Energy-Based Methods for Segmenting Birdsong Recordings

José Francisco Ruiz-Muñoz; Mauricio Orozco-Alzate; César Germán Castellanos-Domínguez

Monitoring wildlife populations is important to assess ecosystem health, attend environmental protection activities and undertake research studies about ecology. However, the traditional techniques are temporally and spatially limited; in order to extract information quickly and accurately about the current state of the environment, processing and recognition of acoustic signals are used. In the literature, several research studies about automatic classification of species through their vocalizations are found; however, in many of them the segmentation carried out in the preprocessing stage is briefly mentioned and, therefore, it is difficult to be reproduced by other researchers. This paper is specifically focused on detection of regions of interest in the audio recordings. A methodology for threshold estimation in segmentation techniques based on energy of a frequency band of a birdsong recording is described. Experiments were carried out using chunks taken from the RMBL-Robin database; results showed that a good performance of segmentation can be obtained by computing a threshold as a linear function where the independent variable is the estimated noise.

similarity search and applications | 2018

Relative Minimum Distance Between Projected Bags for Improved Multiple Instance Classification

José Francisco Ruiz-Muñoz; Germán Castellanos-Domínguez; Mauricio Orozco-Alzate

A novel relative minimum distance is introduced that allows improving the dissimilarity-based multiple instance classification. To this end, we apply a previously proposed mapping that brings closer, at least, a single instance from each positive training bag, while the negative-bags instances are driven apart. Our results show an increased classification performance on a broad type of real-world datasets.

international conference on acoustics, speech, and signal processing | 2017

Online learning of time-frequency patterns

José Francisco Ruiz-Muñoz; Raviv Raich; Mauricio Orozco-Alzate; Xiaoli Z. Fern

We present an online method to learn recurring time-frequency patterns from spectrograms. Our method relies on a convolutive decomposition that estimates sequences of spectra into time-frequency patterns and their corresponding activation signals. This method processes one spectrogram at a time such that in comparison with a batch method, the computational cost is reduced proportionally to the number of considered spectrograms. We use a first-order stochastic gradient descent and show that a monotonically decreasing learning-rate works appropriately. Furthermore, we suggest a framework to classify spectrograms based on the estimated set of time-frequency patterns. Results, on a set of synthetically generated spectrograms and a real-world dataset, show that our method finds meaningful time-frequency patterns and that it is suitable to handle a large amount of data.

latin american robotics symposium and ieee colombian conference on automatic control | 2011

Dissimilarity-based classification for bioacoustic monitoring of bird species

José Francisco Ruiz-Muñoz; Mauricio Orozco-Alzate

The wealth of biodiversity is difficult to estimate because field inspections are exhausting and expensive. However, automatic monitoring systems can be a feasible option to partially overcome such a drawback. In this study, we present a process of bioacoustic recognition based on digital signal processing and pattern recognition techniques. On top of representations extracted from waveforms and spectra as well as computed by dissimilarities between pairs of them, we build classifiers for identifying 11 species in a data set of bird sounds recorded in the Colombian mountains. Results show that time-varying representations are a particularly good option for characterizing signals in this problem.

iberoamerican congress on pattern recognition | 2011

Feature and dissimilarity representations for the sound-based recognition of bird species

José Francisco Ruiz-Muñoz; Mauricio Orozco-Alzate; César Germán Castellanos-Domínguez

Pattern recognition and digital signal processing techniques allow the design of automated systems for avian monitoring. They are a non-intrusive and cost-effective way to perform surveys of bird populations and assessments of biological diversity. In this study, a number of representation approaches for bird sounds are compared; namely, feature and dissimilarity representations. In order to take into account the non-stationary nature of the audio signals and to build robust dissimilarity representations, the application of the Earth Movers Distance (EMD) to time-varying measurements is proposed. Measures of the leave-one-out 1-NN performance are used as comparison criteria. Results show that, overall, the Mel-ceptrum coefficients are the best alternative; specially when computed by frames and used in combination with EMD to generate dissimilarity representations.

Explore More