Antonio Camarena-Ibarrola
Universidad Michoacana de San Nicolás de Hidalgo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Antonio Camarena-Ibarrola.
iberoamerican congress on pattern recognition | 2009
Eric Sadit Tellez; Edgar Chávez; Antonio Camarena-Ibarrola
Many pattern recognition tasks can be modeled as proximity searching. Here the common task is to quickly find all the elements close to a given query without sequentially scanning a very large database. A recent shift in the searching paradigm has been established by using permutations instead of distances to predict proximity. Every object in the database record how the set of reference objects (the permutants) is seen , i.e. only the relative positions are used. When a query arrives the relative displacements in the permutants between the query and a particular object is measured. This approach turned out to be the most efficient and scalable, at the expense of loosing recall in the answers. The permutation of every object is represented with *** short integers in practice, producing bulky indexes of 16 ***n bits. In this paper we show how to represent the permutation as a binary vector, using just one bit for each permutant (instead of log*** in the plain representation). The Hamming distance in the binary signature is used then to predict proximity between objects in the database. We tested this approach with many real life metric databases obtaining faster queries with a recall close to the Spearman ρ using 16 times less space.
iberoamerican congress on pattern recognition | 2009
Antonio Camarena-Ibarrola; Edgar Chávez; Eric Sadit Tellez
Monitoring media broadcast content has deserved a lot of attention lately from both academy and industry due to the technical challenge involved and its economic importance (e.g. in advertising). The problem pose a unique challenge from the pattern recognition point of view because a very high recognition rate is needed under non ideal conditions. The problem consist in comparing a small audio sequence (the commercial ad) with a large audio stream (the broadcast) searching for matches. In this paper we present a solution with the Multi-Band Spectral Entropy Signature (MBSES) which is very robust to degradations commonly found on amplitude modulated (AM) radio. Using the MBSES we obtained perfect recall (all audio ads occurrences were accurately found with no false positives) in 95 hours of audio from five different am radio broadcasts. Our system is able to scan one hour of audio in 40 seconds if the audio is already fingerprinted (e.g. with a separated slave computer), and it totaled five minutes per hour including the fingerprint extraction using a single core off the shelf desktop computer with no parallelization.
International Journal of Machine Learning and Cybernetics | 2011
Antonio Camarena-Ibarrola; Edgar Chávez
To make audio monitoring, the state of the art in this area makes use of local alignment algorithms between the objective audio and musical interpretation.The inductive hypothesis of a local alignment tool is that the alignment is correct to the current position of an error this is drag and accumulate to subsequent errors which do not recover unless elaborate heuristics are used. Our approach uses a local non-alignment scheme based on the audio search the entire purpose of short segments of audio taken from musical performance to get the k nearest audio segments (the proximity is determined using audio tracks based on entropy signs).The current audio segment of the play is paired with the nearest (in time) between the k previously selected audio segments of the target audio.To our knowledge, this is the first algorithm able to start up from an arbitrary point in the audio end, for example, if the musical performance had already begun when the monitoring system just went on.We complemented the overall strategy through a simple heuristic of ignoring the candidates when they are all too far in time with respect to the last position reported by the system.We have tested our method with 62 musical pieces, some of which are pop and classical music mostly.For every song we have two interpretations, we use one as the audio object and the other as the interpretation which will be monitored.We obtained excellent results.
mexican international conference on artificial intelligence | 2010
Antonio Camarena-Ibarrola; Edgar Chávez
Real time tracking of musical performances allows for implementation of virtual teachers of musical instruments, automatic accompanying of musicians or singers, and automatic adding of special effects in live presentations. State of the art approaches make a local alignment of the score (the target audio) and a musical performance, such procedure induce cumulative error since it assumes the rendition to be well tracked up to the current time. We propose searching for the k-nearest neighbors of the current audio segment among all audio segments of the score then use some heuristics to decide the current tracked position of the performance inside the score. We tested the method with 62 songs, some pop music but mostly classical. For each song we have two performances, we use one of them as the score and the other one as the music to be tracked with excellent results.
international symposium on signal processing and information technology | 2011
Alain Manzo-Martinez; Antonio Camarena-Ibarrola
In this paper we propose a new technique to characterize audio-signals. We use Shannons Entropy to estimate the level of information content per chroma and we show that involving entropy contributes for a more robust audio characterization. A new audio-fingerprint (AFP) based on this feature is proposed in this paper which we have called Entropy-Chroma Fingerprint (ECFP). Two approaches were considered to estimate entropy; the first assumes the spectral coefficients distribute normally, while the second, estimates its probability density function (PDF) with the Parzen Windows Estimation method. We compared the robustness of the ECFP against the Chromagram-Based Audio-Fingerprint (CBFP) which is determined using the Constant Q Transform (CQT). Three thousand and five hundred AFPs were determined from songs of several genres. A subset of 350 songs were severely degraded and searched for using excerpts of 5 seconds for that matter. The ECFP determined assuming gaussianity on the PDF turned out to be much more robust than the CBFP. The ECFP determined assuming gaussianity is much faster to process than both, the CBFP and the ECFP determined with Parzen Windows and still more robust.
iberoamerican congress on pattern recognition | 2014
Karina Figueroa; Antonio Camarena-Ibarrola; Jonathan García; Héctor Tejeda Villela
Photo-identification of naturally marked animals is a non-intrusive technique for obtaining valuable information regarding population size and behavior in the wilderness of endangered species. In this paper we present a method for detecting/cutting wild felines in pictures taken with trap-cameras installed in the forest and triggered by infrared sensors. The detection of these felines serves the purpose of collecting information useful in studies about the population size or the migration phenomena. We propose computing the difference of images from the same trap-cameras within a short period of time. According to our experiments, our method is fast, reliable and robust, this method can be used for other species with different pelage patterns.
ieee international autumn meeting on power electronics and computing | 2013
José María Valencia-Ramírez; Antonio Camarena-Ibarrola
Discrete Hidden Markov Models (DHMMs) are used in Automatic Speech Recognition (ASR) systems to model the dynamics of utterances as stochastic processes. Some researchers however prefer the use of Dynamic Time Warping (DTW) to deal with variations on the temporal evolution of utterances of the same word. Furthermore, some researchers in the field of ASR recommend the use of Mel frequency Cepstral Coefficients (MFCC) as the relevant features to be extracted from the speech signal while others use Linear Prediction Coefficients (LPC) for that matter. At evaluating the similarity of feature vectors we may use euclidean distance, cosine distance or the Itakura distance (in case of using LPC). We would like to know what combination of techniques should ASR developers use in the specific problem of Isolated Word Recognition. We implemented a number of ASR systems by changing the feature extraction module, the aligning techinque, the distance measure, or parameters values and compared them in order for the sake of those interested in developping Isolated Word recognition systems. In this paper we report the results of our experiments using Receiver Operating Characteristics (ROC) curves to show which ASR system achieved the highest recognition rate.
mexican international conference on computer science | 2006
Antonio Camarena-Ibarrola; Edgar Chávez
In automatic speech recognition, voice synthesis, speaker identification and identifying laringeal diseases, it is critical to classify speech segments as voiced or unvoiced. Several techniques have been proposed for this issue during the last twenty years, unfortunately, they either have especial cases where the result is unreliable or need to use not only the present segment of speech but the next one as well, this fact limits its applications (i.e continuous speech recognition). In this paper we present an alternative to voiced/unvoiced classification using a discretization of the continuous Fourier transform
mexican conference on pattern recognition | 2018
Karina Figueroa; Nora Reyes; Antonio Camarena-Ibarrola; L. Valero-Elizondo
The similarity search is a central problem to many applications, such as multimedia databases and repositories containing complex non-structured objects. The metric space model is very useful in these scenarios, because metric indexes support efficient similarity search but most of them are designed for main memory. In this article we introduce an improved version of the List of Clustered Permutations (iLCP), a competitive index for approximate similarity search. Our proposal is specially adapted for secondary memory and performs well in several scenarios, especially on spaces of medium and high dimensionality. We assessed this new structure with several real-life metric spaces from SISAP, the results show that this new version keeps the rewarding characteristics of LCP, while obtaining a very good performance in terms of number of pages read per search.
Pattern Recognition Letters | 2017
Karina Figueroa; Rodrigo Paredes; Antonio Camarena-Ibarrola; Héctor Tejeda-Villela
Abstract Similarity searching is a very useful task in several disciplines such as pattern recognition, machine learning, and decision theory. To solve this task we can use an index to speed up the searching. Among the current indices, the permutant based searching approach has proved its efficiency for high-dimensional data before, however up to now this approach had not been adapted to work with low-dimensional data where the approach seemed useless. We propose several ways to adapt the permutant searching approach for low-dimensional data, using zones varying the distribution of the radii, trying different distance measures, and using partial distance computation as well. After many experiments, we arrived to conclusions about the optimal values of the parameters using a synthetic database of vectors, and then use these learned values on real databases obtaining excellent results for k -nearest neighbor queries, both in high and low dimensional data.