Leandro E. Di Persia
National Scientific and Technical Research Council
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Leandro E. Di Persia.
Signal Processing | 2008
Leandro E. Di Persia; Diego H. Milone; Hugo Leonardo Rufiner; Masuzo Yanagida
In a previous article, an evaluation of several objective quality measures as predictors of recognition rate after the application of a blind source separation algorithm was reported. In this work, the experiments were repeated using some new measures, based on the perceptual evaluation of speech quality (PESQ), which is part of the ITU P862 standard for evaluation of communication systems. The raw PESQ and a nonlinearly transformed PESQ were evaluated, together with several composite measures. The results show that the PESQ-based measures outperformed all the measures reported in the previous work. Based on these results, we recommend the use of PESQ-based measures to evaluate blind source separation algorithms for automatic speech recognition.
Signal Processing | 2007
Leandro E. Di Persia; Masuzo Yanagida; Hugo Leonardo Rufiner; Diego H. Milone
The determination of quality of the signals obtained by blind source separation is a very important subject for development and evaluation of such algorithms. When this approach is used as a pre-processing stage for automatic speech recognition, the quality measure of separation applied for assessment should be related to the recognition rates of the system. Many measures have been used for quality evaluation, but in general these have been applied without prior research of their capabilities as quality measures in the context of blind source separation, and often they require experimentation in unrealistic conditions. Moreover, these measures just try to evaluate the amount of separation, and this value could not be directly related to recognition rates. Presented in this work is a study of several objective quality measures evaluated as predictors of recognition rate of a continuous speech recognizer. Correlation between quality measures and recognition rates is analyzed for a separation algorithm applied to signals recorded in a real room with different reverberation times and different kinds and levels of noise. A very good correlation between weighted spectral slope measure and the recognition rate has been verified from the results of this analysis. Furthermore, a good performance of total relative distortion and cepstral measures for rooms with relatively long reverberation time has been observed.
Medical Engineering & Physics | 2014
Gastón Schlotthauer; Leandro E. Di Persia; Luis Darío Larrateguy; Diego H. Milone
Detection of desaturations on the pulse oximetry signal is of great importance for the diagnosis of sleep apneas. Using the counting of desaturations, an index can be built to help in the diagnosis of severe cases of obstructive sleep apnea-hypopnea syndrome. It is important to have automatic detection methods that allows the screening for this syndrome, reducing the need of the expensive polysomnography based studies. In this paper a novel recognition method based on the empirical mode decomposition of the pulse oximetry signal is proposed. The desaturations produce a very specific wave pattern that is extracted in the modes of the decomposition. Using this information, a detector based on properly selected thresholds and a set of simple rules is built. The oxygen desaturation index constructed from these detections produces a detector for obstructive sleep apnea-hypopnea syndrome with high sensitivity (0.838) and specificity (0.855) and yields better results than standard desaturation detection approaches.
international conference of the ieee engineering in medicine and biology society | 2011
Pablo Gautério Cavalcanti; Jacob Scharcanski; Leandro E. Di Persia; Diego H. Milone
Segmentation is an important step in computer-aided diagnostic systems for pigmented skin lesions, since that a good definition of the lesion area and its boundary at the image is very important to distinguish benign from malignant cases. In this paper a new skin lesion segmentation method is proposed. This method uses Independent Component Analysis to locate skin lesions in the image, and this location information is further refined by a Level-set segmentation method. Our method was evaluated in 141 images and achieved an average segmentation error of 16.55%, lower than the results for comparable state-of-the-art methods proposed in literature.
instrumentation and measurement technology conference | 2014
Pablo Gautério Cavalcanti; Jacob Scharcanski; Cesar E. Martinez; Leandro E. Di Persia
Melanoma is a type of malignant pigmented skin lesion, which currently is among the most dangerous existing cancers. Segmentation is an important step in computer-aided pre-screening systems for pigmented skin lesions, because a good definition of the lesion area and its rim is very important for discriminating between benign and malignant cases. In this paper, we propose to segment pigmented skin lesions using the Non-negative Matrix Factorization of the multi-channel skin lesion image representation. Our preliminary experimental results on a publicly available dataset suggest that our method obtains lower segmentation errors (in average) than comparable state-of-the-art methods proposed in literature.
Signal Processing | 2016
Leandro E. Di Persia; Diego H. Milone
In the frequency domain independent component analysis approaches for audio sources separation, the convolutive mixing problem is replaced by the solution of several instantaneous mixing problems, one for each frequency bin of the short time Fourier transform. This methodology yields good results but requires the solution of the permutation ambiguity. Moreover, the performance of the separation algorithms for each bin is not guaranteed to be equivalent, thus some bins can have worse results than others. In this paper a technique based on data from multiple bins is proposed to address these issues. The use of multiple bin information produces a coupling of the separation, resulting in more stable separation matrices and reducing the occurrence of permutations, but increasing the computational cost. This can be mitigated by a sub-sampling of the multiple bins information. The results show that both approaches are beneficial for the frequency domain ICA approach, producing better separation in terms of objective quality measures. HighlightsA method for stabilization of FD-ICA using lateral bins information is proposed.The method produces a robust estimation of separation matrices for all bins.The permutation rate is also reduced which improves the separation quality.Using subsampling the introduced computational cost overhead can be reduced.The trade-off between quality improvement and overhead can be controlled.
mexican international conference on artificial intelligence | 2007
Diego H. Milone; Leandro E. Di Persia
The wavelet transform has been used for feature extraction in many applications of pattern recognition. However, in general the learning algorithms are not designed taking into account the properties of the features obtained with discrete wavelet transform. In this work we propose a Markovian model to classify sequences of frames in the wavelet domain. The architecture is a composite of an external hidden Markov model in which the observation probabilities are provided by a set of hidden Markov trees. Training algorithms are developed for the composite model using the expectation-maximization framework. We also evaluate a novel delay-invariant representation to improve wavelet feature extraction for classification tasks. The proposed methods can be easily extended to model sequences of images. Here we present phoneme recognition experiments with TIMIT speech corpus. The robustness of the proposed architecture and learning method was tested by reducing the amount of training data to a few patterns. Recognition rates were better than those of hidden Markov models with observation densities based in Gaussian mixtures.
signal processing systems | 2011
Leandro E. Di Persia; Diego H. Milone; Masuzo Yanagida
In a recent publication the pseudoanechoic mixing model for closely spaced microphones was proposed and a blind audio sources separation algorithm based on this model was developed. This method uses frequency-domain independent component analysis to identify the mixing parameters. These parameters are used to synthesize the separation matrices, and then a time-frequency Wiener postfilter to improve the separation is applied. In this contribution, key aspects of the separation algorithm are optimized with two novel methods. A deeper analysis of the working principles of the Wiener postfilter is presented, which gives an insight in its reverberation reduction capabilities. Also a variation of this postfilter to improve the performance using the information of previous frames is introduced. The basic method uses a fixed central frequency bin for the estimation of the mixture parameters. In this contribution an automatic selection of the central bin, based in the information of the separability of the sources, is introduced. The improvements obtained through these methods are evaluated in an automatic speech recognition task and with the PESQ objective quality measure. The results show an increased robustness and stability of the proposed method, enhancing the separation quality and improving the speech recognition rate of an automatic speech recognition system.
Journal of the Acoustical Society of America | 2006
Tadashige Noguchi; Kenko Ota; Masuzo Yanagida; Leandro E. Di Persia
Conventional frequency‐domain ICA yields the optimal separation for each frequency bin, but it suffers from the permutation problem. The authors have developed permutation‐free ICA as a separation scheme by obtaining the separation matrix for a long vector consisting of temporal changes of all frequency components of the received signal. The permutation‐free ICA, however, only yields a common separation matrix for all frequency bins. So, the separation matrix obtained in the permutation‐free ICA has a common directivity pattern for all frequency bins, though the method can avoid the permutation problem. Proposed in this paper is a scheme of multibin ICA that deconvolves mixed signals into original source signals by shifting piecewise integration of a set of frequency bins consisting of the frequency bin in concern and neighboring frequency bins. The proposed method can yield nearly optimal directivity in the form of the separation matrix for each frequency bin, avoiding the permutation problem. Performanc...
Pattern Recognition | 2010
Diego H. Milone; Leandro E. Di Persia; María Eugenia Torres