Paulo A. A. Esquef
Helsinki University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paulo A. A. Esquef.
Brain Research | 2008
Riia Milovanov; Minna Huotilainen; Vesa Välimäki; Paulo A. A. Esquef; Mari Tervaniemi
The main focus of this study was to examine the relationship between musical aptitude and second language pronunciation skills. We investigated whether children with superior performance in foreign language production represent musical sound features more readily in the preattentive level of neural processing compared with children with less-advanced production skills. Sound processing accuracy was examined in elementary school children by means of event-related potential (ERP) recordings and behavioral measures. Children with good linguistic skills had better musical skills as measured by the Seashore musicality test than children with less accurate linguistic skills. The ERP data accompany the results of the behavioral tests: children with good linguistic skills showed more pronounced sound-change evoked activation with the music stimuli than children with less accurate linguistic skills. Taken together, the results imply that musical and linguistic skills could partly be based on shared neural mechanisms.
Neuroscience Letters | 2009
Riia Milovanov; Minna Huotilainen; Paulo A. A. Esquef; Paavo Alku; Vesa Välimäki; Mari Tervaniemi
We examined 10-12-year old elementary school childrens ability to preattentively process sound durations in music and speech stimuli. In total, 40 children had either advanced foreign language production skills and higher musical aptitude or less advanced results in both musicality and linguistic tests. Event-related potential (ERP) recordings of the mismatch negativity (MMN) show that the duration changes in musical sounds are more prominently and accurately processed than changes in speech sounds. Moreover, children with advanced pronunciation and musicality skills displayed enhanced MMNs to duration changes in both speech and musical sounds. Thus, our study provides further evidence for the claim that musical aptitude and linguistic skills are interconnected and the musical features of the stimuli could have a preponderant role in preattentive duration processing.
IEEE Transactions on Audio, Speech, and Language Processing | 2006
Paulo A. A. Esquef; Luiz W. P. Biscainho
This paper presents an efficient model-based interpolator for reconstructing long portions of missing samples in audio signals. A simple modification to an existing interpolation method is proposed, optimizing the balance between computational requirements and qualitative performance. Moreover, a post-processing multirate interpolation scheme is introduced to further enhance the quality of the restored signals. Subjective quality assessments of the reconstructed signals via listening tests substantiate the improved performance of the proposed algorithm
workshop on applications of signal processing to audio and acoustics | 2003
Ismo Kauppinen; Paulo A. A. Esquef; Vesa Välimäki
Auto-regressive modeling of measured data is commonly used in numerous signal processing applications. When aiming for high accuracy, Burgs method has been found to give a suitable model. It has been shown that when the signal energy is non-uniformly distributed in a frequency range, the use of a modified frequency scale is advantageous. This is often the case with audio signals. We introduce a frequency warped version of Burgs method for calculating the auto-regressive filter parameters. A bilinear frequency mapping can be embedded in Burgs method by replacing the unit-delays of the lattice structure used in Burgs method with first-order allpass filters. The benefits of the frequency-warped Burgs method are demonstrated by comparing its signal modeling performance against those of the conventional Burgs method and the warped Yule-Walker method.
EURASIP Journal on Advances in Signal Processing | 2003
Paulo A. A. Esquef; Matti Karjalainen; Vesa Välimäki
This paper addresses model-based analysis of string instrument sounds. In particular, it reviews the application of autoregressive (AR) modeling to sound analysis/synthesis purposes. Moreover, a frequency-zooming autoregressive moving average (FZ-ARMA) modeling scheme is described. The performance of the FZ-ARMA method on modeling the modal behavior of isolated groups of resonance frequencies is evaluated for both synthetic and real string instrument tones immersed in background noise. We demonstrate that the FZ-ARMA modeling is a robust tool to estimate the decay time and frequency of partials of noisy tones. Finally, we discuss the use of the method in synthesis of string instrument sounds.
IEEE Transactions on Information Forensics and Security | 2014
Paulo A. A. Esquef; José Antonio Apolinário; Luiz W. P. Biscainho
In this paper, an edit detection method for forensic audio analysis is proposed. It develops and improves a previous method through changes in the signal processing chain and a novel detection criterion. As with the original method, electrical network frequency (ENF) analysis is central to the novel edit detector, for it allows monitoring anomalous variations of the ENF related to audio edit events. Working in unsupervised manner, the edit detector compares the extent of ENF variations, centered at its nominal frequency, with a variable threshold that defines the upper limit for normal variations observed in unedited signals. The ENF variations caused by edits in the signal are likely to exceed the threshold providing a mechanism for their detection. The proposed method is evaluated in both qualitative and quantitative terms via two distinct annotated databases. Results are reported for originally noisy database signals as well as versions of them further degraded under controlled conditions. A comparative performance evaluation, in terms of equal error rate (EER) detection, reveals that, for one of the tested databases, an improvement from 7% to 4% EER is achieved, respectively, from the original to the new edit detection method. When the signals are amplitude clipped or corrupted by broadband background noise, the performance figures of the novel method follow the same profile of those of the original method.
international conference on digital signal processing | 2002
Paulo A. A. Esquef; Matti Karjalainen; Vesa Välimäki
Warped linear prediction (WLP) is applied to a model-based method to detect impulsive disturbances in audio signals. According to simulations performed on artificially corrupted audio signals, the adoption of negative values for the warping factor favors the click detection scheme. As a consequence, for equal levels of missing (false) detection, the WLP-based scheme yields a consistently lower percentage of false (missing) detection than the conventional method.
international symposium on circuits and systems | 2001
Luiz W. P. Biscainho; Paulo S. R. Diniz; Paulo A. A. Esquef
It was recently shown (May 200) that the sub-band signals resulting from the analysis of an ARMA process by a decimating filter bank can be modeled by individual lower-rate ARMA processes. This paper discusses the validity of this modeling and the zero-pole mapping of the generator filters along the sub-bands. Based on these results, a sub-band version of a model-based technique for de-clicking of audio recordings is proposed. Simulation results are presented and validated by subjective as well as objective means.
international workshop on information forensics and security | 2015
Paulo A. A. Esquef; José Antonio Apolinário; Luiz W. P. Biscainho
In a recent paper published in the IEEE TIFS, we proposed an edit detection method based on the instantaneous variations of the Electrical Network Frequency (ENF). In this work we modify the detection criteria of that method by taking advantage of the typical pattern of ENF variations elicited by audio edits. We describe the implemented modifications and directly confront the performance of both methods using two distinct signal databases that contain real-life speech recordings. The experimental results demonstrate that the new proposition has an improved performance in terms of lower equal error rates when compared to its former version.
multimedia signal processing | 2009
Luiz W. P. Biscainho; Paulo A. A. Esquef; Fabio P. Freeland; Leonardo O. Nunes; Alan Freihof Tygel; Bowon Lee; Amir Said; Ton Kalker; Ronald W. Schafer
Modern telepresence systems can deliver multimedia signals of unprecedentedly high quality of experience to the user. Setting and maintaining such services call for reliable and automatic tools for multimedia quality probing, in special those targeted at speech data along the transmission path. Most of the objective methods for sound quality assessment (QA) in the literature are intended for either speech signals of 4- to 8-kHz bandwidth or general audio until 24 kHz, but are not specifically designed for speech at high sampling-rates. This work approaches quality evaluation of full-band (24 kHz) high-quality speech corrupted by echo. A simple metric singled out from a standardized double-ended tool for audio QA is proposed as a solution for the problem at hand. Quality measures from a set of speech stimuli corrupted by echo under controlled conditions were obtained via listening tests to allow calibration and evaluation of the proposed method. Experimental results reveal an overall correlation of 0.94 between objective and subjective scores, even in the presence of moderate additive noise.