Is this you? Create Your Porfile

Jon Sanchez

University of the Basque Country

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jon Sanchez is active.

Explore More

Publication

Featured researches published by Jon Sanchez.

iberoamerican congress on pattern recognition | 2003

Online Handwritten Signature Verification Using Hidden Markov Models

Juan J. Igarza; Iñaki Goirizelaia; Koldo Espinosa; Inmaculada Hernáez; Raúl Méndez; Jon Sanchez

Most people are used to signing documents and because of this, it is a trusted and natural method for user identity verification, reducing the cost of password maintenance and decreasing the risk of eBusiness fraud. In the proposed system, identity is securely verified and an authentic electronic signature is created using biometric dynamic signature verification. Shape, speed, stroke order, off-tablet motion, pen pressure and timing information are captured and analyzed during the real-time act of signing the handwritten signature. The captured values are unique to an individual and virtually impossible to duplicate. This paper presents a research of various HMM based techniques for signature verification. Different topologies are compared in order to obtain an optimized high performance signature verification system and signal normalization preprocessing makes the system robust with respect to writer variability.

international conference on acoustics, speech, and signal processing | 2007

Evaluation of Pitch Detection Algorithms Under Real Conditions

Iker Luengo; Ibon Saratxaga; Eva Navas; Inmaculada Hernáez; Jon Sanchez; Iñaki Sainz

A novel algorithm based on classical cepstrum calculation followed by dynamic programming is presented in this paper. The algorithm has been evaluated with a 60-minutes database containing 60 speakers and different recording conditions and environments. A second reference database has also been used. In addition, the performance of four popular PDA algorithms has been evaluated with the same databases. The results prove the good performance of the described algorithm in noisy conditions. Furthermore, the paper is a first initiative to perform an evaluation of widely used PDA algorithms over an extensive and realistic database.

IEEE Transactions on Information Forensics and Security | 2015

Toward a Universal Synthetic Speech Spoofing Detection Using Phase Information

Jon Sanchez; Ibon Saratxaga; Inma Hernaez; Eva Navas; Daniel Erro; Tuomo Raitio

In the field of speaker verification (SV) it is nowadays feasible and relatively easy to create a synthetic voice to deceive a speech driven biometric access system. This paper presents a synthetic speech detector that can be connected at the front-end or at the back-end of a standard SV system, and that will protect it from spoofing attacks coming from state-of-the-art statistical Text to Speech (TTS) systems. The system described is a Gaussian Mixture Model (GMM) based binary classifier that uses natural and copy-synthesized signals obtained from the Wall Street Journal database to train the system models. Three different state-of-the-art vocoders are chosen and modeled using two sets of acoustic parameters: 1) relative phase shift and 2) canonical Mel Frequency Cepstral Coefficients (MFCC) parameters, as baseline. The vocoder dependency of the system and multivocoder modeling features are thoroughly studied. Additional phase-aware vocoders are also tested. Several experiments are carried out, showing that the phase-based parameters perform better and are able to cope with new unknown attacks. The final evaluations, testing synthetic TTS signals obtained from the Blizzard challenge, validate our proposal.

Speech Communication | 2016

Synthetic speech detection using phase information

Ibon Saratxaga; Jon Sanchez; Zhizheng Wu; Inma Hernaez; Eva Navas

Phase information based synthetic speech detectors (RPS, MGD) are analyzed.Training using real attack samples and copy-synthesized material is evaluated.Evaluation of the detectors against unknown attacks, including channel effect.Detectors work well for voice conversion and adapted synthetic speech impostors. Taking advantage of the fact that most of the speech processing techniques neglect the phase information, we seek to detect phase perturbations in order to prevent synthetic impostors attacking Speaker Verification systems. Two Synthetic Speech Detection (SSD) systems that use spectral phase related information are reviewed and evaluated in this work: one based on the Modified Group Delay (MGD), and the other based on the Relative Phase Shift, (RPS). A classical module-based MFCC system is also used as baseline. Different training strategies are proposed and evaluated using both real spoofing samples and copy-synthesized signals from the natural ones, aiming to alleviate the issue of getting real data to train the systems. The recently published ASVSpoof2015 database is used for training and evaluation. Performance with completely unrelated data is also checked using synthetic speech from the Blizzard Challenge as evaluation material. The results prove that phase information can be successfully used for the SSD task even with unknown attacks.

text speech and dialogue | 2005

Analysis of the suitability of common corpora for emotional speech modeling in standard basque

Eva Navas; Inmaculada Hernáez; Iker Luengo; Jon Sanchez; Ibon Saratxaga

This paper presents the analysis made to assess the suitability of neutral semantic corpora to study emotional speech. Two corpora have been used: one having neutral texts that were common to all emotions and the other having texts related to the emotion. Subjective and objective analysis have been performed. In the subjective test common corpus has achieved good recognition rates, although worse than those obtained with specific texts. In the objective analysis, differences among emotions are larger for common texts than for specific texts, indicating that in common corpus expression of emotions was more exaggerated. This is convenient for emotional speech synthesis, but no for emotion recognition. So, in this case, common corpus is suitable for the prosodic modeling of emotions to be used in speech synthesis, but for emotion recognition specific texts are more convenient.

Literary and Linguistic Computing | 2013

‘DiaTech’: A new tool for dialectology

Gotzon Aurrekoetxea; Karmele Fernandez-Aguirre; Jesús Rubio; Borja Ruiz; Jon Sanchez

The aim of this paper is to present a new tool to do dialectometry. The program, called “DiaTech”, has been incorporated features of previous programs, and especially by the VDM program created under the direction of H. Goebl, researcher of the Salzburg University. The main goal of the new tool is to motivate Dialectology studies and dialectologists, putting in their hands a comfortable and efficient tool.

COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours | 2007

Meaningful parameters in emotion characterisation

Eva Navas; Inmaculada Hernáez; Iker Luengo; Iñaki Sainz; Ibon Saratxaga; Jon Sanchez

In expressive speech synthesis some method of mimicking the way one specific speaker express emotions is needed. In this work we have studied the suitability of long term prosodic parameters and short term spectral parameters to reflect emotions in speech, by means of the analysis of the results of two automatic emotion classification systems. Those systems have been trained with different emotional monospeaker databases recorded in standard Basque that include six emotions. Both of them are able to differentiate among emotions for a specific speaker with very high identification rates (above 75%), but the models are not applicable to other speakers (identification rates drop to 20%). Therefore in the synthesis process the control of both spectral and prosodic features is essential to get expressive speech and when a change in speaker is desired the values of the parameters should be re-estimated.

mediterranean electrotechnical conference | 2004

Description and design of a Web accessible multimedia archive

Amaia Castelruiz; Jon Sanchez; X. Zalbide; Eva Navas; I. Graminde

This paper presents a multimedia archive containing audio and video recordings that can be easily accessed via Web. The database structure and the type of information considered relevant are described. The system implementation, i.e. the data addition process and the Web based search application developed, is explained. Statistical data on the success of the system are included.

conference of the international speech communication association | 2016

ML Parameter Generation with a Reformulated MGE Training Criterion - Participation in the Voice Conversion Challenge 2016.

Daniel Erro; Agustín Alonso; Luis Serrano; David Tavarez; Igor Odriozola; Xabier Sarasola; Eder del Blanco; Jon Sanchez; Ibon Saratxaga; Eva Navas; Inma Hernaez

This paper describes our entry to the Voice Conversion Challenge 2016. Based on the maximum likelihood parameter generation algorithm, the method is a reformulation of the minimum generation error training criterion. It uses a GMM for soft classification, a Mel-cepstral vocoder for acoustic analysis and an improved dynamic time warping procedure for source-target alignment. To compensate the oversmoothing effect, the generated parameters are filtered through a speaker-independent postfilter implemented as a linear transform in cepstral domain. The process is completed with mean and variance adaptation of the logfundamental frequency and duration modification by a constant factor. The results of the evaluation show that the proposed system achieves a high conversion accuracy in comparison with other systems, while its naturalness scores are intermediate.

IberSPEECH 2014 Proceedings of the Second International Conference on Advances in Speech and Language Technologies for Iberian Languages - Volume 8854 | 2014

Speech Watermarking Based on Coding of the Harmonic Phase

Inma Hernaez; Ibon Saratxaga; Jianpei Ye; Jon Sanchez; Daniel Erro; Eva Navas

This paper presents a new speech watermarking technique using harmonic modelling of the speech signal and coding of the harmonic phase. We use a representation of the instantaneous harmonic phase which allows straightforward manipulation of its values to embed the digital watermark. The technique converts each harmonic into a communication channel, whose performance is analysed in terms of distortion and BER. The developed tests show that with a simple coding scheme a bit rate of 300bps can be achieved with minimal perceptual distortion and almost zero BER.

Explore More