Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Doroteo Torre Toledano is active.

Publication


Featured researches published by Doroteo Torre Toledano.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition

Joaquin Gonzalez-Rodriguez; Philip Rose; Daniel Ramos; Doroteo Torre Toledano; Javier Ortega-Garcia

Forensic DNA profiling is acknowledged as the model for a scientifically defensible approach in forensic identification science, as it meets the most stringent court admissibility requirements demanding transparency in scientific evaluation of evidence and testability of systems and protocols. In this paper, we propose a unified approach to forensic speaker recognition (FSR) oriented to fulfil these admissibility requirements within a framework which is transparent, testable, and understandable, both for scientists and fact-finders. We show how the evaluation of DNA evidence, which is based on a probabilistic similarity-typicality metric in the form of likelihood ratios (LR), can also be generalized to continuous LR estimation, thus providing a common framework for phonetic-linguistic methods and automatic systems. We highlight the importance of calibration, and we exemplify with LRs from diphthongal F-pattern, and LRs in NIST-SRE06 tasks. The application of the proposed approach in daily casework remains a sensitive issue, and special caution is enjoined. Our objective is to show how traditional and automatic FSR methodologies can be transparent and testable, but simultaneously remain conscious of the present limitations. We conclude with a discussion on the combined use of traditional and automatic approaches and current challenges for the admissibility of speech evidence.


Pattern Recognition | 2007

Rapid and brief communication: Biosec baseline corpus: A multimodal biometric database

Julian Fierrez; Javier Ortega-Garcia; Doroteo Torre Toledano; Joaquin Gonzalez-Rodriguez

The baseline corpus of a new multimodal database, acquired in the framework of the FP6 EU BioSec Integrated Project, is presented. The corpus consists of fingerprint images acquired with three different sensors, frontal face images from a webcam, iris images from an iris sensor, and voice utterances acquired both with a close-talk headset and a distant webcam microphone. The BioSec baseline corpus includes real multimodal data from 200 individuals in two acquisition sessions. In this contribution, the acquisition setup and protocol are outlined, and the contents of the corpus-including data and population statistics-are described. The database will be publicly available for research purposes by mid-2006.


Pattern Analysis and Applications | 2010

BiosecurID: a multimodal biometric database

Julian Fierrez; Javier Galbally; Javier Ortega-Garcia; Manuel Freire; Fernando Alonso-Fernandez; Daniel Ramos; Doroteo Torre Toledano; Joaquin Gonzalez-Rodriguez; Juan A. Sigüenza; J. Garrido-Salas; E. Anguiano; Guillermo González-de-Rivera; R. Ribalda; Marcos Faundez-Zanuy; Juan Antonio Ortega; Valentín Cardeñoso-Payo; A. Viloria; Carlos Vivaracho; Q.-I. Moro; J. J. Igarza; J. Sanchez; I. Hernaez; C. Orrite-Uruñuela; F. Martinez-Contreras; J. J. Gracia-Roche

A new multimodal biometric database, acquired in the framework of the BiosecurID project, is presented together with the description of the acquisition setup and protocol. The database includes eight unimodal biometric traits, namely: speech, iris, face (still images, videos of talking faces), handwritten signature and handwritten text (on-line dynamic signals, off-line scanned images), fingerprints (acquired with two different sensors), hand (palmprint, contour-geometry) and keystroking. The database comprises 400 subjects and presents features such as: realistic acquisition scenario, balanced gender and population distributions, availability of information about particular demographic groups (age, gender, handedness), acquisition of replay attacks for speech and keystroking, skilled forgeries for signatures, and compatibility with other existing databases. All these characteristics make it very useful in research and development of unimodal and multimodal biometric systems.


Interacting with Computers | 2006

Usability evaluation of multi-modal biometric verification systems

Doroteo Torre Toledano; Rubén Fernández Pozo; Álvaro Hernández Trapote; Luis A. Hernández Gómez

As a result of the evolution in the field of biometrics, a new breed of techniques and methods for user identity recognition and verification has appeared based on the recognition and verification of several biometric features considered unique to each individual. Signature and voice characteristics, facial features, and iris and fingerprint patterns have all been used to identify a person or just to verify that the person is who he/she claims to be. Although still relatively new, these new technologies have already reached a level of development that allows its commercialization. However, there is a lack of studies devoted to the evaluation of these technologies from a user-centered perspective. This paper is intended to promote user-centered design and evaluation of biometric technologies. Towards this end, we have developed a platform to perform empirical evaluations of commercial biometric identity verification systems, including fingerprint, voice and signature verification. In this article, we present an initial empirical study in which we evaluate, compare and try to get insights into the factors that are crucial for the usability of these systems.


PLOS ONE | 2016

Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks

Ruben Zazo; Alicia Lozano-Diez; Javier Gonzalez-Dominguez; Doroteo Torre Toledano; Joaquin Gonzalez-Rodriguez

Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources (a single GPU) that outperforms a reference i-vector system on a subset of the NIST Language Recognition Evaluation (8 target languages, 3s task) by up to a 26%. This result is in line with previously published research using proprietary LSTM implementations and huge computational resources, which made these former results hardly reproducible. Further, we extend those previous experiments modeling unseen languages (out of set, OOS, modeling), which is crucial in real applications. Results show that a LSTM RNN with OOS modeling is able to detect these languages and generalizes robustly to unseen OOS languages. Finally, we also analyze the effect of even more limited test data (from 2.25s to 0.1s) proving that with as little as 0.5s an accuracy of over 50% can be achieved.


EURASIP Journal on Advances in Signal Processing | 2009

Assessment of severe apnoea through voice analysis, automatic speech, and speaker recognition techniques

Rubén Fernández Pozo; José Luis Blanco Murillo; Luis A. Hernández Gómez; Eduardo López Gonzalo; José Alcázar Ramírez; Doroteo Torre Toledano

This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

Initialization, Training, and Context-Dependency in HMM-Based Formant Tracking

Doroteo Torre Toledano; Jesus Gomez Villardebo; Luis A. Hernández Gómez

This paper presents an algorithm for formant tracking using HMMs and analyzes the influence of HMM initialization, training and context-dependency on the accuracy of the formant tracks obtained with the HMMs. Formant trackers usually include two different phases: one in which the speech is analyzed and formant candidates are obtained, and another in which, by imposing different constraints, the most likely formants are chosen. While the first stage usually relies on standard spectrum estimation techniques, the second stage has evolved notably in the recent years. Traditionally the second phase tries to impose continuity constraints on the formant selection process. Lately there has been ongoing research to include phonemic knowledge in the second stage to make formant tracking more reliable. In order to incorporate phonemic knowledge newer approaches make use of the orthographic transcription of the speech utterance. From the orthographic transcription, the phonemic transcription is obtained, and from this and the speech itself a phonemic segmentation can be obtained. This phonemic segmentation, along with the phonemic transcription and some knowledge of the nominal formant positions for the different phonemes provides extra information that can be used to obtain more accurate formant tracks. This paper presents a complete HMM-based data-driven algorithm for formant tracking suitable to combine different levels of acoustic and phonemic information. A detailed analysis on the performance of this algorithm is discussed for: different initialization strategies using different levels of knowledge, different degrees of training, and context-independent and dependent HMMs. Experimental speaker-dependent results show that the efficient use of phonemic information in HMM training and context-dependent modeling significantly reduces the formant tracking error rate especially for formants


Computer Speech & Language | 2014

Analysis of voice features related to obstructive sleep apnoea and their application in diagnosis support

Ana Montero Benavides; Rubén Fernández Pozo; Doroteo Torre Toledano; José Luis Blanco Murillo; Eduardo López Gonzalo; Luis A. Hernández Gómez

F_2


international conference on acoustics, speech, and signal processing | 2005

MFCC compensation for improved recognition of filtered and bandlimited speech

Nicolás Morales; John H. L. Hansen; Doroteo Torre Toledano

and


Eurasip Journal on Audio, Speech, and Music Processing | 2013

Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion

Javier Tejedor; Doroteo Torre Toledano; Xavier Anguera; Amparo Varona; Lluís F. Hurtado; Antonio Miguel; José Colás

F_3

Collaboration


Dive into the Doroteo Torre Toledano's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rubén Fernández Pozo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Daniel Ramos

Autonomous University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

José Colás

Autonomous University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antonio Moreno Sandoval

Autonomous University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Javier Ortega-Garcia

Autonomous University of Madrid

View shared research outputs
Top Co-Authors

Avatar

John H. L. Hansen

University of Texas at Dallas

View shared research outputs
Researchain Logo
Decentralizing Knowledge