Anastasios Tsopanoglou
University of Patras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anastasios Tsopanoglou.
Speech Communication | 1993
Nikos Fakotakis; Anastasios Tsopanoglou; George K. Kokkinakis
Abstract An automatic text-independent speaker recognition system suitable for identification and verification purposes is presented. The system is based on spotting the vowels of the test utterance, extracting parameter vectors and classifying them into a speaker-dependent reference database. This database consists of L prototypes for every speaker, representing the vowels of the language, which are estimated from L vowel clusters. These are formed by applying a modified k -means algorithm on the patterns extracted from the vowels of training utterances. The patterns of the training utterances are stored in a training database to be used for updating the reference data of the system. The system was tested over a period of four months with a population of 15 male and female speakers with non-correlated training and test data. Its accuracy proved to be satisfactory (91.39% for verification, 90.19% for closed-set identification, 95.28% for open-set identification), considering that the training utterances per speaker do not exceed 50 sec and the test utterances have a duration of 1.3 sec on the average. The accuracy is substantially increased when increasing the length of the test utterance (e.g. 93.75% verification accuracy for test utterances having an average duration of 4 sec). Additional advantages of the system are the small memory requirements and the fast response.
international conference on digital signal processing | 2002
Todor Ganchev; Anastasios Tsopanoglou; Nikos Fakotakis; George K. Kokkinakis
We study the applicability of probabilistic neural networks (PNNs) as core classifiers to medium scale speaker recognition over fixed telephone networks. In particular, banking applications with up to 400 enrolled speakers and short training times are targeted. Two PNN-based open-set text-independent systems, for speaker identification and speaker verification, respectively, are presented. The performance of these systems is studied with and without the use of a supporting Gaussian mixture models classifier. Results from experiments carried out on the Polycost and SpeechDat(II)-Greek corpus, with training times as short as 43 seconds, are reported.
International Journal of Speech Technology | 2003
Kallirroi Georgila; Kyriakos N. Sgarbas; Anastasios Tsopanoglou; Nikos Fakotakis; George K. Kokkinakis
The automation of Directory Assistance Services (DAS) through speech is one of the most difficult and demanding applications of human-computer interaction because it deals with very large vocabulary recognition issues. In this paper, we present a spoken dialogue system for automating DAS.1 Taking into account the major difficulties of this endeavor a stepwise approach was adopted. In particular, two prototypes D1.1 (basic approach) and D1.2 (improved version) were developed successively. The results of D1.1 evaluation were used to refine D1.1 and gradually led to D1.2 that was also improved using a feedback approach. Furthermore, the system was extended and optimized so that it can be utilized in real-world conditions. We describe the general architecture and the three stages of the systems development in detail. Evaluation results concerning both the speech recognizers accuracy and the overall systems performance are provided for all prototypes. Finally, we focus on techniques that handle large vocabulary recognition issues. The use of Directed Acyclic Word Graphs (DAWGs) and context-dependent phonological rules resulted in search space reduction and therefore in faster response, and also in improved accuracy.
international conference on acoustics, speech, and signal processing | 1991
John Mourjopoulos; Anastasios Tsopanoglou; Nikos Fakotakis
A vector quantization (VQ) technique for room acoustics applications is described. Such an approach allows classification of all possible room transfer functions (RTFs) corresponding to different source/receiver positions into discrete and representative frequency response patterns. Results are presented for simulated tests in three rectangular enclosures, for which the optimum number of such patterns (centroids) is estimated. The performance of the proposed method is also assessed by deconvolution which indicates the degree of similarity between all classified RTFs and their corresponding centroids. The VQ technique achieved the required classification in all tested cases. Typically, 256 centroids were required for a medium-sized room, resulting in approximate reduction of 20:1 over all possible RTF measurements inside such an enclosure. The VQ performance was better for central source positions than for the corner positions. In all cases RTF smoothing could be observed after deconvolution by the inverse of the centroids spectrum, indicating that the VQ technique is suitable for dereverberation applications.<<ETX>>
Speech Communication | 1994
Anastasios Tsopanoglou; John Mourjopoulos; George K. Kokkinakis
Abstract An improved, phoneme-based IWSR system is described, which employs a robust reference data extraction procedure and achieves increased recognition accuracy. Furthermore, a novel method for the adaptation of the IWSR-system to continuous speech is presented. The IWSR system employs a multisection codebook design technique and the LVQ algorithm, which provide well-defined and accurate codebooks, minimize the influence of the within-word coarticulation and allow the use of time-sequence information at the recognition stage. The adaptation method is based on modifications of the systems reference data codebook using a small amount of representative continuous speech data on linear transformations of the main prosodic parameters (i.e. energy and duration). Extensive testing under different conditions (speaker dependent versus speaker independent reference data, single versus multisection codebooks, adapted versus unadapted codebooks, phoneme versus word recognition accuracy, etc.) has shown the efficiency of the proposed methods.
international conference on speech and computer | 2017
Otilia Kocsis; Basilis Kladis; Anastasios Tsopanoglou; Nikos Fakotakis
Assessment of usability and user experience of spoken dialog services is a rather complex task, which remains difficult to achieve with real end-users. In this work a three-fold evaluation approach is introduced, which supports reliable assessment of usability and user experience. The approach combines interaction log data based assessment (at dialog, task and node level) with an optimized questionnaire-based end-users evaluation and a controlled stress test performed by an IVR system. The 3-fold evaluation approach was used for the assessment of usability and user experience of the pilot deployment of a voice banking systems. The proposed assessment approach provides sufficient evidence for the business informed decision-making with respect to perceived user quality of the interaction and offered services and allows for investigation of potential improvement areas.
Proceedings 1998 IEEE 4th Workshop Interactive Voice Technology for Telecommunications Applications. IVTTA '98 (Cat. No.98TH8376) | 1998
Kallirroi Georgila; Anastasios Tsopanoglou; Nikos Fakotakis; G. Kokkinakis
Digital Processing of Signals in Communications, 1991., Sixth International Conference on | 1991
Nikos Fakotakis; Anastasios Tsopanoglou; G. Kokkinakis
conference of the international speech communication association | 1997
Nikos Fakotakis; Kallirroi Georgila; Anastasios Tsopanoglou
language resources and evaluation | 2004
Dorota J. Iskra; Rainer Siemund; Jamal Borno; Asunción Moreno; Ossama Emam; Khalid Choukri; Oren Gedge; H.S. Tropf; Albino Nogueiras; Imed Zitouni; Anastasios Tsopanoglou; Nikos Fakotakis