Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Panagiotis Zervas is active.

Publication


Featured researches published by Panagiotis Zervas.


international conference on tools with artificial intelligence | 2007

Segmental Duration Modeling for Greek Speech Synthesis

Alexandros Lazaridis; Panagiotis Zervas; G. Kokkinakis

In this paper we cope with the task of modeling phoneme duration for Greek speech synthesis. In particular we apply well established machine learning approaches to the WCL-1 prosodic database for predicting segmental durations from shallow morphosyntactic and prosodic features. We employ decision trees, instance based learning and linear regression. Trained on a 5500 word database, both CART and linear regression models proved to be the most effective in terms for the task with a root mean square error off 0. 0252 and 0.0251 respectively.


Journal of Quantitative Linguistics | 2008

Development and evaluation of a prosodic database for Greek speech synthesis and research

Panagiotis Zervas; Nikos Fakotakis; G. Kokkinakis

Abstract In this article the definition, construction and statistical evaluation of a prosodic database of Greek speech are presented. The main motivation for the development of such a database was its use as a research tool for Text-to-Speech synthesis and the study of prosody in general. Beginning with the task of text selection we came to a final set containing sentences with almost 95% of all Greek syllables, extracted from a widely used Greek dictionary. Then, a professional radio actress was instructed to utter these sentences in reading style and at reading rate; this was recorded at 44 kHz/16 bits, in the anechoic chamber of a professional studio. The intonational phenomena were transcribed on the corresponding speech signals by a trained phonetician using the ToBI annotation model adapted to Greek prosodic patterns. The speech data were segmented to the phoneme level employing a phoneme recognizer based on the HTK platform. All files were aligned so that possible relations among text, intonational and durational labelling could be identified. For database management, the EMU speech database system was utilized. Extensive measurements of numerous annotated events presented in histograms and tables provide detailed information on the database. Finally, we evaluate prediction models of prosodic phrase breaks and pitch accents derived from our database. Performance of these models was also compared to models derived under the same experimental conditions with a limited domain corpus of Greek speech.


hellenic conference on artificial intelligence | 2006

Employing fujisaki's intonation model parameters for emotion recognition

Panagiotis Zervas; Iosif Mporas; Nikos Fakotakis; George K. Kokkinakis

In this paper we are introducing the employment of features extracted from Fujisakis parameterization of pitch contour for the task of emotion recognition from speech. In evaluating the proposed features we have trained a decision tree inducer as well as the instance based learning algorithm. The datasets utilized for training the classification models, were extracted from two emotional speech databases. Fujisakis parameters benefited all prediction models with an average raise of 9,52% in the total accuracy.


text speech and dialogue | 2003

A Data-Driven Framework for Intonational Phrase Break Prediction

Manolis Maragoudakis; Panagiotis Zervas; Nikos Fakotakis; George K. Kokkinakis

For the present work, we attempt to study the issue of automatic acquisition of intonational phrase breaks. A mathematically well-formed framework is suggested, which is based on Bayesian theory. Based on two different assumptions regarding the conditional independence of the input attributes, we have come up with two Bayesian implementations, namely the Naive Bayes and the Bayesian Networks classifiers. As a performance benchmark, we evaluated the experimental result against CART, an acclaimed algorithm in the field of intonational phrase break detection that has demonstrated stat-of-the-art figures. Our approach utilizes minimal morphological and syntactic resources in a finite length window, i.e. the POS label and the type of syntactic phrase boundary, a novel attribute that has not been applied to the specific task before. On a 5500 word training set, the Bayesian networks approach proved to be the most effective, depicting precision and recall figures in the range of 82% and 77% respectively.


hellenic conference on artificial intelligence | 2006

Recognition of greek phonemes using support vector machines

Iosif Mporas; Todor Ganchev; Panagiotis Zervas; Nikos Fakotakis

In the present work we study the applicability of Support Vector Machines (SVMs) on the phoneme recognition task. Specifically, the Least Squares version of the algorithm (LS-SVM) is employed in recognition of the Greek phonemes in the framework of telephone-driven voice-enabled information service. The N-best candidate phonemes are identified and consequently feed to the speech and language recognition components. In a comparative evaluation of various classification methods, the SVM-based phoneme recognizer demonstrated a superior performance. Recognition rate of 74.2% was achieved from the N-best list, for N=5, prior to applying the language model.


text speech and dialogue | 2005

Experimental evaluation of tree-based algorithms for intonational breaks representation

Panagiotis Zervas; Gerasimos Xydas; Nikolaos Fakotakis; George K. Kokkinakis; Georgios Kouroupetroglou

The prosodic specification of an utterance to be spoken by a Text-to-Speech synthesis system can be devised in break indices, pitch accents and boundary tones. In particular, the identification of break indices formulates the intonational phrase breaks that affect all the forthcoming prosody-related procedures. In the present paper we use tree-structured predictors, and specifically the commonly used in similar tasks CART and the introduced C4.5 one, to cope with the task of break placement in the presence of shallow textual features. We have utilized two 500-utterance prosodic corpora offered by two Greek universities in order to compare the machine learning approaches and to argue on the robustness they offer for Greek break modeling. The evaluation of the resulted models revealed that both approaches were positively compared with similar works published for other languages, while the C4.5 method accuracy scaled from 1% to 2,7% better than CART.


International Journal on Artificial Intelligence Tools | 2007

EVALUATING INTONATIONAL FEATURES FOR EMOTION RECOGNITION FROM SPEECH

Panagiotis Zervas; Iosif Mporas; Nikos Fakotakis; George K. Kokkinakis

This paper presents and discusses the problem of emotion recognition from speech signals with the utilization of features bearing intonational information. In particular parameters extracted from Fujisakis model of intonation are presented and evaluated. Machine learning models were build with the utilization of C4.5 decision tree inducer, instance based learner and Bayesian learning. The datasets utilized for the purpose of training machine learning models were extracted from two emotional databases of acted speech. Experimental results showed the effectiveness of Fujisakis model attributes since they enhanced the recognition process for most of the emotion categories and learning approaches helping to the segregation of emotion categories.


text speech and dialogue | 2002

On the First Greek-TTS Based on Festival Speech Synthesis

Panagiotis Zervas; Ilyas Potamitis; Nikos Fakotakis; George K. Kokkinakis

In this article we describe the first Text To Speech (TTS) system for the Greek language based on Festival architecture. We discuss practical implementation details and we capitalize on the preparation of the diphone database and on the prediction of phoneme duration module implemented with CART tree technique. Two male databases where used for two different speech synthesis engines, namely, residual LPC synthesis and Mbrola technique.


conference of the international speech communication association | 2003

Bayesian induction of intonational phrase breaks.

Panagiotis Zervas; Manolis Maragoudakis; Nikos Fakotakis; George K. Kokkinakis


conference of the international speech communication association | 2004

Evaluation of Corpus Based Tone Prediction in Mismatched Environments for Greek TtS Synthesis

Panagiotis Zervas; Nikos Fakotakis; George Kokkinakis; Georgios Kouroupetroglou; Gerasimos Xydas

Collaboration


Dive into the Panagiotis Zervas's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Georgios Kouroupetroglou

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Gerasimos Xydas

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Ilyas Potamitis

Technological Educational Institute of Crete

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge