Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dimitris Spiliotopoulos is active.

Publication


Featured researches published by Dimitris Spiliotopoulos.


text speech and dialogue | 2004

Modeling Prosodic Structures in Linguistically Enriched Environments

Gerasimos Xydas; Dimitris Spiliotopoulos; Georgios Kouroupetroglou

A significant challenge in Text-to-Speech (TtS) synthesis is the formulation of the prosodic structures (phrase breaks, pitch accents, phrase accents and boundary tones) of utterances. The prediction of these elements robustly relies on the accuracy and the quality of error-prone linguistic procedures, such as the identification of the part-of-speech and the syntactic tree. Additional linguistic factors, such as rhetorical relations, improve the naturalness of the prosody, but are hard to extract from plain texts. In this work, we are proposing a method to generate enhanced prosodic events for TtS by utilizing accurate, error-free and high-level linguistic information. We are also presenting an appropriate XML annotation scheme to encode syntax, grammar, new or given information, phrase subject/object information, as well as rhetorical elements. These linguistically enriched has have been utilized to build realistic machine learning models for the prediction of the prosodic structures in terms of segmental information and ToBI marks. The methodology has been applied by exploiting a Natural Language Generator (NLG) system. The trained models have been built using classification via regression trees and the results strongly indicate the realistic effect on the generated prosody. The evaluation of this approach has been made by comparing the models produced by the enriched documents to those produced by plain text of the same domain. The results show an improved accuracy of up to 23%.


intelligent environments | 2014

Metalogue: A Multiperspective Multimodal Dialogue System with Metacognitive Abilities for Highly Adaptive and Flexible Dialogue Management

Jan Alexandersson; Maria Aretoulaki; Nick Campbell; Michael Gardner; Andrey Girenko; Dietrich Klakow; Dimitris Koryzis; Volha Petukhova; Marcus Specht; Dimitris Spiliotopoulos; Alexander Stricker; Niels Taatgen

This poster paper presents a high-level description of the Metalogue project that is developing a multi-modal dialogue system that is able to implement interactive behaviors that seem natural to users and is flexible enough to exploit the full potential of multimodal interaction. We provide an outline of the initial work undertaken to define a an open architecture for the integrated Metalogue system. This system includes components that are necessary for the implementation of the processing stages for a variety of application domains: initialization, training, information gathering, orchestration, multimodality, dialogue management, speech recognition, speech synthesis and user modelling.


Universal Access in The Information Society | 2010

Auditory universal accessibility of data tables using naturally derived prosody specification

Dimitris Spiliotopoulos; Gerasimos Xydas; Georgios Kouroupetroglou; Vasilios Argyropoulos; Kalliopi Ikospentaki

Text documents usually embody visually oriented meta-information in the form of complex visual structures, such as tables. The semantics involved in such objects result in poor and ambiguous text-to-speech synthesis. Although most speech synthesis frameworks allow the consistent control of an abundance of parameters, such as prosodic cues, through appropriate markup, there is no actual prosodic specification to speech-enable visual elements. This paper presents a method for the acoustic specification modelling of simple and complex data tables, derived from the human paradigm. A series of psychoacoustic experiments were set up for providing speech properties obtained from prosodic analysis of natural spoken descriptions of data tables. Thirty blind and 30 sighted listeners selected the most prominent natural rendition. The derived prosodic phrase accent and pause break placement vectors were modelled using the ToBI semiotic system to successfully convey semantically important visual information through prosody control. The quality of the information provision of speech-synthesized tables when utilizing the proposed prosody specification was evaluated by first-time listeners. The results show a significant increase (from 14 to 20% depending on the table type) of the user subjective understanding (overall impression, listening effort and acceptance) of the table data semantic structure compared to the traditional linearized speech synthesis of tables. Furthermore, it is proven that successful prosody manipulation can be applied to data tables using generic specification sets for certain table types and browsing techniques, resulting in improved data comprehension.


text speech and dialogue | 2005

Diction based prosody modeling in table-to-speech synthesis

Dimitris Spiliotopoulos; Gerasimos Xydas; Georgios Kouroupetroglou

Transferring a structure from the visual modality to the aural one presents a difficult challenge. In this work we are experimenting with prosody modeling for the synthesized speech representation of tabulated structures. This is achieved by analyzing naturally spoken descriptions of data tables and a following feedback by blind and sighted users. The derived prosodic phrase accent and pause break placement and values are examined in terms of successfully conveying semantically important visual information through prosody control in Table-to-Speech synthesis. Finally, the quality of the information provision of synthesized tables when utilizing the proposed prosody specification is studied against plain synthesis.


international conference on universal access in human computer interaction | 2009

Acoustic Rendering of Data Tables Using Earcons and Prosody for Document Accessibility

Dimitris Spiliotopoulos; Panagiota Stavropoulou; Georgios Kouroupetroglou

Earlier works show that using a prosody specification that is derived from natural human spoken rendition, increases the naturalness and overall acceptance of speech synthesised complex visual structures by conveying to audio certain semantic information hidden in the visual structure. However, prosody alone, although exhibits significant improvement, cannot perform adequately in the cases of very large complex data tables browsed in a linear manner. This work reports on the use of earcons and spearcons combined with prosodically enriched aural rendition of simple and complex tables. Three spoken combinations earcons+prosody , spearcons+prosody , and prosody were evaluated in order to examine how the resulting acoustic output would improve the document-to-audio semantic correlation throughput from the visual modality. The results show that the use of non-speech sounds can further improve certain qualities, such as listening effort, a crucial parameter when vocalising any complex visual structure contained in a document.


USAB '09 Proceedings of the 5th Symposium of the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society on HCI and Usability for e-Inclusion | 2009

Spoken Dialogue Interfaces: Integrating Usability

Dimitris Spiliotopoulos; Pepi Stavropoulou; Georgios Kouroupetroglou

Usability is a fundamental requirement for natural language interfaces. Usability evaluation reflects the impact of the interface and the acceptance from the users. This work examines the potential of usability evaluation in terms of issues and methodologies for spoken dialogue interfaces along with the appropriate designer-needs analysis. It unfolds the perspective to the usability integration in the spoken language interface design lifecycle and provides a framework description for creating and testing usable content and applications for conversational interfaces. Main concerns include the problem identification of design issues for usability design and evaluation, the use of customer experience for the design of voice interfaces and dialogue, and the problems that arise from real-life deployment. Moreover it presents a real-life paradigm of a hands-on approach for applying usability methodologies in a spoken dialogue application environment to compare against a DTMF approach. Finally, the scope and interpretation of results from both the designer and the user standpoint of usability evaluation are discussed.


hellenic conference on artificial intelligence | 2002

Symbolic Authoring for Multilingual Natural Language Generation

Ion Androutsopoulos; Dimitris Spiliotopoulos; Konstantinos Stamatakis; Aggeliki Dimitromanolaki; Vangelis Karkaletsis; Constantine D. Spyropoulos

We describe the symbolic authoring facilities of the M-PIRO project. M-PIRO is developing technology that allows personalized multilingual object descriptions, in both textual and spoken form, to be produced from symbolic information in a database and small fragments of text. The technology is being tested in the context of electronic museums, where a prototype that produces dynamically multilingual exhibit descriptions for presentations over the web has already been developed. This paper focuses on M-PIROs authoring subsystem, which allows domain experts with no language technology expertise to configure the system for new applications. The authoring facilities allow the experts to define or modify the structure of the underlying database, its contents, and the systems domain-dependent linguistic resources. Previews of the generated texts can also be produced during the authoring process to monitor the content and quality of the resulting descriptions.


Future Internet | 2014

Analysing and Enriching Focused Semantic Web Archives for Parliament Applications

Elena Demidova; Nicola Barbieri; Stefan Dietze; Adam Funk; Helge Holzmann; Diana Maynard; Nikolaos Papailiou; Wim Peters; Thomas Risse; Dimitris Spiliotopoulos

The web and the social web play an increasingly important role as an information source for Members of Parliament and their assistants, journalists, political analysts and researchers. It provides important and crucial background information, like reactions to political events and comments made by the general public. The case study presented in this paper is driven by two European parliaments (the Greek and the Austrian parliament) and targets an effective exploration of political web archives. In this paper, we describe semantic technologies deployed to ease the exploration of the archived web and social web content and present evaluation results.


international conference on computers helping people with special needs | 2012

Designing user interfaces for social media driven digital preservation and information retrieval

Dimitris Spiliotopoulos; Efstratios Tzoannos; Pepi Stavropoulou; Georgios Kouroupetroglou; Alexandros Pino

Social Media provide a vast amount of information identifying stories, events, entities that play the crucial role of shaping the community in an everyday heavy user involvement. This work involves the study of social media information in terms of type (multimodal: text, video, sound, picture) and role players (agents, users, opinion leaders) and the potential of designing accessible, usable interfaces that integrate that information. This case examines the design of a user interface that uses an underlying engine for modality components (plain text, sound, image, video) analysis, social media crawling, contextual search fusion and semantic analysis. The interface is the only point of user interaction to the world of knowledge. This work reports on the usability and accessibility methods and concerns for the user requirements phase and the design control and testing. The findings of the pilot user testing and evaluation provide indications on how the semantic analysis of the social media information can be integrated to the design methodologies for user interfaces resulting in maximization of user experience in terms of social information involvement.


International Journal of Information Technology and Web Engineering | 2009

Usability Methodologies for Real-Life Voice User Interfaces

Georgios Kouroupetroglou; Dimitris Spiliotopoulos

This paper studies the usability methodologies for spoken dialogue web interfaces along with the appropriate designer-needs analysis. The work unfolds a theoretical perspective to the methods that are extensively used and provides a framework description for creating and testing usable content and applications for conversational interfaces. The main concerns include the design issues for usability testing and evaluation during the development lifecycle, the basic customer experience metrics and the problems that arise after the deployment of real-life systems. Through the discussion of the evaluation and testing methods, this paper argues on the importance and the potential of wizard-based functional assessment and usability testing for deployed systems, presenting an appropriate environment as part of an integrated development framework.

Collaboration


Dive into the Dimitris Spiliotopoulos's collaboration.

Top Co-Authors

Avatar

Georgios Kouroupetroglou

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Gerasimos Xydas

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Pepi Stavropoulou

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Dimitrios Tsonos

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexandros Pino

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Ion Androutsopoulos

Athens University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar

Nikolaos Papailiou

National Technical University of Athens

View shared research outputs
Researchain Logo
Decentralizing Knowledge