Kostas Karpouzis
National Technical University of Athens
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kostas Karpouzis.
affective computing and intelligent interaction | 2007
Ellen Douglas-Cowie; Roddy Cowie; Ian Sneddon; Cate Cox; Orla Lowry; Margaret McRorie; Jean-Claude Martin; Laurence Devillers; Sarkis Abrilian; Anton Batliner; Noam Amir; Kostas Karpouzis
The HUMAINE project is concerned with developing interfaces that will register and respond to emotion, particularly pervasive emotion (forms of feeling, expression and action that colour most of human life). The HUMAINE Database provides naturalistic clips which record that kind of material, in multiple modalities, and labelling techniques that are suited to describing it.
EURASIP Journal on Advances in Signal Processing | 2002
Amaryllis Raouzaiou; Nicolas Tsapatsoulis; Kostas Karpouzis; Stefanos D. Kollias
In the framework of MPEG-4, one can include applications where virtual agents, utilizing both textual and multisensory data, including facial expressions and nonverbal speech help systems become accustomed to the actual feelings of the user. Applications of this technology are expected in educational environments, virtual collaborative workplaces, communities, and interactive entertainment. Facial animation has gained much interest within the MPEG-4 framework; with implementation details being an open research area (Tekalp, 1999). In this paper, we describe a method for enriching human computer interaction, focusing on analysis and synthesis of primary and intermediate facial expressions (Ekman and Friesen (1978)). To achieve this goal, we utilize facial animation parameters (FAPs) to model primary expressions and describe a rule-based technique for handling intermediate ones. A relation between FAPs and the activation parameter proposed in classical psychological studies is established, leading to parameterized facial expression analysis and synthesis notions, compatible with the MPEG-4 standard.
Multimedia Tools and Applications | 2009
Stylianos Asteriadis; Paraskevi K. Tzouveli; Kostas Karpouzis; Stefanos D. Kollias
Most e-learning environments which utilize user feedback or profiles, collect such information based on questionnaires, resulting very often in incomplete answers, and sometimes deliberate misleading input. In this work, we present a mechanism which compiles feedback related to the behavioral state of the user (e.g. level of interest) in the context of reading an electronic document; this is achieved using a non-intrusive scheme, which uses a simple web camera to detect and track the head, eye and hand movements and provides an estimation of the level of interest and engagement with the use of a neuro-fuzzy network initialized from evidence from the idea of Theory of Mind and trained from expert-annotated data. The user does not need to interact with the proposed system, and can act as if she was not monitored at all. The proposed scheme is tested in an e-learning environment, in order to adapt the presentation of the content to the user profile and current behavioral state. Experiments show that the proposed system detects reading- and attention-related user states very effectively, in a testbed where children’s reading performance is tracked.
artificial intelligence applications and innovations | 2007
George Caridakis; Ginevra Castellano; Loic Kessous; Amaryllis Raouzaiou; Lori Malatesta; Stylianos Asteriadis; Kostas Karpouzis
In this paper we present a multimodal approach for the recognition of eight emotions that integrates information from facial expressions, body movement and gestures and speech. We trained and tested a model with a Bayesian classifier, using a multimodal corpus with eight emotions and ten subjects. First individual classifiers were trained for each modality. Then data were fused at the feature level and the decision level. Fusing multimodal data increased very much the recognition rates in comparison with the unimodal systems: the multimodal approach gave an improvement of more than 10% with respect to the most successful unimodal system. Further, the fusion performed at the feature level showed better results than the one performed at the decision level.
Pattern Recognition Letters | 2010
George Caridakis; Kostas Karpouzis; Athanasios I. Drosopoulos; Stefanos D. Kollias
Present work introduces a probabilistic recognition scheme for hand gestures. Self organizing feature maps are used to model spatiotemporal information extracted through image processing. Two models are built for each gesture category and, along with appropriate distance metrics, produce a validated classification mechanism that performs consistently during experiments on acted gestures video sequences. The main focus of current work is to tackle intra and inter user variability during gesture performance by adding flexibility to the decoding procedure and allowing the algorithm to perform an optimal trajectory search while the processing speed of both the feature extraction and the recognition process indicate that the proposed architecture is appropriate for real time and large scale lexicon applications.
Eurasip Journal on Image and Video Processing | 2007
Spiros Ioannou; George Caridakis; Kostas Karpouzis; Stefanos D. Kollias
This paper presents a robust and adaptable facial feature extraction system used for facial expression recognition in human-computer interaction (HCI) environments. Such environments are usually uncontrolled in terms of lighting and color quality, as well as human expressivity and movement; as a result, using a single feature extraction technique may fail in some parts of a video sequence, while performing well in others. The proposed system is based on a multicue feature extraction and fusion technique, which provides MPEG-4-compatible features assorted with a confidence measure. This confidence measure is used to pinpoint cases where detection of individual features may be wrong and reduce their contribution to the training phase or their importance in deducing the observed facial expression, while the fusion process ensures that the final result regarding the features will be based on the extraction technique that performed better given the particular lighting or color conditions. Real data and results are presented, involving both extreme and intermediate expression/emotional states, obtained within the sensitive artificial listener HCI environment that was generated in the framework of related European projects.
Computers in Education | 2007
Kostas Karpouzis; George Caridakis; Stavroula-Evita Fotinea; Eleni Efthimiou
In this paper, we present how creation and dynamic synthesis of linguistic resources of Greek Sign Language (GSL) may serve to support development and provide content to an educational multitask platform for the teaching of GSL in early elementary school classes. The presented system utilizes standard virtual character (VC) animation technologies for the synthesis of sign sequences/streams, exploiting digital linguistic resources of both lexicon and grammar of GSL. Input to the system is written Greek text, which is transformed into GSL and animated on screen. To achieve this, a syntactic parser decodes the structural patterns of written Greek and matches them into equivalent patterns of GSL, which are then signed by a VC. The adopted notation system for the representation of GSL phonology incorporated in the systems lexical knowledge database, is Hamburg Notation System (HamNoSys). For the implementation of the virtual signer tool, the definition of the VC follows the h-anim standard and is implemented in a web browser using a standard VRML plug-in.
language resources and evaluation | 2007
George Caridakis; Amaryllis Raouzaiou; Elisabetta Bevacqua; Maurizio Mancini; Kostas Karpouzis; Lori Malatesta; Catherine Pelachaud
This work is about multimodal and expressive synthesis on virtual agents, based on the analysis of actions performed by human users. As input we consider the image sequence of the recorded human behavior. Computer vision and image processing techniques are incorporated in order to detect cues needed for expressivity features extraction. The multimodality of the approach lies in the fact that both facial and gestural aspects of the user’s behavior are analyzed and processed. The mimicry consists of perception, interpretation, planning and animation of the expressions shown by the human, resulting not in an exact duplicate rather than an expressive model of the user’s original behavior.
Proceedings of the International Workshop on Affective-Aware Virtual Agents and Social Robots | 2009
Stylianos Asteriadis; Dimitris Soufleros; Kostas Karpouzis; Stefanos D. Kollias
We present a new dataset, ideal for Head Pose and Eye Gaze Estimation algorithm testings. Our dataset was recorded using a monocular system, and no information regarding camera or environment parameters is offered, making the dataset ideal to be tested with algorithms that do not utilize such information and do not require any specific equipment in terms of hardware.
Universal Access in The Information Society | 2008
Stavroula-Evita Fotinea; Eleni Efthimiou; George Caridakis; Kostas Karpouzis
This paper presents the modules that comprise a knowledge-based sign synthesis architecture for Greek sign language (GSL). Such systems combine natural language (NL) knowledge, machine translation (MT) techniques and avatar technology in order to allow for dynamic generation of sign utterances. The NL knowledge of the system consists of a sign lexicon and a set of GSL structure rules, and is exploited in the context of typical natural language processing (NLP) procedures, which involve syntactic parsing of linguistic input as well as structure and lexicon mapping according to standard MT practices. The coding on linguistic strings which are relevant to GSL provide instructions for the motion of a virtual signer that performs the corresponding signing sequences. Dynamic synthesis of GSL linguistic units is achieved by mapping written Greek structures to GSL, based on a computational grammar of GSL and a lexicon that contains lemmas coded as features of GSL phonology. This approach allows for robust conversion of written Greek to GSL, which is an essential prerequisite for access to e-content by the community of native GSL signers. The developed system is sublanguage oriented and performs satisfactorily as regards its linguistic coverage, allowing for easy extensibility to other language domains. However, its overall performance is subject to current well known MT limitations.