Aristomenis Thanopoulos
University of Patras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aristomenis Thanopoulos.
international conference on computational linguistics | 2000
Aristomenis Thanopoulos; Nikos Fakotakis; George K. Kokkinakis
In this paper we address the problem of discovering word semantic similarities via statistical processing of text corpora. We propose a knowledge-poor method that exploits the sentencial context of words for extracting similarity relations between them as well as semantic in nature word clusters. The approach aims at full portability across domains and languages and therefore is based on minimal resources.
text speech and dialogue | 2003
Aristomenis Thanopoulos; Nikos Fakotakis; George K. Kokkinakis
Previous approaches on automatic extraction of lexical similarities have considered as semantic unit of text the word. However, the theoretical perspective of contextual lexical semantics suggests that larger segments of text, specifically non-compositional multiwords, are more appropriate for this role. We experimentally tested the applicability of this notion, applying automatic collocation extraction to identify and merge such multiwords prior to the similarity estimation process. Employing an automatic comparative evaluation scheme we ascertain improvement of the extracted lexico-semantic knowledge.
text speech and dialogue | 2003
Manolis Maragoudakis; Aristomenis Thanopoulos; Nikos Fakotakis
For the present paper, we endeavor with the issue of identification of a user’s plan, in terms of user modeling under uncertainty. Unlike the majority of existing natural language understanding engines, the presented framework automatically encodes semantic representation of a user’s query using a Bayesian networks framework. The structure of the networks is determined from annotated dialogue corpora, thus eliminating the monotonous and costly process of manually coding domain knowledge. The conditional probability tables are computed accurately from the available data, obtained from the same set of dialogue acts. In order to cope with words absent from our restricted dialogue corpus, we have incorporated a separate, offline module, which estimates their semantic role from both medical and general raw text corpora, correlating them with known lexical-semantically similar words or predefined topics. Lexical similarity is identified on the basis of both contextual similarity and co-occurrence in conjunctive expressions, while extraction of word-topic correlations is possible due to the labeled nature of the available medical corpus, obtained from the Hellenic National Organization for Medicines. The evaluation of the platform was performed against an existing language understanding module of the DIKTIS medical system, the architecture of which is based on manually embedded domain knowledge. Obtained results depict noteworthy improvement in the context of efficiently identifying the core goals of a user. The presented approach demonstrates a 24% recognition improvement using the automatic domain knowledge extraction engine, augmented with the unknown terms resolving component.
international conference on knowledge-based and intelligent information and engineering systems | 2003
Manolis Maragoudakis; Aristomenis Thanopoulos; Kyriakos N. Sgarbas; Nikos Fakotakis
In this paper, a probabilistic framework for acquiring domain knowledge from heterogeneous corpora is introduced. The acquired information is used for intelligent human-computer interaction through the web. The application selected for the framework experimentation was education on issues of chemotherapy of nosocomial and community acquired pneumonia. Contrasting to existing educational dialogue engines which use handcrafted knowledge of the application domain, our approach utilizes automatic encoding of the semantic model of the application, based on learning Bayesian networks from past user questions. The structure of the networks as well as the conditional probability distributions are computed automatically from dialogue corpora, thus eliminating the tedious process of manual insertion of domain knowledge. Furthermore, we attempt to overcome the significant issue of limited vocabulary by incorporating a methodology which estimates semantic similarities of words not found within the system’s vocabulary and probabilistically associates them with those who appear.
International Journal on Artificial Intelligence Tools | 2004
Manolis Maragoudakis; Aristomenis Thanopoulos; Kyriakos N. Sgarbas; Nikos Fakotakis
This paper introduces a statistical framework for extracting medical domain knowledge from heterogeneous corpora. The acquired information is incorporated into a natural language understanding agent and applied to DIKTIS, an existing web-based educational dialogue system for the chemotherapy of nosocomial and community acquired pneumonia, aiming at providing a more intelligent natural language interaction. Unlike the majority of existing dialogue understanding engines, the presented system automatically encodes semantic representation of a users query using Bayesian networks. The structure of the networks is determined from annotated dialogue corpora using the Bayesian scoring method, thus eliminating the tedious and costly process of manually coding domain knowledge. The conditional probability distributions are estimated during a training phase using data obtained from the same set of dialogue acts. In order to cope with words absent from our restricted dialogue corpus, a separate offline module was incorporated, which estimates their semantic role from both medical and general raw text corpora, correlating them with known lexical-semantically similar words or predefined topics. Lexical similarity is identified on the basis of both contextual similarity and co-occurrence in conjunctive expressions. The evaluation of the platform was performed against the existing language natural understanding module of DIKTIS, the architecture of which is based on manually embedded domain knowledge.
text speech and dialogue | 2002
Manolis Maragoudakis; Aristomenis Thanopoulos; Nikos Fakotakis
In this paper, we introduce an architecture designed to achieve effective plan recognition using Bayesian Networks which encode the semantic representation of the users utterances. The structure of the networks is determined from dialogue corpora, thus eliminating the high cost process of hand-coding domain knowledge. The conditional probability distributions are learned during a training phase in which data are obtained by the same set of dialogue acts. Furthermore, we have incorporated a module that learns semantic similarities of words from raw text corpora and uses the extracted knowledge to resolve the issue of the unknown terms, thus enhancing plan recognition accuracy, and improves the quality of the discourse. We present experimental results of an implementation of our platform for a weather information system and compare its performance against a similar, commercial one. Results depict significant improvement in the context of identifying the goals of the user. Moreover, we claim that our framework could straightforwardly be updated with new elements from the same domain or adapted to other domains as well.
language resources and evaluation | 2002
Aristomenis Thanopoulos; Nikos Fakotakis; George K. Kokkinakis
IEEE Intelligent Systems | 2007
Manolis Maragoudakis; Aristomenis Thanopoulos; Nikos Fakotakis
conference of the international speech communication association | 1997
Aristomenis Thanopoulos; Nikos Fakotakis; George K. Kokkinakis
language resources and evaluation | 2008
Katia Lida Kermanidis; Aristomenis Thanopoulos; Manolis Maragoudakis; Nikos Fakotakis