Kostas Fragos
National Technical University of Athens
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kostas Fragos.
INTERNATIONAL CONFERENCE ON INTEGRATED INFORMATION (IC-ININFO 2014): Proceedings of the 4th International Conference on Integrated Information | 2015
Dimitris Vassis; B. A. Kampouraki; Petros Belsis; Vassilis Zafeiris; Nikolaos Vassilas; Eleni Galiotou; Nikitas N. Karanikolas; Kostas Fragos; Vassilis G. Kaburlasos; S. E. Papadakis; Vassilis Tsoukalas; Christos Skourlas
In this paper we make a comprehensive review regarding the use of neural networks in automated medical diagnosis, with a special emphasis in Support Vector Machines (SVMs), which are specialized types of neural functions. Through the study, we see that, in many cases, symptoms and diseases can be efficiently predicted by neural systems, while SVMs are increasingly used in medical diagnosis due to their accurate classification characteristics.
International Journal on Artificial Intelligence Tools | 2013
Kostas Fragos
In this work, we propose a new measure of semantic relatedness between concepts applied in word sense disambiguation. Using the overlaps between WordNet definitions of concepts (glosses) and the so-called goodness of fit statistical test we establish a formal mechanism for quantifying and estimating the semantic relatedness between concepts. More concretely, we model WordNet glosses overlaps by making a theoretical assumption about their distribution and then we quantify the discrepancy between the theoretical and actual distribution. This discrepancy is suitably used to measure the relatedness between the input concepts. The experimental results showed very good performance on SensEval-2 lexical sample data for word sense disambiguation.
panhellenic conference on informatics | 2014
Kostas Fragos; Christos Skourlas
In this work, we propose a method to improve performance in biomedical article classification. We use Naïve Bayes and Maximum Entropy classifiers to classify real world biomedical articles. We describe a technique based on chi-square measure to discard irrelevant information from the data and to identify the most relevant keywords to the classification task. To improve classification performance, we used two merging operators, Max and Harmonic Mean proposed by Jongwoo et al (2010) to combine results of the two classifiers. The results show that the Maximum Entropy classifier shows the better performance at 500 top relevant keywords. It is also shown that combining the results of the two classifiers we can improve classification performance of real world biomedical data.
Information Retrieval | 2006
Kostas Fragos; Yannis Maistros
In many probabilistic modeling approaches to Information Retrieval we are interested in estimating how well a document model “fits” the user’s information need (query model). On the other hand in statistics, goodness of fit tests are well established techniques for assessing the assumptions about the underlying distribution of a data set. Supposing that the query terms are randomly distributed in the various documents of the collection, we actually want to know whether the occurrences of the query terms are more frequently distributed by chance in a particular document. This can be quantified by the so-called goodness of fit tests. In this paper, we present a new document ranking technique based on Chi-square goodness of fit tests. Given the null hypothesis that there is no association between the query terms q and the document d irrespective of any chance occurrences, we perform a Chi-square goodness of fit test for assessing this hypothesis and calculate the corresponding Chi-square values. Our retrieval formula is based on ranking the documents in the collection according to these calculated Chi-square values. The method was evaluated over the entire test collection of TREC data, on disks 4 and 5, using the topics of TREC-7 and TREC-8 (50 topics each) conferences. It performs well, outperforming steadily the classical OKAPI term frequency weighting formula but below that of KL-Divergence from language modeling approach. Despite this, we believe that the technique is an important non-parametric way of thinking of retrieval, offering the possibility to try simple alternative retrieval formulas within goodness-of-fit statistical tests’ framework, modeling the data in various ways estimating or assigning any arbitrary theoretical distribution in terms.
panhellenic conference on informatics | 2015
Kostas Fragos; Christos Skourlas
In this paper, a new method for medical article classification is proposed based on exploiting information from local and global class label frequencies in training corpus. The proposed method partially overcomes the low accuracy rate of KNN classifier. First, it uses a lexical approach to identify tokens in the medical document article and then, it uses local and global class label frequencies in a sophisticated way similar to traditional tf-idf weighting scheme to devise the weighted function in classification process. The evaluation experiments on the collection of medical documents, called Ohsumed, show that the method proposed here significantly outperforms traditional KNN classification.
acm symposium on applied computing | 2006
Petros Belsis; Kostas Fragos; Stefanos Gritzalis; Christos Skourlas
Many linear statistical models have been lately proposed in text classification related literature and evaluated against the Unsolicited Bulk Email filtering problem. Despite their popularity - due both to their simplicity and relative ease of interpretation - the non-linearity assumption of data samples is inappropriate in practice, due to its inability to capture the apparent non-linear relationships, which characterize these samples. In this paper, we propose the SF-HME, a Hierarchical Mixture-of-Experts system, attempting to overcome limitations common to other machine-learning based approaches when applied to spam mail classification. By reducing the dimensionality of data through the usage of the effective Simba algorithm for feature selection, we evaluated our SF-HME system with a publicly available corpus of emails, with very high similarity between legitimate and bulk email - and thus low discriminative potential - where the traditional rule based filtering approaches achieve considerable lower degrees of precision. As a result, we confirm the domination of our SF-HME method against other machine learning approaches, which appeared to present lesser degree of recall.
panhellenic conference on informatics | 2016
Kostas Fragos; Christos Skourlas
K-Nearest Neighbor (KNN) is one of the most popular algorithms for text classification. In many experiments researchers have found that the KNN algorithm accomplishes very good performance on different data sets. In a previous work [16], we proposed an algorithm, called lf-igf KNN, to classify medical articles using local from neighborhood and global from corpus class label frequencies to device a weighting scheme for ranking all data points in the training set. In this work, we modify this previous work going beyond simple counting, both smoothing class label frequencies and neighbors distances. We provide by this way an alternative and more robust weighted scheme for KNN classification. The evaluation experiments on the collection of medical documents, called Ohsumed, show promising results and inspire us to use smoothing techniques to treat class occurrences in traditional KNN classification.
The 2nd International Workshop on Natural Language Understanding and Cognitive Science | 2016
Kostas Fragos; Yannis Maistros; Christos Skourlas
Procedia - Social and Behavioral Sciences | 2014
Kostas Fragos; Petros Belsis; Christos Skourlas
International Journal on Artificial Intelligence Tools | 2005
Kostas Fragos; Yannis Maistros