Nikos Fakotakis
University of Patras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nikos Fakotakis.
Computational Linguistics | 2000
Efstathios Stamatatos; George K. Kokkinakis; Nikos Fakotakis
The two main factors that characterize a text are its content and its style, and both can be used as a means of categorization. In this paper we present an approach to text categorization in terms of genre and author for Modern Greek. In contrast to previous stylometric approaches, we attempt to take full advantage of existing natural language processing (NLP) tools. To this end, we propose a set of style markers including analysis-level measures that represent the way in which the input text has been analyzed and capture useful stylistic information without additional cost. We present a set of small-scale but reasonable experiments in text genre detection, author identification, and author verification tasks and show that the proposed method performs better than the most popular distributional lexical measures, i.e., functions of vocabulary richness and frequencies of occurrence of the most frequent words. All the presented experiments are based on unrestricted text downloaded from the World Wide Web without any manual text preprocessing or text sampling. Various performance issues regarding the training set size and the significance of the proposed style markers are discussed. Our system can be used in any application that requires fast and easily adaptable text categorization in terms of stylistically homogeneous categories. Moreover, the procedure of defining analysis-level markers can be followed in order to extract useful stylistic information using existing text processing tools.
Computers and The Humanities | 2001
Efstathios Stamatatos; Nikos Fakotakis; G. Kokkinakis
The most important approaches to computer-assistedauthorship attribution are exclusively based onlexical measures that either represent the vocabularyrichness of the author or simply comprise frequenciesof occurrence of common words. In this paper wepresent a fully-automated approach to theidentification of the authorship of unrestricted textthat excludes any lexical measure. Instead we adapt aset of style markers to the analysis of the textperformed by an already existing natural languageprocessing tool using three stylometric levels, i.e.,token-level, phrase-level, and analysis-levelmeasures. The latter represent the way in which thetext has been analyzed. The presented experiments ona Modern Greek newspaper corpus show that the proposedset of style markers is able to distinguish reliablythe authors of a randomly-chosen group and performsbetter than a lexically-based approach. However, thecombination of these two approaches provides the mostaccurate solution (i.e., 87% accuracy). Moreover, wedescribe experiments on various sizes of the trainingdata as well as tests dealing with the significance ofthe proposed set of style markers.
international conference on computational linguistics | 2000
Efstathios Stamatatos; Nikos Fakotakis; George K. Kokkinakis
In this paper we present a method for detecting the text genre quickly and easily following an approach originally proposed in authorship attribution studies which uses as style markers the frequencies of occurrence of the most frequent words in a training corpus (Burrows, 1992). In contrast to this approach we use the frequencies of occurrence of the most frequent words of the entire written language. Using as testing ground a part of the Wall Street Journal corpus, we show that the most frequent words of the British National Corpus, representing the most frequent words of the written English language, are more reliable discriminators of text genre in comparison to the most frequent words of the training corpus. Moreover, the frequencies of occurrence of the most common punctuation marks play an important role in terms of accurate text categorization as well as when dealing with training data of limited size.
Natural Language Engineering | 1996
Stephanos E. Michos; Nikos Fakotakis; G. Kokkinakis
Operating system command languages assist the user in executing commands for a significant number of common everyday tasks. On the other hand, the introduction of textual command languages for robots has provided the opportunity to perform some important functions that leadthrough programming cannot readily accomplish. However, such command languages assume the user to be expert enough to carry out a specific task in these application domains. On the contrary, a natural language interface to such command languages, apart from being able to be integrated into a future speech interface, can facilitate and broaden the use of these command languages to a larger audience. In this paper, advanced techniques are presented for an adaptive natural language interface that can (a) be portable to a large range of command languages, (b) handle even complex commands thanks to an embedded linguistic parser, and (c) be expandable and customizable by providing the casual user with the opportunity to specify some types of new words as well as the system developer with the ability to introduce new tasks in these application domains. Finally, to demonstrate the above techniques in practice, an example of their application to a Greek natural language interface to the MS-DOS operating system is given.
conference of the european chapter of the association for computational linguistics | 1999
Efstathios Stamatatos; Nikos Fakotakis; George K. Kokkinakis
In this paper we present an approach to automatic authorship attribution dealing with real-world (or unrestricted) text. Our method is based on the computational analysis of the input text using a text-processing tool. Besides the style markes relevant to the output of this tool we also use analysis-dependent style markers, that is, measures that represent the way in which the text has been processed. No word frequency counts, nor other lexically-based measures are taken into account. We show that the proposed set of style markers is able to distinguish texts of various authors of a weekly newspaper using multiple regression. All the experiments we present were performed using real-world text downloaded from the World Wide Web. Our approach is easily trainable and fully-automated requiring no manual text preprocessing nor sampling.
international conference on acoustics, speech, and signal processing | 2009
Stavros Ntalampiras; Ilyas Potamitis; Nikos Fakotakis
The present study presents a practical methodology for automatic space monitoring based solely on the perceived acoustic information. We consider the case where atypical situations such as screams, explosions and gunshots take place in a metro station environment. Our approach is based on a two stage recognition schema, each one exploiting HMMs for approximating the density function of the corresponding sound class. The main objective is to detect abnormal events that take place in a noisy environment. A thorough evaluation procedure is carried out under different SNR conditions and we report high detection rates with respect to false alarm and miss probabilities rates.
International Journal on Document Analysis and Recognition | 2002
Ergina Kavallieratou; Nikos Fakotakis; George K. Kokkinakis
Abstract. In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved accuracy that varies from 65.6% to 100% depending on the database and the experiment.
IEEE Transactions on Multimedia | 2011
Stavros Ntalampiras; Ilyas Potamitis; Nikos Fakotakis
Novelty detection in the machine learning context refers to identifying unknown/novel data, i.e., data which vary greatly from the ones that the system was trained with. This paper explores this technique as applied to acoustic surveillance of abnormal situations. The ultimate goal of the system is to help an authorized person towards taking the appropriate actions for preventing life/property loss. A wide variety of acoustic parameters is employed towards forming a multidomain feature vector, which captures diverse characteristics of the audio signals. Subsequently the feature coefficients are fed to three probabilistic novelty detection methodologies. Their performance is computed using two measures which take into account misdetections and false alarms. Out dataset was recorded under real-world conditions including three different locations where various types of normal and abnormal sound events were captured. A smart-home environment, an open public space, and an office corridor were used. The results indicate that probabilistic novelty detection can provide an accurate analysis of the audio scene to identify abnormal events.
Image and Vision Computing | 2002
Ergina Kavallieratou; Nikos Fakotakis; George K. Kokkinakis
A skew estimation algorithm for printed and handwritten documents, based on the document’s horizontal projection profile and its Wigner – Ville distribution, is presented. The proposed algorithm is able to correct skew angles that range between 289 and þ898 detecting the right oriented position of the page by the alternations of the horizontal projection profile. It is able of processing successfully handwritten documents, even if they consist of non-parallel text lines. It deals with the presence of graphics, while a few text lines suffice for the application of the algorithm. Furthermore, the latter permits the use of only a part of the page for the skew estimation minimizing the computational complexity. The proposed algorithm was evaluated on a wide variety of pages (i.e. printed, handwritten, multi-column, application forms etc.) achieving a success rate of 100% within a confidence range of ^ 0.38. q 2002 Published by Elsevier Science B.V.
Eurasip Journal on Audio, Speech, and Music Processing | 2009
Stavros Ntalampiras; Ilyas Potamitis; Nikos Fakotakis
Robust recognition of general audio events constitutes a topic of intensive research in the signal processing community. This work presents an efficient methodology for acoustic surveillance of atypical situations which can find use under different acoustic backgrounds. The primary goal is the continuous acoustic monitoring of a scene for potentially hazardous events in order to help an authorized officer to take the appropriate actions towards preventing human loss and/or property damage. A probabilistic hierarchical scheme is designed based on Gaussian mixture models and state-of-the-art sound parameters selected through extensive experimentation. A feature of the proposed system is its model adaptation loop that provides adaptability to different sound environments. We report extensive experimental results including installation in a real environment and operational detection rates for three days of function on a 24 hour basis. Moreover, we adopt a reliable testing procedure that demonstrates high detection rates as regards average recognition, miss probability, and false alarm rates.