Fotis Jannidis | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fotis Jannidis is active.

Explore More

Publication

Featured researches published by Fotis Jannidis.

international conference on e science | 2006

TextGrid and eHumanities

Peter Gietz; Andreas Aschenbrenner; Stefan Büdenbender; Fotis Jannidis; Marc Wilhelm Küster; Christoph Ludwig; Wolfgang Pempe; Thorsten Vitt; Werner Wegstein; Andrea Zielinski

TextGrid is a new Grid project in the framework of the German D-Grid initiative, with the aim to deploy Grid technologies for humanities scholars working on historical (German) texts. Its two roots, humanities computing and eScience (Grid computing used by research together with modern communication technologies), are the basis for TextGrid to provide pioneer work in eHumanities. After summarizing Humanities Computing and modern network technologies, community expectations in the fields of philological edition and other application areas are set forth, from which functional requirements such as modularity, distribution, etc. are distilled. The first version of the TextGrid architecture was designed in accordance with these requirements, and focuses on openness by standard conformance and encapsulation. It provides storage Grid services via a pure Web Services interface to dedicated Web Services tools for different aspects of text processing, analysis and retrieval. This platform aims to provide easily usable tools for scholars, but also specifies interfaces for external program developers to add functionality.

north american chapter of the association for computational linguistics | 2015

Rule-based Coreference Resolution in German Historic Novels

Markus Krug; Frank Puppe; Fotis Jannidis; Luisa Macharowsky; Isabella Reger; Lukas Weimar

Coreference resolution (CR) is a key task in the automated analysis of characters in stories. Standard CR systems usually trained on newspaper texts have difficulties with literary texts, even with novels; a comparison with newspaper texts showed that average sentence length is greater in novels and the number of pronouns, as well as the percentage of direct speech is higher. We report promising evaluation results for a rule-based system similar to [Lee et al. 2011], but tailored to the domain which recognizes coreference chains in novels much better than CR systems like CorZu. Rule-based systems performed best on the CoNLL 2011 challenge [Pradhan et al. 2011]. Recent work in machine learning showed similar results as rule-based systems [Durett et al. 2013]. The latter has the advantage that its explanation component facilitates a fine grained error analysis for incremental refinement of the rules.

Digital Scholarship in the Humanities | 2017

Understanding and explaining Delta measures for authorship attribution

Stefan Evert; Thomas Proisl; Fotis Jannidis; Isabella Reger; Steffen Pielström; Christof Schöch; Thorsten Vitt

This article builds on a mathematical explanation of one the most prominent stylometric measures, Burrows’s Delta (and its variants), to understand and explain its working. Starting with the conceptual separation between feature selection, feature scaling, and distance measures, we have designed a series of controlled experiments in which we used the kind of feature scaling (various types of standardization and normalization) and the type of distance measures (notably Manhattan, Euclidean, and Cosine) as independent variables and the correct authorship attributions as the dependent variable indicative of the performance of each of the methods proposed. In this way, we are able to describe in some detail how each of these two variables interact with each other and how they influence the results. Thus we can show that feature vector normalization, that is, the transformation of the feature vectors to a uniform length of 1 (implicit in the cosine measure), is the decisive factor for the improvement of Delta proposed recently. We are also able to show that the information particularly relevant to the identification of the author of a text lies in the profile of deviation across the most frequent words rather than in the extent of the deviation or in the deviation of specific words only. .................................................................................................................................................................................

database and expert systems applications | 2015

Genre Classification on German Novels

Lena Hettinger; Martin Becker; Isabella Reger; Fotis Jannidis; Andreas Hotho

The study of German literature is mostly based on literary canons, i.e., small sets of specifically chosen documents. In particular, the history of novels has been characterized using a set of only 100 to 250 works. In this paper we address the issue of genre classification in the context of a large set of novels using machine learning methods in order to achieve a better understanding of the genre of novels. To this end, we explore how different types of features affect the performance of different classification algorithms. We employ commonly used stylometric features, and evaluate two types of features not yet applied to genre classification, namely topic based features and features based on social network graphs and character interaction. We build features on a data set of close to 1700 novels either written in or translated into German. Even though topics are often considered orthogonal to genres, we find that topic based features in combination with support vector machines achieve the best results. Overall, we successfully apply new feature types for genre classification in the context of novels and give directions for further research in this area.

Joint German/Austrian Conference on Artificial Intelligence (Künstliche Intelligenz) | 2017

Towards Sentiment Analysis on German Literature

Albin Zehe; Martin Becker; Fotis Jannidis; Andreas Hotho

Sentiment Analysis is a Natural Language Processing-task that is relevant in a number of contexts, including the analysis of literature. We report on ongoing research towards enabling, for the first time, sentence-level Sentiment Analysis in the domain of German novels. We create a labelled dataset from sentences extracted from German novels and, by adapting existing sentiment classifiers, reach promising F1-scores of 0.67 for binary polarity classification.

DMNLP@PKDD/ECML | 2016