Lilja Øvrelid
University of Oslo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lilja Øvrelid.
Computational Linguistics | 2012
Erik Velldal; Lilja Øvrelid; Jonathon Read; Stephan Oepen
This article explores a combination of deep and shallow approaches to the problem of resolving the scope of speculation and negation within a sentence, specifically in the domain of biomedical research literature. The first part of the article focuses on speculation. After first showing how speculation cues can be accurately identified using a very simple classifier informed only by local lexical context, we go on to explore two different syntactic approaches to resolving the in-sentence scopes of these cues. Whereas one uses manually crafted rules operating over dependency structures, the other automatically learns a discriminative ranking function over nodes in constituent trees. We provide an in-depth error analysis and discussion of various linguistic properties characterizing the problem, and show that although both approaches perform well in isolation, even better results can be obtained by combining them, yielding the best published results to date on the CoNLL-2010 Shared Task data. The last part of the article describes how our speculation system is ported to also resolve the scope of negation. With only modest modifications to the initial design, the system obtains state-of-the-art results on this task also.
international conference on data mining | 2012
Emanuele Lapponi; Jonathon Read; Lilja Øvrelid
Proper treatment of negation is an important characteristic of methods for sentiment analysis. However, while there is a growing body of research on the automatic resolution of negation, it is not yet clear as to how negation is best represented for different applications. To begin to address this issue, we review representation alternatives and present a state-of-the-art system for negation resolution that is interoperable across these schemes. By employing different configurations of this system as a component in a test bed for lexically-based sentiment classification, we demonstrate that the choice of representation can have a significant impact on downstream processing.
meeting of the association for computational linguistics | 2009
Lilja Øvrelid; Jonas Kuhn; Kathrin Spreyer
This paper presents experiments which combine a grammar-driven and a data-driven parser. We show how the conversion of LFG output to dependency representation allows for a technique of parser stacking, whereby the output of the grammar-driven parser supplies features for a data-driven dependency parser. We evaluate on English and German and show significant improvements stemming from the proposed dependency structure as well as various other, deep linguistic features derived from the respective grammars.
meeting of the association for computational linguistics | 2014
Hugo Lewi Hammer; Per Erik Solberg; Lilja Øvrelid
Online political discussions have received a lot of attention over the past years. In this paper we compare two sentiment lexicon approaches to classify the sentiment of sentences from political discussions. The first approach is based on applying the number of words between the target and the sentiment words to weight the sentence sentiment score. The second approach is based on using the shortest paths between target and sentiment words in a dependency graph and linguistically motivated syntactic patterns expressed as dependency paths. The methods are tested on a corpus of sentences from online Norwegian political discussions. The results show that the method based on dependency graphs performs significantly better than the word-based approach.
meeting of the association for computational linguistics | 2009
Lilja Øvrelid
This article presents empirical evaluations of aspects of annotation for the linguistic property of animacy in Swedish, ranging from manual human annotation, automatic classification and, finally, an external evaluation in the task of syntactic parsing. We show that a treatment of animacy as a lexical semantic property of noun types enables generalization over distributional properties of these nouns which proves beneficial in automatic classification and furthermore gives significant improvements in terms of parsing accuracy for Swedish, compared to a state-of-the-art baseline parser with gold standard animacy information.
conference on computational natural language learning | 2008
Lilja Øvrelid
This article investigates the effect of a set of linguistically motivated features on argument disambiguation in data-driven dependency parsing of Swedish. We present results from experiments with gold standard features, such as animacy, definiteness and finiteness, as well as corresponding experiments where these features have been acquired automatically and show significant improvements both in overall parse results and in the analysis of specific argument relations, such as subjects, objects and predicatives.
conference of the european chapter of the association for computational linguistics | 2006
Lilja Øvrelid
This paper presents results from experiments in automatic classification of animacy for Norwegian nouns using decision-tree classifiers. The method makes use of relative frequency measures for linguistically motivated morphosyntactic features extracted from an automatically annotated corpus of Norwegian. The classifiers are evaluated using leave-one-out training and testing and the initial results are promising (approaching 90% accuracy) for high frequency nouns, however deteriorate gradually as lower frequency nouns are classified. Experiments attempting to empirically locate a frequency threshold for the classification method indicate that a subset of the chosen morphosyntactic features exhibit a notable resilience to data sparseness. Results will be presented which show that the classification accuracy obtained for high frequency nouns (with absolute frequencies >1000) can be maintained for nouns with considerably lower frequencies (~50) by backing off to a smaller set of features at classification.
Journal of Language Modelling | 2016
Angelina Ivanova; Stephan Oepen; Rebecca Dridan; Dan Flickinger; Lilja Øvrelid; Emanuele Lapponi
We compare three different approaches to parsing into syntactic, bilexical dependencies for English: a ‘direct’ data-driven dependency parser, a statistical phrase structure parser, and a hybrid, ‘deep’ grammar-driven parser. The analyses from the latter two are postconverted to bi-lexical dependencies. Through this ‘reduction’ of all three approaches to syntactic dependency parsers, we determine empirically what performance can be obtained for a common set of dependency types for English; in- and out-of-domain experimentation ranges over diverse text types. In doing so, we observe what trade-offs apply along three dimensions: accuracy, efficiency, and resilience to domain variation. Our results suggest that the hand-built grammar in one of our parsers helps in both accuracy and cross-domain parsing performance. When evaluated extrinsically in two downstream tasks – negation resolution and semantic dependency parsing – these accuracy gains do sometimes but not always translate into improved end-to-end performance.
north american chapter of the association for computational linguistics | 2016
Aksel Wester; Lilja Øvrelid; Erik Velldal; Hugo Lewi Hammer
This paper investigates the effect of various types of linguistic features (lexical, syntactic and semantic) for training classifiers to detect threats of violence in a corpus of YouTube comments. Our results show that combinations of lexical features outperform the use of more complex syntactic and semantic features for this task.
Journal of computing science and engineering | 2012
Jonathon Read; Erik Velldal; Lilja Øvrelid
Computational techniques for topic classification can support qualitative research by automatically applying labels in preparation for qualitative analyses. This paper presents an evaluation of supervised learning techniques applied to one such use case, namely, that of labeling emotions, instructions and information in suicide notes. We train a collection of one-versus-all binary support vector machine classifiers, using cost-sensitive learning to deal with class imbalance. The features investigated range from a simple bag-of-words and n-grams over stems, to information drawn from syntactic dependency analysis and WordNet synonym sets. The experimental results are complemented by an analysis of systematic errors in both the output of our system and the gold-standard annotations. Category: Smart and intelligent computing