Katja Markert
University of Leeds
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Katja Markert.
empirical methods in natural language processing | 2005
Johan Bos; Katja Markert
We use logical inference techniques for recognising textual entailment. As the performance of theorem proving turns out to be highly dependent on not readily available background knowledge, we incorporate model building, a technique borrowed from automated reasoning, and show that it is a useful robust method to approximate entailment. Finally, we use machine learning to combine these deep semantic analysis techniques with simple shallow word overlap; the resulting hybrid model achieves high accuracy on the RTE testset, given the state of the art. Our results also show that the different techniques that we employ perform very differently on some of the subsets of the RTE corpus and as a result, it is useful to use the nature of the dataset as a feature.
british machine vision conference | 2009
Josiah Wang; Katja Markert; Mark Everingham
We investigate the task of learning models for visual object recognition from natural language descriptions alone. The approach contributes to the recognition of fine-grain object categories, such as animal and plant species, where it may be difficult to collect many images for training, but where textual descriptions of visual attributes are readily available. As an example we tackle recognition of butterfly species, learning models from descriptions in an online nature guide. We propose natural language processing methods for extracting salient visual attributes from these descriptions to use as ‘templates’ for the object categories, and apply vision methods to extract corresponding attributes from test images. A generative model is used to connect textual terms in the learnt templates to visual attributes. We report experiments comparing the performance of humans and the proposed method on a dataset of ten butterfly categories.
empirical methods in natural language processing | 2003
Natalia N. Modjeska; Katja Markert; Malvina Nissim
We present a machine learning framework for resolving other-anaphora. Besides morpho-syntactic, recency, and semantic features based on existing lexical knowledge resources, our algorithm obtains additional semantic knowledge from the Web. We search the Web via lexico-syntactic patterns that are specific to other-anaphors. Incorporating this innovative feature leads to an 11.4 percentage point improvement in the classifiers F-measure (25% improvement relative to results without this feature).
international conference on computational linguistics | 2008
Fangzhong Su; Katja Markert
We determine the subjectivity of word senses. To avoid costly annotation, we evaluate how useful existing resources established in opinion mining are for this task. We show that results achieved with existing resources that are not tailored towards word sense subjectivity classification can rival results achieved with supervision on a manually annotated training set. However, results with different resources vary substantially and are dependent on the different definitions of subjectivity used in the establishment of the resources.
Artificial Intelligence | 2002
Katja Markert; Udo Hahn
We propose a new computational model for the resolution of metonymies, a particular type of figurative language. Typically, metonymies are considered as a violation of semantic constraints (e.g., those expressed by selectional restrictions) that require some repair mechanism (e.g., type coercion) for proper interpretation. We reject this view, arguing that it misses out on the interpretation of a considerable number of utterances. Instead, we treat literal and figurative language on a par, by computing both kinds of interpretation independently from each other as long as their semantic representation structures are consistent with the underlying knowledge representation structures of the domain of discourse. The following general heuristic principles apply for making reasonable selections from the emerging readings. We argue that the embedding of utterances in a coherent discourse context is as important for recognizing and interpreting metonymic utterances as intrasentential semantic constraints. Therefore, in our approach, (metonymic or literal) interpretations that establish referential cohesion are preferred over ones that do not. In addition, metonymic interpretations that conform to a metonymy schema are preferred over metonymic ones that do not, and metonymic interpretations that are in conformance with knowledge-based aptness conditions are preferred over metonymic ones that are not. We lend further credit to our model by discussing empirical data from an evaluation study which highlights the importance of the discourse embedding of metonymy interpretation for both anaphora and metonymy resolution.
meeting of the association for computational linguistics | 2003
Malvina Nissim; Katja Markert
We present a supervised machine learning algorithm for metonymy resolution, which exploits the similarity between examples of conventional metonymy. We show that syntactic head-modifier relations are a high precision feature for metonymy recognition but suffer from data sparseness. We partially overcome this problem by integrating a thesaurus and introducing simpler grammatical features, thereby preserving precision and increasing recall. Our algorithm generalises over two levels of contextual similarity. Resulting inferences exceed the complexity of inferences undertaken in word sense disambiguation. We also compare automatic and manual methods for syntactic feature extraction.
north american chapter of the association for computational linguistics | 2009
Fangzhong Su; Katja Markert
We supplement WordNet entries with information on the subjectivity of its word senses. Supervised classifiers that operate on word sense definitions in the same way that text classifiers operate on web or newspaper texts need large amounts of training data. The resulting data sparseness problem is aggravated by the fact that dictionary definitions are very short. We propose a semi-supervised minimum cut framework that makes use of both WordNet definitions and its relation structure. The experimental results show that it outperforms supervised minimum cut as well as standard supervised, non-graph classification, reducing the error rate by 40%. In addition, the semi-supervised approach achieves the same results as the supervised framework with less than 20% of the training data.
empirical methods in natural language processing | 2002
Katja Markert; Malvina Nissim
We reformulate metonymy resolution as a classification task. This is motivated by the regularity of metonymic readings and makes general classification and word sense disambiguation methods available for metonymy resolution. We then present a case study for location names, presenting both a corpus of location names annotated for metonymy as well as experiments with a supervised classification algorithm on this corpus. We especially explore the contribution of features used in word sense disambiguation to metonymy resolution.
meeting of the association for computational linguistics | 2007
Katja Markert; Malvina Nissim
We provide an overview of the metonymy resolution shared task organised within SemEval-2007. We describe the problem, the data provided to participants, and the evaluation measures we used to assess performance. We also give an overview of the systems that have taken part in the task, and discuss possible directions for future work.
language resources and evaluation | 2009
Katja Markert; Malvina Nissim
We describe the first shared task for figurative language resolution, which was organised within SemEval-2007 and focused on metonymy. The paper motivates the linguistic principles of data sampling and annotation and shows the task’s feasibility via human agreement. The five participating systems mainly used supervised approaches exploiting a variety of features, of which grammatical relations proved to be the most useful. We compare the systems’ performance to automatic baselines as well as to a manually simulated approach based on selectional restriction violations, showing some limitations of this more traditional approach to metonymy recognition. The main problem supervised systems encountered is data sparseness, since metonymies in general tend to occur more rarely than literal uses. Also, within metonymies, the reading distribution is skewed towards a few frequent metonymy types. Future task developments should focus on addressing this issue.