Hessel Haagsma
University of Groningen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hessel Haagsma.
north american chapter of the association for computational linguistics | 2016
Johannes Bjerva; Johannes Bos; Hessel Haagsma
We participated in the shared task on meaning representation parsing (Task 8 at SemEval-2016) with the aim of investigating whether we could use Boxer, an existing open-domain semantic parser, for this task. However, the meaning representations produced by Boxer, Discourse Representation Structures, are considerably different from Abstract Meaning Representations, AMRs, the target meaning representations of the shared task. Our hybrid conversion method (involving lexical adaptation as well as post-processing of the output) failed to produce state-of-the-art results. Nonetheless, F-scores of 53% on development and 47% on test data (50% unofficially) were obtained.
cross-language evaluation forum | 2017
Masha Medvedeva; Hessel Haagsma; Malvina Nissim
User profiling on social media data is normally done within a supervised setting. A typical feature of supervised models that are trained on data from a specific genre, is their limited portability to other genres. Cross-genre models were developed in the context of PAN 2016, where systems were trained on tweets, and tested on other non-tweet social media data. Did the model that achieved best results at this task got lucky or was it truly designed in a cross-genre manner, with features general enough to capture demographics beyond Twitter? We explore this question via a series of in-genre and cross-genre experiments on English and Spanish using the best performing system at PAN 2016, and discover that portability is successful to a certain extent, provided that the sub-genres involved are close enough. In such cases, it is also more beneficial to do cross-genre than in-genre modelling if the cross-genre setting can benefit from larger amounts of training data than those available in-genre.
meeting of the association for computational linguistics | 2016
Hessel Haagsma
Singleton (or non-coreferential) mentions are a problem for coreference resolution systems, and identifying singletons before mentions are linked improves resolution performance. Here, a singleton detection system based on word embeddings and neural networks is presented, which achieves state-of-the-art performance (79.6% accuracy) on the CoNLL-2012 shared task development set. Extrinsic evaluation with the Stanford and Berkeley coreference resolution systems shows significant improvement for the first, but not for the latter. The results show the potential of using neural networks and word embeddings for improving both singleton detection and coreference resolution.
Proceedings of the Fourth Workshop on Metaphor in NLP | 2016
Hessel Haagsma; Johannes Bjerva
Recent work on metaphor processing often employs selectional preference information. We present a comparison of different approaches to the modelling of selectional preferences, based on various ways of generalizing over corpus frequencies. We evaluate on the VU Amsterdam Metaphor corpus, a broad corpus of metaphor. We find that using only selectional preference information is enough to outperform an all-metaphor baseline classification, but that generalization through prediction or clustering is not beneficial. A possible explanation for this lies in the nature of the evaluation data, and lack of power of selectional preference information on its own for non-novel metaphor detection. To better investigate the role of metaphor type in metaphor detection, we suggest a resource with annotation of novel metaphor should be created.
cross language evaluation forum | 2018
Angelo Basile; Gareth Dwyer; Maria Medvedeva; Josine Rawee; Hessel Haagsma; Malvina Nissim
A simple linear SVM with word and character n-gram features and minimal parameter tuning can identify the gender and the language variety (for English, Spanish, Arabic and Portuguese) of Twitter users with very high accuracy. All our attempts at improving performance by including more data, smarter features, and employing more complex architectures plainly fail. In addition, we experiment with joint and multitask modelling, but find that they are clearly outperformed by single task models. Eventually, our simplest model was submitted to the PAN 2017 shared task on author profiling, obtaining an average accuracy of 0.86 on the test set, with performance on sub-tasks ranging from 0.68 to 0.98. These were the best results achieved at the competition overall. To allow lay people to easily use and see the value of machine learning for author profiling, we also built a web application on top our models.
Computational Linguistics | 2017
Malvina Nissim; Lasha Abzianidze; Kilian Evang; Rob van der Goot; Hessel Haagsma; Barbara Plank; Martijn Wieling
Shared tasks are indisputably drivers of progress and interest for problems in NLP. This is reflected by their increasing popularity, as well as by the fact that new shared tasks regularly emerge for under-researched and under-resourced topics, especially at workshops and smaller conferences. The general procedures and conventions for organizing a shared task have arisen organically over time (Paroubek, Chaudiron, and Hirschman 2007, Section 7). There is no consistent framework that describes how shared tasks should be organized. This is not a harmful thing per se, but we believe that shared tasks, and by extension the field in general, would benefit from some reflection on the existing conventions. This, in turn, could lead to the future harmonization of shared task procedures. Shared tasks revolve around two aspects: research advancement and competition. We see research advancement as the driving force and main goal behind organizing them. Competition is an instrument to encourage and promote participation. However,
conference of the european chapter of the association for computational linguistics | 2017
Lasha Abzianidze; Johannes Bjerva; Kilian Evang; Hessel Haagsma; Rik van Noord; Pierre Ludmann; Duc-Duy Nguyen; Johannes Bos
cross language evaluation forum | 2017
Angelo Basile; Gareth Dwyer; Maria Medvedeva; Josine Rawee; Hessel Haagsma; Malvina Nissim
CLEF (Working Notes) | 2016
Mart Busger op Vollenbroek; Talvany Carlotto; Tim Kreutz; Maria Medvedeva; Chris Pool; Johannes Bjerva; Hessel Haagsma; Malvina Nissim
computational linguistics in the netherlands | 2015
Hessel Haagsma