Joachim Bingel
University of Copenhagen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Joachim Bingel.
meeting of the association for computational linguistics | 2016
Maria Barrett; Joachim Bingel; Frank Keller; Anders Søgaard
For many of the world’s languages, there are no or very few linguistically annotated resources. On the other hand, raw text, and often also dictionaries, can be harvested from the web for many of these languages, and part-of-speech taggers can be trained with these resources. At the same time, previous research shows that eye-tracking data, which can be obtained without explicit annotation, contains clues to partof-speech information. In this work, we bring these two ideas together and show that given raw text, a dictionary, and eyetracking data obtained from naive participants reading text, we can train a weakly supervised PoS tagger using a secondorder HMM with maximum entropy emissions. The best model use type-level aggregates of eye-tracking data and significantly outperforms a baseline that does not have access to eye-tracking data.
meeting of the association for computational linguistics | 2017
Marcel Bollmann; Joachim Bingel; Anders Søgaard
Automated processing of historical texts often relies on pre-normalization to modern word forms. Training encoder-decoder architectures to solve such problems typically requires a lot of training data, which is not available for the named task. We address this problem by using several novel encoder-decoder architectures, including a multi-task learning (MTL) architecture using a grapheme-to-phoneme dictionary as auxiliary data, pushing the state-of-the-art by an absolute 2% increase in performance. We analyze the induced models across 44 different texts from Early New High German. Interestingly, we observe that, as previously conjectured, multi-task learning can learn to focus attention during decoding, in ways remarkably similar to recently proposed attention mechanisms. This, we believe, is an important step toward understanding how MTL works.
meeting of the association for computational linguistics | 2016
Joachim Bingel; Anders Søgaard
We present a new, structured approach to text simplification using conditional random fields over top-down traversals of dependency graphs that jointly predicts possible compressions and paraphrases. Our model reaches readability scores comparable to word-based compression approaches across a range of metrics and human judgements while maintaining more of the important information.
meeting of the association for computational linguistics | 2016
Joachim Bingel; Maria Barrett; Anders Søgaard
Neuro-imaging studies on reading different parts of speech (PoS) report somewhat mixed results, yet some of them indicate different activations with different PoS. This paper addresses the difficulty of using fMRI to discriminate between linguistic tokens in reading of running text because of low temporal resolution. We show that once we solve this problem, fMRI data contains a signal of PoS distinctions to the extent that it improves PoS induction with error reductions of more than 4%.
Archive | 2018
Zeerak Waseem; James Thorne; Joachim Bingel
Accurately detecting hate speech using supervised classification is dependent on data that is annotated by humans. Attaining high agreement amongst annotators though is difficult due to the subjective nature of the task, and different cultural, geographic and social backgrounds of the annotators. Furthermore, existing datasets capture only single types of hate speech such as sexism or racism; or single demographics such as people living in the United States, which negatively affects the recall when classifying data that are not captured in the training examples. End users of websites where hate speech may occur are exposed to risk of being exposed to explicit content due to the shortcomings in the training of automatic hate speech detection systems where unseen forms of hate speech or hate speech towards unseen groups are not captured. In this paper, we investigate methods for bridging differences in annotation and data collection of abusive language tweets such as different annotation schemes, labels, or geographic and cultural influences from data sampling. We consider three distinct sets of annotations, namely the annotations provided by Waseem (2016), Waseem and Hovy (2016), and Davidson et al. (2017). Specifically, we train a machine learning model using a multi-task learning (MTL) framework, where typically some auxiliary task is learned alongside a main task in order to gain better performance on the latter. Our approach distinguishes itself from most previous work in that we aim to train a model that is robust across data originating from different distributions and labeled under differing annotation guidelines, and that we understand these different datasets as different learning objectives in the way that classical work in multi-task learning does with different tasks. Here, we experiment with using fine-grained tags for annotation. Aided by the predictions in our models as well as the baseline models, we seek to show that it is possible to utilize distinct domains for classification as well as showing how cultural contexts influence classifier performance as the datasets we use are collected either exclusively from the U.S. Davidson et al. (2017) or collected globally with no geographic restriction (Waseem 2016; Waseem and Hovy 2016). Our choice for a multi-task learning set-up is motivated by a number of factors. Most importantly, MTL allows us to share knowledge between two or more objectives, such that we can leverage information encoded in one dataset to better fit another. As shown by Bingel and Sogaard (2017) and Martinez Alonso and Plank (2017), this is particularly promising when the auxiliary task has a more coarse-grained set of labels in comparison to the main task. Another benefit of MTL is that it lets us learn lower-level representations from greater amounts of data when compared to a single-task setup. This, in connection with MTL being known to work as a regularizer, is not only promising when it comes to fitting the training data, but also helps to prevent overfitting, especially when we have to deal with small datasets.
arXiv: Machine Learning | 2017
Sebastian Ruder; Joachim Bingel; Isabelle Augenstein; Anders Søgaard
conference of the european chapter of the association for computational linguistics | 2017
Joachim Bingel; Anders Søgaard
north american chapter of the association for computational linguistics | 2018
Joachim Bingel; Maria Barrett; Sigrid Klerke
international conference on computational linguistics | 2018
Joachim Bingel; Gustavo Paetzold; Anders Søgaard
arXiv: Machine Learning | 2017
Sebastian Ruder; Joachim Bingel; Isabelle Augenstein; Anders Søgaard