Koen Deschacht
Katholieke Universiteit Leuven
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Koen Deschacht.
empirical methods in natural language processing | 2009
Koen Deschacht; Marie-Francine Moens
Semantic Role Labeling (SRL) has proved to be a valuable tool for performing automatic analysis of natural language texts. Currently however, most systems rely on a large training set, which is manually annotated, an effort that needs to be repeated whenever different languages or a different set of semantic roles is used in a certain application. A possible solution for this problem is semi-supervised learning, where a small set of training examples is automatically expanded using unlabeled texts. We present the Latent Words Language Model, which is a language model that learns word similarities from unlabeled texts. We use these similarities for different semi-supervised SRL methods as additional features or to automatically expand a small training set. We evaluate the methods on the PropBank dataset and find that for small training sizes our best performing system achieves an error reduction of 33.27% F1-measure compared to a state-of-the-art supervised baseline.
Computer Speech & Language | 2012
Koen Deschacht; Jan De Belder; Marie-Francine Moens
We present a new generative model of natural language, the latent words language model. This model uses a latent variable for every word in a text that represents synonyms or related words in the given context. We develop novel methods to train this model and to find the expected value of these latent variables for a given unseen text. The learned word similarities help to reduce the sparseness problems of traditional n-gram language models. We show that the model significantly outperforms interpolated Kneser-Ney smoothing and class-based language models on three different corpora. Furthermore the latent variables are useful features for information extraction. We show that both for semantic role labeling and word sense disambiguation, the performance of a supervised classifier increases when incorporating these variables as extra features. This improvement is especially large when using only a small annotated corpus for training.
Multimedia Tools and Applications | 2010
Gert-Jan Poulisse; Marie-Francine Moens; Tomas Dekens; Koen Deschacht
In this paper, we describe an approach to segmenting news video based on the perceived shift in content using features spanning multiple modalities. We investigate a number of multimedia features, which serve as potential indicators of a change in story, in order to determine which are the most effective. The efficacy of our approach is demonstrated by the performance of our prototype, where a number of feature combinations demonstrate an up to 18% improvement in WindowDiff score compared to other state of the art story segmenters. In our investigation, there is no, one, clearly superior feature, rather the best segmentation occurs when there is synergy between multiple features. A further investigation into the effect on segmentation performance, while varying the number of training examples versus the number of features used, reveal that having better feature combinations is more important than having more training examples. Our work suggests that it is possible to train robust story segmenters for news video using only a handful of broadcasts, provided a good initial feature selection is made.
database and expert systems applications | 2008
Erik Boiy; Koen Deschacht; Marie-Francine Moens
We automatically construct a dictionary of visual (possible to perceive on a picture) or non-visual (impossible to perceive directly on a picture) entities and attributes, based on statistical association techniques used in data mining. We compute whether certain words that could function as entities or attributes of an entity are correlated with texts that describe images and use these words for the detection of visual nouns and visual adjectives. We compare our corpus-based approach with a knowledge-rich approach based on WordNet, and with a combination of both approaches.
Journal of Visual Communication and Image Representation | 2013
Koen Deschacht; Tinne Tuytelaars; Marie-Francine Moens
In this paper, we focus on the problem of automated video annotation. We report on the application of naming faces in soap series by using the weak supervision of narrative texts that describe the events in the video and that are drafted by fans. Several unsupervised methods that operate without any manual labeling of exemplar faces, and methods that use a limited number of labeled exemplars are presented and evaluated. All methods exploit the multiple co-occurrences between faces shown in the video and names mentioned in the texts to compute the strength of the linking and reinforce this coupling by means of an Expectation Maximization algorithm. We show that the unsupervised methods attain competitive results without any prior human effort. The results show F1 values between 80% and 86% for the recognition of the face-name pairs without any human supervision. These figures rise only slightly when a number of faces were manually labeled beforehand. The study gives insights in the benefits and bottlenecks of the proposed approaches, and an error analysis results in guidelines for the choice of a certain technique.
british machine vision conference | 2010
Chris Engels; Koen Deschacht; Jan Hendrik Becker; Tinne Tuytelaars; Sien Moens; Luc Van Gool
Given a video and associated text, we propose an automatic annotation scheme in which we employ a latent topic model to generate topic distributions from weighted text and then modify these distributions based on visual similarity. We apply this scheme to location annotation of a television series for which transcripts are available. The topic distributions allow us to avoid explicit classification, which is useful in cases where the exact number of locations is unknown. Moreover, many locations are unique to a single episode, making it impossible to obtain representative training data for a supervised approach. Our method first segments the episode into scenes by fusing cues from both images and text. We then assign location-oriented weights to the text and generate topic distributions for each scene using Latent Dirichlet Allocation. Finally, we update the topic distributions using the distributions of visually similar scenes. We formulate our visual similarity between scenes as an Earth Mover’s Distance problem. We quantitatively validate our multi-modal approach to segmentation and qualitatively evaluate the resulting location annotations. Our results demonstrate that we are able to generate accurate annotations, even for locations only seen in a single episode.
international conference on electronic publishing | 2007
Erik Boiy; Pieter Hens; Koen Deschacht; Marie-Francine Moens
meeting of the association for computational linguistics | 2007
Koen Deschacht; Marie-Francine Moens
Proceedings of the 9th Dutch-Belgian information retrieval workshop | 2009
Javier Arias; Koen Deschacht; Marie-Francine Moens
Lecture Notes in Computer Science | 2008
Koen Deschacht; Marie-Francine Moens