Johannes Deleu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Johannes Deleu is active.

Explore More

Publication

Featured researches published by Johannes Deleu.

international world wide web conferences | 2015

Topical Word Importance for Fast Keyphrase Extraction

Lucas Sterckx; Thomas Demeester; Johannes Deleu; Chris Develder

We propose an improvement on a state-of-the-art keyphrase extraction algorithm, Topical PageRank (TPR), incorporating topical information from topic models. While the original algorithm requires a random walk for each topic in the topic model being used, ours is independent of the topic model, computing but a single PageRank for each text regardless of the amount of topics in the model. This increases the speed drastically and enables it for use on large collections of text using vast topic models, while not altering performance of the original algorithm.

Knowledge Based Systems | 2016

Knowledge base population using semantic label propagation

Lucas Sterckx; Thomas Demeester; Johannes Deleu; Chris Develder

Training relation extractors for the purpose of automated knowledge base population requires the availability of sufficient training data. The amount of manual labeling can be significantly reduced by applying distant supervision, which generates training data by aligning large text corpora with existing knowledge bases. This typically results in a highly noisy training set, where many training sentences do not express the intended relation. In this paper, we propose to combine distant supervision with minimal human supervision by annotating features (in particular shortest dependency paths) rather than complete relation instances. Such feature labeling eliminates noise from the initial training set, resulting in a significant increase of precision at the expense of recall. We further improve on this approach by introducing the Semantic Label Propagation (SLP) method, which uses the similarity between low-dimensional representations of candidate training instances to again extend the (filtered) training set in order to increase recall while maintaining high precision. Our strategy is evaluated on an established test collection designed for knowledge base population (KBP) from the TAC KBP English slot filling task. The experimental results show that SLP leads to substantial performance gains when compared to existing approaches while requiring an almost negligible human annotation effort.

international conference on communication software and networks | 2009

CROEQS: Contemporaneous Role Ontology-based Expanded Query Search - Implementation and Evaluation

Stijn Vandamme; Johannes Deleu; Tim Wauters; Brecht Vermeulen; Filip De Turck

Searching annotated items in multimedia databases becomes increasingly important. The traditional approach is to build a search engine based on textual metadata. However, in manually annotated multimedia databases, the conceptual level of what is searched for might differ from the high-levelness of the annotations of the items. To address this problem, we present CROEQS, a semantically enhanced search engine. It allows the user to query the annotated persons not only on their name, but also on their roles at the time the multimedia item was broadcast. We also present the ontology used to expand such queries: it allows us to semantically represent the domain knowledge on people fulfilling a role during a temporal interval in general, and politicians holding a political office specifically. The evaluation results show that query expansion using data retrieved from an ontology considerably filters the result set, although there is a performance penalty.

Expert Systems With Applications | 2018

An attentive neural architecture for joint segmentation and parsing and its application to real estate ads

Giannis Bekoulis; Johannes Deleu; Thomas Demeester; Chris Develder

We convert textual real estate ads to structured representations as property trees.We propose a new neural model for joint segmentation and parsing in 1 step.Our joint model outperforms 2-step pipeline methods with 3.42%.Including attention models incurs an additional 2.1% F1 improvement. In processing human produced text using natural language processing (NLP) techniques, two fundamental subtasks that arise are (i) segmentation of the plain text into meaningful subunits (e.g., entities), and (ii) dependency parsing, to establish relations between subunits. Such structural interpretation of text provides essential building blocks for upstream expert system tasks: e.g., from interpreting textual real estate ads, one may want to provide an accurate price estimate and/or provide selection filters for end users looking for a particular property which all could rely on knowing the types and number of rooms, etc. In this paper, we develop a relatively simple and effective neural joint model that performs both segmentation and dependency parsing together, instead of one after the other as in most state-of-the-art works. We will focus in particular on the real estate ad setting, aiming to convert an ad to a structured description, which we name property tree, comprising the tasks of (1) identifying important entities of a property (e.g., rooms) from classifieds and (2) structuring them into a tree format. In this work, we propose a new joint model that is able to tackle the two tasks simultaneously and construct the property tree by (i) avoiding the error propagation that would arise from the subtasks one after the other in a pipelined fashion, and (ii) exploiting the interactions between the subtasks. For this purpose, we perform an extensive comparative study of the pipeline methods and the new proposed joint model, reporting an improvement of over three percentage points in the overall edge F1 score of the property tree. Also, we propose attention methods, to encourage our model to focus on salient tokens during the construction of the property tree. Thus we experimentally demonstrate the usefulness of attentive neural architectures for the proposed joint model, showcasing a further improvement of two percentage points in edge F1 score for our application. While the results demonstrated are for the particular real estate setting, the model is generic in nature, and thus could be equally applied to other expert system scenarios requiring the general tasks of both (i) detecting entities (segmentation) and (ii) establishing relations among them (dependency parsing).

international world wide web conferences | 2015

When Topic Models Disagree: Keyphrase Extraction with Multiple Topic Models

Lucas Sterckx; Thomas Demeester; Johannes Deleu; Chris Develder

We explore how the unsupervised extraction of topic-related keywords benefits from combining multiple topic models. We show that averaging multiple topic models, inferred from different corpora, leads to more accurate keyphrases than when using a single topic model and other state-of-the-art techniques. The experiments confirm the intuitive idea that a prerequisite for the significant benefit of combining multiple models is that the models should be sufficiently different, i.e., they should provide distinct contexts in terms of topical word importance.

workshop on image analysis for multimedia interactive services | 2009

CROEQS: Contemporaneous role ontology-based expanded query search - Analysis of the result set size

Stijn Vandamme; Johannes Deleu; Tim Wauters; Brecht Vermeulen; Filip De Turck

We designed CROEQS, which allows users to search a multimedia database. The query formulation is not only based on text, but also on higher-level attributes: users can query videos about persons based on additional information about that person. The aim of CROEQS is to build a more efficient search engine, able to respond to queries on a higher level. To this end it makes use of an ontology in which the domain knowledge on people fulfilling a role during a temporal interval in general, and politicians holding a political office specifically. In this paper, we focus on the impact of the semantic clause on the result set size. The evaluation results demonstrate a significant impact, which can be used for targeted search, and show that temporal information is useful if the period during which a the temporal statement holds is limited.

language resources and evaluation | 2018

Creation and evaluation of large keyphrase extraction collections with multiple opinions

Lucas Sterckx; Thomas Demeester; Johannes Deleu; Chris Develder

AbstractWhile several automatic keyphrase extraction (AKE) techniques have been developed and analyzed, there is little consensus on the definition of the task and a lack of overview of the effectiveness of different techniques. Proper evaluation of keyphrase extraction requires large test collections with multiple opinions, currently not available for research. In this paper, we (i) present a set of test collections derived from various sources with multiple annotations (which we also refer to as opinions in the remained of the paper) for each document, (ii) systematically evaluate keyphrase extraction using several supervised and unsupervised AKE techniques, (iii) and experimentally analyze the effects of disagreement on AKE evaluation. Our newly created set of test collections spans different types of topical content from general news and magazines, and is annotated with multiple annotations per article by a large annotator panel. Our annotator study shows that for a given document there seems to be a large disagreement on the preferred keyphrases, suggesting the need for multiple opinions per document. A first systematic evaluation of ranking and classification of keyphrases using both unsupervised and supervised AKE techniques on the test collections shows a superior effectiveness of supervised models, even for a low annotation effort and with basic positional and frequency features, and highlights the importance of a suitable keyphrase candidate generation approach. We also study the influence of multiple opinions, training data and document length on evaluation of keyphrase extraction. Our new test collection for keyphrase extraction is one of the largest of its kind and will be made available to stimulate future work to improve reliable evaluation of new keyphrase extractors.

international conference on future energy systems | 2018

Achieving Scalable Model-Free Demand Response in Charging an Electric Vehicle Fleet with Reinforcement Learning

Nasrin Sadeghianpourhamami; Johannes Deleu; Chris Develder

To achieve coordinated electric vehicle (EV) charging with demand response (DR), a model-free approach using reinforcement learning (RL) is an attractive proposition. Using RL, the DR algorithm is defined as a Markov decision process (MDP). Initial work in this area comprises algorithms to control just one EV at a time, because of scalability challenges when taking coupling between EVs into account. In this paper, we propose a novel MDP definition for charging an EV fleet. More specifically, we propose (1) a relatively compact aggregate state and action space representation, and (2) a batch RL algorithm (i.e., an instance of fitted Q-iteration, FQI) to learn the optimal EV charging policy.

Expert Systems With Applications | 2018

Joint entity recognition and relation extraction as a multi-head selection problem

Giannis Bekoulis; Johannes Deleu; Thomas Demeester; Chris Develder

Abstract State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them.

forum for information retrieval evaluation | 2014

On the robustness of event detection evaluation: a case study

Matthias Feys; Thomas Demeester; Blaz Fortuna; Johannes Deleu; Chris Develder

Research on evaluation of IR systems has led to the insight that a robust evaluation strategy requires tests on a large number of events/queries. However, especially for event detection, the number of manually labeled events may be limited. In this paper we investigate how to optimize the evaluation strategy in those cases to maximize robustness. We also introduce two new vector space models for event detection that aim to incorporate bursty information of terms and compare these with existing models. Experiments show that exploiting graded relevance levels reduces the impact of subjectivity and ambiguity of event detection evaluation. We also show that although user disagreement is significant, it has no real impact on result ranking.

Explore More