Hendrik ter Horst
Bielefeld University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hendrik ter Horst.
knowledge acquisition, modeling and management | 2016
Sherzod Hakimov; Hendrik ter Horst; Soufian Jebbara; Matthias Hartung; Philipp Cimiano
Named Entity Disambiguation NED is the task of disambiguating named entities in a natural language text by linking them to their corresponding entities in a knowledge base such as DBpedia, which are already recognized. It is an important step in transforming unstructured text into structured knowledge. Previous work on this task has proven a strong impact of graph-based methods such as PageRank on entity disambiguation. Other approaches rely on distributional similarity between an article and the textual description of a candidate entity. However, the combined impact of these different feature groups has not been explored to a sufficient extent. In this paper, we present a novel approach that exploits an undirected probabilistic model to combine different types of features for named entity disambiguation. Capitalizing on Markov Chain Monte Carlo sampling, our model is capable of exploiting complementary strengths between both graph-based and textual features. We analyze the impact of these features and their combination on named entity disambiguation. In an evaluation on the GERBIL benchmark, our model compares favourably to the current state-of-the-art in 8 out of 14 data sets.
applications of natural language to data bases | 2018
Hendrik ter Horst; Matthias Hartung; Roman Klinger; Nicole Brazda; Hans Werner Müller; Philipp Cimiano
Template-based information extraction generalizes over standard token-level binary relation extraction in the sense that it attempts to fill a complex template comprising multiple slots on the basis of information given in a text. In the approach presented in this paper, templates and possible fillers are defined by a given ontology. The information extraction task consists in filling these slots within a template with previously recognized entities or literal values. We cast the task as a structure prediction problem and propose a joint probabilistic model based on factor graphs to account for the interdependence in slot assignments. Inference is implemented as a heuristic building on Markov chain Monte Carlo sampling. As our main contribution, we investigate the impact of soft constraints modeled as single slot factors which measure preferences of individual slots for ranges of fillers, as well as pairwise slot factors modeling the compatibility between fillers of two slots. Instead of relying on expert knowledge to acquire such soft constraints, in our approach they are directly captured in the model and learned from training data. We show that both types of factors are effective in improving information extraction on a real-world data set of full-text papers from the biomedical domain. Pairwise factors are shown to particularly improve the performance of our extraction model by up to \({+}0.43\) points in precision, leading to an F\(_1\) score of 0.90 for individual templates.
language data and knowledge | 2017
Hendrik ter Horst; Matthias Hartung; Philipp Cimiano
The problems of recognizing mentions of entities in texts and linking them to unique knowledge base identifiers have received considerable attention in recent years. In this paper we present a probabilistic system based on undirected graphical models that jointly addresses both the entity recognition and the linking task. Our framework considers the span of mentions of entities as well as the corresponding knowledge base identifier as random variables and models the joint assignment using a factorized distribution. We show that our approach can be easily applied to different technical domains by merely exchanging the underlying ontology. On the task of recognizing and linking disease names, we show that our approach outperforms the state-of-the-art systems DNorm and TaggerOne, as well as two strong lexicon-based baselines. On the task of recognizing and linking chemical names, our system achieves comparable performance to the state-of-the-art.
Reasoning Web International Summer School | 2018
Hendrik ter Horst; Matthias Hartung; Philipp Cimiano
In this tutorial we discuss how Conditional Random Fields can be applied to knowledge base population tasks. We are in particular interested in the cold-start setting which assumes as given an ontology that models classes and properties relevant for the domain of interest, and an empty knowledge base that needs to be populated from unstructured text. More specifically, cold-start knowledge base population consists in predicting semantic structures from an input document that instantiate classes and properties as defined in the ontology. Considering knowledge base population as structure prediction, we frame the task as a statistical inference problem which aims at predicting the most likely assignment to a set of ontologically grounded output variables given an input document. In order to model the conditional distribution of these output variables given the input variables derived from the text, we follow the approach adopted in Conditional Random Fields. We decompose the cold-start knowledge base population task into the specific problems of entity recognition, entity linking and slot-filling, and show how they can be modeled using Conditional Random Fields.
Proceedings of the 3rd Joint Ontology Workshops (JOWO): Ontologies and Data in the Life Sciences | 2017
Nicole Brazda; Hendrik ter Horst; Matthias Hartung; Cord Wiljes; Veronica Estrada; Roman Klinger; Wolfgang Kuchinke; Hans Werner Müller; Philipp Cimiano
meeting of the association for computational linguistics | 2018
Matthias Hartung; Hendrik ter Horst; Frank Grimm; Tim Diekmann; Roman Klinger; Philipp Cimiano
Proceedings of DGfS/CL Poster Session | 2018
Annika Schwitteck; Hendrik ter Horst; Matthias Hartung
Proceedings of the SEMANTICS 2017 Poster and Demo Track | 2017
Alexander Borowi; Hendrik ter Horst; Matthias Hartung; Veronica Estrada; Nicole Brazda; Hans Werner Müller; Philipp Cimiano
Archive | 2017
Hendrik ter Horst; Matthias Hartung; Roman Klinger; Matthias Zwick; Philipp Cimiano
Proceedings of the 18th Spinal Research Network Meeting (ISRT 2016) | 2016
Nicole Brazda; Veronica Estrada; Tarek Kirchhoffer; Hendrik ter Horst; Matthias Hartung; Cord Wiljes; Roman Klinger; Philipp Cimiano; Hans Werner Müller