Max De Wilde
Université libre de Bruxelles
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Max De Wilde.
Literary and Linguistic Computing | 2015
Seth van Hooland; Max De Wilde; Ruben Verborgh; Thomas Steiner; Rik Van de Walle
Unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. However, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. This article explores the possibilities and limitations of named-entity recognition (NER) and term extraction (TE) to mine such unstructured metadata for meaningful concepts. These concepts can be used to leverage otherwise limited searching and browsing operations, but they can also play an important role to foster Digital Humanities research. To catalyze experimentation with NER and TE, the article proposes an evaluation of the performance of three third-party entity extraction services through a comprehensive case study, based on the descriptive fields of the Smithsonian Cooper–Hewitt National Design Museum in New York. To cover both NER and TE, we first offer a quantitative analysis of named entities retrieved by the services in terms of precision and recall compared with a manually annotated gold-standard corpus, and then complement this approach with a more qualitative assessment of relevant terms extracted. Based on the outcomes of this double analysis, the conclusions present the added value of entity extraction services, but also indicate the dangers of uncritically using NER and/or TE, and by extension Linked Data principles, within the Digital Humanities. All metadata and tools used within the article are freely available, making it possible for researchers and practitioners to repeat the methodology. By doing so, the article offers a significant contribution towards understanding the value of entity recognition and disambiguation for the Digital Humanities.
Journal of the Association for Information Science and Technology | 2013
Seth van Hooland; Ruben Verborgh; Max De Wilde; Johannes Hercher; Erik Mannens; Rik Van de Walle
The concept of Linked Data has made its entrance in the cultural heritage sector due to its potential use for the integration of heterogeneous collections and deriving additional value out of existing metadata. However, practitioners and researchers alike need a better understanding of what outcome they can reasonably expect of the reconciliation process between their local metadata and established controlled vocabularies which are already a part of the Linked Data cloud. This paper offers an in-depth analysis of how a locally developed vocabulary can be successfully reconciled with the Library of Congress Subject Headings (LCSH) and the Arts and Architecture Thesaurus (AAT) through the help of a general-purpose tool for interactive data transformation (OpenRefine). Issues negatively affecting the reconciliation process are identified and solutions are proposed in order to derive maximum value from existing metadata and controlled vocabularies in an automated manner.
advances in databases and information systems | 2015
Max De Wilde
The relevance of Named-Entity Recognition and Entity Linking for cultural heritage institutions is evaluated through a case-study involving the semantic enrichment of historical periodicals. A language-independent approach is proposed in order to improve the search experience of end-users with the mapping of entities to the Linked Open Data (LOD) cloud. Preliminary results show that a precision rate of almost 90% can be achieved with very little fine-tuning, while an increase in recall remains necessary.
Cataloging & Classification Quarterly | 2016
Raphaël Hubain; Max De Wilde; Seth van Hooland
ABSTRACT Ensuring quick and consistent access to large collections of unstructured documents is one of the biggest challenges facing knowledge-intensive organizations. Designing specific vocabularies to index and retrieve documents is often deemed too expensive, full-text search being preferred despite its known limitations. However, the process of creating controlled vocabularies can be partly automated thanks to natural language processing and machine learning techniques. With a case study from the biopharmaceutical industry, we demonstrate how small organizations can use an automated workflow in order to create a controlled vocabulary to index unstructured documents in a semantically meaningful way.
Archive | 2013
Ruben Verborgh; Max De Wilde
Archive | 2013
Ruben Verborgh; Max De Wilde
Brussels Studies | 2018
Margot Waty; Seth van Hooland; Simon Hengchen; Mathias Coeckelbergs; Max De Wilde
Brussels Studies | 2018
Margot Waty; Seth van Hooland; Simon Hengchen; Mathias Coeckelbergs; Max De Wilde; Jean-Michel Decroly
Brussels Studies | 2018
Margot Waty; Seth van Hooland; Simon Hengchen; Mathias Coeckelbergs; Max De Wilde; Jean-Michel Decroly
Digital Humanities Quarterly | 2017
Max De Wilde; Simon Hengchen