Pierre-Yves Vandenbussche
Fujitsu
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pierre-Yves Vandenbussche.
Sprachwissenschaft | 2016
Pierre-Yves Vandenbussche; Ghislain Auguste Atemezing; María Poveda-Villalón; Bernard Vatant
One of the major barriers to the deployment of Linked Data is the difficulty that data publishers have in determining which vocabularies to use to describe the semantics of data. This system report describes Linked Open Vocabularies (LOV), a high quality catalogue of reusable vocabularies for the description of data on the Web. The LOV initiative gathers and makes visible indicators that have not been previously harvested such as the interconnections between vocabularies, version history along with past and current referent (individual or organization). LOV goes beyond existing Semantic Web vocabulary search engines and takes into consideration the values property type, matched with a query, to improve vocabulary terms scoring. By providing an extensive range of data access methods (SPARQL endpoint, API, data dump or UI), we try to facilitate the reuse of well-documented vocabularies in the Linked Data ecosystem. We conclude that the adoption in many applications and methods of LOV shows the benefits of such a set of vocabularies and related features to aid the design and publication of data on the Web.
Sprachwissenschaft | 2017
Pierre-Yves Vandenbussche; Jürgen Umbrich; Luca Matteis; Aidan Hogan; Carlos Buil-Aranda
Fujitsu Laboratories Limited CONICYT/FONDECYT Project 3130617 FONDECYT Project 11140900 DGIP Project 116.24.1 Millennium Nucleus Center for Semantic Web Research NC120004
international conference on informatics and semiotics in organisations | 2014
Vivian Lee; Masatomo Goto; Bo Hu; Aisha Naseer; Pierre-Yves Vandenbussche; Gofran Shakair; Eduarda Mendes Rodrigues
In this paper, we report on a recent initiative that exploiting Linked Data for financial data integration. Financial data present high heterogeneity. Linked Data helps to reveal the true data semantics and “hidden” connection, upon which meaningful mappings can be constructed. The work reported in this paper has been well-accepted at several public events and conferences, including the 26th XBRL conference, involving the realisation of the XBRL (eXtensible Business Reporting Language) prototype called HIKAKU, which means “comparison” in Japanese. It demonstrates our approach to exploit the power of Linked Data in enhancing flexibility for data integration in the financial domain.
language data and knowledge | 2017
Sameh K. Mohamed; Emir Muñoz; Vít Nováček; Pierre-Yves Vandenbussche
Relation paths are sequences of relations with inverse that allow for complete exploration of knowledge graphs in a two-way unconstrained manner. They are powerful enough to encode complex relationships between entities and are crucial in several contexts, such as knowledge base verification, rule mining, and link prediction. However, fundamental forms of reasoning such as containment and equivalence of relation paths have hitherto been ignored. Intuitively, two relation paths are equivalent if they share the same extension, i.e., set of source and target entity pairs. In this paper, we study the problem of containment as a means to find equivalent relation paths and show that it is very expensive in practice to enumerate paths between entities. We characterize the complexity of containment and equivalence of relation paths and propose a domain-independent and unsupervised method to obtain approximate equivalences ranked by a tri-criteria ranking function. We evaluate our algorithm using test cases over real-world data and show that we are able to find semantically meaningful equivalences efficiently.
acm symposium on applied computing | 2018
Sameh K. Mohamed; Vít Nováček; Pierre-Yves Vandenbussche
Graph feature models facilitate efficient and interpretable predictions of missing links in knowledge bases with network structure (i.e. knowledge graphs). However, existing graph feature models---e.g. Subgraph Feature Extractor (SFE) or its predecessor, Path Ranking Algorithm (PRA) and its variants---depend on a limited set of graph features, connecting paths. This type of features may be missing for many interesting potential links, though, and the existing techniques cannot provide any predictions at all then. In this paper, we address the limitations of existing works by introducing a new graph-based feature model - Distinct Subgraph Paths (DSP). Our model uses a richer set of graph features and therefore can predict new relevant facts that neither SFE, nor PRA or its variants can discover by principle. We use a standard benchmark data set to show that DSP model performs better than the state-of-the-art - SFE (ANYREL) and PRA - in terms of mean average precision (MAP), mean reciprocal rank (MRR) and Hits@5, 10, 20, with no extra computational cost incurred.
european semantic web conference | 2017
Luca Costabello; Pierre-Yves Vandenbussche; Gofran Shukair; Corine Deliot; Neil Wilson
We present a traffic analytics platform for servers that publish Linked Data. To the best of our knowledge, this is the first system that mines access logs of registered Linked Data servers to extract traffic insights on daily basis and without human intervention. The framework extracts Linked Data-specific traffic metrics from log records of HTTP lookups and SPARQL queries, and provides insights not available in traditional web analytics tools. Among all, we detect visitor sessions with a variant of hierarchical agglomerative clustering. We also identify workload peaks of SPARQL endpoints by detecting heavy and light SPARQL queries with supervised learning. The platform has been tested on 13 months of access logs of the British National Bibliography RDF dataset.
european conference on machine learning | 2017
Pasquale Minervini; Luca Costabello; Emir Muñoz; Vít Nováček; Pierre-Yves Vandenbussche
Learning embeddings of entities and relations using neural architectures is an effective method of performing statistical learning on large-scale relational data, such as knowledge graphs. In this paper, we consider the problem of regularizing the training of neural knowledge graph embeddings by leveraging external background knowledge. We propose a principled and scalable method for leveraging equivalence and inversion axioms during the learning process, by imposing a set of model-dependent soft constraints on the predicate embeddings. The method has several advantages: (i) the number of introduced constraints does not depend on the number of entities in the knowledge base; (ii) regularities in the embedding space effectively reflect available background knowledge; (iii) it yields more accurate results in link prediction tasks over non-regularized methods; and (iv) it can be adapted to a variety of models, without affecting their scalability properties. We demonstrate the effectiveness of the proposed method on several large knowledge graphs. Our evaluation shows that it consistently improves the predictive accuracy of several neural knowledge graph embedding models (for instance, the MRR of TransE on WordNet increases by 11%) without compromising their scalability properties.
Briefings in Bioinformatics | 2017
Vít Nováček; Pierre-Yves Vandenbussche; Emir Muñoz
Abstract Timely identification of adverse drug reactions (ADRs) is highly important in the domains of public health and pharmacology. Early discovery of potential ADRs can limit their effect on patient lives and also make drug development pipelines more robust and efficient. Reliable in silico prediction of ADRs can be helpful in this context, and thus, it has been intensely studied. Recent works achieved promising results using machine learning. The presented work focuses on machine learning methods that use drug profiles for making predictions and use features from multiple data sources. We argue that despite promising results, existing works have limitations, especially regarding flexibility in experimenting with different data sets and/or predictive models. We suggest to address these limitations by generalization of the key principles used by the state of the art. Namely, we explore effects of: (1) using knowledge graphs—machine‐readable interlinked representations of biomedical knowledge—as a convenient uniform representation of heterogeneous data; and (2) casting ADR prediction as a multi‐label ranking problem. We present a specific way of using knowledge graphs to generate different feature sets and demonstrate favourable performance of selected off‐the‐shelf multi‐label learning models in comparison with existing works. Our experiments suggest better suitability of certain multi‐label learning methods for applications where ranking is preferred. The presented approach can be easily extended to other feature sources or machine learning methods, making it flexible for experiments tuned toward specific requirements of end users. Our work also provides a clearly defined and reproducible baseline for any future related experiments.
international semantic web conference | 2013
Carlos Buil-Aranda; Aidan Hogan; Jürgen Umbrich; Pierre-Yves Vandenbussche
Ercim News | 2014
Pierre-Yves Vandenbussche; Bernard Vatant