Luis Galárraga
Télécom ParisTech
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Luis Galárraga.
international world wide web conferences | 2013
Luis Galárraga; Christina Teflioudi; Katja Hose; Fabian M. Suchanek
Recent advances in information extraction have led to huge knowledge bases (KBs), which capture knowledge in a machine-readable format. Inductive Logic Programming (ILP) can be used to mine logical rules from the KB. These rules can help deduce and add missing knowledge to the KB. While ILP is a mature field, mining logical rules from KBs is different in two aspects: First, current rule mining systems are easily overwhelmed by the amount of data (state-of-the art systems cannot even run on todays KBs). Second, ILP usually requires counterexamples. KBs, however, implement the open world assumption (OWA), meaning that absent data cannot be used as counterexamples. In this paper, we develop a rule mining model that is explicitly tailored to support the OWA scenario. It is inspired by association rule mining and introduces a novel measure for confidence. Our extensive experiments show that our approach outperforms state-of-the-art approaches in terms of precision and coverage. Furthermore, our system, AMIE, mines rules orders of magnitude faster than state-of-the-art approaches.
international world wide web conferences | 2014
Luis Galárraga; Katja Hose; Ralf Schenkel
The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications relying on efficient query processing. Confronted with such a trend, existing centralized state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for fast RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log and allocating the fragments to hosts in a cluster of machines. Furthermore, Partouts query optimizer produces efficient query execution plans for ad-hoc SPARQL queries.
web search and data mining | 2017
Luis Galárraga; Simon Razniewski; Antoine Amarilli; Fabian M. Suchanek
Knowledge bases such as Wikidata, DBpedia, or YAGO contain millions of entities and facts. In some knowledge bases, the correctness of these facts has been evaluated. However, much less is known about their completeness, i.e., the proportion of real facts that the knowledge bases cover. In this work, we investigate different signals to identify the areas where a knowledge base is complete. We show that we can combine these signals in a rule mining approach, which allows us to predict where facts may be missing. We also show that completeness predictions can help other applications such as fact prediction.
conference on information and knowledge management | 2013
Luis Galárraga; Nicoleta Preda; Fabian M. Suchanek
The Semantic Web has made huge progress in the last decade, and now comprises hundreds of knowledge bases (KBs). The Linked Open Data cloud connects the KBs in this Web of data. However, the links between the KBs are mostly concerned with the instances, not with the schema. Aligning the schemas is not easy, because the KBs may differ not just in their names for relations and classes, but also in their inherent structure. Therefore, we argue in this paper that advanced schema alignment is needed to tie the Semantic Web together. We put forward a particularly simple approach to illustrate how that might look.
asia-pacific web conference | 2014
Antoine Amarilli; Luis Galárraga; Nicoleta Preda; Fabian M. Suchanek
A knowledge base (KB) is a formal collection of knowledge about the world. In this paper, we explain how the YAGO KB is constructed. We also summarize our contributions to different aspects of KB management in general. One of these aspects is rule mining, i.e., the identification of patterns such as spouse(x,y) ∧ livesIn(x,z) ⇒livesIn(y,z). Another aspect is the incompleteness of KBs. We propose to integrate data from Web Services into the KB in order to fill the gaps. Further, we show how the overlap between existing KBs can be used to align them, both in terms of instances and in terms of the schema. Finally, we show how KBs can be protected by watermarking.
conference on information and knowledge management | 2014
Luis Galárraga
The continuous progress of Information Extraction (IE) techniques has led to the construction of large Knowledge Bases (KBs) containing facts about millions of entities such as people, organizations and places. KBs are important nowadays because they allow computers to understand the real world and are used in multiple domains and applications. Furthermore, the discovery of useful and non-trivial patterns in KBs, known as rule mining, opens the door for multiple applications in the areas of data analysis, prediction and automatic data engineering. In this article we present an overview of our ongoing work on rule mining on KBs and some of its applications. The scale of current KBs as well as their inherent incompleteness and noise make this endevour challenging.
very large data bases | 2015
Luis Galárraga; Christina Teflioudi; Katja Hose; Fabian M. Suchanek
conference on information and knowledge management | 2014
Luis Galárraga; Geremy Heitz; Kevin P. Murphy; Fabian M. Suchanek
international semantic web conference | 2017
Luis Galárraga; Kim Ahlstrøm Meyn Mathiassen; Katja Hose
Archive | 2015
Luis Galárraga; Christina Teflioudi; Katja Hose; Fabian M. Suchanek