Fabian M. Suchanek | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fabian M. Suchanek is active.

Explore More

Publication

Featured researches published by Fabian M. Suchanek.

web search and data mining | 2014

WebChild: harvesting and organizing commonsense knowledge from the web

Niket Tandon; Gerard de Melo; Fabian M. Suchanek; Gerhard Weikum

This paper presents a method for automatically constructing a large commonsense knowledge base, called WebChild, from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, nouns and adjectives, are disambiguated by mapping them onto their proper WordNet senses. Our method is based on semi-supervised Label Propagation over graphs of noisy candidate assertions. We automatically derive seeds from WordNet and by pattern matching from Web text collections. The Label Propagation algorithm provides us with domain sets and range sets for 19 different relations, and with confidence-ranked assertions between WordNet senses. Large-scale experiments demonstrate the high accuracy (more than 80 percent) and coverage (more than four million fine grained disambiguated assertions) of WebChild.

north american chapter of the association for computational linguistics | 2016

But What Do We Actually Know

Simon Razniewski; Fabian M. Suchanek

Knowledge bases such as Wikidata, DBpedia, YAGO, or the Google Knowledge Vault collect a vast number of facts about the world. But while quite some facts are known about the world, little is known about how much is unknown. For example, while the knowledge base may tell us that Barack Obama is the father of Malia Obama and Sasha Obama, it does not tell us whether these are all of his children. This is not just an epistemic challenge, but also a practical problem for data producers and consumers. We envision that KBs become annotated with information about their recall on specific topics. We show what such annotations could look like, how they could be obtained, and survey related work.

very large data bases | 2014

Knowledge bases in the age of big data analytics

Fabian M. Suchanek; Gerhard Weikum

This tutorial gives an overview on state-of-the-art methods for the automatic construction of large knowledge bases and harnessing them for data and text analytics. It covers both big-data methods for building knowledge bases and knowledge bases being assets for big-data applications. The tutorial also points out challenges and research opportunities.

web search and data mining | 2017

Predicting Completeness in Knowledge Bases

Luis Galárraga; Simon Razniewski; Antoine Amarilli; Fabian M. Suchanek

Knowledge bases such as Wikidata, DBpedia, or YAGO contain millions of entities and facts. In some knowledge bases, the correctness of these facts has been evaluated. However, much less is known about their completeness, i.e., the proportion of real facts that the knowledge bases cover. In this work, we investigate different signals to identify the areas where a knowledge base is complete. We show that we can combine these signals in a rule mining approach, which allows us to predict where facts may be missing. We also show that completeness predictions can help other applications such as fact prediction.

international semantic web conference | 2016

YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames

Thomas Rebele; Fabian M. Suchanek; Johannes Hoffart; Joanna Biega; Erdal Kuzey; Gerhard Weikum

YAGO is a large knowledge base that is built automatically from Wikipedia, WordNet and GeoNames. The project combines information from Wikipedias in 10 different languages into a coherent whole, thus giving the knowledge a multilingual dimension. It also attaches spatial and temporal information to many facts, and thus allows the user to query the data over space and time. YAGO focuses on extraction quality and achieves a manually evaluated precision of 95 %. In this paper, we explain how YAGO is built from its sources, how its quality is evaluated, how a user can access it, and how other projects utilize it.

international workshop on the web and databases | 2015

IBEX: Harvesting Entities from the Web Using Unique Identifiers

Aliaksandr Talaika; Joanna Biega; Antoine Amarilli; Fabian M. Suchanek

In this paper we study the prevalence of unique entity identifiers on the Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs (for documents), email addresses, and others. We show how these identifiers can be harvested systematically from Web pages, and how they can be associated with humanreadable names for the entities at large scale. Starting with a simple extraction of identifiers and names from Web pages, we show how we can use the properties of unique identifiers to filter out noise and clean up the extraction result on the entire corpus. The end result is a database of millions of uniquely identified entities of different types, with an accuracy of 73--96% and a very high coverage compared to existing knowledge bases. We use this database to compute novel statistics on the presence of products, people, and other entities on the Web.

international workshop on the web and databases | 2015

The elephant in the room: getting value from Big Data

Serge Abiteboul; Luna Dong; Oren Etzioni; Divesh Srivastava; Gerhard Weikum; Julia Stoyanovich; Fabian M. Suchanek

Big Data, and its 4 Vs – volume, velocity, variety, and veracity – have been at the forefront of societal, scientific and engineering discourse. Arguably the most important 5th V, value, is not talked about as much. How can we make sure that our data is not just big, but also valuable? WebDB 2015 has as its theme “Freshness, Correctness, Quality of Information and Knowledge on the Web”. The workshop attracted 31 submissions, of which the best 9 were selected for presentation at the workshop, and for publication in the proceedings. To set the stage, we have interviewed several prominent members of the data management community, soliciting their opinions on how we can ensure that data is not just available in quantity, but also in quality. In this interview Serge Abiteboul, Oren Etzioni, Divesh Srivastava with Luna Dong, and Gerhard Weikum shared with us their motivation for doing research in the area of data quality, and discussed their current work and their view on the future of the field. This interview appeared as a SIGMOD Blog article.

international semantic web conference | 2017

VICKEY: Mining Conditional Keys on Knowledge Bases

Danai Symeonidou; Luis Galárraga; Nathalie Pernelle; Fatiha Saïs; Fabian M. Suchanek

A conditional key is a key constraint that is valid in only a part of the data. In this paper, we show how such keys can be mined automatically on large knowledge bases (KBs). For this, we combine techniques from key mining with techniques from rule mining. We show that our method can scale to KBs of millions of facts. We also show that the conditional keys we mine can improve the quality of entity linking by up to 47% points.

conference on information and knowledge management | 2016

Thymeflow, A Personal Knowledge Base with Spatio-temporal Data

David Montoya; Thomas Pellissier Tanon; Serge Abiteboul; Fabian M. Suchanek

The typical Internet user has data spread over several devices and across several online systems. We demonstrate an open-source system for integrating users data from different sources into a single Knowledge Base. Our system integrates data of different kinds into a coherent whole, starting with email messages, calendar, contacts, and location history. It is able to detect event periods in the users location data and align them with calendar events. We will demonstrate how to query the system within and across different dimensions, and perform analytics over emails, events, and locations.

asia-pacific web conference | 2014

Recent Topics of Research around the YAGO Knowledge Base

Antoine Amarilli; Luis Galárraga; Nicoleta Preda; Fabian M. Suchanek

A knowledge base (KB) is a formal collection of knowledge about the world. In this paper, we explain how the YAGO KB is constructed. We also summarize our contributions to different aspects of KB management in general. One of these aspects is rule mining, i.e., the identification of patterns such as spouse(x,y) ∧ livesIn(x,z) ⇒livesIn(y,z). Another aspect is the incompleteness of KBs. We propose to integrate data from Web Services into the KB in order to fill the gaps. Further, we show how the overlap between existing KBs can be used to align them, both in terms of instances and in terms of the schema. Finally, we show how KBs can be protected by watermarking.

Explore More