Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Danai Symeonidou is active.

Publication


Featured researches published by Danai Symeonidou.


Journal of Web Semantics | 2013

An automatic key discovery approach for data linking

Nathalie Pernelle; Fatiha Saïs; Danai Symeonidou

In the context of Linked Data, different kinds of semantic links can be established between data. However when data sources are huge, detecting such links manually is not feasible. One of the most important types of links, the identity link, expresses that different identifiers refer to the same real world entity. Some automatic data linking approaches use keys to infer identity links, nevertheless this kind of knowledge is rarely available. In this work we propose KD2R, an approach which allows the automatic discovery of composite keys in RDF data sources that may conform to different schemas. We only consider data sources for which the Unique Name Assumption is fulfilled. The obtained keys are correct with respect to the RDF data sources in which they are discovered. The proposed algorithm is scalable since it allows the key discovery without having to scan all the data. KD2R has been tested on real datasets of the international contest OAEI 2010 and on datasets available on the web of data, and has obtained promising results.


international semantic web conference | 2014

SAKey: Scalable Almost Key Discovery in RDF Data

Danai Symeonidou; Vincent Armant; Nathalie Pernelle; Fatiha Saïs

Exploiting identity links among RDF resources allows applications to efficiently integrate data. Keys can be very useful to discover these identity links. A set of properties is considered as a key when its values uniquely identify resources. However, these keys are usually not available. The approaches that attempt to automatically discover keys can easily be overwhelmed by the size of the data and require clean data. We present SAKey, an approach that discovers keys in RDF data in an efficient way. To prune the search space, SAKey exploits characteristics of the data that are dynamically detected during the process. Furthermore, our approach can discover keys in datasets where erroneous data or duplicates exist (i.e., almost keys). The approach has been evaluated on different synthetic and real datasets. The results show both the relevance of almost keys and the efficiency of discovering them.


international conference on conceptual structures | 2014

Defining Key Semantics for the RDF Datasets: Experiments and Evaluations

Manuel Atencia; Michel Chein; Madalina Croitoru; Jérôme David; Michel Leclère; Nathalie Pernelle; Fatiha Saïs; François Scharffe; Danai Symeonidou

Many techniques were recently proposed to automate the linkage of RDF datasets. Predicate selection is the step of the linkage process that consists in selecting the smallest set of relevant predicates needed to enable instance comparison. We call keys this set of predicates that is analogous to the notion of keys in relational databases. We explain formally the different assumptions behind two existing key semantics. We then evaluate experimentally the keys by studying how discovered keys could help dataset interlinking or cleaning. We discuss the experimental results and show that the two different semantics lead to comparable results on the studied datasets.


knowledge acquisition, modeling and management | 2016

Automatic Key Selection for Data Linking

Manel Achichi; Mohamed Ben Ellefi; Danai Symeonidou; Konstantin Todorov

The paper proposes an RDF key ranking approach that attempts to close the gap between automatic key discovery and data linking approaches and thus reduce the user effort in linking configuration. Indeed, data linking tool configuration is a laborious process, where the user is often required to select manually the properties to compare, which supposes an in-depth expert knowledge of the data. Key discovery techniques attempt to facilitate this task, but in a number of cases do not fully succeed, due to the large number of keys produced, lacking a confidence indicator. Since keys are extracted from each dataset independently, their effectiveness for the matching task, involving two datasets, is undermined. The approach proposed in this work suggests to unlock the potential of both key discovery techniques and data linking tools by providing to the user a limited number of merged and ranked keys, well-suited to a particular matching task. In addition, the complementarity properties of a small number of top-ranked keys is explored, showing that their combined use improves significantly the recall. We report our experiments on data from the Ontology Alignment Evaluation Initiative, as well as on real-world benchmark data about music.


international semantic web conference | 2017

VICKEY: Mining Conditional Keys on Knowledge Bases

Danai Symeonidou; Luis Galárraga; Nathalie Pernelle; Fatiha Saïs; Fabian M. Suchanek

A conditional key is a key constraint that is valid in only a part of the data. In this paper, we show how such keys can be mined automatically on large knowledge bases (KBs). For this, we combine techniques from key mining with techniques from rule mining. We show that our method can scale to KBs of millions of facts. We also show that the conditional keys we mine can improve the quality of entity linking by up to 47% points.


international conference on conceptual structures | 2016

Key Discovery for Numerical Data: Application to Oenological Practices

Danai Symeonidou; Isabelle Sanchez; Madalina Croitoru; Pascal Neveu; Nathalie Pernelle; Fatiha Saïs; Aurelie Roland-Vialaret; Patrice Buche; Aunur Rofiq Muljarto; Remi Schneider

The key discovery problem has been recently investigated for symbolical RDF data and tested on large datasets such as DBpedia and YAGO. The advantage of such methods is that they allow the automatic extraction of combinations of properties that uniquely identify every resource in a dataset (i.e., ontological rules). However, none of the existing approaches is able to treat real world numerical data. In this paper we propose a novel approach that allows to handle numerical RDF datasets for key discovery. We test the significance of our approach on the context of an oenological application and consider a wine dataset that represents the different chemical based flavourings. Discovering keys in this context contributes in the investigation of complementary flavors that allow to distinguish various wine sorts amongst themselves.


international conference on knowledge capture | 2017

KeyRanker: Automatic RDF Key Ranking for Data Linking

Houssameddine Farah; Danai Symeonidou; Konstantin Todorov

Automatic approaches to key discovery on RDF datasets generate sets of discriminative properties that can be used to configure data linking systems relying on link specifications. These keys often come in large numbers, generated independently for two datasets to be linked, lacking an assessment of their usefulness for the linking task. We propose a novel generic algorithm for selecting keys, valid in two datasets, and ranking them with respect to their individual likelihood to generate identity links. In addition, we explore the combined use of several complementary keys improving their individual performance. We evaluate our approach on diverse synthetic and real-world benchmark data, showing its robustness with respect to different linking tools and domains.


information processing and management of uncertainty | 2016

A Proposal for Modelling Agrifood Chains as Multi Agent Systems

Madalina Croitoru; Patrice Buche; Brigitte Charnomordic; Jérôme Fortin; Hazaël Jones; Pascal Neveu; Danai Symeonidou; Rallou Thomopoulos

In the aim of evaluating and improving link quality in bibliographical knowledge bases, we develop a decision support system based on partitioning semantics. The novelty of our approach consists in using symbolic values criteria for partitioning and suitable partitioning semantics. In this paper we evaluate and compare the above mentioned semantics on a real qualitative sample. This sample is issued from the catalogue of French university libraries (SUDOC), a bibliographical knowledge base maintained by the University Bibliographic Agency (ABES).


IC: Ingénierie des Connaissances | 2014

Définition de la sémantique des clés dans le Web sémantique : un point de vue théorique

Michel Chein; Madalina Croitoru; Michel Leclère; Nathalie Pernelle; Fatiha Saïs; Danai Symeonidou


LDOW@WWW | 2015

Rule Mining for Semantifying Wikilinks

Luis Galárraga; Danai Symeonidou; Jean-Claude Moissinac

Collaboration


Dive into the Danai Symeonidou's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Madalina Croitoru

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Michel Chein

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Michel Leclère

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pascal Neveu

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Patrice Buche

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Madalina Croitoru

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Aunur Rofiq Muljarto

Institut national de la recherche agronomique

View shared research outputs
Researchain Logo
Decentralizing Knowledge