Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Fatiha Saïs is active.

Publication


Featured researches published by Fatiha Saïs.


Journal on Data Semantics | 2009

Combining a Logical and a Numerical Method for Data Reconciliation

Fatiha Saïs; Nathalie Pernelle; Marie-Christine Rousset

The reference reconciliation problem consists in deciding whether different identifiers refer to the same data, i.e. correspond to the same real world entity. In this article we present a reference reconciliation approach which combines a logical method for reference reconciliation called L2R and a numerical one called N2R. This approach exploits the schema and data semantics, which is translated into a set of Horn FOL rules of reconciliation. These rules are used in L2R to infer exact decisions both of reconciliation and non-reconciliation. In the second method N2R, the semantics of the schema is translated in an informed similarity measure which is used by a numerical computation of the similarity of reference pairs. This similarity measure is expressed in a non linear equation system, which is solved by using an iterative method. The experiments of the methods made on two different domains, show good results for both recall and precision. They can be used separately or in combination. We have shown that their combination allows to improve runtime performance.


Journal of Web Semantics | 2013

An automatic key discovery approach for data linking

Nathalie Pernelle; Fatiha Saïs; Danai Symeonidou

In the context of Linked Data, different kinds of semantic links can be established between data. However when data sources are huge, detecting such links manually is not feasible. One of the most important types of links, the identity link, expresses that different identifiers refer to the same real world entity. Some automatic data linking approaches use keys to infer identity links, nevertheless this kind of knowledge is rarely available. In this work we propose KD2R, an approach which allows the automatic discovery of composite keys in RDF data sources that may conform to different schemas. We only consider data sources for which the Unique Name Assumption is fulfilled. The obtained keys are correct with respect to the RDF data sources in which they are discovered. The proposed algorithm is scalable since it allows the key discovery without having to scan all the data. KD2R has been tested on real datasets of the international contest OAEI 2010 and on datasets available on the web of data, and has obtained promising results.


international semantic web conference | 2014

SAKey: Scalable Almost Key Discovery in RDF Data

Danai Symeonidou; Vincent Armant; Nathalie Pernelle; Fatiha Saïs

Exploiting identity links among RDF resources allows applications to efficiently integrate data. Keys can be very useful to discover these identity links. A set of properties is considered as a key when its values uniquely identify resources. However, these keys are usually not available. The approaches that attempt to automatically discover keys can easily be overwhelmed by the size of the data and require clean data. We present SAKey, an approach that discovers keys in RDF data in an efficient way. To prune the search space, SAKey exploits characteristics of the data that are dynamically detected during the process. Furthermore, our approach can discover keys in datasets where erroneous data or duplicates exist (i.e., almost keys). The approach has been evaluated on different synthetic and real datasets. The results show both the relevance of almost keys and the efficiency of discovering them.


international conference on move to meaningful internet systems | 2011

KD2R: a key discovery method for semantic reference reconciliation

Danai Symeonidou; Nathalie Pernelle; Fatiha Saïs

The reference reconciliation problem consists of deciding whether different identifiers refer to the same world entity. Some existing reference reconciliation approaches use key constraints to infer reconciliation decisions. In the context of the Linked Open Data, this knowledge is not available. We propose KD2R, a method which allows automatic discovery of key constraints associated to OWL2 classes. These keys are discovered from RDF data which can be incomplete. The proposed algorithm allows this discovery without having to scan all the data. KD2R has been tested on data sets of the international contest OAEI and obtains promising results.


international conference on conceptual structures | 2014

Defining Key Semantics for the RDF Datasets: Experiments and Evaluations

Manuel Atencia; Michel Chein; Madalina Croitoru; Jérôme David; Michel Leclère; Nathalie Pernelle; Fatiha Saïs; François Scharffe; Danai Symeonidou

Many techniques were recently proposed to automate the linkage of RDF datasets. Predicate selection is the step of the linkage process that consists in selecting the smallest set of relevant predicates needed to enable instance comparison. We call keys this set of predicates that is analogous to the notion of keys in relational databases. We explain formally the different assumptions behind two existing key semantics. We then evaluate experimentally the keys by studying how discovered keys could help dataset interlinking or cleaning. We discuss the experimental results and show that the two different semantics lead to comparable results on the studied datasets.


Proceedings of the 2nd International Workshop on Open Data | 2013

N2R-part: identity link discovery using partially aligned ontologies

Nathalie Pernelle; Fatiha Saïs; Brigitte Safar; Maria Koutraki; Tushar Ghosh

Thanks to the initiative of Linked Open Data, the RDF datasets that are published on the Web are more and more numerous. One active research field currently concerns the problem of finding links between entities. We focus in this paper on ontology-based data linking approaches which use linking rules based on the available schemas (or ontologies). This kind of systems assume to have beforehand a set of mappings between ontology elements. However, this set of mappings could be incomplete. We propose in this paper a data linking approach called N2R-Part. It is based on the computation of similarity scores by exploiting at the same time properties for which a mapping exists and those for which there is no mapping. We illustrate throughout an example how the exploitation of the unmapped properties improves the data linking results.


international semantic web conference | 2017

VICKEY: Mining Conditional Keys on Knowledge Bases

Danai Symeonidou; Luis Galárraga; Nathalie Pernelle; Fatiha Saïs; Fabian M. Suchanek

A conditional key is a key constraint that is valid in only a part of the data. In this paper, we show how such keys can be mined automatically on large knowledge bases (KBs). For this, we combine techniques from key mining with techniques from rule mining. We show that our method can scale to KBs of millions of facts. We also show that the conditional keys we mine can improve the quality of entity linking by up to 47% points.


international conference on knowledge capture | 2017

Detection of Contextual Identity Links in a Knowledge Base

Joe Raad; Nathalie Pernelle; Fatiha Saïs

Most of the Linked Data applications currently rely on the use of owl: sameAs for linking ontology instances. However, several studies have noticed multiple misuses of this identity link. These misuses, which are mainly caused by the lack of other well-defined linking alternatives, can lead to erroneous statements or inconsistencies. We propose in this paper a new contextual identity link: identiConTo that could serve as a replacement for owl: sameAs in linking identical instances in a specified context. To detect these contextual links, we have defined an algorithm named DECIDE that has been tested on scientific knowledge bases describing transformation processes.


international conference on conceptual structures | 2016

Key Discovery for Numerical Data: Application to Oenological Practices

Danai Symeonidou; Isabelle Sanchez; Madalina Croitoru; Pascal Neveu; Nathalie Pernelle; Fatiha Saïs; Aurelie Roland-Vialaret; Patrice Buche; Aunur Rofiq Muljarto; Remi Schneider

The key discovery problem has been recently investigated for symbolical RDF data and tested on large datasets such as DBpedia and YAGO. The advantage of such methods is that they allow the automatic extraction of combinations of properties that uniquely identify every resource in a dataset (i.e., ontological rules). However, none of the existing approaches is able to treat real world numerical data. In this paper we propose a novel approach that allows to handle numerical RDF datasets for key discovery. We test the significance of our approach on the context of an oenological application and consider a wine dataset that represents the different chemical based flavourings. Discovering keys in this context contributes in the investigation of complementary flavors that allow to distinguish various wine sorts amongst themselves.


international semantic web conference | 2018

Detecting Erroneous Identity Links on the Web Using Network Metrics

Joe Raad; Wouter Beek; Frank van Harmelen; Nathalie Pernelle; Fatiha Saïs

In the absence of a central naming authority on the Semantic Web, it is common for different datasets to refer to the same thing by different IRIs. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that date back as far as 2009, have observed that the owl:sameAs property is sometimes used incorrectly. In this paper, we show how network metrics such as the community structure of the owl:sameAs graph can be used in order to detect such possibly erroneous statements. One benefit of the here presented approach is that it can be applied to the network of owl:sameAs links itself, and does not rely on any additional knowledge. In order to illustrate its ability to scale, the approach is evaluated on the largest collection of identity links to date, containing over 558M owl:sameAs links scraped from the LOD Cloud.

Collaboration


Dive into the Fatiha Saïs's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Danai Symeonidou

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Rallou Thomopoulos

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michel Chein

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Michel Leclère

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Ollivier Haemmerlé

Institut national de la recherche agronomique

View shared research outputs
Researchain Logo
Decentralizing Knowledge