Cássia Trojahn
French Institute for Research in Computer Science and Automation
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cássia Trojahn.
Journal on Data Semantics | 2011
Jérôme Euzenat; Christian Meilicke; Heiner Stuckenschmidt; Pavel Shvaiko; Cássia Trojahn
In the area of semantic technologies, benchmarking and systematic evaluation is not yet as established as in other areas of computer science, e.g., information retrieval. In spite of successful attempts, more effort and experience are required in order to achieve such a level of maturity. In this paper, we report results and lessons learned from the Ontology Alignment Evaluation Initiative (OAEI), a benchmarking initiative for ontology matching. The goal of this work is twofold: on the one hand, we document the state of the art in evaluating ontology matching methods and provide potential participants of the initiative with a better understanding of the design and the underlying principles of the OAEI campaigns. On the other hand, we report experiences gained in this particular area of semantic technologies to potential developers of benchmarking for other kinds of systems. For this purpose, we describe the evaluation design used in the OAEI campaigns in terms of datasets, evaluation criteria and workflows, provide a global view on the results of the campaigns carried out from 2005 to 2010 and discuss upcoming trends, both specific to ontology matching and generally relevant for the evaluation of semantic technologies. Finally, we argue that there is a need for a further automation of benchmarking to shorten the feedback cycle for tool developers.
Journal of Web Semantics | 2012
Christian Meilicke; Raúl García-Castro; Fred Freitas; Willem Robert van Hage; Elena Montiel-Ponsoda; Ryan Ribeiro de Azevedo; Heiner Stuckenschmidt; Ondřej Šváb-Zamazal; Vojtěch Svátek; Andrei Tamilin; Cássia Trojahn; Shenghui Shenghui Wang
In this paper we present the MultiFarm dataset, which has been designed as a benchmark for multilingual ontology matching. The MultiFarm dataset is composed of a set of ontologies translated in different languages and the corresponding alignments between these ontologies. It is based on the OntoFarm dataset, which has been used successfully for several years in the Ontology Alignment Evaluation Initiative (OAEI). By translating the ontologies of the OntoFarm dataset into eight different languages-Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish-we created a comprehensive set of realistic test cases. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.
Journal of Web Semantics | 2013
Jérôme Euzenat; Maria-Elena Roşoiu; Cássia Trojahn
The OAEI Benchmark test set has been used for many years as a main reference to evaluate and compare ontology matching systems. However, this test set has barely varied since 2004 and has become a relatively easy task for matchers. In this paper, we present the design of a flexible test generator based on an extensible set of alterators which may be used programmatically for generating different test sets from different seed ontologies and different alteration modalities. It has been used for reproducing Benchmark both with the original seed ontology and with other ontologies. This highlights the remarkable stability of results over different generations and the preservation of difficulty across seed ontologies, as well as a systematic bias towards the initial Benchmark test set and the inability of such tests to identify an overall winning matcher. These were exactly the properties for which Benchmark had been designed. Furthermore, the generator has been used for providing new test sets aiming at increasing the difficulty and discriminability of Benchmark. Although difficulty may be easily increased with the generator, attempts to increase discriminability proved unfruitful. However, efforts towards this goal raise questions about the very nature of discriminability.
Archive | 2011
Cássia Trojahn; Jérôme Euzenat; Valentina A. M. Tamma; Terry R. Payne
Within open, distributed and dynamic environments, agents frequently encounter and communicate with new agents and services that were previously unknown. However, to overcome the ontological heterogeneitywhich may exist within such environments, agents first need to reach agreement over the vocabulary and underlying conceptualisation of the shared domain, that will be used to support their subsequent communication.Whilst there are many existing mechanisms for matching the agents’ individual ontologies, some are better suited to certain ontologies or tasks than others, and many are unsuited for use in a real-time, autonomous environment. Agents have to agree on which correspondences between their ontologies are mutually acceptable by both agents. As the rationale behind the preferences of each agent may well be private, one cannot always expect agents to disclose their strategy or rationale for communicating. This prevents the use of a centralised mediator or facilitator which could reconcile the ontological differences. The use of argumentation allows two agents to iteratively explore candidate correspondences within a matching process, through a series of proposals and counter proposals, i.e., arguments. Thus, two agents can reason over the acceptability of these correspondences without explicitly disclosing the rationale for preferring one type of correspondences over another. In this chapter we present an overview of the approaches for alignment agreement based on argumentation.
processing of the portuguese language | 2014
Roger Granada; Cássia Trojahn; Renata Vieira
The growth of available data in digital format has been facilitating the development of new models to automatically infer the semantic similarity between word pairs. However, there are still many natural languages without sufficient resources to evaluate measures of semantic relatedness. In this paper we translated word pairs from a well-known baseline for evaluating semantic relatedness measures into Portuguese and performed a manual evaluation of each pair. We compared the correlation with similar datasets in other languages and generated LSA models from Wikipedia articles in order to verify the pertinence of each dataset and how semantic similarity conveys across languages.
Towards the Multilingual Semantic Web | 2014
Cássia Trojahn; Bo Fu; Ondřej Zamazal; Dominique Ritze
Ontology matching is one of the key solutions for solving the heterogeneity problem in the Semantic Web. Nowadays, the increasing amount of multilingual data on the Web and the consequent development of ontologies in different natural languages have pushed the need for multilingual and cross-lingual ontology matching. This chapter provides an overview of multilingual and cross-lingual ontology matching. We formally define the problem of matching multilingual and cross-lingual ontologies and provide a classification of different techniques and approaches. Systematic evaluations of these techniques are discussed with an emphasis on standard and freely available data sets and systems.
international semantic web conference | 2016
Imen Megdiche; Olivier Teste; Cássia Trojahn
Resolving the semantic heterogeneity in the semantic web requires finding correspondences between ontologies describing resources. In particular, with the explosive growth of data sets in the Linked Open Data, linking multiple vocabularies and ontologies simultaneously, known as holistic matching problem, becomes necessary. Currently, most state-of-the-art matching approaches are limited to pairwise matching. In this paper, we propose a holistic ontology matching approach that is modeled through a linear program extending the maximum-weighted graph matching problem with linear constraints (cardinality, structural, and coherence constraints). Our approach guarantees the optimal solution with mostly coherent alignments. To evaluate our proposal, we discuss the results of experiments performed on the Conference track of the OAEI 2015, under both holistic and pairwise matching settings.
extended semantic web conference | 2012
Christian Meilicke; Cássia Trojahn; Ondřej Šváb-Zamazal; Dominique Ritze
This paper reports on the first usage of the MultiFarm dataset for evaluating ontology matching systems. This dataset has been designed as a comprehensive benchmark for multilingual ontology matching. In a first set of experiments, we analyze how state-of-the-art matching systems – not particularly designed for the task of multilingual ontology matching – perform on this dataset. These experiments show the hardness of MultiFarm and result in baselines for any algorithm specifically designed for multilingual ontology matching. We continue with a second set of experiments, where we analyze three systems that have been extended with specific strategies to solve the multilingual matching problem. This paper allows us to draw relevant conclusions for both multilingual ontology matching and ontology matching evaluation in general.
Proceedings of the 2011 workshop on Data infrastructurEs for supporting information retrieval evaluation | 2011
Stuart N. Wrigley; Raúl García-Castro; Cássia Trojahn
This paper describes an infrastructure for the automated evaluation of semantic technologies and, in particular, semantic search technologies. For this purpose, we present an evaluation framework which follows a service-oriented approach for evaluating semantic technologies and uses the Business Process Execution Language (BPEL) to define evaluation workflows that can be executed by process engines. This framework supports a variety of evaluations, from different semantic areas, including search, and is extendible to new evaluations. We show how BPEL addresses this diversity as well as how it is used to solve specific challenges such as heterogeneity, error handling and reuse.
european semantic web conference | 2018
Élodie Thiéblin; Ollivier Haemmerlé; Nathalie Hernandez; Cássia Trojahn
Simple ontology alignments, largely studied, link one entity of a source ontology to one entity of a target ontology. One of the limitations of these alignments is, however, their lack of expressiveness which can be overcome by complex alignments. Although different complex matching approaches have emerged in the literature, there is a lack of complex reference alignments on which these approaches can be systematically evaluated. This paper proposes two sets of complex alignments between 10 pairs of ontologies from the well-known OAEI conference simple alignment dataset.