Andreas Harth
National University of Ireland, Galway
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andreas Harth.
european semantic web conference | 2005
John G. Breslin; Andreas Harth; Uldis Bojars; Stefan Decker
Online community sites have replaced the traditional means of keeping a community informed via libraries and publishing. At present, online communities are islands that are not interlinked. We describe different types of online communities and tools that are currently used to build and support such communities. Ontologies and Semantic Web technologies offer an upgrade path to providing more complex services. Fusing information and inferring links between the various applications and types of information provides relevant insights that make the available information on the Internet more valuable. We present the SIOC ontology which combines terms from vocabularies that already exist with new terms needed to describe the relationships between concepts in the realm of online community sites.
international semantic web conference | 2007
Andreas Harth; Jürgen Umbrich; Aidan Hogan; Stefan Decker
We present the architecture of an end-to-end semantic search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers. We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements.
latin american web congress | 2005
Andreas Harth; Stefan Decker
Storing and querying resource description framework (RDF) data is one of the basic tasks within any semantic Web application. A number of storage systems provide assistance for this task. However, current RDF database systems do not use optimized indexes, which results in a poor performance behavior for querying RDF. In this paper we describe optimized index structures for RDF, show how to process and evaluate queries based on the index structure, describe a lightweight adaptable implementation in Java, and provide a performance comparison with existing RDF databases.
web based communities | 2006
John G. Breslin; Stefan Decker; Andreas Harth; Uldis Bojars
Online communities are islands of people and topics that are not interlinked. Complementary discussions exist on disparate systems but it is currently difficult to exploit the available distributed information. A Semantically Interlinked Online Community (SIOC) can enable efficient information dissemination across communities by creating an ontology that will model concepts identified in discussion methods. Data instances can be accessed from community sites using this ontology, enabling connections between local and remote concept instances, and allowing queries on, or transfer of, the data. By searching on one forum, the ontology and interface will allow users to find information on other forums that use a SIOC-based system architecture. Other uses include cross-site querying, topic-related searches, and the importing of SIOC data into other systems. Fusing information and inferring links among various applications and types of information with SIOC provide relevant insights that make the community information available on the internet more valuable.
International Journal on Semantic Web and Information Systems | 2009
Aidan Hogan; Andreas Harth; Axel Polleres
In this article the authors discuss the challenges of performing reasoning on large scale RDF datasets from the Web. Using ter-Horst’s pD* fragment of OWL as a base, the authors compose a rule-based framework for application to web data: they argue their decisions using observations of undesirable examples taken directly from the Web. The authors further temper their OWL fragment through consideration of “authoritative sources†which counter-acts an observed behaviour which they term “ontology hijacking†: new ontologies published on the Web re-defining the semantics of existing entities resident in other ontologies. They then present their system for performing rule-based forward-chaining reasoning which they call SAOR: Scalable Authoritative OWL Reasoner. Based upon observed characteristics of web data and reasoning in general, they design their system to scale: the system is based upon a separation of terminological data from assertional data and comprises of a lightweight in-memory index, on-disk sorts and file-scans. The authors evaluate their methods on a dataset in the order of a hundred million statements collected from real-world Web sources and present scale-up experiments on a dataset in the order of a billion statements collected from the Web.
european semantic web conference | 2006
Axel Polleres; Cristina Feier; Andreas Harth
Knowledge representation formalisms used on the Semantic Web adhere to a strict open world assumption. Therefore, nonmonotonic reasoning techniques are often viewed with scepticism. Especially negation as failure, which intuitively adopts a closed world view, is often claimed to be unsuitable for the Web where knowledge is notoriously incomplete. Nonetheless, it was suggested in the ongoing discussions around rules extensions for languages like RDF(S) or OWL to allow at least restricted forms of negation as failure, as long as negation has an explicitly defined, finite scope. Yet clear definitions of such “scoped negation” as well as formal semantics thereof are missing. We propose logic programs with contexts and scoped negation and discuss two possible semantics with desirable properties. We also argue that this class of logic programs can be viewed as a rule extension to a subset of RDF(S).
international semantic web conference | 2009
Andreas Harth; Sheila Kinsella; Stefan Decker
The focus of web search is moving away from returning relevant documents towards returning structured data as results to user queries. A vital part in the architecture of search engines are link-based ranking algorithms, which however are targeted towards hypertext documents. Existing ranking algorithms for structured data, on the other hand, require manual input of a domain expert and are thus not applicable in cases where data integrated from a large number of sources exhibits enormous variance in vocabularies used. In such environments, the authority of data sources is an important signal that the ranking algorithm has to take into account. This paper presents algorithms for prioritising data returned by queries over web datasets expressed in RDF. We introduce the notion of naming authority which provides a correspondence between identifiers and the sources which can speak authoritatively for these identifiers. Our algorithm uses the original PageRank method to assign authority values to data sources based on a naming authority graph, and then propagates the authority values to identifiers referenced in the sources. We conduct performance and quality evaluations of the method on a large web dataset. Our method is schema-independent, requires no manual input, and has applications in search, query processing, reasoning, and user interfaces over integrated datasets.
asian semantic web conference | 2008
Aidan Hogan; Andreas Harth; Axel Polleres
In this paper we discuss the challenges of performing reasoning on large scale RDF datasets from the Web. We discuss issues and practical solutions relating to reasoning over web data using a rule-based approach to forward-chaining; in particular, we identify the problem of ontology hijacking: new ontologies published on the Web re-defining the semantics of existing concepts resident in other ontologies. Our solution introduces consideration of authoritative sources. Our system is designed to scale, comprising of file-scans and selected lightweight on-disk indices. We evaluate our methods on a dataset in the order of a hundred million statements collected from real-world Web sources.
international world wide web conferences | 2004
Andreas Harth; Stefan Decker; Yu He; Hongsuda Tangmunarunkit; Carl Kesselman
A fundamental task on the Grid is to decide what jobs to run on what computing resources based on job or application requirements. Our previous work on ontology-based matchmaking discusses a resource matchmaking mechanism using Semantic Web technologies. We extend our previous work to provide dynamic access to such matchmaking capability by building a persistent online matchmaking service. Our implementation uses the Globus Toolkit for the Grid service development, and exploits the monitoring and discovery service in the Grid infrastructure to dynamically discover and update resource information. We describe the architecture of our semantic matchmaker service in the poster.
international world wide web conferences | 2007
Aidan Hogan; Andreas Harth; Jürgen Umrich; Stefan Decker
Current search engines do not fully leverage semantically rich datasets, or specialise in indexing just one domain-specific dataset.We present a search engine that uses the RDF data model to enable interactive query answering over richly structured and interlinked data collected from many disparate sources on the Web.