Maurizio Atzori
University of Cagliari
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maurizio Atzori.
international semantic web conference | 2017
Mattia Atzeni; Maurizio Atzori
In this paper, we leverage advances in the Semantic Web area, including data modeling (RDF), data management and querying (JENA and SPARQL), to develop CodeOntology, a community-shared software framework supporting expressive queries over source code. The project consists of two main contributions: an ontology that provides a formal representation of object-oriented programming languages, and a parser that is able to analyze Java source code and serialize it into RDF triples. The parser has been successfully applied to the source code of OpenJDK 8, gathering a structured dataset consisting of more than 2 million RDF triples. CodeOntology allows to generate Linked Data from any Java project, thereby enabling the execution of highly expressive queries over source code, by means of a powerful language like SPARQL.
international conference on management of data | 2014
Hamid Mousavi; Maurizio Atzori; Shi Gao; Carlo Zaniolo
Wikipedias InfoBoxes play a crucial role in advanced applications and provide the main knowledge source for DBpedia and the powerful structured queries it supports. However, InfoBoxes, which were created by crowdsourcing for human rather than computer consumption, suffer from incompleteness, inconsistencies, and inaccuracies. To overcome these problems, we have developed (i) the IBminer system that extracts InfoBox information by text-mining Wikipedia pages, (ii) the IKBStore system that integrates the information derived by IBminer with that of DBpedia, YAGO2,WikiData,WordNet, and other sources, and (iii) SWiPE and InfoBox Editor (IBE) that provide a user-friendly interfaces for querying and revising the knowledge base. Thus, IBminer uses a deep NLP-based approach to extract from text a semantic representation structure called TextGraph from which the system detects patterns and derives subject-attribute-value relations, as well as domain-specific synonyms for the knowledge base. IKBStore and IBE complement the powerful, user-friendly, by-example structured queries of SWiPE by supporting the validation and provenance history for the information contained in the knowledge base, along with the ability of upgrading its knowledge when this is found incomplete, incorrect, or outdated.
ieee international conference semantic computing | 2014
Maurizio Atzori
We present a simple approach to handle recursive SPARQL queries, that is, nested queries that may contain references to the query itself. This powerful feature is obtained by implementing a custom SPARQL function that takes a SPARQL query as a parameter and executes it over a specified endpoint. The behaviour is similar to the SPARQL 1.1 SERVICE clause, with a few fundamental differences: (1) the query passed as argument can be arbitrarily complex, (2) being a string, the query can be created at runtime in the calling (outer) query, and (3) it can reference to itself, enabling recursion. These features transform the SPARQL language into a Turing-equivalent one without introducing special constructs or needing another interpreter implemented on the endpoint server engine. The feature is implemented using the standard Estensible Value Testing described in the recommendations since 1.0, therefore, our proposal is standard compliant and also compatible with older endpoints not supporting 1.1 Specifications, where it can be also a replacement for the missing SERVICE clause.
Learning Structure and Schemas from Documents | 2011
Maurizio Atzori; Nicoletta Dessì
In this chapter we investigate the crucial problem that poses the bases to the concept of dataspaces: the need for human interaction/intervention in the process of organizing (getting the structure of) unstructured data. We survey the existing techniques behind dataspaces to overcome that need, exploring the structure of a dataspace along three dimensions: dataspace profiling, querying and searching and application domain. We will further explore existing projects focusing on dataspaces, induction of data structure from documents, and data models where data schema and documents structure overlaps will be reviewed, such as Apache Hadoop, Cassandra on Amazon Dynamo, Google BigTable model and other DHT-based flexible data structures, Google Fusion Tables, iMeMex, U-DID, WebTables and Yahoo! SearchMonkey.
workshops on enabling technologies: infrastracture for collaborative enterprises | 2015
Maurizio Atzori; Carlo Zaniolo
This paper discusses expressivity and accuracy of the By-Example Structured (BESt) Query paradigm implemented on the SWiPE system through the Wikipedia interface. We define an experimental setting based on the natural language questions made available by the QALD-4 challenge, in which we compare SWiPE against Xser, a state-of-the-art Question Answering system, and plain keyword search provided by the Wikipedia Search Engine. The experiments show that SWiPE outperforms the results provided by Wikipedia, and it also performs sensibly better than Xser, obtaining an overall 85% of totally correct answers vs. 68% of Xser. Among all answered questions, we obtain a precision of 100% and recall 96%. SWiPE is also able to answer more questions than the other systems. A formal characterization of the set of SPARQL queries supported by the BESt Query paradigm is also provided.
workshops on enabling technologies: infrastracture for collaborative enterprises | 2014
Maurizio Atzori; Andrea Dessi
In many Semantic Web applications, having RDF predicates sorted by significance is of primarily importance to improve usability and performance. In this paper we focus on predicates available on DBpedia, the most important Semantic Web source of data counting 470 million english triples. Although there is plenty of work in literature dealing with ranking entities or RDF query results, none of them seem to specifically address the problem of computing predicate rank. We address the problem by associating to each DBpedia property (also known as predicates or attributes of RDF triples) 8 original features specifically designed to provide sort-by-importance quantitative measures, automatically computable from an online SPARQL endpoint or a RDF dataset. By computing those features on a number of entity properties, we created a learning set and tested the performance of a number of well-known learning-to-rank algorithms. Our first experimental results show that the approach is effective and fast. Further, we provide an extensive survey of state-of-the-art algorithms for RDF ranking, to which we compare our approach.
workshops on enabling technologies: infrastracture for collaborative enterprises | 2017
Nicoletta Dessì; Maurizio Atzori
Despite the significant contribution from specialized ontologies and text mining methods, the evaluation of the semantic similarity of genes remains difficult because of the complex functions in which genes are involved. A less exploited resource is Wikipedia that stores more than 10400 articles about human genes: each gene name identifies the corresponding Wikipedia page resuming genes properties in short sentences where hyperlinks define relationships with other genes in Wikipedia. This paper evaluates the extent to which the Wikipedia can be trusted for assessing the similarity of a gene pair as the distance between their Wikipedia pages. We present a set of experiments that make use of TagMe (a powerful tool for evaluating the distance of two Wikipedia pages based on their annotations) to calculate the semantic similarity of several sets of genes on Wikipedia. Results compare well with gold standards and semantic similarity values evaluated on gene ontologies. The paper demonstrates the effectiveness of Wikipedia in recognizing functional groups of genes, the quality and the wealth of its knowledge about genes as well the accuracy of TagMe.
Information & Computation | 2017
Carlo Zaniolo; Shi Gao; Maurizio Atzori; Muhao Chen; Jiaqi Gu
Abstract DBpedia and other RFD-encoded Knowledge Bases (KB)s give users access to encyclopedic knowledge via SPARQL queries. As the world evolves, the KBs are updated, and the history of entities and their properties becomes of great interest. Thus, we need powerful tools and friendly interfaces to query histories and flash-back to the past. Here, we propose (i) a point-based temporal extension of SPARQL, called SPARQL T , which enables simple and concise expression of temporal queries, and (ii) an extension of Wikipedia Infoboxes to support user-friendly by-example temporal queries implemented by mapping them into SPARQL T . Our main-memory RDF-TX system supports such queries efficiently using Multi-Version B+ trees, compressed indexes, and query optimization techniques, which achieve performance and scalability, as demonstrated by experiments on historical datasets including Cliopedia derived from Wikipedia dumps. We finally discuss how provenance information can be used to add valid-time features to these transaction-time KBs.
ieee international conference semantic computing | 2014
Andrea Dessi; Maurizio Atzori
In many Semantic Web applications, having RDF predicates sorted by significance is of primarily importance to improve usability and performance. In this paper we focus on predicates available on DBpedia, the most important Semantic Web source of data counting 470 million english triples. Although there is plenty of work in literature dealing with ranking entities or RDF query results, none of them seem to specifically address the problem of computing predicate rank. We address the problem by associating to each DBPedia property (also known as predicates or attributes of RDF triples) a number of original features specifically designed to provide sort-by-importance quantitative measures, automatically computable from an online SPARQL endpoint or a RDF dataset. By computing those features on a number of entity properties, we created a learning set and tested the performance of a number of well-known learning-to-rank algorithms. Our first experimental results show that the approach is effective and fast.
ieee international conference semantic computing | 2016
Daniele Stefano Ferru; Maurizio Atzori
We show a framework to write SPARQL custom functions that can run on different SPARQL engines. Our current implementation supports some of the major opensource engines, namely Apache Jena/Fuseki, OpenLink Virtuoso and Sesame. In our experiments we show the performance in terms of running time and the ease of building custom functions by using our write-once run-anywhere framework.