Paolo Guagliardo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paolo Guagliardo is active.

Explore More

Publication

Featured researches published by Paolo Guagliardo.

symposium on principles of database systems | 2016

Making SQL Queries Correct on Incomplete Databases: A Feasibility Study

Paolo Guagliardo; Leonid Libkin

Multiple issues with SQLs handling of nulls have been well documented. Having efficiency as its key goal, evaluation of SQL queries disregards the standard notion of correctness on incomplete databases - certain answers - due to its high complexity. As a result, it may produce answers that are just plain wrong. It was recently shown that SQL evaluation can be modified, at least for first-order queries, to return only correct answers. But while these modifications came with good theoretical complexity bounds, they have not been tested in practice. The goals of this proof-of-concept paper are to understand whether wrong answers can be produced by SQL queries in real-world scenarios, and whether proposed techniques for avoiding them can be made practically feasible. We use the TPC-H benchmark, and show that for some typical queries involving negation, wrong answers are very common. On the other hand, existing solutions for fixing the problem do not work in practice at all. By analyzing the reasons for this, we come up with a new modified way of rewriting SQL queries that restores correctness. We conduct experiments which show the feasibility of our solution: the small price tag it imposes can be often tolerated to ensure correct results, and we do not miss correct answers that the usual SQL evaluation produces. The overall conclusion is that correct evaluation can be realistically achieved in the presence of nulls, at least for the SQL fragment that corresponds to first-order queries.

Data Exchange, Information, and Streams | 2013

Query Processing in Data Integration

Paolo Guagliardo; Piotr Wieczorek

In this chapter we illustrate the main techniques for processing queries in data integration. The first part of the chapter focuses on the problem of query answering in the relational setting, and describes approaches based on variants of the chase, along with how to deal with integrity constraints and access patterns. The second part of the chapter investigates query processing in the context of semistructured data, which is best described by graph-based data models, where the expressiveness of query languages not common in traditional database systems allows to point out the subtle differences between query answering and query rewriting. The chapter is closed by a very brief discussion of query processing in data integration with XML and ontologies.

very large data bases | 2017

A formal semantics of SQL queries, its validation, and applications

Paolo Guagliardo; Leonid Libkin

While formal semantics of theoretical languages underlying SQL have been provided in the past, they all made simplifying assumptions ranging from changes in the syntax to omitting bag semantics and nulls. This situation is reminiscent of what happens in the field of programming languages, where semantics of formal calculi underlying the main features of languages are abundant, but formal semantics of real languages that people use are few and far between. We consider the basic class of SQL queries --- essentially SELECT-FROM-WHERE queries with subqueries, set/bag operations, and nulls --- and define a formal semantics for it, without any departures from the real language. This fragment already requires decisions related to the data model and handling variable names that are normally disregarded by simplified semantics. To justify our choice of the semantics, we validate it experimentally on a large number of randomly generated queries and databases. We give two applications of the semantics. One is the first formal proof of the equivalence of basic SQL and relational algebra that extends to bag semantics and nulls. The other application looks at the three-valued logic employed by SQL, which is universally assumed to be necessary to handle nulls. We prove however that this is not so, as three-valued logic does not add expressive power: every SQL query in our fragment can be evaluated under the usual two-valued Boolean semantics of conditions.

international conference on management of data | 2017

Correctness of SQL Queries on Databases with Nulls

Paolo Guagliardo; Leonid Libkin

Multiple issues with SQLs handling of nulls have been well documented. Having efficiency as its main goal, SQL disregards the standard notion of correctness on incomplete databases -- certain answers -- due to its high complexity. As a result, the evaluation of SQL queries on databases with nulls may produce answers that are just plain wrong. However, SQL evaluation can be modified, at least for relational algebra queries, to approximate certain answers, i.e., return only correct answers. We examine recently proposed approximation schemes for certain answers and analyze their complexity, both theoretical bounds and real-life behavior

british national conference on databases | 2013

Lossless horizontal decomposition with domain constraints on interpreted attributes

Ingo Feinerer; Enrico Franconi; Paolo Guagliardo

Horizontal decomposition is the process of splitting a relation into sub-relations, called fragments, each containing a subset of the rows of the original relation. In this paper, we consider horizontal decomposition in a setting where some of the attributes in the database schema are interpreted over a specific domain, on which a set of special predicates and functions is defined. We study the losslessness of horizontal decomposition, that is, whether the original relation can be reconstructed from the fragments by union, in the presence of integrity constraints on the database schema. We introduce the new class of conditional domain constraints (CDCs), restricting the values the interpreted attributes may take whenever a certain condition holds on the non-interpreted ones, and investigate lossless horizontal decomposition under CDCs in isolation, as well as in combination with functional and unary inclusion dependencies.

international conference on management of data | 2018

Cypher: An Evolving Query Language for Property Graphs

Nadime Francis; Alastair Green; Paolo Guagliardo; Leonid Libkin; Tobias Lindaaker; Victor Marsault; Stefan Plantikow; Mats Rydberg; Petra Selmer; Andrés Taylor

The Cypher property graph query language is an evolving language, originally designed and implemented as part of the Neo4j graph database, and it is currently used by several commercial database products and researchers. We describe Cypher 9, which is the first version of the language governed by the openCypher Implementers Group. We first introduce the language by example, and describe its uses in industry. We then provide a formal semantic definition of the core read-query features of Cypher, including its variant of the property graph data model, and its ASCII Art graph pattern matching mechanism for expressing subgraphs of interest to an application. We compare the features of Cypher to other property graph query languages, and describe extensions, at an advanced stage of development, which will form part of Cypher 10, turning the language into a compositional language which supports graph projections and multiple named graphs.

international joint conference on artificial intelligence | 2017

On Querying Incomplete Information in Databases under Bag Semantics

Marco Console; Paolo Guagliardo; Leonid Libkin

Querying incomplete data is an important task both in data management, and in many AI applications that use query rewriting to take advantage of relational database technology. Usually one looks for answers that are certain, i.e., true in every possible world represented by an incomplete database. For positive queries – expressed either in positive relational algebra or as unions of conjunctive queries – finding such answers can be done efficiently when databases and query answers are sets. Real-life databases however use bag, rather than set, semantics. For bags, instead of saying that a tuple is certainly in the answer, we have more detailed information: namely, the range of the numbers of occurrences of the tuple in query answers. We show that the behavior of positive queries is different under bag semantics: finding the minimum number of occurrences can still be done efficiently, but for maximum it becomes intractable. We use these results to investigate approximation schemes for computing certain answers to arbitrary first-order queries that have been proposed for set semantics. One of them cannot be adapted to bags, as it relies on the intractable maxima of occurrences, but another scheme only deals with minima, and we show how to adapt it to bag semantics without losing efficiency.

IEEE Transactions on Knowledge and Data Engineering | 2015

Lossless Selection Views under Conditional Domain Constraints

Ingo Feinerer; Enrico Franconi; Paolo Guagliardo

A set of views defined by selection queries splits a database relation into sub-relations, each containing a subset of the original rows. This decomposition into horizontal fragments is lossless when the initial relation can be reconstructed from the fragments by union. In this paper, we consider horizontal decomposition in a setting where some of the attributes in the database schema are interpreted over a specific domain, on which a set of special predicates and functions is defined. We study losslessness in the presence of integrity constraints on the database schema. We consider the class of conditional domain constraints (CDCs), which restrict the values that the interpreted attributes may take whenever a certain condition holds on the non-interpreted ones, and investigate lossless horizontal decomposition under CDCs in isolation, as well as in combination with functional and unary inclusion dependencies.

Information Systems | 2018

On the Codd semantics of SQL nulls

Paolo Guagliardo; Leonid Libkin

Abstract Theoretical models used in database research often have subtle differences with those occurring in practice. One particular mismatch that is usually neglected concerns the use of marked nulls to represent missing values in theoretical models of incompleteness, while in an SQL database these are all denoted by the same syntactic NULL object. It is commonly argued that results obtained in the model with marked nulls carry over to SQL, because SQL nulls can be interpreted as Codd nulls , which are simply marked nulls that do not repeat. This argument, however, does not take into account that even simple queries may produce answers where distinct occurrences of NULL do in fact denote the same unknown value. For such queries, interpreting SQL nulls as Codd nulls would incorrectly change the semantics of query answers. To use results about Codd nulls for real-life SQL queries, we need to understand which queries preserve the Codd interpretation of SQL nulls. We show, however, that the class of relational algebra queries preserving Codd interpretation is not recursively enumerable, which necessitates looking for sufficient conditions for such preservation. Those can be obtained by exploiting the information provided by NOT NULL constraints on the database schema. We devise mild syntactic restrictions on queries that guarantee preservation, do not limit the full expressiveness of queries on databases without nulls, and can be checked efficiently.

database and expert systems applications | 2014

Translatable Updates of Selection Views under Constant Complement

Enrico Franconi; Paolo Guagliardo

Given a lossless view associating a source relation with a set of target relations defined by selection queries over the source, we study how updates of the target relations can be consistently and univocally propagated to the underlying source relation. We consider a setting where some of the attributes in the schema are interpreted over some specific domain (e.g., the reals or the integers) whose data values can be compared beyond equality, by means of special predicates (e.g., smaller/greater than) and functions (e.g., addition and subtraction). The source schema is constrained by conditional domain constraints, which restrict the values that are admissible for the interpreted attributes whenever a certain condition is satisfied by the values taken by the non-interpreted ones.

Explore More