Dario Colazzo
Paris Dauphine University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dario Colazzo.
international world wide web conferences | 2014
Dario Colazzo; François Goasdoué; Ioana Manolescu; Alexandra Roatis
The development of Semantic Web (RDF) brings new requirements for data analytics tools and methods, going beyond querying to semantics-rich analytics through warehouse-style tools. In this work, we fully redesign, from the bottom up, core data analytics concepts and tools in the context of RDF data, leading to the first complete formal framework for warehouse-style RDF analytics. Notably, we define i) analytical schemas tailored to heterogeneous, semantics-rich RDF graph, ii) analytical queries which (beyond relational cubes) allow flexible querying of the data and the schema as well as powerful aggregation and iii) OLAP-style operations. Experiments on a fully-implemented platform demonstrate the practical interest of our approach.
Journal of Functional Programming | 2006
Dario Colazzo; Giorgio Ghelli; Paolo Manghi; Carlo Sartiani
A part of a query that will never contribute data to the query answer should be regarded as an error. This principle has been recently accepted into mainstream XML query languages, but was still waiting for a complete treatment. We provide here a precise definition for this class of errors, and define a type system that is sound and complete, in its search for such errors, for a core language, under mild restrictions on the use of recursion in type definitions. In the process, we describe a dichotomy among existential and universal type systems, which is essential to understand some specific features of our type system.
Information Systems | 2009
Dario Colazzo; Giorgio Ghelli; Carlo Sartiani
Inclusion between XML types is important but expensive, and is much more expensive when unordered types are considered. We prove here that inclusion for XML types with interleaving and counting can be decided in polynomial time in the presence of two important restrictions: no element appears twice in the same content model, and Kleene star is only applied to disjunctions of single elements. Our approach is based on the transformation of each such content model into a set of constraints that completely characterizes the generated language. We then reduce inclusion checking to constraint implication. We exhibit a quadratic algorithm to perform inclusion checking on a RAM machine.
conference on information and knowledge management | 2012
Andrés Aranda-Andújar; Francesca Bugiotti; Jesús Camacho-Rodríguez; Dario Colazzo; François Goasdoué; Zoi Kaoudi; Ioana Manolescu
We present AMADA, a platform for storing Web data (in particular, XML documents and RDF graphs) based on the Amazon Web Services (AWS) cloud infrastructure. AMADA operates in a Software as a Service (SaaS) approach, allowing users to upload, index, store, and query large volumes of Web data. The demonstration shows (i) the step-by-step procedure for building and exploiting the warehouse (storing, indexing, querying) and (ii) the monitoring tools enabling one to control the expenses (monetary costs) charged by AWS for the operations involved while running AMADA.
conference on information and knowledge management | 2008
Giorgio Ghelli; Dario Colazzo; Carlo Sartiani
The extension of Regular Expressions (REs) with an interleaving (shuffle) operator has been proposed in many occasions, since it would be crucial to deal with unordered data. However, interleaving badly affects the complexity of basic operations, and, expecially, makes membership NP-hard [13], which is unacceptable for most uses of REs. REs form the basis of most XML type languages, such as DTDs and XML Schema types, and XDuce types [16, 11]. In this context, the interleaving operator would be a natural addition to the language of REs, as witnessed by the presence of limited forms of interleaving in XSD (the all group), Relax-NG, and SGML, provided that the NP-hardness of membership could be avoided. We present here a restricted class of REs with interleaving and counting which admits a linear membership algorithm, and which is expressive enough to cover the vast majority of real-world XML types. We first present an algorithm for membership of a list of words into a RE with interleaving and counting, based on the translation of the RE into a set of constraints. We generalize the approach in order to check membership of XML trees into a class of EDTDs with interleaving and counting, which models the crucial aspects of DTDs and XSD schemas.
international conference on data engineering | 2012
Jesús Camacho-Rodríguez; Dario Colazzo; Ioana Manolescu
It has been by now widely accepted that an increasing part of the worlds interesting data is either shared through the Web or directly produced through and for Web platforms using formats like XML (structured documents). We present a scalable store for managing a large corpora of XML documents built on top of off-the-shelf cloud infrastructure. We implement different indexing strategies to evaluate a query workload over the stored documents in the cloud. Moreover, each strategy presents different trade-offs between efficiency in query answering and cost for storing the index.
logic in computer science | 1999
Dario Colazzo; Giorgio Ghelli
The problem of defining and checking a subtype relation between recursive types was studied in Armadio and Cardelli (1993) for a first order type system, but for second order systems, which combine subtyping and parametric polymorphism, only negative results are known. This paper studies the problem of subtype checking for recursive types in system kernel Fun, a typed /spl lambda/-calculus with subtyping and bounded second order polymorphism. Along the lines of Armadio and Cardelli (1993), we study the definition of a subtype relation over kernel Fun recursive types, and then we present a subtyping algorithm which is sound and complete with respect to this relation. We show that the natural extension of the techniques introduced in Armadio and Cardelli (1993) to compare first order recursive types gives a non complete algorithm. We prove the completeness and correctness of a different algorithm, which also admits an efficient implementation.
database programming languages | 2005
Dario Colazzo; Carlo Sartiani
Unstructured p2p database systems are usually characterized by the presence of schema mappings among peers. In these systems, the detection of corrupted mappings is a key problem. A corrupted mapping fails in matching the target or the source schema, hence it is not able to transform data conforming to a schema
extending database technology | 2013
Nicole Bidoit; Dario Colazzo; Noor Malla; Federico Ulliana; Maurizio Nolé; Carlo Sartiani
\mathcal{S}_i
international database engineering and applications symposium | 2012
Nicole Bidoit; Dario Colazzo; Noor Malla; Carlo Sartiani
into data conforming to a schema