Diego Milano
Sapienza University of Rome
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Diego Milano.
international conference on data engineering | 2006
Floris Geerts; Anastasios Kementsietsidis; Diego Milano
Annotations play a central role in the curation of scientific databases. Despite their importance, data formats and schemas are not designed to manage the increasing variety of annotations. Moreover, DBMS’s often lack support for storing and querying annotations. Furthermore, annotations and data are only loosely coupled. This paper introduces an annotation-oriented data model for the manipulation and querying of both data and annotations. In particular, the model allows for the specification of annotations on sets of values and for effectively querying the information on their association. We use the concept of block to represent an annotated set of values. Different colors applied to the blocks represent different annotations. We introduce a color query language for our model and prove it to be both complete (it can express all possible queries over the class of annotated databases), and minimal (all the algebra operators are primitive). We present MONDRIAN, a prototype implementation of our annotation mechanism, and we conduct experiments that investigate the set of parameters which influence the evaluation cost for color queries.
international conference on move to meaningful internet systems | 2005
Diego Milano; Monica Scannapieco; Tiziana Catarci
Real data is often affected by errors and inconsistencies. Many of them depend on the fact that schemas cannot represent a sufficiently wide range of constraints. Data cleaning is the process of identifying and possibly correcting data quality problems that affect the data. Cleaning data requires to gather knowledge on the domain to which the data refer. Anyway, existing data cleaning techniques still access this knowledge as a fragmented collection of heterogenous rules and ad hoc data transformations. Furthermore, data cleaning methodologies for an important class of data based on the semistructured XML data model have not yet been proposed. In this paper we introduce the OXC framework, that offers a methodology for XML data cleaning based on a uniform representation of domain knowledge through an ontology We describe how to define XML related data quality metrics based on our domain knowledge representation, and give a definition of various metrics related to the completeness data quality dimension.
extending database technology | 2006
Floris Geerts; Anastasios Kementsietsidis; Diego Milano
We demonstrate iMONDRIAN, a component of the MONDRIAN annotation management system. Distinguishing features of MONDRIAN are (i) the ability to annotate sets of values (ii) the annotation-aware query algebra. On top of that, iMONDRIAN offers an intuitive visual interface to annotate and query scientific databases. In this demonstration, we consider Gene Ontology (GO), a publicly available biological database. Using this database we show (i) the creation of annotations through the visual interface (ii) the ability to visually build complex, annotation-aware, queries (iii) the basic functionality for tracking annotation provenance. Our demonstration also provides a cheat window which shows the system internals and how visual queries are translated to annotation-aware algebra queries.
International Journal of Enterprise Information Systems | 2007
Diego Milano; Monica Scannapieco; Tiziana Catarci
Data quality is becoming an increasingly important issue in environments characterized by extensive data replication. Among such environments, this article focuses on cooperative information systems (CISs), for which it is very important to declare and access quality of data. The article describes a general methodology for evaluating quality of data, and the design of an architectural component, named quality factory, that implements quality evaluation of XML data. The detailed design and implementation of a further service, named data quality broker, are presented. The data quality broker accesses data and related quality distributed in the CIS and improves quality of data by comparing different copies present in the system. The data quality broker has been implemented as a peer-to-peer service and a set of experiments on real data show its effectiveness and performance behavior.
Archive | 2006
Diego Milano; Monica Scannapieco; Tiziana Catarci
Data quality is becoming an increasingly important issue in environments characterized by extensive data replication. Among such environments, this paper focuses on Cooperative Information Systems (CISs), for which it is very important to declare and access quality of data. Specifically, we describe the detailed design and implementation of a peer-to-peer service for exchanging and improving data quality in CISs. Such a service allows to access data and related quality distributed in the CIS and improves quality of data by comparing different copies of the same data. Some experiments on real data will show the effectiveness of the service and the performance behavior.
Lecture Notes in Computer Science | 2004
Diego Milano; Monica Scannapieco; Tiziana Catarci
Recent research has highlighted the importance of data quality issues in environments characterized by extensive data replication, such as Cooperative Information Systems (CISs). While high data quality is a strict requirement for CISs, the high degree of data replication that characterizes such systems can be exploited to improve the quality of data, as different copies of the same data may be compared in order to detect quality problems and possibly solve them.
very large data bases | 2007
Carola Aiello; Roberto Baldoni; Devis Bianchini; Silvia Bonomi; Silvana Castano; Tiziana Catarci; Carlo Curino; Valeria De Antonellis; Alfio Ferrara; Michele Melchiori; Diego Milano; Stefano Montanelli; Giorgio Orsi; Antonella Poggi; Leonardo Querzoni; Elisa Quintarelli; Rosalba Rossato; Denise Salvi; Monica Scannapieco; Fabio A. Schreiber; Letizia Tanca; Sara Tucci Pergiovanni
conference on advanced information systems engineering | 2004
Diego Milano; Monica Scannapieco; Tiziana Catarci
Journal of Digital Information Management | 2005
Diego Milano; Monica Scannapieco; Tiziana Catarci
Archive | 2009
Tiziana Catarci; Diego Milano; Monica Scannapieco