Is this you? Create Your Porfile

Gianni Costa

Indian Council of Agricultural Research

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gianni Costa is active.

Explore More

Publication

Featured researches published by Gianni Costa.

very large data bases | 2003

XQueC: pushing queries to compressed XML data

Andrei Arion; Angela Bonifati; Gianni Costa; Sandra D'aguanno; Ioana Manolescu; Andrea Pugliese

Publisher Summary This chapter reveals that the loader and compressor convert XML documents in a compressed, yet queryable format. The compressed repository stores the compressed documents and provides: access methods to this compressed data, and a set of compression-specific utilities that enable, e.g., the comparison of two compressed values. The query processor optimizes and evaluates XQuery queries over the compressed documents. Its complete set of physical operators allows for efficient evaluation over the compressed repository. The chapter motivates the choice of the storage structures for compressed XML, and of the compression algorithms employed. It describes the XQueC query processor, its set of physical operators, and outlines its optimization algorithm.

extending database technology | 2004

Efficient Query Evaluation over Compressed XML Data

Andrei Arion; Angela Bonifati; Gianni Costa; Sandra D’Aguanno; Ioana Manolescu; Andrea Pugliese

XML suffers from the major limitation of high redundancy. Even if compression can be beneficial for XML data, however, once compressed, the data can be seldom browsed and queried in an efficient way. To address this problem, we propose XQueC, an [XQue]ry processor and [C]ompressor, which covers a large set of XQuery queries in the compressed domain. We shred compressed XML into suitable data structures, aiming at both reducing memory usage at query time and querying data while compressed. XQueC is the first system to take advantage of a query workload to choose the compression algorithms, and to group the compressed data granules according to their common properties. By means of experiments, we show that good trade-offs between compression ratio and query capability can be achieved in several real cases, as those covered by an XML benchmark. On average, XQueC improves over previous XML query-aware compression systems, still being reasonably closer to general-purpose query-unaware XML compressors. Finally, QETs for a wide variety of queries show that XQueC can reach speed comparable to XQuery engines on uncompressed data.

Data Mining and Knowledge Discovery | 2010

An incremental clustering scheme for data de-duplication

Gianni Costa; Giuseppe Manco; Riccardo Ortale

We propose an incremental technique for discovering duplicates in large databases of textual sequences, i.e., syntactically different tuples, that refer to the same real-world entity. The problem is approached from a clustering perspective: given a set of tuples, the objective is to partition them into groups of duplicate tuples. Each newly arrived tuple is assigned to an appropriate cluster via nearest-neighbor classification. This is achieved by means of a suitable hash-based index, that maps any tuple to a set of indexing keys and assigns tuples with high syntactic similarity to the same buckets. Hence, the neighbors of a query tuple can be efficiently identified by simply retrieving those tuples that appear in the same buckets associated to the query tuple itself, without completely scanning the original database. Two alternative schemes for computing indexing keys are discussed and compared. An extensive experimental evaluation on both synthetic and real data shows the effectiveness of our approach.

advances in social networks analysis and mining | 2012

A Bayesian Hierarchical Approach for Exploratory Analysis of Communities and Roles in Social Networks

Gianni Costa; Riccardo Ortale

We present a new probabilistic approach to modeling social interactions, that seamlessly integrates community discovery and role assignment for a deeper understanding of connectivity patterns in social networks. The devised approach is an unsupervised learning technique based on a Bayesian hierarchical model of social interactions. This model specifies an intuitive generative process, in which pairs of nodes in a social network are associated with communities as well as roles in the context of the respective communities, before that a directed interaction is possibly established between them. According to the generative semantics of the proposed model, nodes are represented as probability distributions over communities, while communities are represented as probability distributions over roles. Such distributions are unknown parameters of the proposed model, that are estimated from social-network data through approximated posterior inference and parameter estimation. A comparative evaluation over real-world social networks reveals that our approach outperforms state-of-the-art competitors in terms of link prediction.

advances in geographic information systems | 2008

The DAEDALUS framework: progressive querying and mining of movement data

Riccardo Ortale; Ettore Ritacco; Nikos Pelekis; Roberto Trasarti; Gianni Costa; Fosca Giannotti; Giuseppe Manco; Chiara Renso; Yannis Theodoridis

In this work we propose DAEDALUS, a formal framework and system, specifically focussed on progressive combination of mining and querying operators. The core component of DAEDALUS is the MO-DMQL query language that extends SQL in two respects, namely a pattern definition operator and the capability to uniform manipulating both raw data and unveiled patterns. DAEDALUS system is specifically focussed on movement data and has been implemented as a query execution layer on top of the Hermes Moving Object Database. The expressiveness and usefulness of the MODMQL language as well as the computational capabilities of DAEDALUS are qualitatively evaluated by means of a case study.

ACM Transactions on Information Systems | 2013

X-Class: Associative Classification of XML Documents by Structure

Gianni Costa; Riccardo Ortale; Ettore Ritacco

The supervised classification of XML documents by structure involves learning predictive models in which certain structural regularities discriminate the individual document classes. Hitherto, research has focused on the adoption of prespecified substructures. This is detrimental for classification effectiveness, since the a priori chosen substructures may not accord with the structural properties of the XML documents. Therein, an unexplored question is how to choose the type of structural regularity that best adapts to the structures of the available XML documents. We tackle this problem through X-Class, an approach that handles all types of tree-like substructures and allows for choosing the most discriminatory one. Algorithms are designed to learn compact rule-based classifiers in which the chosen substructures discriminate the classes of XML documents. X-Class is studied across various domains and types of substructures. Its classification performance is compared against several rule-based and SVM-based competitors. Empirical evidence reveals that the classifiers induced by X-Class are compact, scalable, and at least as effective as the established competitors. In particular, certain substructures allow the induction of very compact classifiers that generally outperform the rule-based competitors in terms of effectiveness over all chosen corpora of XML data. Furthermore, such classifiers are substantially as effective as the SVM-based competitor, with the additional advantage of a high-degree of interpretability.

international conference on tools with artificial intelligence | 2011

Effective XML Classification Using Content and Structural Information via Rule Learning

Gianni Costa; Riccardo Ortale; Ettore Ritacco

We propose a new approach to XML classification, that uses a particular rule-learning technique for the induction of interpretable classification models. These separate the individual classes of XML documents by looking at the presence within the XML documents themselves of certain features, that provide information on their content and structure. The devised approach induces classifiers with outperforming effectiveness in comparison to several established competitors.

conference on recommender systems | 2011

Modeling item selection and relevance for accurate recommendations: a bayesian approach

Nicola Barbieri; Gianni Costa; Giuseppe Manco; Riccardo Ortale

We propose a bayesian probabilistic model for explicit preference data. The model introduces a generative process, which takes into account both item selection and rating emission to gather into communities those users who experience the same items and tend to adopt the same rating pattern. Each user is modeled as a random mixture of topics, where each topic is characterized by a distribution modeling the popularity of items within the respective user-community and by a distribution over preference values for those items. The proposed model can be associated with a novel item-relevance ranking criterion, which is based both on item popularity and users preferences. We show that the proposed model, equipped with the new ranking criterion, outperforms state-of-art approaches in terms of accuracy of the recommendation list provided to users on standard benchmark datasets.

international conference on tools with artificial intelligence | 2012

On Effective XML Clustering by Path Commonality: An Efficient and Scalable Algorithm

Gianni Costa; Riccardo Ortale

XML clustering by structure is, in its most general form, the process of partitioning a corpus of XML documents into disjoint clusters, such that intra-cluster structural homogeneity is high and inter-cluster structural homogeneity is low. In this paper, we propose an algorithm that implements a partitioning strategy, in which root-to-leaf paths are used to separate the XML documents. Paths are discriminatory substructures and, thus, the effectiveness of our algorithm is accordingly high. Moreover, a suitable encoding is adopted for representing and testing the occurrence of the individual paths within each XML document independently of the length of such paths. Not only this expedites clustering, but it also makes our algorithm scalable to process large-scale corpora of XML documents. A comparative evaluation over several standard (real-word and synthetic) XML corpora reveals that our algorithm outperforms all of its competitors in efficiency and scalability, while being as effective as the top-notch competitors. One especially appealing property of the proposed algorithm is that it achieves these levels of performance by automatically establishing a natural number of clusters to be discovered in the underlying XML corpus.

ACM Transactions on Internet Technology | 2016

Model-Based Collaborative Personalized Recommendation on Signed Social Rating Networks

Gianni Costa; Riccardo Ortale

Recommendation on signed social rating networks is studied through an innovative approach. Bayesian probabilistic modeling is used to postulate a realistic generative process, wherein user and item interactions are explained by latent factors, whose relevance varies within the underlying network organization into user communities and item groups. Approximate posterior inference captures distrust propagation and drives Gibbs sampling to allow rating and (dis)trust prediction for recommendation along with the unsupervised exploratory analysis of network organization. Comparative experiments reveal the superiority of our approach in rating and link prediction on Epinions and Ciao, besides community quality and recommendation sensitivity to network organization.

Explore More