Francesco Gullo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francesco Gullo is active.

Explore More

Publication

Featured researches published by Francesco Gullo.

knowledge discovery and data mining | 2013

Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees

Charalampos E. Tsourakakis; Francesco Bonchi; Aristides Gionis; Francesco Gullo; Maria A. Tsiarli

Finding dense subgraphs is an important graph-mining task with many applications. Given that the direct optimization of edge density is not meaningful, as even a single edge achieves maximum density, research has focused on optimizing alternative density functions. A very popular among such functions is the average degree, whose maximization leads to the well-known densest-subgraph notion. Surprisingly enough, however, densest subgraphs are typically large graphs, with small edge density and large diameter. In this paper, we define a novel density function, which gives subgraphs of much higher quality than densest subgraphs: the graphs found by our method are compact, dense, and with smaller diameter. We show that the proposed function can be derived from a general framework, which includes other important density functions as subcases and for which we show interesting general theoretical properties. To optimize the proposed function we provide an additive approximation algorithm and a local-search heuristic. Both algorithms are very efficient and scale well to large graphs. We evaluate our algorithms on real and synthetic datasets, and we also devise several application studies as variants of our original problem. When compared with the method that finds the subgraph of the largest average degree, our algorithms return denser subgraphs with smaller diameter. Finally, we discuss new interesting research directions that our problem leaves open.

Pattern Recognition | 2009

A time series representation model for accurate and fast similarity detection

Francesco Gullo; Giovanni Ponti; Andrea Tagarelli; Sergio Greco

Similarity search and detection is a central problem in time series data processing and management. Most approaches to this problem have been developed around the notion of dynamic time warping, whereas several dimensionality reduction techniques have been proposed to improve the efficiency of similarity searches. Due to the continuous increasing of sources of time series data and the cruciality of real-world applications that use such data, we believe there is a challenging demand for supporting similarity detection in time series in a both accurate and fast way. Our proposal is to define a concise yet feature-rich representation of time series, on which the dynamic time warping can be applied for effective and efficient similarity detection of time series. We present the Derivative time series Segment Approximation (DSA) representation model, which originally features derivative estimation, segmentation and segment approximation to provide both high sensitivity in capturing the main trends of time series and data compression. We extensively compare DSA with state-of-the-art similarity methods and dimensionality reduction techniques in clustering and classification frameworks. Experimental evidence from effectiveness and efficiency tests on various datasets shows that DSA is well-suited to support both accurate and fast similarity detection.

scalable uncertainty management | 2008

Clustering Uncertain Data Via K-Medoids

Francesco Gullo; Giovanni Ponti; Andrea Tagarelli

Uncertain data are usually represented in terms of an uncertainty region over which a probability density function (pdf) is defined. In the context of uncertain data management, there has been a growing interest in clustering uncertain data. In particular, the classic K-means clustering algorithm has been recently adapted to handle uncertain data. However, the centroid-based partitional clustering approach used in the adapted K-means presents two major weaknesses that are related to: (i)an accuracy issue, since cluster centroids are computed as deterministic objects using the expected values of the pdfs of the clustered objects; and, (ii)an efficiency issue, since the expected distance between uncertain objects and cluster centroids is computationally expensive. In this paper, we address the problem of clustering uncertain data by proposing a K-medoids-based algorithm, called UK-medoids, which is designed to overcome the above issues. In particular, our UK-medoids algorithm employs distance functions properly defined for uncertain objects, and exploits a K-medoids scheme. Experiments have shown that UK-medoids outperforms existing algorithms from an accuracy viewpoint while achieving reasonably good efficiency.

knowledge discovery and data mining | 2014

Core decomposition of uncertain graphs

Francesco Bonchi; Francesco Gullo; Andreas Kaltenbrunner; Yana Volkovich

Core decomposition has proven to be a useful primitive for a wide range of graph analyses. One of its most appealing features is that, unlike other notions of dense subgraphs, it can be computed linearly in the size of the input graph. In this paper we provide an analogous tool for uncertain graphs, i.e., graphs whose edges are assigned a probability of existence. The fact that core decomposition can be computed efficiently in deterministic graphs does not guarantee efficiency in uncertain graphs, where even the simplest graph operations may become computationally intensive. Here we show that core decomposition of uncertain graphs can be carried out efficiently as well. We extensively evaluate our definitions and methods on a number of real-world datasets and applications, such as influence maximization and task-driven team formation.

web search and data mining | 2015

Finding Subgraphs with Maximum Total Density and Limited Overlap

Oana Denisa Balalau; Francesco Bonchi; T-H. Hubert Chan; Francesco Gullo; Mauro Sozio

Finding dense subgraphs in large graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (possibly overlapping) dense subgraphs which might correspond to communities in social networks or interesting events. While a large amount of work is devoted to finding a single densest subgraph, perhaps surprisingly, the problem of finding several dense subgraphs with limited overlap has not been studied in a principled way, to the best of our knowledge. In this work we define and study a natural generalization of the densest subgraph problem, where the main goal is to find at most

international conference on management of data | 2014

The pursuit of a good possible world: extracting representative instances of uncertain graphs

Panos Parchas; Francesco Gullo; Dimitris Papadias; Franceseco Bonchi

international conference on data mining | 2009

Projective Clustering Ensembles

Francesco Gullo; Carlotta Domeniconi; Andrea Tagarelli

subgraphs with maximum total aggregate density, while satisfying an upper bound on the pairwise Jaccard coefficient between the sets of nodes of the subgraphs. After showing that such a problem is NP-Hard, we devise an efficient algorithm that comes with provable guarantees in some cases of interest, as well as, an efficient practical heuristic. Our extensive evaluation on large real-world graphs confirms the efficiency and effectiveness of our algorithms.

very large data bases | 2012

Uncertain centroid based partitional clustering of uncertain data

Francesco Gullo; Andrea Tagarelli

Data in several applications can be represented as an uncertain graph, whose edges are labeled with a probability of existence. Exact query processing on uncertain graphs is prohibitive for most applications, as it involves evaluation over an exponential number of instantiations. Even approximate processing based on sampling is usually extremely expensive since it requires a vast number of samples to achieve reasonable quality guarantees. To overcome these problems, we propose algorithms for creating deterministic representative instances of uncertain graphs that maintain the underlying graph properties. Specifically, our algorithms aim at preserving the expected vertex degrees because they capture well the graph topology. Conventional processing techniques can then be applied on these instances to closely approximate the result on the uncertain graph. We experimentally demonstrate, with real and synthetic uncertain graphs, that indeed the representative instances can be used to answer, efficiently and accurately, queries based on several properties such as shortest path distance, clustering coefficient and betweenness centrality.

Data Mining and Knowledge Discovery | 2015

Efficient and effective community search

Nicola Barbieri; Francesco Bonchi; Edoardo Galimberti; Francesco Gullo

Recent advances in data clustering concern clustering ensembles and projective clustering methods, each addressing different issues in clustering problems. In this paper, we consider for the first time the projective clustering ensemble (PCE) problem, whose main goal is to derive a proper projective consensus partition from an ensemble of projective clustering solutions. We formalize PCE as an optimization problem which does not rely on any particular clustering ensemble algorithm, and which has the ability to handle hard as well as soft data clustering, and different feature weightings. We provide two formulations for PCE, namely a two-objective and a single-objective problem, in which the object-based and feature-based representations of the ensemble solutions are taken into account differently. Experiments have demonstrated that the proposed methods for PCE show clear improvements in terms of accuracy of the output consensus partition.

Journal of Computational Science | 2012

A time series approach for clustering mass spectrometry data

Francesco Gullo; Giovanni Ponti; Andrea Tagarelli; Giuseppe Tradigo; Pierangelo Veltri

Clustering uncertain data has emerged as a challenging task in uncertain data management and mining. Thanks to a computational complexity advantage over other clustering paradigms, partitional clustering has been particularly studied and a number of algorithms have been developed. While existing proposals differ mainly in the notions of cluster centroid and clustering objective function, little attention has been given to an analysis of their characteristics and limits. In this work, we theoretically investigate major existing methods of partitional clustering, and alternatively propose a well-founded approach to clustering uncertain data based on a novel notion of cluster centroid. A cluster centroid is seen as an uncertain object defined in terms of a random variable whose realizations are derived based on all deterministic representations of the objects to be clustered. As demonstrated theoretically and experimentally, this allows for better representing a cluster of uncertain objects, thus supporting a consistently improved clustering performance while maintaining comparable efficiency with existing partitional clustering algorithms.

Explore More