Mirco Nanni
Istituto di Scienza e Tecnologie dell'Informazione
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mirco Nanni.
knowledge discovery and data mining | 2007
Fosca Giannotti; Mirco Nanni; Fabio Pinelli; Dino Pedreschi
The increasing pervasiveness of location-acquisition technologies (GPS, GSM networks, etc.) is leading to the collection of large spatio-temporal datasets and to the opportunity of discovering usable knowledge about movement behaviour, which fosters novel applications and services. In this paper, we move towards this direction and develop an extension of the sequential pattern mining paradigm that analyzes the trajectories of moving objects. We introduce trajectory patterns as concise descriptions of frequent behaviours, in terms of both space (i.e., the regions of space visited during movements) and time (i.e., the duration of movements). In this setting, we provide a general formal statement of the novel mining problem and then study several different instantiations of different complexity. The various approaches are then empirically evaluated over real data and synthetic benchmarks, comparing their strengths and weaknesses.
international conference on data engineering | 2008
Osman Abul; Francesco Bonchi; Mirco Nanni
Preserving individual privacy when publishing data is a problem that is receiving increasing attention. According to the fc-anonymity principle, each release of data must be such that each individual is indistinguishable from at least k - 1 other individuals. In this paper we study the problem of anonymity preserving data publishing in moving objects databases. We propose a novel concept of k-anonymity based on co-localization that exploits the inherent uncertainty of the moving objects whereabouts. Due to sampling and positioning systems (e.g., GPS) imprecision, the trajectory of a moving object is no longer a polyline in a three-dimensional space, instead it is a cylindrical volume, where its radius delta represents the possible location imprecision: we know that the trajectory of the moving object is within this cylinder, but we do not know exactly where. If another object moves within the same cylinder they are indistinguishable from each other. This leads to the definition of (k,delta) -anonymity for moving objects databases. We first characterize the (k, delta)-anonymity problem and discuss techniques to solve it. Then we focus on the most promising technique by the point of view of information preservation, namely space translation. We develop a suitable measure of the information distortion introduced by space translation, and we prove that the problem of achieving (k,delta) -anonymity by space translation with minimum distortion is NP-hard. Faced with the hardness of our problem we propose a greedy algorithm based on clustering and enhanced with ad hoc pre-processing and outlier removal techniques. The resulting method, named NWA (Never Walk .Alone), is empirically evaluated in terms of data quality and efficiency. Data quality is assessed both by means of objective measures of information distortion, and by comparing the results of the same spatio-temporal range queries executed on the original database and on the (k, delta)-anonymized one. Experimental results show that for a wide range of values of delta and k, the relative error introduced is kept low, confirming that NWA produces high quality (k, delta)-anonymized data.
intelligent information systems | 2006
Mirco Nanni; Dino Pedreschi
Spatio-temporal, geo-referenced datasets are growing rapidly, and will be more in the near future, due to both technological and social/commercial reasons. From the data mining viewpoint, spatio-temporal trajectory data introduce new dimensions and, correspondingly, novel issues in performing the analysis tasks. In this paper, we consider the clustering problem applied to the trajectory data domain. In particular, we propose an adaptation of a density-based clustering algorithm to trajectory data based on a simple notion of distance between trajectories. Then, a set of experiments on synthesized data is performed in order to test the algorithm and to compare it with other standard clustering approaches. Finally, a new approach to the trajectory clustering problem, called temporal focussing, is sketched, having the aim of exploiting the intrinsic semantics of the temporal dimension to improve the quality of trajectory clustering.
very large data bases | 2011
Fosca Giannotti; Mirco Nanni; Dino Pedreschi; Fabio Pinelli; Chiara Renso; Salvatore Rinzivillo; Roberto Trasarti
The technologies of mobile communications pervade our society and wireless networks sense the movement of people, generating large volumes of mobility data, such as mobile phone call records and Global Positioning System (GPS) tracks. In this work, we illustrate the striking analytical power of massive collections of trajectory data in unveiling the complexity of human mobility. We present the results of a large-scale experiment, based on the detailed trajectories of tens of thousands private cars with on-board GPS receivers, tracked during weeks of ordinary mobile activity. We illustrate the knowledge discovery process that, based on these data, addresses some fundamental questions of mobility analysts: what are the frequent patterns of people’s travels? How big attractors and extraordinary events influence mobility? How to predict areas of dense traffic in the near future? How to characterize traffic jams and congestions? We also describe M-Atlas, the querying and mining language and system that makes this analytical process possible, providing the mechanisms to master the complexity of transforming raw GPS tracks into mobility knowledge. M-Atlas is centered onto the concept of a trajectory, and the mobility knowledge discovery process can be specified by M-Atlas queries that realize data transformations, data-driven estimation of the parameters of the mining methods, the quality assessment of the obtained results, the quantitative and visual exploration of the discovered behavioral patterns and models, the composition of mined patterns, models and data with further analyses and mining, and the incremental mining strategies to address scalability.
visual analytics science and technology | 2009
Gennady L. Andrienko; Natalia V. Andrienko; Salvatore Rinzivillo; Mirco Nanni; Dino Pedreschi; Fosca Giannotti
One of the most common operations in exploration and analysis of various kinds of data is clustering, i.e. discovery and interpretation of groups of objects having similar properties and/or behaviors. In clustering, objects are often treated as points in multi-dimensional space of properties. However, structurally complex objects, such as trajectories of moving entities and other kinds of spatio-temporal data, cannot be adequately represented in this manner. Such data require sophisticated and computationally intensive clustering algorithms, which are very hard to scale effectively to large datasets not fitting in the computer main memory. We propose an approach to extracting meaningful clusters from large databases by combining clustering and classification, which are driven by a human analyst through an interactive visual interface.
knowledge discovery and data mining | 2011
Roberto Trasarti; Fabio Pinelli; Mirco Nanni; Fosca Giannotti
In this paper we introduce a methodology for extracting mobility profiles of individuals from raw digital traces (in particular, GPS traces), and study criteria to match individuals based on profiles. We instantiate the profile matching problem to a specific application context, namely proactive car pooling services, and therefore develop a matching criterion that satisfies various basic constraints obtained from the background knowledge of the application domain. In order to evaluate the impact and robustness of the methods introduced, two experiments are reported, which were performed on a massive dataset containing GPS traces of private cars: (i) the impact of the car pooling application based on profile matching is measured, in terms of percentage shareable traffic; (ii) the approach is adapted to coarser-grained mobility data sources that are nowadays commonly available from telecom operators. In addition the ensuing loss in precision and coverage of profile matches is measured.
Information Systems | 2010
Osman Abul; Francesco Bonchi; Mirco Nanni
Preserving individual privacy when publishing data is a problem that is receiving increasing attention. Thanks to its simplicity the concept of k-anonymity, introduced by Samarati and Sweeney [1], established itself as one fundamental principle for privacy preserving data publishing. According to the k-anonymity principle, each release of data must be such that each individual is indistinguishable from at least k-1 other individuals. In this article we tackle the problem of anonymization of moving objects databases. We propose a novel concept of k-anonymity based on co-localization, that exploits the inherent uncertainty of the moving objects whereabouts. Due to sampling and imprecision of the positioning systems (e.g., GPS), the trajectory of a moving object is no longer a polyline in a three-dimensional space, instead it is a cylindrical volume, where its radius @d represents the possible location imprecision: we know that the trajectory of the moving object is within this cylinder, but we do not know exactly where. If another object moves within the same cylinder they are indistinguishable from each other. This leads to the definition of (k,@d)-anonymity for moving objects databases. We first characterize the (k,@d)-anonymity problem, then we recall NWA (NeverWalkAlone), a method that we introduced in [2] based on clustering and spatial perturbation. Starting from a discussion on the limits of NWA we develop a novel clustering method that, being based on EDR distance [3], has the important feature of being time-tolerant. As a consequence it perturbs trajectories both in space and time. The novel method, named W4M (WaitforMe), is empirically shown to produce higher quality anonymization than NWA, at the price of higher computational requirements. Therefore, in order to make W4M scalable to large datasets, we introduce two variants based on a novel (and computationally cheaper) time-tolerant distance function, and on chunking. All the variants of W4M are empirically evaluated in terms of data quality and efficiency, and thoroughly compared to their predecessor NWA. Data quality is assessed both by means of objective measures of information distortion, and by more usability oriented measure, i.e., by comparing the results of (i) spatio-temporal range queries and (ii) frequent pattern mining, executed on the original database and on the (k,@d)-anonymized one. Experimental results over both real-world and synthetic mobility data confirm that, for a wide range of values of @d and k, the relative distortion introduced by our anonymization methods is kept low. Moreover, the techniques introduced to make W4M scalable to large datasets, achieve their goal without giving up data quality in the anonymization process.
Data Mining and Knowledge Discovery Handbook | 2009
Slava Kisilevich; Florian Mansmann; Mirco Nanni; Salvatore Rinzivillo
Summary. Spatio-temporal clustering is a process of grouping objects based on their spatial and temporal similarity. It is relatively new subfield of data mining which gained high popularity especially in geographic information sciences due to the pervasiveness of all kinds of location-based or environmental devices that record position, time or/and environmental properties of an object or set of objects in real-time. As a consequence, different types and large amounts of spatio-temporal data became available that introduce new challenges to data analysis and require novel approaches to knowledge discovery. In this chapter we concentrate on the spatio-temporal clustering in geographic space. First, we provide a classification of different types of spatio-temporal data. Then, we focus on one type of spatio-temporal clustering - trajectory clustering, provide an overview of the state-of-the-art approaches and methods of spatio-temporal clustering and finally present several scenarios in different application domains such as movement, cellular networks and environmental studies.
Mobility, Data Mining and Privacy | 2008
Mirco Nanni; Bart Kuijpers; Christine Körner; Michael May; Dino Pedreschi
After the introduction and development of the relational database model between 1970 and the 1980s, this model proved to be insufficiently expressive for specific applications dealing with, for instance, temporal data, spatial data and multi-media data. From the mid-1980s, this has led to the development of domain-specific database systems, the first being temporal databases, later followed by spatial database systems. In the area of data mining, we have seen a similar development. Many data mining techniques – such as frequent set and association rule mining, classification, prediction and clustering – were first developed for typical alpha-numerical business data. From the second half of the 1990s, these techniques were studied for temporal and spatial data and sometimes specific, previously well studied, techniques such as time-series analysis were introduced in the data mining field. For an overview of mining techniques in spatial and geographic data, we refer to Chap. 9. For spatiotemporal data, this development has only just started. This field is no longer in an embryonic state; now, in 2007, we can say that with the organization of a few workshops, this field has just been born. In this chapter, we give an overview of what has been done in spatiotemporal data mining, with a focus on mining trajectories of moving objects, and we mainly emphasize the challenges that this field faces. This chapter is organized as follows. In Sect. 10.2, we outline, by means of examples, challenging tasks for spatiotemporal mining. In Sects. 10.3 and 10.4, we discuss, respectively, spatiotemporal clustering and patterns. Spatiotemporal prediction and classification, including time series, are discussed in Sect. 10.5. In Sect. 10.6, the role played by uncertainty in spatiotemporal data mining is briefly described. Finally, in Sect. 10.7, we summarize the main problems and issues
acm symposium on applied computing | 2006
Fosca Giannotti; Mirco Nanni; Dino Pedreschi; Fabio Pinelli
In this paper we propose an extension of the sequence mining paradigm to (temporally-)annotated sequential patterns, where each transition in a sequential pattern is annotated with a typical transition time derived from the source data. Then, we present a basic solution for the novel mining problem based on the combination of sequential pattern mining and clustering, and assess this solution on two realistic datasets, illustrating how potentially useful patterns of the new form are extracted.