Maria Camila Nardini Barioni
Universidade Federal do ABC
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maria Camila Nardini Barioni.
Journal of Systems and Software | 2008
Maria Camila Nardini Barioni; Humberto Luiz Razente; Agma J. M. Traina; Caetano Traina
Scalable data mining algorithms have become crucial to efficiently support KDD processes on large databases. In this paper, we address the task of scaling up k-medoid-based algorithms through the utilization of metric access methods, allowing clustering algorithms to be executed by database management systems in a fraction of the time usually required by the traditional approaches. We also present an optimization strategy that can be applied as an additional step of the proposed algorithm in order to achieve better clustering solutions. Experimental results based on several datasets, including synthetic and real ones, show that the proposed algorithm can reduce the number of distance calculations by a factor of more than three thousand times when compared to existing algorithms, while producing clusters of equivalent quality.
conference on information and knowledge management | 2008
Humberto Luiz Razente; Maria Camila Nardini Barioni; Agma J. M. Traina; Christos Faloutsos; Caetano Traina
A similarity query considers an element as the query center and searches a dataset to find either the elements far up to a bounding radius or the k nearest ones from the query center. Several algorithms have been developed to efficiently execute similarity queries. However, there are queries that require more than one center, which we call Aggregate Similarity Queries. Such queries appear when the user gives multiple desirable examples, and requests data elements that are similar to all of the examples, as in the case of applying relevance feedback. Here we give the first algorithms that can handle aggregate similarity queries on Metric Access Methods (MAM) such as the M-tree and Slim-tree. Our method, which we call Metric Aggregate Similarity Search (MASS) has the following properties: (a) it requires only the triangle inequality property; (b) it guarantees no false-dismissals, as we prove that it lower-bounds the aggregate distance scores; (c) it can work with any MAM; (d) it can handle any number of query centers, which are either scattered all over the space or concentrated on a restricted region. Experiments on both real and synthetic data show that our method scales on both the number of elements and, if the dataset is in a spatial domain, also on its dimensionality. Moreover, it achieves better results than previous related methods.
Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2014
Maria Camila Nardini Barioni; Humberto Luiz Razente; Alessandra M. R. Marcelino; Agma J. M. Traina; Caetano Traina
Over the last decades, a great variety of data mining techniques have been developed to reach goals concerning Knowledge Discovery in Databases. Among them, cluster detection techniques are of major importance. Although these techniques have already been largely explored in the scientific literature, there are at least two important open issues: the existent algorithms are not scalable for large high‐dimensional datasets, and the unsupervised nature of traditional data clustering makes it very difficult to generate meaningful clusters. This article presents an overview of the strategies being explored in order to deal more deeply with these issues. Moreover, it describes a new semi‐supervised clustering strategy that exemplifies the integration of several approaches and that can be employed with partitioning algorithms, such as PAM and Clarans. The technique addresses an improvement to these types of algorithms, which is obtained by using must‐link feedback information provided by the users in an interactive and visual environment. WIREs Data Mining Knowl Discov 2014, 4:161–177. doi: 10.1002/widm.1127
2011 15th International Conference on Information Visualisation | 2011
Renato Bueno; Daniel S. Kaster; Humberto Luiz Razente; Maria Camila Nardini Barioni; Agma J. M. Traina; Caetano Traina
Complex data is usually represented through signatures, which are sets of features describing the data content. Several kinds of complex data allow extracting different signatures from an object, representing complementary data characteristics. However, there is no ground truth of how balancing these signatures to reach an ideal similarity distribution. It depends on the analyst intent, that is, according to the job he/she is performing, a few signatures should have more impact in the data distribution than others. This work presents a new technique, called Visual Signature Weighting (ViSW), which allows interactively analyzing the impact of each signature in the similarity of complex data represented through multiple signatures. Our method provides means to explore the tradeoff of prioritizing signatures over the others, by dynamically changing their weight relation. We also present case studies showing that the technique is useful for global dataset analysis as well as for inspecting subspaces of interest.
2010 14th International Conference Information Visualisation | 2010
Renato Bueno; Humberto Luiz Razente; Daniel S. Kaster; Maria Camila Nardini Barioni; Agma J. M. Traina; Caetano Traina
The human vision can naturally interpret data in spaces of 2 or 3 dimensions. When data is in higher dimensional spaces, in most cases the visualization is not intuitive. Regarding metric spaces, the interpretation is even harder, since they often do not have a direct spatial representation. However, the need to analyze how metric-represented data evolve over time is pretty common when one needs to understand several phenomena and in decision making processes, as it occurs in medical and agrometeorological applications. This paper presents three interactive techniques to visualize metric data that vary over time. Each one focus on a different way to interpret the temporal information. The first technique shows data evolving in a timeline axis. The second overlaps evolving snapshots of the space showing how the space varies regarding time. The last one does not treat temporal data as a dimension, it is used instead to define the similarity among complex data, employing the new concept of metric-temporal spaces, which seamlessly integrate time and metric data into a single similarity space. Visualization examples with real datasets are presented to show the usefulness of the proposed techniques.
international conference on data engineering | 2011
Marcos R. Vieira; Humberto Luiz Razente; Maria Camila Nardini Barioni; Marios Hadjieleftheriou; Divesh Srivastava; Caetano Traina; Vassilis J. Tsotras
very large data bases | 2006
Maria Camila Nardini Barioni; Humberto Luiz Razente; Agma J. M. Traina; Caetano Traina
Software - Practice and Experience | 2009
Maria Camila Nardini Barioni; Humberto Luiz Razente; Agma J. M. Traina; Caetano Traina
brazilian symposium on databases | 2006
Maria Camila Nardini Barioni; Humberto Luiz Razente; Agma J. M. Traina; Caetano Traina
brazilian symposium on databases | 2005
Maria Camila Nardini Barioni; Humberto Luiz Razente; Caetano Traina; Agma J. M. Traina