Claudio Sartori | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Claudio Sartori is active.

Explore More

Publication

Featured researches published by Claudio Sartori.

IEEE Transactions on Knowledge and Data Engineering | 2013

Distributed Strategies for Mining Outliers in Large Data Sets

Fabrizio Angiulli; Stefano Basta; Stefano Lodi; Claudio Sartori

We introduce a distributed method for detecting distance-based outliers in very large data sets. Our approach is based on the concept of outlier detection solving set [2], which is a small subset of the data set that can be also employed for predicting novel outliers. The method exploits parallel computation in order to obtain vast time savings. Indeed, beyond preserving the correctness of the result, the proposed schema exhibits excellent performances. From the theoretical point of view, for common settings, the temporal cost of our algorithm is expected to be at least three orders of magnitude faster than the classical nested-loop like approach to detect outliers. Experimental results show that the algorithm is efficient and that its running time scales quite well for an increasing number of nodes. We discuss also a variant of the basic strategy which reduces the amount of data to be transferred in order to improve both the communication cost and the overall runtime. Importantly, the solving set computed by our approach in a distributed environment has the same quality as that produced by the corresponding centralized method.

Information Sciences | 2014

A novel Frank-Wolfe algorithm. Analysis and applications to large-scale SVM training

Ricardo Ñanculef; Emanuele Frandi; Claudio Sartori; Héctor Allende

Recently, there has been a renewed interest in the machine learning community for variants of a sparse greedy approximation procedure for concave optimization known as {the Frank-Wolfe (FW) method}. In particular, this procedure has been successfully applied to train large-scale instances of non-linear Support Vector Machines (SVMs). Specializing FW to SVM training has allowed to obtain efficient algorithms but also important theoretical results, including convergence analysis of training algorithms and new characterizations of model sparsity. nIn this paper, we present and analyze a novel variant of the FW method based on a new way to perform away steps, a classic strategy used to accelerate the convergence of the basic FW procedure. Our formulation and analysis is focused on a general concave maximization problem on the simplex. However, the specialization of our algorithm to quadratic forms is strongly related to some classic methods in computational geometry, namely the Gilbert and MDM algorithms. nOn the theoretical side, we demonstrate that the method matches the guarantees in terms of convergence rate and number of iterations obtained by using classic away steps. In particular, the method enjoys a linear rate of convergence, a result that has been recently proved for MDM on quadratic forms. nOn the practical side, we provide experiments on several classification datasets, and evaluate the results using statistical tests. Experiments show that our method is faster than the FW method with classic away steps, and works well even in the cases in which classic away steps slow down the algorithm. Furthermore, these improvements are obtained without sacrificing the predictive accuracy of the obtained SVM model.

international conference on data technologies and applications | 2015

A Study on Term Weighting for Text Categorization: A Novel Supervised Variant of tf.idf

Giacomo Domeniconi; Gianluca Moro; Roberto Pasolini; Claudio Sartori

Within text categorization and other data mining tasks, the use of suitable methods for term weighting can n nbring a substantial boost in effectiveness. Several term weighting methods have been presented throughout n nliterature, based on assumptions commonly derived from observation of distribution of words in documents. n nFor example, the idf assumption states that words appearing in many documents are usually not as important n nas less frequent ones. Contrarily to tf.idf and other weighting methods derived from information retrieval, n nschemes proposed more recently are supervised, i.e. based on knownledge of membership of training documents n nto categories. We propose here a supervised variant of the tf.idf scheme, based on computing the n nusual idf factor without considering documents of the category to be recognized, so that importance of terms n nfrequently appearing only within it is not underestimated. A further proposed variant is additionally based on n nrelevance frequency, considering occurrences of words within the category itself. In extensive experiments on n ntwo recurring text collections with several unsupervised and supervised weighting schemes, we show that the n nones we propose generally perform better than or comparably to other ones in terms of accuracy, using two n ndifferent learning methods.

international joint conference on knowledge discovery, knowledge engineering and knowledge management | 2014

Iterative Refining of Category Profiles for Nearest Centroid Cross-Domain Text Classification

Giacomo Domeniconi; Gianluca Moro; Roberto Pasolini; Claudio Sartori

In cross-domain text classification, topic labels for documents of a target domain are predicted by leveraging knowledge of labeled documents of a source domain, having equal or similar topics with possibly different words. Existing methods either adapt documents of the source domain to the target or represent both domains in a common space. These methods are mostly based on advanced statistical techniques and often require tuning of parameters in order to obtain optimal performances. We propose a more straightforward approach based on nearest centroid classification: profiles of topic categories are extracted from the source domain and are then adapted by iterative refining steps using most similar documents in the target domain. Experiments on common benchmark datasets show that this approach, despite its simplicity, obtains accuracy measures better or comparable to other methods, obtained with fixed empirical values for its few parameters.

international joint conference on knowledge discovery knowledge engineering and knowledge management | 2014

Cross-domain Text Classification through Iterative Refining of Target Categories Representations

Giacomo Domeniconi; Gianluca Moro; Roberto Pasolini; Claudio Sartori

Cross-domain text classification deals with predicting topic labels for documents in a target domain by leveraging knowledge from pre-labeled documents in a source domain, with different terms or different distributions thereof. Methods exist to address this problem by re-weighting documents from the source domain to transfer them to the target one or by finding a common feature space for documents of both domains; they often require the combination of complex techniques, leading to a number of parameters which must be tuned for each dataset to yield optimal performances. We present a simpler method based on creating explicit representations of topic categories, which can be compared for similarity to the ones of documents. Categories representations are initially built from relevant source documents, then are iteratively refined by considering the most similar target documents, with relatedness being measured by a simple regression model based on cosine similarity, built once at the begin. This expectedly leads to obtain accurate representations for categories in the target domain, used to classify documents therein. Experiments on common benchmark text collections show that this approach obtains results better or comparable to other methods, obtained with fixed empirical values for its few parameters.

IEEE Transactions on Parallel and Distributed Systems | 2016

GPU Strategies for Distance-Based Outlier Detection

Fabrizio Angiulli; Stefano Basta; Stefano Lodi; Claudio Sartori

The process of discovering interesting patterns in large, possibly huge, data sets is referred to as data mining, and can be performed in several flavours, known as “data mining functions.” Among these functions, outlier detection discovers observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and currently requires high-performance computing facilities. We propose a family of parallel and distributed algorithms for graphic processing units (GPU) derived from two distance-based outlier detection algorithms: BruteForce and SolvingSet. The algorithms differ in the way they exploit the architecture and memory hierarchy of the GPU and guarantee significant improvements with respect to the CPU versions, both in terms of scalability and exploitation of parallelism. We provide a detailed discussion of their computational properties and measure performances with an extensive experimentation, comparing the several implementations and showing significant speedups.

International Journal of Pattern Recognition and Artificial Intelligence | 2013

TRAINING SUPPORT VECTOR MACHINES USING FRANK–WOLFE OPTIMIZATION METHODS

Emanuele Frandi; Ricardo Ñanculef; Maria Grazia Gasparo; Stefano Lodi; Claudio Sartori

Training a support vector machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions. By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of core vector machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a minimal enclosing ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function. In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank–Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs, and thus our methods can be used for a wider set of problems.

international conference on data technologies and applications | 2015

A Comparison of Term Weighting Schemes for Text Classification and Sentiment Analysis with a Supervised Variant of tf.idf

Giacomo Domeniconi; Gianluca Moro; Roberto Pasolini; Claudio Sartori

In text analysis tasks like text classification and sentiment analysis, the careful choice of term weighting schemes can have an important impact on the effectiveness. Classic unsupervised schemes are based solely on the distribution of terms across documents, while newer supervised ones leverage the knowledge of membership of training documents to categories; these latter ones are often specifically tailored for either topic or sentiment classification. We propose here a supervised variant of the well-known tf.idf scheme, where the idf factor is computed without considering documents within the category under analysis, so that terms frequently appearing only within it are not penalized. The importance of these terms is further boosted in a second variant inspired by relevance frequency. We performed extensive experiments to compare these novel schemes to known ones, observing top performances in text categorization by topic and satisfactory results in sentiment classification.

high performance computing systems and applications | 2014

Accelerating outlier detection with intra- and inter-node parallelism

Fabrizio Angiulli; Stefano Basta; Stefano Lodi; Claudio Sartori

Outlier detection is a data mining task consisting in the discovery of observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and the size limit of the data that can be elaborated is considerably pushed forward by mixing three ingredients: efficient algorithms, intra-cpu parallelism of high-performance architectures, network level parallelism. In this paper we propose an outlier detection algorithm able to exploit the internal parallelism of a GPU and the external parallelism of a cluster of GPU. The algorithm is the evolution of our previous solutions which considered either GPU or network level parallelism. We discuss a set of large scale experiments executed in a supercomputing facility and show the speedup obtained with varying number of nodes.

Machine Learning | 2016

Fast and scalable Lasso via stochastic Frank–Wolfe methods with a convergence guarantee

Emanuele Frandi; Ricardo Ñanculef; Stefano Lodi; Claudio Sartori; Johan A. K. Suykens

Frank–Wolfe (FW) algorithms have been often proposed over the last few years as efficient solvers for a variety of optimization problems arising in the field of machine learning. The ability to work with cheap projection-free iterations and the incremental nature of the method make FW a very effective choice for many large-scale problems where computing a sparse model is desirable. In this paper, we present a high-performance implementation of the FW method tailored to solve large-scale Lasso regression problems, based on a randomized iteration, and prove that the convergence guarantees of the standard FW method are preserved in the stochastic setting. We show experimentally that our algorithm outperforms several existing state of the art methods, including the Coordinate Descent algorithm by Friedman et al. (one of the fastest known Lasso solvers), on several benchmark datasets with a very large number of features, without sacrificing the accuracy of the model. Our results illustrate that the algorithm is able to generate the complete regularization path on problems of size up to four million variables in <1xa0min.

Explore More