Claudio Gentile
University of Milan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Claudio Gentile.
IEEE Transactions on Information Theory | 2004
Nicolò Cesa-Bianchi; Alex Conconi; Claudio Gentile
In this paper, it is shown how to extract a hypothesis with small risk from the ensemble of hypotheses generated by an arbitrary on-line learning algorithm run on an independent and identically distributed (i.i.d.) sample of data. Using a simple large deviation argument, we prove tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble. Via sharp pointwise bounds on M/sub n/, we then obtain risk tail bounds for kernel perceptron algorithms in terms of the spectrum of the empirical kernel matrix. These bounds reveal that the linear hypotheses found via our approach achieve optimal tradeoffs between hinge loss and margin size over the class of all linear functions, an issue that was left open by previous results. A distinctive feature of our approach is that the key tools for our analysis come from the model of prediction of individual sequences; i.e., a model making no probabilistic assumptions on the source generating the data. In fact, these tools turn out to be so powerful that we only need very elementary statistical facts to obtain our final risk bounds.
SIAM Journal on Computing | 2005
Nicolò Cesa-Bianchi; Alex Conconi; Claudio Gentile
Kernel-based linear-threshold algorithms, such as support vector machines and Perceptron-like algorithms, are among the best available techniques for solving pattern classification problems. In this paper, we describe an extension of the classical Perceptron algorithm, called second-order Perceptron, and analyze its performance within the mistake bound model of on-line learning. The bound achieved by our algorithm depends on the sensitivity to second-order data information and is the best known mistake bound for (efficient) kernel-based linear-threshold classifiers to date. This mistake bound, which strictly generalizes the well-known Perceptron bound, is expressed in terms of the eigenvalues of the empirical data correlation matrix and depends on a parameter controlling the sensitivity of the algorithm to the distribution of these eigenvalues. Since the optimal setting of this parameter is not known a priori, we also analyze two variants of the second-order Perceptron algorithm: one that adaptively sets the value of the parameter in terms of the number of mistakes made so far, and one that is parameterless, based on pseudoinverses.
conference on learning theory | 2006
Nicolò Cesa-Bianchi; Claudio Gentile
Shifting bounds for on-line classification algorithms ensure good performance on any sequence of examples that is well predicted by a sequence of smoothly changing classifiers. When proving shifting bounds for kernel-based classifiers, one also faces the problem of storing a number of support vectors that can grow unboundedly, unless an eviction policy is used to keep this number under control. In this paper, we show that shifting and on-line learning on a budget can be combined surprisingly well. First, we introduce and analyze a shifting Perceptron algorithm achieving the best known shifting bounds while using an unlimited budget. Second, we show that by applying to the Perceptron algorithm the simplest possible eviction policy, which discards a random support vector each time a new one comes in, we achieve a shifting bound close to the one we obtained with no budget restrictions. More importantly, we show that our randomized algorithm strikes the optimal trade-off
international conference on machine learning | 2006
Nicolò Cesa-Bianchi; Claudio Gentile; Luca Zaniboni
U = \Theta\bigl(\sqrt{B}\bigr)
Machine Learning | 2007
Giovanni Cavallanti; Nicolò Cesa-Bianchi; Claudio Gentile
between budget B and norm U of the largest classifier in the comparison sequence.
international conference on machine learning | 2009
Nicolò Cesa-Bianchi; Claudio Gentile; Francesco Orabona
We study hierarchical classification in the general case when an instance could belong to more than one class node in the underlying taxonomy. Experiments done in previous work showed that a simple hierarchy of Support Vectors Machines (SVM) with a top-down evaluation scheme has a surprisingly good performance on this kind of task. In this paper, we introduce a refined evaluation scheme which turns the hierarchical SVM classifier into an approximator of the Bayes optimal classifier with respect to a simple stochastic model for the labels. Experiments on synthetic datasets, generated according to this stochastic model, show that our refined algorithm outperforms the simple hierarchical SVM. On real-world data, however, the advantage brought by our approach is a bit less clear. We conjecture this is due to a higher noise rate for the training labels in the low levels of the taxonomy.
international conference on machine learning | 2011
Koby Crammer; Claudio Gentile
Shifting bounds for on-line classification algorithms ensure good performance on any sequence of examples that is well predicted by a sequence of changing classifiers. When proving shifting bounds for kernel-based classifiers, one also faces the problem of storing a number of support vectors that can grow unboundedly, unless an eviction policy is used to keep this number under control. In this paper, we show that shifting and on-line learning on a budget can be combined surprisingly well. First, we introduce and analyze a shifting Perceptron algorithm achieving the best known shifting bounds while using an unlimited budget. Second, we show that by applying to the Perceptron algorithm the simplest possible eviction policy, which discards a random support vector each time a new one comes in, we achieve a shifting bound close to the one we obtained with no budget restrictions. More importantly, we show that our randomized algorithm strikes the optimal trade-off
conference on learning theory | 2003
Nicolò Cesa-Bianchi; Alex Conconi; Claudio Gentile
IEEE Transactions on Information Theory | 2008
Nicolò Cesa-Bianchi; Claudio Gentile
U = \Theta(\sqrt{B})
Machine Learning | 2011
Giovanni Cavallanti; Nicolò Cesa-Bianchi; Claudio Gentile