Ignazio Pillai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ignazio Pillai is active.

Explore More

Publication

Featured researches published by Ignazio Pillai.

Proceedings of the 2013 ACM workshop on Artificial intelligence and security | 2013

Is data clustering in adversarial settings secure

Battista Biggio; Ignazio Pillai; Samuel Rota Bulò; Davide Ariu; Marcello Pelillo; Fabio Roli

Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversarys goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits.

Pattern Recognition | 2013

Multi-label classification with a reject option

Ignazio Pillai; Giorgio Fumera; Fabio Roli

We consider multi-label classification problems in application scenarios where classifier accuracy is not satisfactory, but manual annotation is too costly. In single-label problems, a well known solution consists of using a reject option, i.e., allowing a classifier to withhold unreliable decisions, leaving them (and only them) to human operators. We argue that this solution can be exploited also in multi-label problems. However, the current theoretical framework for classification with a reject option applies only to single-label problems. We thus develop a specific framework for multi-label ones. In particular, we extend multi-label accuracy measures to take into account rejections, and define manual annotation cost as a cost function. We then formalise the goal of attaining a desired trade-off between classifier accuracy on non-rejected decisions, and the cost of manually handling rejected decisions, as a constrained optimisation problem. We finally develop two possible implementations of our framework, tailored to the widely used F accuracy measure, and to the only cost models proposed so far for multi-label annotation tasks, and experimentally evaluate them on five application domains.

Pattern Recognition | 2013

Threshold optimisation for multi-label classifiers

Ignazio Pillai; Giorgio Fumera; Fabio Roli

Many multi-label classifiers provide a real-valued score for each class. A well known design approach consists of tuning the corresponding decision thresholds by optimising the performance measure of interest. We address two open issues related to the optimisation of the widely used F measure and precision-recall (P-R) curve, with respect to the class-related decision thresholds, on a given data set. (i) We derive properties of the micro-averaged F, which allow its global maximum to be found by an optimisation strategy with a low computational cost. So far, only a suboptimal threshold selection rule and a greedy algorithm with no optimality guarantee were known. (ii) We rigorously define the macro- and micro-P-R curves, analyse a previously suggested strategy for computing them, based on maximising F, and develop two possible implementations, which can be also exploited for optimising related performance measures. We evaluate our algorithms on five data sets related to three different application domains.

international conference on image analysis and processing | 2003

Classification with reject option in text categorisation systems

Giorgio Fumera; Ignazio Pillai; Fabio Roli

The aim of this paper is to evaluate the potential usefulness of the reject option for text categorisation (TC) tasks. The reject option is a technique used in statistical pattern recognition for improving classification reliability. Our work is motivated by the fact that, although the reject option proved to be useful in several pattern recognition problems, it has not yet been considered for TC tasks. Since TC tasks differ from usual pattern recognition problems in the performance measures used and in the fact that documents can belong to more than one category, we developed a specific rejection technique for TC problems. The performance improvement achievable by using the reject option was experimentally evaluated on the Reuters dataset, which is a standard benchmark for TC systems.

Lecture Notes in Computer Science | 2014

Poisoning Complete-Linkage Hierarchical Clustering

Battista Biggio; Samuel Rota Bulò; Ignazio Pillai; Michele Mura; Eyasu Zemene Mequanint; Marcello Pelillo; Fabio Roli

Clustering algorithms are largely adopted in security applications as a vehicle to detect malicious activities, although few attention has been paid on preventing deliberate attacks from subverting the clustering process itself. Recent work has introduced a methodology for the security analysis of data clustering in adversarial settings, aimed to identify potential attacks against clustering algorithms and to evaluate their impact. The authors have shown that single-linkage hierarchical clustering can be severely affected by the presence of a very small fraction of carefully-crafted poisoning attacks into the input data, highlighting that the clustering algorithm may be itself the weakest link in a security system. In this paper, we extend this analysis to the case of complete-linkage hierarchical clustering by devising an ad hoc poisoning attack. We verify its effectiveness on artificial data and on application examples related to the clustering of malware and handwritten digits.

international conference on image analysis and processing | 2011

A classification approach with a reject option for multi-label problems

Ignazio Pillai; Giorgio Fumera; Fabio Roli

We investigate the implementation of multi-label classification algorithms with a reject option, as a mean to reduce the time required to human annotators and to attain a higher classification accuracy on automatically classified samples than the one which can be obtained without a reject option. Based on a recently proposed model of manual annotation time, we identify two approaches to implement a reject option, related to the two main manual annotation methods: browsing and tagging. In this paper we focus on the approach suitable to tagging, which consists in withholding either all or none of the category assignments of a given sample. We develop classification reliability measures to decide whether rejecting or not a sample, aimed at maximising classification accuracy on non-rejected ones. We finally evaluate the trade-off between classification accuracy and rejection rate that can be attained by our method, on three benchmark data sets related to text categorisation and image annotation tasks.

Lecture Notes in Computer Science | 2004

A Two-Stage Classifier with Reject Option for Text Categorisation

Giorgio Fumera; Ignazio Pillai; Fabio Roli

In this paper, we investigate the usefulness of the reject option in text categorisation systems. The reject option is introduced by allowing a text classifier to withhold the decision of assigning or not a document to any subset of categories, for which the decision is considered not sufficiently reliable. To automatically handle rejections, a two-stage classifier architecture is used, in which documents rejected at the first stage are automatically classified at the second stage, so that no rejections eventually remain. The performance improvement achievable by using the reject option is assessed on a real text categorisation task, using the well known Reuters data set.

IEEE Transactions on Neural Networks | 2017

Randomized Prediction Games for Adversarial Machine Learning

Samuel Rota Bulò; Battista Biggio; Ignazio Pillai; Marcello Pelillo; Fabio Roli

In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.

international conference on image analysis and processing | 2011

Exploiting depth information for indoor-outdoor scene classification

Ignazio Pillai; Riccardo Satta; Giorgio Fumera; Fabio Roli

A rapid diffusion of stereoscopic image acquisition devices is expected in the next years. Among the different potential applications that depth information can enable, in this paper we focus on its exploitation as a novel information source in the task of scene classification, and in particular to discriminate between indoor and outdoor images. This issue has not been addressed so far in the literature, probably because the extraction of depth information from two-dimensional images is a computationally demanding task. However, new-generation stereo cameras will allow a very fast computation of depth maps. We experimentally show that depth information alone provides a discriminant capability between indoor and outdoor images close to state-of-the art methods based on colour, edge and texture information, and that it allows to improve their performance, when it is used as an additional information source.

international conference on pattern recognition | 2014

Learning of Multilabel Classifiers

Ignazio Pillai; Giorgio Fumera; Fabio Roli

Developing learning algorithms for multilabel classification problems, when the goal is to maximizing the micro-averaged F measure, is a difficult problem for which no solution was known so far. In this paper we provide an exact solution for the case when the popular binary relevance approach is used for designing a multilabel classifier. We prove that the empirical maximum of the micro-averaged F measure can be attained by iteratively retraining class-related binary classifiers whose learning algorithm is capable of maximizing a modified version of the F measure of a two-class problem. We apply our optimization strategy to an existing formulation of support vector machine classifiers tailored to performance measures like F, and evaluate it on benchmark multilabel data sets.

Explore More