Is this you? Create Your Porfile

Marco Vannucci

Sant'Anna School of Advanced Studies

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marco Vannucci is active.

Explore More

Publication

Featured researches published by Marco Vannucci.

Neurocomputing | 2014

A method for resampling imbalanced datasets in binary classification tasks for real-world problems

Silvia Cateni; Valentina Colla; Marco Vannucci

The paper presents a novel resampling method for binary classification problems on imbalanced datasets. Imbalanced datasets are frequently found in many industrial applications: for instance, the occurrence of particular product defects, the diagnosis of severe diseases in a series of patients or machine faults are rare events whose detection is of utmost importance. In this paper a new resampling method is proposed combining an oversampling and an undersampling technique. Several tests have been developed aiming at assessing the efficiency of the proposed method. Four classifiers based, respectively, on Support Vector Machine, Decision Tree, labelled Self-Organizing Map and Bayesian Classifiers have been developed and applied for binary classification on the following four datasets: a synthetic dataset, a widely used public dataset and two datasets coming from industrial applications. The results that have been obtained in the tests are presented and discussed in the paper; in particular, the performances that are achieved by the four classifiers through the proposed novel resampling approach have been compared to the ones that are obtained, without any resampling, through a widely applied and well known resampling technique, i.e. the classical SMOTE approach, and through another approach coupling informed SMOTE-based oversampling and informed clustering-based undersampling.

Applied Soft Computing | 2011

Novel classification method for sensitive problems and uneven datasets based on neural networks and fuzzy logic

Marco Vannucci; Valentina Colla

Abstract: This paper describes a novel binary classification method named LASCUS that can be applied to uneven datasets and sensitive problems such as malfunction detection. Such method aims at filling the gap left by traditional algorithms which have difficulties when coping with unbalanced datasets and are not able to satisfactorily recognize unfrequent patterns. The proposed method is based on the use of a self organizing map (SOM) and of a fuzzy inference system (FIS). The SOM creates a set of clusters to be associated either to frequent or unfrequent situations while the FIS determines such association on the basis of data distribution. The method has been tested on the widely used benchmarking Wisconsin breast cancer database and on two industrial applications. The obtained results, which are discussed in the paper, are encouraging and in line with expectations.

international conference on robotics and automation | 2008

Outlier Detection Methods for Industrial Applications

Silvia Cateni; Valentina Colla; Marco Vannucci

An outlier is an observation (or measurement) that is different with respect to the other values contained in a given dataset. Outliers can be due to several causes. The measurement can be incorrectly observed, recorded or entered into the process computer, the observed datum can come from a different population with respect to the normal situation and thus is correctly measured but represents a rare event. In literature different definitions of outlier exist: the most commonly referred are reported in the following: - “An outlier is an observation that deviates so much from other observations as to arouse suspicions that is was generated by a different mechanism “ (Hawkins, 1980). - “An outlier is an observation (or subset of observations) which appear to be inconsistent with the remainder of the dataset” (Barnet & Lewis, 1994). - “An outlier is an observation that lies outside the overall pattern of a distribution” (Moore and McCabe, 1999). - “Outliers are those data records that do not follow any pattern in an application” (Chen and al., 2002). - “An outlier in a set of data is an observation or a point that is considerably dissimilar or inconsistent with the remainder of the data” (Ramasmawy at al., 2000). Many data mining algorithms try to minimize the influence of outliers for instance on a final model to develop, or to eliminate them in the data pre-processing phase. However, a data miner should be careful when automatically detecting and eliminating outliers because, if the data are correct, their elimination can cause the loss of important hidden information (Kantardzic, 2003). Some data mining applications are focused on outlier detection and they are the essential result of a data-analysis (Sane & Ghatol, 2006). The outlier detection techniques find applications in credit card fraud, network robustness analysis, network intrusion detection, financial applications and marketing (Han & Kamber, 2001). A more exhaustive list of applications that exploit outlier detection is provided below (Hodge, 2004): - Fraud detection: fraudulent applications for credit cards, state benefits or fraudulent usage of credit cards or mobile phones. - Loan application processing: fraudulent applications or potentially problematical customers. - Intrusion detection, such as unauthorized access in computer networks.

intelligent systems design and applications | 2009

General Purpose Input Variables Extraction: A Genetic Algorithm Based Procedure GIVE A GAP

Silvia Cateni; Valentina Colla; Marco Vannucci

The paper presents an application of genetic algorithms to the problem of input variables selection for the design of neural systems. The basic idea of the proposed method lies in the use of genetic algorithms in order to select the set of variables to be fed to the neural networks. However, the main concept behind this approach is far more general and does not depend on the particular adopted model: it can be used for a wide category of systems, also non-neural, and with a variety of performance indicators. The proposed method has been tested on a simple case study, in order to demonstrate its effectiveness. The results obtained in the processing of experimental data are presented and discussed.

ieee international conference on fuzzy systems | 2010

A fuzzy inference system applied to defect detection in flat steel production

Alice Borselli; Valentina Colla; Marco Vannucci; Marco Veroli

Recently in many industrial fields the exploitation of vision systems for quality control had a considerable increase, which is mainly due to the technological progress experienced by such systems, that, with respect to the past, made their performance more appealing and more reliable while the associated costs are decreased. The advantages of these kind of systems in terms of savings in human resources and improved quality monitoring have become far more evident, by encouraging their adoption in a wide variety of production cycles. The present paper deals with the elaboration and information extraction from images, that represent portions of the surface of flat steel products, and describes an algorithm for defect detection and classification. The overall classification procedure is composed of a preliminary part that is mostly related to image processing and analysis, which aims at pointing out the defect (independently on the class it belongs to), as well as to the extraction of relevant features of the detected defect; the second part exploits a fuzzy inference system in order to analyze the type of defect and solves a classification problem that presently can be addressed only with the support of a human operator. Fuzzy inference systems are suitable to this application because they are able to mimic and reproduce the human reasoning.

ambient intelligence | 2009

Thresholded Neural Networks for Sensitive Industrial Classification Tasks

Marco Vannucci; Valentina Colla; Mirko Sgarbi; Orlando Toscanelli

In this paper a novel classification method for real world classification tasks is proposed. The method was designed to overcome the difficulties encountered by traditional methods when coping with those real world problems where the key issue is the detection of particular situations - such as for instance machine faults or anomalies - which in some frameworks are hard to be recognized due to some interacting factors that are analyzed within the paper. The method is described and tested on two industrial problems, which show the goodness of the proposed approach and encourage its use in the industrial environments.

european modelling symposium | 2014

A Hybrid Feature Selection Method for Classification Purposes

Silvia Cateni; Valentina Colla; Marco Vannucci

This paper presents a novel combination of filter features selection algorithms for classification problem. Feature selection is one of the most important issues in pattern recognition, machine learning and computer vision. The main objective of feature selection regards the dimensionality reduction, the performance of machine learning improvement and the process comprehensibility increase. Exhaustive search method is the only method which guarantees to find the optimal subsets but its computational time complexity is exponential. In this paper the set of available variables are firstly reduced using a combination of filter selection methods and then exhaustive search is performed in order to obtain a sub-optimal set of variables in a reasonable time. The proposed approach is tested on several commonly used datasets from UCI repository and two datasets coming from industrial context.

european symposium on computer modeling and simulation | 2008

Model Parameters Optimisation for an Industrial Application: A Comparison between Traditional Approaches and Genetic Algorithms

Valentina Colla; Gianluca Bioli; Marco Vannucci

Model parameters optimisation is a very common problem when dealing with mathematical models. These models are often designed from theoretical considerations on the physics phenomena and afterwards adapted in order to fit the experimental data that are collected on the real operating scenario. The paper compares different approaches to the problem of finding the optimal values of four parameters characterising a quite complex mathematical model which estimates some important mechanical properties of aluminium killed and interstitial free steels. Several optimisation procedures have been attempted, from traditional methods to genetic algorithms. A comparison of such methods is performed, by illustrating and discussing numerical results.

Archive | 2013

Variable Selection and Feature Extraction Through Artificial Intelligence Techniques

Silvia Cateni; Marco Vannucci; Marco Vannocci; Valentina Colla

unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

international work conference on artificial and natural neural networks | 2009

Estimation of train speed via neuro---fuzzy techniques

Valentina Colla; Marco Vannucci; B. Allottay; M. Malvezziy

The paper describes and compares some applications of neuro- fuzzy (NF) systems to estimate the speed of a train from the measure- ment of the velocity of two axles in any wheel/rail adhesion conditions. All the presented NF approaches outperforms the firstly designed crisp algorithm in terms of computational burden and some of them achieve also a significative performance improvement, by demonstrating their capability of learning from rough data.

Explore More