Barbara Pes | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Barbara Pes is active.

Explore More

Publication

Featured researches published by Barbara Pes.

Future Generation Computer Systems | 2011

Extending the SOA paradigm to e-Science environments

Andrea Bosin; Nicoletta Dessì; Barbara Pes

In the business world, Service Oriented Architecture (SOA) has recently gained popularity as a new approach to integrate business applications. In a similar way, scientific workflows can accomplish the automation of large-scale e-Science applications. However, the use of workflows in scientific environments differs from that in business environments. Scientific workflows need to handle large amounts of data, deal with heterogeneous and distributed resources such as the Grid, and require specialized software applications that are written in diverse programming languages, most of which are not popular in business environments. In this paper, we analyze the preparedness and the shortcomings of the SOA paradigm in addressing the needs of e-Science and the extent to which this can be done. The paper identifies the characteristics of a Virtual Organization providing scientific services, and presents a model placing particular emphasis on BPEL processes as a mean for supporting the interaction with Web Services. We discuss the challenges encountered in the seamless integration of BPEL processes within an e-Science infrastructure and we propose a set of complementary infrastructural services. By providing business utilities and automation technology founded on Web Services, infrastructural services cooperate with BPEL in ensuring on-demand resource provisioning for the execution of scientific workflows, while addressing some critical issues such as security, access control and monitoring. Furthermore, the paper presents our experience in adopting the proposed approach within a collaborative environment for bioinformatics. To illustrate how a scientific experiment can be formalized and executed, we focus on micro-array data processing, a field that will be increasingly common in applications of machine learning to molecular biology.

Expert Systems With Applications | 2015

Similarity of feature selection methods

Nicoletta Dessì; Barbara Pes

We empirically investigated the similarity among feature selection methods.Extensive experiments were carried out across high dimensional classification tasks.We obtained useful insight into the pattern of agreement of eight popular methods. In the past two decades, the dimensionality of datasets involved in machine learning and data mining applications has increased explosively. Therefore, feature selection has become a necessary step to make the analysis more manageable and to extract useful knowledge about a given domain. A large variety of feature selection techniques are available in literature, and their comparative analysis is a very difficult task. So far, few studies have investigated, from a theoretical and/or experimental point of view, the degree of similarity/dissimilarity among the available techniques, namely the extent to which they tend to produce similar results within specific application contexts. This kind of similarity analysis is of crucial importance when two or more methods are combined in an ensemble fashion: indeed the ensemble paradigm is beneficial only if the involved methods are capable of giving different and complementary representations of the considered domain. This paper gives a contribution in this direction by proposing an empirical approach to evaluate the degree of consistency among the outputs of different selection algorithms in the context of high dimensional classification tasks. Leveraging on a proper similarity index, we systematically compared the feature subsets selected by eight popular selection methods, representatives of different selection approaches, and derived a similarity trend for feature subsets of increasing size. Through an extensive experimentation involving sixteen datasets from three challenging domains (Internet advertisements, text categorization and micro-array data classification), we obtained useful insight into the pattern of agreement of the considered methods. In particular, our results revealed how multivariate selection approaches systematically produce feature subsets that overlap to a small extent with those selected by the other methods.

BioMed Research International | 2013

A Comparative Analysis of Biomarker Selection Techniques

Nicoletta Dessì; Emanuele Pascariello; Barbara Pes

Feature selection has become the essential step in biomarker discovery from high-dimensional genomics data. It is recognized that different feature selection techniques may result in different set of biomarkers, that is, different groups of genes highly correlated to a given pathological condition, but few direct comparisons exist which quantify these differences in a systematic way. In this paper, we propose a general methodology for comparing the outcomes of different selection techniques in the context of biomarker discovery. The comparison is carried out along two dimensions: (i) measuring the similarity/dissimilarity of selected gene sets; (ii) evaluating the implications of these differences in terms of both predictive performance and stability of selected gene sets. As a case study, we considered three benchmarks deriving from DNA microarray experiments and conducted a comparative analysis among eight selection methods, representatives of different classes of feature selection techniques. Our results show that the proposed approach can provide useful insight about the pattern of agreement of biomarker discovery techniques.

Journal of Artificial Evolution and Applications | 2009

An evolutionary method for combining different feature selection criteria in microarray data classification

Nicoletta Dessì; Barbara Pes

The classification of cancers from gene expression profiles is a challenging research area in bioinformatics since the high dimensionality of microarray data results in irrelevant and redundant information that affects the performance of classification. This paper proposes using an evolutionary algorithm to select relevant gene subsets in order to further use them for the classification task. This is achieved by combining valuable results from different feature ranking methods into feature pools whose dimensionality is reduced by a wrapper approach involving a genetic algorithm and SVM classifier. Specifically, the GA explores the space defined by each feature pool looking for solutions that balance the size of the feature subsets and their classification accuracy. Experiments demonstrate that the proposed method provide good results in comparison to different state of art methods for the classification of microarray data.

international workshop on fuzzy logic and applications | 2005

Learning bayesian classifiers from gene-expression microarray data

Andrea Bosin; Nicoletta Dessì; Diego Liberati; Barbara Pes

Computing methods that allow the efficient and accurate processing of experimentally gathered data play a crucial role in biological research. The aim of this paper is to present a supervised learning strategy which combines concepts stemming from coding theory and Bayesian networks for classifying and predicting pathological conditions based on gene expression data collected from micro-arrays. Specifically, we propose the adoption of the Minimum Description Length (MDL) principle as a useful heuristic for ranking and selecting relevant features. Our approach has been successfully applied to the Acute Leukemia dataset and compared with different methods proposed by other researchers.

Information Fusion | 2017

Exploiting the ensemble paradigm for stable feature selection

Barbara Pes; Nicoletta Dess; Marta Angioni

We discuss the rationale of ensemble feature selection.We empirically evaluate the effectiveness of a data perturbation ensemble approach.Our study involves both univariate and multivariate selection algorithms.A special emphasis is given to the stability level of the selected feature subsets.Useful insight is gained from the analysis of high-dimensional genomic datasets. Ensemble classification is a well-established approach that involves fusing the decisions of multiple predictive models. A similar ensemble logic has been recently applied to challenging feature selection tasks aimed at identifying the most informative variables (or features) for a given domain of interest. In this work, we discuss the rationale of ensemble feature selection and evaluate the effects and the implications of a specific ensemble approach, namely the data perturbation strategy. Basically, it consists in combining multiple selectors that exploit the same core algorithm but are trained on different perturbed versions of the original data. The real potential of this approach, still object of debate in the feature selection literature, is here investigated in conjunction with different kinds of core selection algorithms (both univariate and multivariate). In particular, we evaluate the extent to which the ensemble implementation improves the overall performance of the selection process, in terms of predictive accuracy and stability (i.e., robustness with respect to changes in the training data). Furthermore, we measure the impact of the ensemble approach on the final selection outcome, i.e. on the composition of the selected feature subsets. The results obtained on ten public genomic benchmarks provide useful insight on both the benefits and the limitations of such ensemble approach, paving the way to the exploration of new and wider ensemble schemes.

international conference on intelligent information processing | 2010

A Filter-Based Evolutionary Approach for Selecting Features in High-Dimensional Micro-array Data

Laura Maria Cannas; Nicoletta Dessì; Barbara Pes

Evolutionary algorithms have received much attention in extracting knowledge on high-dimensional micro-array data, being crucial to their success a suitable definition of the search space of the potential solutions. In this paper, we present an evolutionary approach for selecting informative genes (features) to predict and diagnose cancer. We propose a procedure that combines results of filter methods, which are commonly used in the field of data mining, to reduce the search space where a genetic algorithm looks for solutions (i.e. gene subsets) with better classification performance, being the quality (fitness) of each solution evaluated by a classification method. The methodology is quite general because any classification algorithm could be incorporated as well a variety of filter methods. Extensive experiments on a public micro-array dataset are presented using four popular filter methods and SVM.

intelligent data engineering and automated learning | 2007

Capturing heuristics and intelligent methods for improving micro-array data classification

Andrea Bosin; Nicoletta Dessì; Barbara Pes

Classification of micro-array data has been studied extensively but only a small amount of research work has been done on classification of microarray data involving more than two classes. This paper proposes a learning strategy that deals with building a multi-target classifier and takes advantage from well known data mining techniques. To address the intrinsic difficulty of selecting features in order to promote the classification accuracy, the paper considers the use of a set of binary classifiers each of ones is devoted to predict a single class of the multi-classification problem. These classifiers are similar to local experts whose knowledge (about the features that are most correlated to each class value) is taken into account by the learning strategy for selecting an optimal set of features. Results of the experiments performed on a publicly available dataset demonstrate the feasibility of the proposed approach.

industrial and engineering applications of artificial intelligence and expert systems | 2005

Intelligent Bayesian classifiers in network intrusion detection

Andrea Bosin; Nicoletta Dessì; Barbara Pes

The aim of this paper is to explore the effectiveness of Bayesian classifiers in intrusion detection (ID). Specifically, we provide an experimental study that focuses on comparing the accuracy of different classification models showing that the Bayesian classification approach is reasonably effective and efficient in predicting attacks and in exploiting the knowledge required by a computational intelligent ID process.

business process management | 2005

Applying enterprise models to design cooperative scientific environments

Andrea Bosin; Nicoletta Dessì; Maria Grazia Fugini; Diego Liberati; Barbara Pes

Scientific experiments are supported by activities that create, use, communicate and distribute information whose organizational dynamics is similar to processes performed by distributed cooperative enterprise units. On this premise, the aim of this paper is to apply existing enterprise models and processes for designing cooperative scientific experiments. The presented approach assumes the Service Oriented Architecture as the enacting paradigm to formalize experiments as cooperative services on various computational nodes of a network. Specifically, a framework is proposed that defines the responsibility of e-nodes in offering services, and the set of rules under which each service can be accessed by e-nodes through service invocation. By discussing a representative case study, the paper details how specific classes of experiments can be mapped into a service-oriented model whose implementation is carried out in a prototypical scientific environment.

Explore More