Fabio Stella | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fabio Stella is active.

Explore More

Publication

Featured researches published by Fabio Stella.

intelligent systems design and applications | 2009

Automatic Labeling of Topics

Davide Magatti; Silvia Calegari; Davide Ciucci; Fabio Stella

An algorithm for the automatic labeling of topics accordingly to a hierarchy is presented. Its main ingredients are a set of similarity measures and a set of topic labeling rules. The labeling rules are specifically designed to find the most agreed labels between the given topic and the hierarchy. The hierarchy is obtained from the Google Directory service, extracted via an ad-hoc developed software procedure and expanded through the use of the OpenOffice English Thesaurus. The performance of the proposed algorithm is investigated by using a document corpus consisting of 33,801 documents and a dictionary consisting of 111,795 words. The results are encouraging, while particularly interesting and significant labeling cases emerged

Annals of Operations Research | 2000

Stochastic Nonstationary Optimization for Finding Universal Portfolios

Alexei A. Gaivoronski; Fabio Stella

We apply ideas from stochastic optimization for defining universal portfolios. Universal portfolios are that class of portfolios which are constructed directly from the available observations of the stocks behavior without any assumptions about their statistical properties. Cover [7] has shown that one can construct such portfolio using only observations of the past stock prices which generates the same asymptotic wealth growth as the best constant rebalanced portfolio which is constructed with the full knowledge of the future stock market behavior.In this paper we construct universal portfolios using a different set of ideas drawn from nonstationary stochastic optimization. Our portfolios yield the same asymptotic growth of wealth as the best constant rebalanced portfolio constructed with the perfect knowledge of the future and they are less demanding computationally compared to previously known universal portfolios. We also present computational evidence using New York Stock Exchange data which shows, among other things, superior performance of portfolios which explicitly take into account possible nonstationary market behavior.

Neurocomputing | 2012

Topic model validation

Eduardo H. Ramírez; Ramón F. Brena; Davide Magatti; Fabio Stella

In this paper the problem of performing external validation of the semantic coherence of topic models is considered. The Fowlkes-Mallows index, a known clustering validation metric, is generalized for the case of overlapping partitions and multi-labeled collections, thus making it suitable for validating topic modeling algorithms. In addition, we propose new probabilistic metrics inspired by the concepts of recall and precision. The proposed metrics also have clear probabilistic interpretations and can be applied to validate and compare other soft and overlapping clustering algorithms. The approach is exemplified by using the Reuters-21578 multi-labeled collection to validate LDA models, then using Monte Carlo simulations to show the convergence to the correct results. Additional statistical evidence is provided to better understand the relation of the metrics presented.

Journal of Economic Dynamics and Control | 2003

On-line portfolio selection using stochastic programming

Alexei A. Gaivoronski; Fabio Stella

Abstract This paper is dedicated to the problem of dynamic portfolio optimization for the case when the number of decision periods is large and new information about market arrives during each such period. We propose the family of adaptive portfolio selection policies which rebalance the current portfolio during each decision period by adopting portfolio from a specified family with the best performance on the past data. In the absence of transaction costs the general conditions are found under which this policy yields asymptotically the same performance as the best portfolio from the same family constructed with the full knowledge of the future. These results are extended for the case of nonzero transaction costs by introducing a class of threshold portfolio optimization policies which rebalance current portfolio only when its performance differs from performance of the best portfolio by a given threshold. The value of this threshold is adapted to the changing market conditions. We show that it is possible to select a sequence of threshold values in such a way that the asymptotic influence of transaction costs on portfolio performance is negligible and overall portfolio performance is asymptotically the same as the performance of portfolio with the perfect knowledge of the future. We do not assume neither specific probabilistic structure of the market data nor their stationarity. Our theory is illustrated by numerical experiments with real data. Finally, we discuss the relevance of our results in the context of high performance computing.

database and expert systems applications | 2013

Towards Explaining Latent Factors with Topic Models in Collaborative Recommender Systems

Marco Rossetti; Fabio Stella; Markus Zanker

Latent factor models have been proved to be the state of the art for the Collaborative Filtering approach in a Recommender System. However, latent factors obtained with mathematical methods applied to the user-item matrix can be hardly interpreted by humans. In this paper we exploit Topic Models applied to textual data associated with items to find explanations for latent factors. Based on the Movie Lens dataset and textual data about movies collected from Freebase we run a user study with over hundred participants to develop a reference dataset for evaluating different strategies towards more interpretable and portable latent factor models.

Journal of Biomedical Informatics | 2012

Continuous time Bayesian network classifiers

Fabio Stella; Y. Amer

The class of continuous time Bayesian network classifiers is defined; it solves the problem of supervised classification on multivariate trajectories evolving in continuous time. The trajectory consists of the values of discrete attributes that are measured in continuous time, while the predicted class is expected to occur in the future. Two instances from this class, namely the continuous time naive Bayes classifier and the continuous time tree augmented naive Bayes classifier, are introduced and analyzed. They implement a trade-off between computational complexity and classification accuracy. Learning and inference for the class of continuous time Bayesian network classifiers are addressed, in the case where complete data are available. A learning algorithm for the continuous time naive Bayes classifier and an exact inference algorithm for the class of continuous time Bayesian network classifiers are described. The performance of the continuous time naive Bayes classifier is assessed in the case where real-time feedback to neurological patients undergoing motor rehabilitation must be provided.

Journal of Systems and Software | 2015

On applying machine learning techniques for design pattern detection

Marco Zanoni; Francesca Arcelli Fontana; Fabio Stella

We apply machine learning to detect design patterns in software systems.We exploit a specific design pattern model to apply machine learning techniques.We compare the performances of several machine learning algorithms.We provide a large dataset containing manually checked design pattern instances. The detection of design patterns is a useful activity giving support to the comprehension and maintenance of software systems. Many approaches and tools have been proposed in the literature providing different results. In this paper, we extend a previous work regarding the application of machine learning techniques for design pattern detection, by adding a more extensive experimentation and enhancements in the analysis method. Here we exploit a combination of graph matching and machine learning techniques, implemented in a tool we developed, called MARPLE-DPD. Our approach allows the application of machine learning techniques, leveraging a modeling of design patterns that is able to represent pattern instances composed of a variable number of classes. We describe the experimentations for the detection of five design patterns on 10 open source software systems, compare the performances obtained by different learning models with respect to a baseline, and discuss the encountered issues.

BMC Bioinformatics | 2011

Conformational and functional analysis of molecular dynamics trajectories by self-organising maps.

Domenico Fraccalvieri; Alessandro Pandini; Fabio Stella; Laura Bonati

BackgroundMolecular dynamics (MD) simulations are powerful tools to investigate the conformational dynamics of proteins that is often a critical element of their function. Identification of functionally relevant conformations is generally done clustering the large ensemble of structures that are generated. Recently, Self-Organising Maps (SOMs) were reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data mining problems. We present a novel strategy to analyse and compare conformational ensembles of protein domains using a two-level approach that combines SOMs and hierarchical clustering.ResultsThe conformational dynamics of the α-spectrin SH3 protein domain and six single mutants were analysed by MD simulations. The Cαs Cartesian coordinates of conformations sampled in the essential space were used as input data vectors for SOM training, then complete linkage clustering was performed on the SOM prototype vectors. A specific protocol to optimize a SOM for structural ensembles was proposed: the optimal SOM was selected by means of a Taguchi experimental design plan applied to different data sets, and the optimal sampling rate of the MD trajectory was selected. The proposed two-level approach was applied to single trajectories of the SH3 domain independently as well as to groups of them at the same time. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular structures: the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions.ConclusionsThe use of a two-level approach combining SOMs and hierarchical clustering for conformational analysis of structural ensembles of proteins was proposed. It can easily be extended to other study cases and to conformational ensembles from other sources.

Neural Networks | 1997

Some numerical aspects of the training problem for feed-forward neural nets

John. J. McKeown; Fabio Stella; Gary Hall

This paper considers the feed-forward training problem from the numerical point of view, in particular the conditioning of the problem. It is well known that the feed-forward training problem is often ill-conditioned; this affects the behaviour of training algorithms, the choice of such algorithms and the quality of the solutions achieved. A geometric interpretation of ill-conditioning is explored and an example of function approximation is analysed in detail.

conference on advanced information systems engineering | 2010

Dependency discovery in data quality

Daniele Barone; Fabio Stella; Carlo Batini

A conceptual framework for the automatic discovery of dependencies between data quality dimensions is described. Dependency discovery consists in recovering the dependency structure for a set of data quality dimensions measured on attributes of a database. This task is accomplished through the data mining methodology, by learning a Bayesian Network from a database. The Bayesian Network is used to analyze dependency between data quality dimensions associated with different attributes. The proposed framework is instantiated on a real world database. The task of dependency discovery is presented in the case when the following data quality dimensions are considered; accuracy, completeness, and consistency. The Bayesian Network model shows how data quality can be improved while satisfying budget constraints.

Explore More