Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gabriele Soffritti is active.

Publication


Featured researches published by Gabriele Soffritti.


Computational Statistics & Data Analysis | 2014

A multivariate linear regression analysis using finite mixtures of t distributions

Giuliano Galimberti; Gabriele Soffritti

Recently, finite mixture models have been used to model the distribution of the error terms in multivariate linear regression analysis. In particular, Gaussian mixture models have been employed. A novel approach that assumes that the error terms follow a finite mixture of t distributions is introduced. This assumption allows for an extension of multivariate linear regression models, making these models more versatile and robust against the presence of outliers in the error term distribution. The issues of model identifiability and maximum likelihood estimation are addressed. In particular, identifiability conditions are provided and an Expectation-Maximisation algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments and compared to the estimators from the Gaussian mixture models. Results from the analysis of two real datasets are presented.


Statistics and Computing | 2011

Multivariate linear regression with non-normal errors: a solution based on mixture models

Gabriele Soffritti; Giuliano Galimberti

In some situations, the distribution of the error terms of a multivariate linear regression model may depart from normality. This problem has been addressed, for example, by specifying a different parametric distribution family for the error terms, such as multivariate skewed and/or heavy-tailed distributions. A new solution is proposed, which is obtained by modelling the error term distribution through a finite mixture of multi-dimensional Gaussian components. The multivariate linear regression model is studied under this assumption. Identifiability conditions are proved and maximum likelihood estimation of the model parameters is performed using the EM algorithm. The number of mixture components is chosen through model selection criteria; when this number is equal to one, the proposal results in the classical approach. The performances of the proposed approach are evaluated through Monte Carlo experiments and compared to the ones of other approaches. In conclusion, the results obtained from the analysis of a real dataset are presented.


Health & Place | 2009

Evaluating patient satisfaction through latent class factor analysis.

Giulia Cavrini; Giuliano Galimberti; Gabriele Soffritti

This paper introduces Health and Place readers interested in studying the latent concept of satisfaction to the methodology of latent variable analysis. In particular, some suitable methods for analyzing individual opinions expressed on ordinal scales are illustrated. The basic theory behind these methods is explained and a step by step description of how they should be used in practice is given. The discussion of the subject starts with the simplest methods, in which opinions are grouped into two categories: typically positive and negative. Furthermore, more complex methods are presented to deal with opinions expressed on ordinal scales (e.g., very satisfied, somewhat satisfied, and not satisfied). All methods are described by showing various results obtained through the analysis of a dataset containing patients opinions about their satisfaction with hospital care, collected through a survey conducted after their discharge from an Italian hospital. The database was created using a questionnaire covering different aspects of satisfaction and a five-point Likert scale. This represents an example of multi-level data: patients are clustered according to the hospital ward in which they were hospitalized. Thus, some specific latent variable methods able to deal with this particular structure of the data are also described.


Laterality | 2014

Assessment of handedness using latent class factor analysis

Franco Merni; Rocco Di Michele; Gabriele Soffritti

Recently several studies in which handedness was evaluated as a latent construct have been performed. In those studies, handedness was modelled using a qualitative latent variable (latent class models), a continuous latent variable (factor models), or both a qualitative latent variable and a continuous latent trait (mixed Rasch models). The aim of this study was to explore the usefulness and effectiveness of an approach in which handedness is treated as a qualitatively scaled latent variable with ordered categories (latent class factor models). This aim was pursued through an exploratory analysis of a dataset containing information on the hand used by 2236 young Italian sportspeople to perform 10 tasks. For comparison purposes, a latent class analysis was carried out. A cross-validation procedure was implemented. The results of all the analyses revealed that the best fit to the observed handedness patterns was obtained using a latent class factor model. Through this model, individuals were assigned to one of four ordered levels of handedness, and a quantitative index of left-handedness for each individual was computed by taking into account the different effect of the 10 tasks. These results provide support for the use of the latent class factor approach for handedness assessment.


Statistics and Computing | 2016

Using mixtures in seemingly unrelated linear regression models with non-normal errors

Giuliano Galimberti; Elena Scardovi; Gabriele Soffritti

Seemingly unrelated linear regression models are introduced in which the distribution of the errors is a finite mixture of Gaussian distributions. Identifiability conditions are provided. The score vector and the Hessian matrix are derived. Parameter estimation is performed using the maximum likelihood method and an Expectation–Maximisation algorithm is developed. The usefulness of the proposed methods and a numerical evaluation of their properties are illustrated through the analysis of simulated and real datasets.


Statistics and Computing | 2013

Using conditional independence for parsimonious model-based Gaussian clustering

Giuliano Galimberti; Gabriele Soffritti

In the framework of model-based cluster analysis, finite mixtures of Gaussian components represent an important class of statistical models widely employed for dealing with quantitative variables. Within this class, we propose novel models in which constraints on the component-specific variance matrices allow us to define Gaussian parsimonious clustering models. Specifically, the proposed models are obtained by assuming that the variables can be partitioned into groups resulting to be conditionally independent within components, thus producing component-specific variance matrices with a block diagonal structure. This approach allows us to extend the methods for model-based cluster analysis and to make them more flexible and versatile. In this paper, Gaussian mixture models are studied under the above mentioned assumption. Identifiability conditions are proved and the model parameters are estimated through the maximum likelihood method by using the Expectation-Maximization algorithm. The Bayesian information criterion is proposed for selecting the partition of the variables into conditionally independent groups. The consistency of the use of this criterion is proved under regularity conditions. In order to examine and compare models with different partitions of the set of variables a hierarchical algorithm is suggested. A wide class of parsimonious Gaussian models is also presented by parameterizing the component-variance matrices according to their spectral decomposition. The effectiveness and usefulness of the proposed methodology are illustrated with two examples based on real datasets.


Archive | 2011

Notes on the Robustness of Regression Trees Against Skewed and Contaminated Errors

Giuliano Galimberti; Marilena Pillati; Gabriele Soffritti

Regression trees represent one of the most popular tools in predictive data mining applications. However, previous studies have shown that their performances are not completely satisfactory when the dependent variable is highly skewed, and severely degrade in the presence of heavy-tailed error distributions, especially for grossly mis-measured values of the dependent variable. In this paper the lack of robustness of some classical regression trees is investigated by addressing the issue of highly-skewed and contaminated error distributions. In particular, the performances of some non robust regression trees are evaluated through a Monte Carlo experiment and compared to those of some trees, based on M-estimators, recently proposed in order to robustify this kind of methods. In conclusion, the results obtained from the analysis of a real dataset are presented.


Statistics and Computing | 2018

Modelling the role of variables in model-based cluster analysis

Giuliano Galimberti; Annamaria Manisi; Gabriele Soffritti

In the framework of cluster analysis based on Gaussian mixture models, it is usually assumed that all the variables provide information about the clustering of the sample units. Several variable selection procedures are available in order to detect the structure of interest for the clustering when this structure is contained in a variable sub-vector. Currently, in these procedures a variable is assumed to play one of (up to) three roles: (1) informative, (2) uninformative and correlated with some informative variables, (3) uninformative and uncorrelated with any informative variable. A more general approach for modelling the role of a variable is proposed by taking into account the possibility that the variable vector provides information about more than one structure of interest for the clustering. This approach is developed by assuming that such information is given by non-overlapped and possibly correlated sub-vectors of variables; it is also assumed that the model for the variable vector is equal to a product of conditionally independent Gaussian mixture models (one for each variable sub-vector). Details about model identifiability, parameter estimation and model selection are provided. The usefulness and effectiveness of the described methodology are illustrated using simulated and real datasets.


Statistical Modelling | 2010

Finite mixture models for clustering multilevel data with multiple cluster structures

Giuliano Galimberti; Gabriele Soffritti

Finite mixture models are useful tools for clustering two-way datasets within a sound statistical framework which can assess some important questions, such as how many clusters are there in the data. Models that can also be used for clustering multilevel data have been proposed, with the intent to produce clusterings of units at every level on the basis of all the available variables, considering the hierarchical structure of the dataset. This paper introduces a new class of mixture models for datasets with two levels that makes it possible to discover a clustering of level 2 units and different clusterings of level 1 units corresponding to different subsets of the variables (multiple cluster structures). This new class is obtained by adapting a mixture model proposed to identify multiple cluster structures in a data matrix to the multilevel situation. The usefulness of the new method is shown using simulated data and a real example.


Journal of Statistical Software | 2012

Classification Trees for Ordinal Responses in R: The rpartScore Package

Giuliano Galimberti; Gabriele Soffritti; Matteo Di Maso

Collaboration


Dive into the Gabriele Soffritti's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge