Sergio Zani
University of Parma
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sergio Zani.
Archive | 1990
Andrea Cerioli; Sergio Zani
Well-known indices for the measurement of poverty traditionally refer to income (or to other economic variables, such as consumption) and to the conventional definition of a poverty line (see, e.g., Sen 1976; Carbonaro 1982; Foster 1984; Atkinson 1987; Hagenaars 1987; Pyatt 1987; Dagum et al. 1988; Cerioli 1989).
Computational Statistics & Data Analysis | 1998
Sergio Zani; Marco Riani; Aldo Corbellini
Abstract In this paper we suggest a simple way of constructing a bivariate boxplot based on convex hull peeling and B-spline smoothing. The proposed method shows some advantages with respect to that suggested by Goldberg and Iglewicz (1992). Our approach leads to defining a natural inner region which is completely nonparametric and smooth. Furthermore it retains the correlation in the observations and adapts to differing spread of the data in the different directions. Theouter contour, which is based on a multiple of the distance of the inner region from the centre, is robust to the presence of clusters of outliers. We also show how the construction of a bivariate boxplot for each pair of variables can become a very useful tool for the detection of multivariate outliers.
45th Scientific Meeting of the Italian Statistical Society | 2013
Sergio Zani; Maria Adele Milioli; Isabella Morlini
Composite indicators should ideally measure multidimensional concepts which cannot be captured by a single variable. In this chapter, we suggest a method based on fuzzy set theory for the construction of a fuzzy synthetic index of a latent phenomenon (e.g., well-being, quality of life, etc.), using a set of manifest variables measured on different scales (quantitative, ordinal and binary). A few criteria for assigning values to the membership function are discussed, as well as criteria for defining the weights of the variables. For ordinal variables, we propose a fuzzy quantification method based on the sampling cumulative function and a weighting system which takes into account the relative frequency of each category. An application regarding the results of a survey on the users of a contact center is presented.
Journal of Classification | 2012
Isabella Morlini; Sergio Zani
We introduce new similarity measures between two subjects, with reference to variables with multiple categories. In contrast to traditionally used similarity indices, they also take into account the frequency of the categories of each attribute in the sample. This feature is useful when dealing with rare categories, since it makes sense to differently evaluate the pairwise presence of a rare category from the pairwise presence of a widespread one. A weighting criterion for each category derived from Shannon’s information theory is suggested. There are two versions of the weighted index: one for independent categorical variables and one for dependent variables. The suitability of the proposed indices is shown in this paper using both simulated and real world data sets.
Advanced Data Analysis and Classification | 2012
Isabella Morlini; Sergio Zani
In this paper we propose a new index Z for measuring the dissimilarity between two hierarchical clusterings (or dendrograms). This index is a metric since it satisfies the axioms of non-negativity, symmetry and triangle inequality. A desirable property of this index is that it can be decomposed into the contributions pertaining to each stage of the hierarchies. We show the relations of such components with the currently used criteria for comparing two partitions. We obtain a global similarity index as the complement to one of the suggested dissimilarity and we derive its adjustment for agreement due to chance. We obtain similarity indexes pertaining to each stage of the hierarchies as the complement to one of the additive parts of the global distance Z. We consider the use of the proposed distance for more than two dendrograms and its use for the consensus of classifications and variable selection in cluster analysis. A series of simulation experiments and an application to a real data set are presented.
Archive | 2001
Andrea Cerioli; Sergio Zani
In this paper we propose some simple diagnostics which can prove useful for detecting high density regions in ℜ P , for p ≥ 2. Our approach does not require full estimation of the multivariate density and exploits the spatial contiguity information which can be attached to objects in ℜ P . The suggested method could be routinely applied as a preliminary step in nonhierarchical cluster analysis, where it provides useful guidance both in choosing the appropriate number of clusters and in selecting the values of initial cluster seeds.
Archive | 1998
Marco Riani; Sergio Zani
1In this paper we suggest a non parametric generalization of the Mahalanobis distance which enables to take into account the differing spread of the data in the different directions. The output is an easy to handle metric which can be conveniently used both in an exploratory stage of the analysis for the detection of multivariate outliers and successively as a tool for non parametric discriminant analysis, multidimensional scaling and cluster analysis. In addition, the use of this metric can provide information about multivariate transformations and multiple outliers.
GfKl | 2012
Isabella Morlini; Sergio Zani
In this paper we suggest a new index for measuring the distance between two hierarchical clusterings. This index can be decomposed into the contributions pertaining to each stage of the hierarchies. We show the relations of such components with the currently used criteria for comparing two partitions. We obtain a similarity index as the complement to one of the suggested distances and we propose its adjustment for agreement due to chance. We consider the extension of the proposed distance and similarity measures to more than two dendrograms and their use for the consensus of classification and variable selection in cluster analysis.
Archive | 2006
Isabella Morlini; Sergio Zani
Following our previous works where an improved dynamic time warping (DTW) algorithm has been proposed and motivated, especially in the multivariate case, for computing the dissimilarity between curves, in this paper we modify the classical DTW in order to obtain discrete warping functions and to estimate the structural mean of a sample of curves. With the suggested methodology we analyze series of daily measurements of some air pollutants in Emilia-Romagna (a region in Northern Italy). We compare results with those obtained with other flexible and non parametric approaches used in functional data analysis.
Archive | 2011
Isabella Morlini; Sergio Zani
In this paper we introduce new similarity indexes for binary and polytomous variables, employing the concept of “information content”. In contrast to traditionally used similarity measures, we suggest to consider the frequency of the categories of each attribute in the sample. This feature is useful when dealing with rare categories, since it makes sense to differently evaluate the pairwise presence of a rare category from the pairwise presence of a widespread one. We also propose a weighted index for dependent categorical variables. The suitability of the proposed measures from a marketing research perspective is shown using two real data sets.