Tonio Di Battista
University of Chieti-Pescara
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tonio Di Battista.
Statistical Methods and Applications | 2011
Stefano Antonio Gattone; Tonio Di Battista
The adaptive cluster sampling (ACS) is a suitable sampling design for rare and clustered populations. In environmental and ecological applications, biological populations are generally animals or plants with highly patchy spatial distribution. However, ACS would be a less efficient design when the study population is not rare with low aggregation since the final sample size could be easily out of control. In this paper, a new variant of ACS is proposed in order to improve the performance (in term of precision and cost) of ACS versus simple random sampling (SRS). The idea is to detect the optimal sample size by means of a data-driven stopping rule in order to determine when to stop the adaptive procedure. By introducing a stopping rule the theoretical basis of ACS are not respected and the behaviour of the ordinary estimators used in ACS is explored by using Monte Carlo simulations. Results show that the proposed variant of ACS allows to control the effective sample size and to prevent from excessive efficiency loss typical of ACS when the population is less clustered than anticipated. The proposed strategy may be recommended especially when no prior information about the population structure is available as it does not require a prior knowledge of the degree of rarity and clustering of the population of interest.
Environmental and Ecological Statistics | 2003
Tonio Di Battista
In many surveys in environmental and natural phenomena the aim is to evaluate the heterogeneity, and the skewness of the distribution of the number point-objects in the study area opportunely partitioned in sub-regions. For this purpose, in this paper the estimation of dispersion indices is considered by using simple random sampling and adaptive sampling with initial simple random sampling selected with replacement or without replacement. The jackknife and the bootstrap procedures are proposed in both cases for reducing bias. Finally, both a simulation study and a case study on biological population referred to a Oidium tuckeri contamination in a growing vineyard is performed to assess the accuracy of the proposed estimators.
Environmental and Ecological Statistics | 2004
Tonio Di Battista; Stefano Antonio Gattone
Abundance vector estimation is a well investigated problem in statistical ecology. The use of simple random sampling with replacement or replicated sampling ensures good asymptotic properties of the abundance vector estimators. However, real surveys are based on small sample sizes, and assuming any specific distribution of the abundance vector estimator may be hazardous.In this paper we focus our attention on situations where the population is not too large and the sample size is small. We propose bootstrap multivariate confidence regions based on data depth. Data depth is a geometrical concept of ordering data from the center outwardly in higher dimensions. The Simplicial depth, the Tukeys depth and the Mahalanobis depth are presented. In order to build confidence regions in the presence of a skewed distribution of the abundance vector estimator, the use of Tukeys depth is suggested. The proposed method has been applied to the benthic community of Lake Lesina. A comparison with Mahalanobis depth and standard existing methods is reported.
International Journal of Bifurcation and Chaos | 2012
Angela De Sanctis; Tonio Di Battista
Assuming a Parametric Family of Functional Data, the problem of computing summary statistics of the same functional form is investigated. The central idea is to compile the statistics on the parameters instead of on the functions themselves. With the hypothesis of a monotonic dependence from parameters, we highlight the special features of this statistics.
Statistical Analysis and Data Mining | 2017
Tonio Di Battista; Francesca Fortuna
Biomonitoring techniques are widely used to assess environmental damages through the changes occurring in the composition of species communities. Among the living organisms used as bioindicators, epiphytic lichens, are recognized as reliable indicators of air pollution. However, lichen biodiversity studies are generally based on the analysis of a scalar measure that omits the species composition. For this reason, we propose to analyze lichen data through diversity profiles and the functional data analysis approach. Indeed, diversity profiles may be naturally considered as functional data because they are expressed as functions of the species abundance vector in a fixed domain. The peculiarity of these data is that the functional space is constituted by a set of curves belonging to the same family. In this context, simultaneous confidence bands are obtained for the mean diversity profile through the Karhunen-Love KL decomposition. The novelty of our method lies in exploiting the known form of the function underlying the data. This allows us to work directly on the functional space by avoiding smoothing techniques. The confidence band procedure is applied to a real data set concerning lichen data in Tuscany region central Italy. Confidence bands functional data analysis intrinsic diversity profile lichen data mean function KL expansion.
Environmental and Ecological Statistics | 2016
Stefano Antonio Gattone; Esha Mohamed; Tonio Di Battista
Adaptive cluster sampling (ACS) has received much attention in recent years since it yields more precise estimates than conventional sampling designs when applied to rare and clustered populations. These results, however, are impacted by the availability of some prior knowledge about the spatial distribution and the absolute abundance of the population under study. This prior information helps the researcher to select a suitable critical value that triggers the adaptive search, the neighborhood definition and the initial sample size. A bad setting of the ACS design would worsen the performance of the adaptive estimators. In particular, one of the greatest weaknesses in ACS is the inability to control the final sampling effort if, for example, the critical value is set too low. To overcome this drawback one can introduce ACS with clusters selected without replacement where one can fix in advance the number of distinct clusters to be selected or ACS with a stopping rule which stops the adaptive sampling when a predetermined sample size limit is reached or when a given stopping rule is verified. However, the stopping rule breaks down the theoretical basis for the unbiasedness of the ACS estimators introducing an unknown amount of bias in the estimates. The current study improves the performance of ACS when applied to patchy and clustered but not rare populations and/or less clustered populations. This is done by combining the stopping rule with ACS without replacement of clusters so as to further limit the sampling effort in form of traveling expenses by avoiding repeat observations and by reducing the final sample size. The performance of the proposed design is investigated using simulated and real data.
Archive | 2016
Tonio Di Battista; Angela De Sanctis; Francesca Fortuna
The curves in a functional data set often present a variety of distinctive patterns corresponding to different shapes that can be identified by clustering the functions. However, clustering functional data is a difficult task because the function space is, generally, of infinite dimension. Thus, the distance among functions may have infinity solutions and can be approximated in different ways leading to different clustering results. The paper deals with this problem and focuses on cases in which the functional form of the observations is known in advance. In this setting, the approximation of the function underlying the data is not required and the functional distance may be computed directly in the explicit form of the functions. Moreover, we restrict the space of the functions to a closed and convex subset in an Hilbert space to achieve desirable properties. In the proposed framework, an \(L^2\) metric is applied combined clustering algorithms for finite dimensional data. The method is applied to a real data set concerning lichen biodiversity in the province of Genoa, North Western Italy.
Archive | 2011
Tonio Di Battista; Stefano Antonio Gattone; Angela De Sanctis
In many different research fields, such as medicine, physics, economics, etc., the evaluation of real phenomena observed at each statistical unit is described by a curve or an assigned function. In this framework, a suitable statistical approach is Functional Data Analysis based on the use of basis functions. An alternative method, using Functional Analysis tools, is considered in order to estimate functional statistics. Assuming a parametric family of functional data, the problem of computing summary statistics of the same parametric form when the set of all functions having that parametric form does not constitute a linear space is investigated. The central idea is to make statistics on the parameters instead of on the functions themselves.
Journal of Classification | 2011
Pasquale Valentini; Tonio Di Battista; Stefano Antonio Gattone
In this paper we deal with the problem of identifying a valid way to characterize heterogeneity in the analysis of customer satisfaction observing the phenomenon through a new perspective. In the literature, the variability of a Customer Satisfaction index is measured by the standard deviation or the coefficient of variation. In this way, heterogeneity among customers may be masked. To overcome this drawback, we provide a new approach to the construction of a multi-dimensional measure of heterogeneity of the Customer Satisfaction index not depending on the choice of a particular heterogeneity index. The approach is based on heterogeneity profiles which lead to a more detailed description of heterogeneity than alternative measures. Moreover, a latent class model is used for classifying individuals into distinct groups based on responses to a set of items. Once groups are formed, Customer Satisfaction researchers can make conclusions about the level of satisfaction and the characteristics of groups in terms of heterogeneity.
international symposium on distributed computing | 2017
Giulia Caruso; Stefano Antonio Gattone; Francesca Fortuna; Tonio Di Battista
Cluster analysis has long played an important role in a broad variety of areas, such as psychology, biology, computer sciences. It has established as a precious tool for marketing and business areas, thanks to its capability to help in decision-making processes. Traditionally, clustering approaches concentrate on purely numerical or categorical data only. An important area of cluster analysis deals with mixed data, composed by both numerical and categorical attributes. Clustering mixed data is not simple, because there is a strong gap between the similarity metrics for these two kind of data. In this review we provide some technical details about the kind of distances that could be used with mixed-data types. Finally, we emphasize as in most applications of cluster analysis practitioners focus either on numeric or categorical variables, lessening the effectiveness of the method as a tool of decision-making.