Featured Researches

Statistics Theory

On the estimating equations and objective functions for parameters of exponential power distribution: Application for disorder

The efficient modeling for disorder in a phenomena depends on the chosen score and objective functions. The main parameters in modeling are location, scale and shape. The exponential power distribution known as generalized Gaussian is extensively used in modeling. In real world, the observations are member of different parametric models or disorder in a data set exists. In this study, estimating equations for the parameters of exponential power distribution are derived to have robust and also efficient M-estimators when the data set includes disorder or contamination. The robustness property of M-estimators for the parameters is examined. Fisher information matrices based on the derivative of score functions from log , log q and distorted log-likelihoods are proposed by use of Tsallis q -entropy in order to have variances of M-estimators. It is shown that matrices derived by score functions are positive semidefinite if conditions are satisfied. Information criteria inspired by Akaike and Bayesian are arranged by taking the absolute value of score functions. Fitting performances of score functions from estimating equations and objective functions are tested by applying volume, information criteria and mean absolute error which are essential tools in modeling to assess the fitting competence of the proposed functions. Applications from simulation and real data sets are carried out to compare the performance of estimating equations and objective functions. It is generally observed that the distorted log-likelihood for the estimations of parameters of exponential power distribution has superior performance than other score and objective functions for the contaminated data sets.

Read more
Statistics Theory

On the invertibility in periodic ARFIMA models

The present paper, characterizes the invertibility and causality conditions of a periodic ARFIMA (PARFIMA) models. We first, discuss the conditions in the multivariate case, by considering the corresponding p-variate stationary ARFIMA models. Second, we construct the conditions using the univariate case and we deduce a new infinite autoregressive representation for the PARFIMA model, the results are investigated through a simulation study.

Read more
Statistics Theory

On the minmax regret for statistical manifolds: the role of curvature

Model complexity plays an essential role in its selection, namely, by choosing a model that fits the data and is also succinct. Two-part codes and the minimum description length have been successful in delivering procedures to single out the best models, avoiding overfitting. In this work, we pursue this approach and complement it by performing further assumptions in the parameter space. Concretely, we assume that the parameter space is a smooth manifold, and by using tools of Riemannian geometry, we derive a sharper expression than the standard one given by the stochastic complexity, where the scalar curvature of the Fisher information metric plays a dominant role. Furthermore, we derive the minmax regret for general statistical manifolds and apply our results to derive optimal dimensional reduction in the context of principal component analysis.

Read more
Statistics Theory

On the power of Chatterjee rank correlation

Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much recent attention. The coefficient has the unusual appeal that it not only estimates a population quantity first proposed by Dette et al. (2013) that is zero if and only if the underlying pair of random variables is independent, but also is asymptotically normal under independence. This paper compares Chatterjee's new correlation coefficient to three established rank correlations that also facilitate consistent tests of independence, namely, Hoeffding's D , Blum-Kiefer-Rosenblatt's R , and Bergsma-Dassios-Yanagimoto's τ ∗ . We contrast their computational efficiency in light of recent advances, and investigate their power against local rotation and mixture alternatives. Our main results show that Chatterjee's coefficient is unfortunately rate sub-optimal compared to D , R , and τ ∗ . The situation is more subtle for a related earlier estimator of Dette et al. (2013). These results favor D , R , and τ ∗ over Chatterjee's new correlation coefficient for the purpose of testing independence.

Read more
Statistics Theory

On the proliferation of support vectors in high dimensions

The support vector machine (SVM) is a well-established classification method whose name refers to the particular training examples, called support vectors, that determine the maximum margin separating hyperplane. The SVM classifier is known to enjoy good generalization properties when the number of support vectors is small compared to the number of training examples. However, recent research has shown that in sufficiently high-dimensional linear classification problems, the SVM can generalize well despite a proliferation of support vectors where all training examples are support vectors. In this paper, we identify new deterministic equivalences for this phenomenon of support vector proliferation, and use them to (1) substantially broaden the conditions under which the phenomenon occurs in high-dimensional settings, and (2) prove a nearly matching converse result.

Read more
Statistics Theory

On the relationship between beta-Bartlett and Uhlig extended processes

Stochastic volatility processes are used in multivariate time-series analysis to track time-varying patterns in covariance matrices. Uhlig extended and beta-Bartlett processes are especially convenient for analyzing high-dimensional time-series because they are conjugate with Wishart likelihoods. In this article, we show that Uhlig extended and beta-Bartlett are closely related, but not equivalent: their hyperparameters can be matched so that they have the same forward-filtered posteriors and one-step ahead forecasts, but different joint (smoothed) posterior distributions. Under this circumstance, Bayes factors can't discriminate the models and alternative approaches to model comparison are needed. We illustrate these issues in a retrospective analysis of volatilities of returns of foreign exchange rates. Additionally, we provide a backward sampling algorithm for the beta-Bartlett process, for which retrospective analysis had not been developed.

Read more
Statistics Theory

On the robustness to adversarial corruption and to heavy-tailed data of the Stahel-Donoho median of means

We consider median of means (MOM) versions of the Stahel-Donoho outlyingness (SDO) [stahel 1981, donoho 1982] and of Median Absolute Deviation (MAD) functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the L2 case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel-Donoho median complementing the n ??????-consistency [maronna 1995] and asymptotic normality [Zuo, Cui, He, 2004] of the Stahel-Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix under only a L2-moment assumption or of a scale parameter if a second moment does not exist.

Read more
Statistics Theory

On the sum of ordered spacings

We provide the analytic forms of the distributions for the sum of ordered spacings. We do this both for the case where the boundaries are included in the calculation of the spacings and the case where they are excluded. Both the probability densities as well as their cumulatives are provided. These results will have useful applications in the physical sciences and possibly elsewhere.

Read more
Statistics Theory

On universally consistent and fully distribution-free rank tests of vector independence

Rank correlations have found many innovative applications in the last decade. In particular, suitable rank correlations have been used for consistent tests of independence between pairs of random variables. Using ranks is especially appealing for continuous data as tests become distribution-free. However, the traditional concept of ranks relies on ordering data and is, thus, tied to univariate observations. As a result, it has long remained unclear how one may construct distribution-free yet consistent tests of independence between random vectors. This is the problem addressed in this paper, in which we lay out a general framework for designing dependence measures that give tests of multivariate independence that are not only consistent and distribution-free but which we also prove to be statistically efficient. Our framework leverages the recently introduced concept of center-outward ranks and signs, a multivariate generalization of traditional ranks, and adopts a common standard form for dependence measures that encompasses many popular examples. In a unified study, we derive a general asymptotic representation of center-outward rank-based test statistics under independence, extending to the multivariate setting the classical Hájek asymptotic representation results. This representation permits direct calculation of limiting null distributions and facilitates a local power analysis that provides strong support for the center-outward approach by establishing, for the first time, the nontrivial power of center-outward rank-based tests over root- n neighborhoods within the class of quadratic mean differentiable alternatives.

Read more
Statistics Theory

One Hundred Probability and Statistics Inequalities

Herein we present one hundred inequalities culled from various corners of the probability, statistics, and combinatorics literature. We welcome new suggestions.

Read more

Ready to get started?

Join us today