Mathematics
Statistics Theory
Featured Researches
Conditional empirical copula processes and generalized dependence measures
We study the weak convergence of conditional empirical copula processes, when the conditioning event has a nonzero probability. The validity of several bootstrap schemes is stated, including the exchangeable bootstrap. We define general - possibly conditional - multivariate dependence measures and their estimators. By applying our theoretical results, we prove the asymptotic normality of some estimators of such dependence measures.
Read moreConditional probability and improper priors
The purpose of this paper is to present a mathematical theory that can be used as a foundation for statistics that include improper priors. This theory includes improper laws in the initial axioms and has in particular Bayes theorem as a consequence. Another consequence is that some of the usual calculation rules are modified. This is important in relation to common statistical practice which usually include improper priors, but tends to use unaltered calculation rules. In some cases the results are valid, but in other cases inconsistencies may appear. The famous marginalization paradoxes exemplify this latter case. An alternative mathematical theory for the foundations of statistics can be formulated in terms of conditional probability spaces. In this case the appearance of improper laws is a consequence of the theory. It is proved here that the resulting mathematical structures for the two theories are equivalent. The conclusion is that the choice of the first or the second formulation for the initial axioms can be considered a matter of personal preference. Readers that initially have concerns regarding improper priors can possibly be more open toward a formulation of the initial axioms in terms of conditional probabilities. The interpretation of an improper law is given by the corresponding conditional probabilities. Keywords: Axioms of statistics, Conditional probability space, Improper prior, Projective space
Read moreConditional tail risk expectations for location-scale mixture of elliptical distributions
We present general results on the univariate tail conditional expectation (TCE) and multivariate tail conditional expectation for location-scale mixture of elliptical distributions. Examples include the location-scale mixture of normal distributions, location-scale mixture of Student- t distributions, location-scale mixture of Logistic distributions and location-scale mixture of Laplace distributions. We also consider portfolio risk decomposition with TCE for location-scale mixture of elliptical distributions.
Read moreConformal e-prediction for change detection
We adapt conformal e-prediction to change detection, defining analogues of the Shiryaev-Roberts and CUSUM procedures for detecting violations of the IID assumption. Asymptotically, the frequency of false alarms for these analogues does not exceed the usual bounds.
Read moreConsistency analysis of bilevel data-driven learning in inverse problems
One fundamental problem when solving inverse problems is how to find regularization parameters. This article considers solving this problem using data-driven bilevel optimization, i.e. we consider the adaptive learning of the regularization parameter from data by means of optimization. This approach can be interpreted as solving an empirical risk minimization problem, and we analyze its performance in the large data sample size limit for general nonlinear problems. We demonstrate how to implement our framework on linear inverse problems, where we can further show the inverse accuracy does not depend on the ambient space dimension. To reduce the associated computational cost, online numerical schemes are derived using the stochastic gradient descent method. We prove convergence of these numerical schemes under suitable assumptions on the forward problem. Numerical experiments are presented illustrating the theoretical results and demonstrating the applicability and efficiency of the proposed approaches for various linear and nonlinear inverse problems, including Darcy flow, the eikonal equation, and an image denoising example.
Read moreConsistent Bayesian Community Detection
Stochastic Block Models (SBMs) are a fundamental tool for community detection in network analysis. But little theoretical work exists on the statistical performance of Bayesian SBMs, especially when the community count is unknown. This paper studies a special class of SBMs whose community-wise connectivity probability matrix is diagonally dominant, i.e., members of the same community are more likely to connect with one another than with members from other communities. The diagonal dominance constraint is embedded within an otherwise weak prior, and, under mild regularity conditions, the resulting posterior distribution is shown to concentrate on the true community count and membership allocation as the network size grows to infinity. A reversible-jump Markov Chain Monte Carlo posterior computation strategy is developed by adapting the allocation sampler of Mcdaid et al (2013). Finite sample properties are examined via simulation studies in which the proposed method offers competitive estimation accuracy relative to existing methods under a variety of challenging scenarios.
Read moreConstruction and Extension of Dispersion Models
There are two main classes of dispersion models studied in the literature: proper (PDM), and exponential dispersion models (EDM). Dispersion models that are neither proper nor exponential dispersion models are termed here non-standard dispersion models (NSDM). This paper exposes a technique for constructing new PDMs and NSDMs. This construction provides a solution to an open question in the theory of dispersion models about the extension of non-standard dispersion models. Given a unit deviance function, a dispersion model is usually constructed by calculating a normalising function that makes the density function integrates one. This calculation involves the solution of non-trivial integral equations. The main idea explored here is to use characteristic functions of real non-lattice symmetric probability measures to construct a family of unit deviances that are sufficiently regular to make the associated integral equations tractable. The integral equations associated to those unit deviances admit a trivial solution, in the sense that the normalising function is a constant function independent of the observed values. However, we show, using the machinery of distributions (i.e., generalised functions) and expansions of the normalising function with respect to specially constructed Riez systems, that those integral equations also admit infinitely many non-trivial solutions, generating many NSDMs. We conclude that, the cardinality of the class of non-standard dispersion models is larger than the cardinality of the class of real non-lattice symmetric probability measures.
Read moreConvergence Rates for Bayesian Estimation and Testing in Monotone Regression
Shape restrictions such as monotonicity on functions often arise naturally in statistical modeling. We consider a Bayesian approach to the problem of estimation of a monotone regression function and testing for monotonicity. We construct a prior distribution using piecewise constant functions. For estimation, a prior imposing monotonicity of the heights of these steps is sensible, but the resulting posterior is harder to analyze theoretically. We consider a ``projection-posterior'' approach, where a conjugate normal prior is used, but the monotonicity constraint is imposed on posterior samples by a projection map on the space of monotone functions. We show that the resulting posterior contracts at the optimal rate n −1/3 under the L 1 -metric and at a nearly optimal rate under the empirical L p -metrics for 0<p≤2 . The projection-posterior approach is also computationally more convenient. We also construct a Bayesian test for the hypothesis of monotonicity using the posterior probability of a shrinking neighborhood of the set of monotone functions. We show that the resulting test has a universal consistency property and obtain the separation rate which ensures that the resulting power function approaches one.
Read moreConvergence Rates of Two-Component MCMC Samplers
Component-wise MCMC algorithms, including Gibbs and conditional Metropolis-Hastings samplers, are commonly used for sampling from multivariate probability distributions. A long-standing question regarding Gibbs algorithms is whether a deterministic-scan (systematic-scan) sampler converges faster than its random-scan counterpart. We answer this question when the samplers involve two components by establishing an exact quantitative relationship between the L 2 convergence rates of the two samplers. The relationship shows that the deterministic-scan sampler converges faster. We also establish qualitative relations among the convergence rates of two-component Gibbs samplers and some conditional Metropolis-Hastings variants. For instance, it is shown that if some two-component conditional Metropolis-Hastings samplers are geometrically ergodic, then so are the associated Gibbs samplers.
Read moreConvex Regression in Multidimensions: Suboptimality of Least Squares Estimators
The least squares estimator (LSE) is shown to be suboptimal in squared error loss in the usual nonparametric regression model with Gaussian errors for d≥5 for each of the following families of functions: (i) convex functions supported on a polytope (in fixed design), (ii) bounded convex functions supported on a polytope (in random design), and (iii) convex Lipschitz functions supported on any convex domain (in random design). For each of these families, the risk of the LSE is proved to be of the order n −2/d (up to logarithmic factors) while the minimax risk is n −4/(d+4) , for d≥5 . In addition, the first rate of convergence results (worst case and adaptive) for the full convex LSE are established for polytopal domains for all d≥1 . Some new metric entropy results for convex functions are also proved which are of independent interest.
Read moreReady to get started?
Join us today