Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maria L. Rizzo is active.

Publication


Featured researches published by Maria L. Rizzo.


Annals of Statistics | 2007

Measuring and testing dependence by correlation of distances

Gábor J. Székely; Maria L. Rizzo; Nail K. Bakirov

Distance correlation is a new measure of dependence between random vectors. Distance covariance and distance correlation are analogous to product-moment covariance and correlation, but unlike the classical definition of correlation, distance correlation is zero only if the random vectors are independent. The empirical distance dependence measures are based on certain Euclidean distances between sample elements rather than sample moments, yet have a compact representation analogous to the classical covariance and correlation. Asymptotic properties and applications in testing independence are discussed. Implementation of the test and Monte Carlo results are also presented.


The Annals of Applied Statistics | 2009

BROWNIAN DISTANCE COVARIANCE

Gábor J. Székely; Maria L. Rizzo

We discuss briefly the very interesting concept of Brownian distance covariance developed by Székely and Rizzo (2009) and describe two possible extensions. The first extension is for high dimensional data that can be coerced into a Hilbert space, including certain high throughput screening and functional data settings. The second extension involves very simple modifications that may yield increased power in some settings. We commend Székely and Rizzo for their very interesting work and recognize that this general idea has potential to have a large impact on the way in which statisticians evaluate dependency in data.Distance correlation is a new class of multivariate dependence coefficients applicable to random vectors of arbitrary and not necessarily equal dimension. Distance covariance and distance correlation are analogous to product-moment covariance and correlation, but generalize and extend these classical bivariate measures of dependence. Distance correlation characterizes independence: it is zero if and only if the random vectors are independent. The notion of covariance with respect to a stochastic process is introduced, and it is shown that population distance covariance coincides with the covariance with respect to Brownian motion; thus, both can be called Brownian distance covariance. In the bivariate case, Brownian covariance is the natural extension of product-moment covariance, as we obtain Pearson product-moment covariance by replacing the Brownian motion in the defin- ition with identity. The corresponding statistic has an elegantly simple com- puting formula. Advantages of applying Brownian covariance and correlation vs the classical Pearson covariance and correlation are discussed and illustrated.


Journal of Multivariate Analysis | 2013

The distance correlation t-test of independence in high dimension

Gábor J. Székely; Maria L. Rizzo

Distance correlation is extended to the problem of testing the independence of random vectors in high dimension. Distance correlation characterizes independence and determines a test of multivariate independence for random vectors in arbitrary dimension. In this work, a modified distance correlation statistic is proposed, such that under independence the distribution of a transformation of the statistic converges to Student t, as dimension tends to infinity. Thus we obtain a distance correlation t-test for independence of random vectors in arbitrarily high dimension, applicable under standard conditions on the coordinates that ensure the validity of certain limit theorems. This new test is based on an unbiased estimator of distance covariance, and the resulting t-test is unbiased for every sample size greater than three and all significance levels. The transformed statistic is approximately normal under independence for sample size greater than nine, providing an informative sample coefficient that is easily interpretable for high dimensional data.


The Annals of Applied Statistics | 2010

DISCO analysis: A nonparametric extension of analysis of variance

Maria L. Rizzo; Gábor J. Székely

In classical analysis of variance, dispersion is measured by considering squared distances of sample elements from the sample mean. We consider a measure of dispersion for univariate or multivariate response based on all pairwise distances between-sample elements, and derive an analogous distance components (DISCO) decomposition for powers of distance in (0, 2]. The ANOVA F statistic is obtained when the index (exponent) is 2. For each index in (0, 2), this decomposition determines a nonparametric test for the multi-sample hypothesis of equal distributions that is statistically consistent against general alternatives.


Annals of Statistics | 2014

Partial distance correlation with methods for dissimilarities

Gábor J. Székely; Maria L. Rizzo

Distance covariance and distance correlation are scalar coefficients that characterize independence of random vectors in arbitrary dimension. Properties, extensions, and applications of distance correlation have been discussed in the recent literature, but the problem of defining the partial distance correlation has remained an open question of considerable interest. The problem of partial distance correlation is more complex than partial correlation partly because the squared distance covariance is not an inner product in the usual linear space. For the definition of partial distance correlation we introduce a new Hilbert space where the squared distance covariance is the inner product. We define the partial distance correlation statistics with the help of this Hilbert space, and develop and implement a test for zero partial distance correlation. Our intermediate results provide an unbiased estimator of squared distance covariance, and a neat solution to the problem of distance correlation for dissimilarities rather than distances.


Astin Bulletin | 2009

New Goodness-of-Fit Tests for Pareto Distributions

Maria L. Rizzo

A new approach to goodness-of-fit for Pareto distributions is introduced. Based on Euclidean distances between sample elements, the family of statistics and tests is indexed by an exponent in (0,2) on Euclidean distance. The corresponding tests are statistically consistent and have excellent performance when applied to heavy-tailed distributions. The exponent can be tailored to the particular Pareto distribution. The goodness-of-fit statistic measures all types of differences between distributions, hence it is also applicable as a minimum distance estimator. Implementation of the test statistics is developed and applied to estimation of the tail index in three well known examples of claims data, and compared with the classical EDF statistics.


International Journal of Information Technology and Decision Making | 2009

Pattern Recognition of Longitudinal Trial Data with Nonignorable Missingness: An Empirical Case Study

Hua Fang; Kimberly Andrews Espy; Maria L. Rizzo; Christian Stopp; Sandra A. Wiebe; Walter W. Stroup

Methods for identifying meaningful growth patterns of longitudinal trial data with both nonignorable intermittent and drop-out missingness are rare. In this study, a combined approach with statistical and data mining techniques is utilized to address the nonignorable missing data issue in growth pattern recognition. First, a parallel mixture model is proposed to model the nonignorable missing information from a real-world patient-oriented study and concurrently to estimate the growth trajectories of participants. Then, based on individual growth parameter estimates and their auxiliary feature attributes, a fuzzy clustering method is incorporated to identify the growth patterns. This case study demonstrates that the combined multi-step approach can achieve both statistical gener ality and computational efficiency for growth pattern recognition in longitudinal studies with nonignorable missing data.


Journal of Statistical Computation and Simulation | 2015

Variable selection in regression using maximal correlation and distance correlation

C. Deniz Yenigün; Maria L. Rizzo

In most of the regression problems the first task is to select the most influential predictors explaining the response, and removing the others from the model. These problems are usually referred to as the variable selection problems in the statistical literature. Numerous methods have been proposed in this field, most of which address linear models. In this study we propose two variable selection criteria for regression based on two powerful dependence measures, maximal correlation and distance correlation. We focus on these two measures since they fully or partially satisfy the Rényi postulates for dependence measures, and thus they are able to detect nonlinear dependence structures. Therefore, our methods are considered to be appropriate in linear as well as nonlinear regression models. Both methods are easy to implement and they perform well. We illustrate the performances of the proposed methods via simulations, and compare them with two benchmark methods, stepwise Akaike information criterion and lasso. In several cases with linear dependence all four methods turned out to be comparable. In the presence of nonlinear or uncorrelated dependencies, we observed that our proposed methods may be favourable. An application of the proposed methods to a real financial data set is also provided.


Communications in Statistics-theory and Methods | 2011

A Test of Independence in Two-Way Contingency Tables Based on Maximal Correlation

C. D. Yenigün; Gábor J. Székely; Maria L. Rizzo

Maximal correlation has several desirable properties as a measure of dependence, including the fact that it vanishes if and only if the variables are independent. Except for a few special cases, it is hard to evaluate maximal correlation explicitly. We focus on two-dimensional contingency tables and discuss a procedure for estimating maximal correlation, which we use for constructing a test of independence. We compare the maximal correlation test with other tests of independence by Monte Carlo simulations. When the underlying continuous variables are dependent but uncorrelated, we point out some cases for which the new test is more powerful.


Archive | 2012

Exploratory Data Analysis

Jim Albert; Maria L. Rizzo

Exploratory data analysis is the process by which a person manipulates data with the goal of learning about general patterns or tendencies and finding specific occurrences that deviate from the general patterns. The themes of Revelation, Resistance, Residuals, and Reexpression are illustrated in exploratory work.

Collaboration


Dive into the Maria L. Rizzo's collaboration.

Top Co-Authors

Avatar

Gábor J. Székely

National Science Foundation

View shared research outputs
Top Co-Authors

Avatar

Jim Albert

Bowling Green State University

View shared research outputs
Top Co-Authors

Avatar

Hua Fang

University of Massachusetts Medical School

View shared research outputs
Top Co-Authors

Avatar

Kimberly Andrews Espy

University of Nebraska–Lincoln

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nail K. Bakirov

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Christian Stopp

University of Nebraska–Lincoln

View shared research outputs
Top Co-Authors

Avatar

Honggang Wang

University of Massachusetts Dartmouth

View shared research outputs
Top Co-Authors

Avatar

John T. Haman

Bowling Green State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge