Archive | 2019

Statistical and Computational Properties of Some “User-Friendly” Methods for High-Dimensional Estimation

 

Abstract


Historically, the choice of method for a given statistical problem has been primarily driven by two criteria: a method’s statistical properties, and its computationalproperties. But as tools for high-dimensional estimation become ubiquitous, it is clear that there are other considerations, going beyond pure accuracy and computational efficiency, that are equally (if not more) important. One such consideration is a method’s “user-friendliness”— a term we use to encapsulate the various properties that make a method easy to work with in practice, exemplified by a method being (i) easy-to-implement, (ii) interpretable, and (iii) computationally cheap. In this thesis, we present new statistical and computational results for three different user-friendly methods in various high-dimensional estimation settings. First, we give conditions for the existence and uniqueness of solutions to the generalized lasso problem, which is a generalization of the standard lasso problem that allows the user to easily impose domain-appropriate structure onto the fittedcoefficients. The conditions are very weak, and essentially guarantee uniqueness in many settings of practical interest, even in high dimensions, which are useful resultsfrom the points-of-view of interpretability as well as prediction. Second, we consider early-stopped gradient descent (as an estimator), giving a number of resultsthat tightly couple the risk profile of the iterates generated by gradient descent, when run on the fundamental problem of least squares regression, to that of ridge regression—these results are favorable for gradient descent, as it is relatively easy to implement as well as computationally cheap. We also discuss extending the analysis to give a similar coupling for (the arguably even more user-friendly) stochastic gradient descent. Finally, we present a new user-friendly, pseudolikelihood-based method for robust undirected graphical modeling that we call the Multiple Quantile Graphical Model (MQGM), showing that the MQGM recovers the population-level conditional independencies, with high probability — this is again a useful result,from an interpretability standpoint. We also give a highly efficient algorithm, based on the alternating direction method of multipliers, for fitting the MQGM to high-dimensionaland potentially non-Gaussian data.

Volume None
Pages None
DOI 10.1184/R1/8336750.V1
Language English
Journal None

Full Text