Martin J. Wainwright | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin J. Wainwright is active.

Explore More

Publication

Featured researches published by Martin J. Wainwright.

Foundations and Trends® in Machine Learning archive | 2008

Graphical Models, Exponential Families, and Variational Inference

Martin J. Wainwright; Michael I. Jordan

The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide variety of algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

IEEE Transactions on Image Processing | 2003

Image denoising using scale mixtures of Gaussians in the wavelet domain

Javier Portilla; Vasily Strela; Martin J. Wainwright; Eero P. Simoncelli

We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multiscale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the coefficient amplitudes. Under this model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear estimates over all possible values of the hidden multiplier variable. We demonstrate through simulations with images contaminated by additive white Gaussian noise that the performance of this method substantially surpasses that of previously published methods, both visually and in terms of mean squared error.

IEEE Transactions on Information Theory | 2010

Network Coding for Distributed Storage Systems

Alexandros G. Dimakis; P B Godfrey; Yunnan Wu; Martin J. Wainwright; Kannan Ramchandran

Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a single node failure is for a new node to reconstruct the whole encoded data object to generate just one encoded block. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to communicate functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.

IEEE Transactions on Information Theory | 2009

Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using

Martin J. Wainwright

The problem of consistently estimating the sparsity pattern of a vector beta* isin Rp based on observations contaminated by noise arises in various contexts, including signal denoising, sparse approximation, compressed sensing, and model selection. We analyze the behavior of l1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern. Our main result is to establish precise conditions on the problem dimension p, the number k of nonzero elements in beta*, and the number of observations n that are necessary and sufficient for sparsity pattern recovery using the Lasso. We first analyze the case of observations made using deterministic design matrices and sub-Gaussian additive noise, and provide sufficient conditions for support recovery and linfin-error bounds, as well as results showing the necessity of incoherence and bounds on the minimum value. We then turn to the case of random designs, in which each row of the design is drawn from a N (0, Sigma) ensemble. For a broad class of Gaussian ensembles satisfying mutual incoherence conditions, we compute explicit values of thresholds 0 < thetasl(Sigma) les thetasu(Sigma) < +infin with the following properties: for any delta > 0, if n > 2 (thetasu + delta) klog (p- k), then the Lasso succeeds in recovering the sparsity pattern with probability converging to one for large problems, whereas for n < 2 (thetasl - delta)klog (p - k), then the probability of successful recovery converges to zero. For the special case of the uniform Gaussian ensemble (Sigma = Iptimesp), we show that thetasl = thetas<u = 1, so that the precise threshold n = 2 klog(p- k) is exactly determined.

neural information processing systems | 2009

\ell _{1}

Sahand Negahban; Bin Yu; Martin J. Wainwright; Pradeep Ravikumar

High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of structure (e.g., sparse vectors; block-structured matrices; low-rank matrices; Markov assumptions). In such settings, a general approach to estimation is to solve a regularized convex program (known as a regularized M-estimator) which combines a loss function (measuring how well the model fits the data) with some regularization function that encourages the assumed structure. The goal of this paper is to provide a unified framework for establishing consistency and convergence rates for such regularized M-estimators under high-dimensional scaling. We state one main theorem and show how it can be used to re-derive several existing results, and also to obtain several new results on consistency and convergence rates. Our analysis also identifies two key properties of loss and regularization functions, referred to as restricted strong convexity and decomposability, that ensure the corresponding regularized M-estimators have fast convergence rates.

IEEE Transactions on Information Theory | 2005

-Constrained Quadratic Programming (Lasso)

Martin J. Wainwright; Tommi S. Jaakkola; Alan S. Willsky

We develop and analyze methods for computing provably optimal maximum a posteriori probability (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of tree-structured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: a) a tree-relaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds; and b) a tree-reweighted max-product message-passing algorithm that is related to but distinct from the max-product algorithm. In this way, we establish a connection between a certain LP relaxation of the mode-finding problem and a reweighted form of the max-product (min-sum) message-passing algorithm.

Annals of Statistics | 2010