Mingyuan Zhou
University of Texas at Austin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mingyuan Zhou.
IEEE Transactions on Image Processing | 2012
Mingyuan Zhou; Haojun Chen; John Paisley; Lu Ren; Lingbo Li; Zhengming Xing; David B. Dunson; Guillermo Sapiro; Lawrence Carin
Nonparametric Bayesian methods are considered for recovery of imagery based upon compressive, incomplete, and/or noisy measurements. A truncated beta-Bernoulli process is employed to infer an appropriate dictionary for the data under test and also for image recovery. In the context of compressive sensing, significant improvements in image recovery are manifested using learned dictionaries, relative to using standard orthonormal image expansions. The compressive-measurement projections are also optimized for the learned dictionary. Additionally, we consider simpler (incomplete) measurements, defined by measuring a subset of image pixels, uniformly selected at random. Spatial interrelationships within imagery are exploited through use of the Dirichlet and probit stick-breaking processes. Several example results are presented, with comparisons to other methods in the literature.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015
Mingyuan Zhou; Lawrence Carin
The seemingly disjoint problems of count and mixture modeling are united under the negative binomial (NB) process. A gamma process is employed to model the rate measure of a Poisson process, whose normalization provides a random probability measure for mixture modeling and whose marginalization leads to an NB process for count modeling. A draw from the NB process consists of a Poisson distributed finite number of distinct atoms, each of which is associated with a logarithmic distributed number of data samples. We reveal relationships between various count- and mixture-modeling distributions and construct a Poisson-logarithmic bivariate distribution that connects the NB and Chinese restaurant table distributions. Fundamental properties of the models are developed, and we derive efficient Bayesian inference. It is shown that with augmentation and normalization, the NB process and gamma-NB process can be reduced to the Dirichlet process and hierarchical Dirichlet process, respectively. These relationships highlight theoretical, structural, and computational advantages of the NB process. A variety of NB processes, including the beta-geometric, beta-NB, marked-beta-NB, marked-gamma-NB and zero-inflated-NB processes, with distinct sharing mechanisms, are also constructed. These models are applied to topic modeling, with connections made to existing algorithms under Poisson factor analysis. Example results show the importance of inferring both the NB dispersion and probability parameters.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015
Gungor Polatkan; Mingyuan Zhou; Lawrence Carin; David M. Blei; Ingrid Daubechies
Super-resolution methods form high-resolution images from low-resolution images. In this paper, we develop a new Bayesian nonparametric model for super-resolution. Our method uses a beta-Bernoulli process to learn a set of recurring visual patterns, called dictionary elements, from the data. Because it is nonparametric, the number of elements found is also determined from the data. We test the results on both benchmark and natural images, comparing with several other models from the research literature. We perform large-scale human evaluation experiments to assess the visual quality of the results. In a first implementation, we use Gibbs sampling to approximate the posterior. However, this algorithm is not feasible for large-scale data. To circumvent this, we then develop an online variational Bayes (VB) algorithm. This algorithm finds high quality dictionaries in a fraction of the time needed by the Gibbs sampler.
sensor array and multichannel signal processing workshop | 2010
Mingyuan Zhou; Chunping Wang; Minhua Chen; John Paisley; David B. Dunson; Lawrence Carin
The Beta-Binomial processes are considered for inferring missing values in matrices. The model moves beyond the low-rank assumption, modeling the matrix columns as residing in a nonlinear subspace. Large-scale problems are considered via efficient Gibbs sampling, yielding predictions as well as a measure of confidence in each prediction. Algorithm performance is considered for several datasets, with encouraging performance relative to existing approaches.
IEEE Transactions on Biomedical Engineering | 2014
David E. Carlson; Joshua T. Vogelstein; Qisong Wu; Wenzhao Lian; Mingyuan Zhou; Colin R. Stoetzner; Daryl R. Kipke; Douglas J. Weber; David B. Dunson; Lawrence Carin
We propose a methodology for joint feature learning and clustering of multichannel extracellular electrophysiological data, across multiple recording periods for action potential detection and classification (sorting). Our methodology improves over the previous state of the art principally in four ways. First, via sharing information across channels, we can better distinguish between single-unit spikes and artifacts. Second, our proposed “focused mixture model” (FMM) deals with units appearing, disappearing, or reappearing over multiple recording days, an important consideration for any chronic experiment. Third, by jointly learning features and clusters, we improve performance over previous attempts that proceeded via a two-stage learning process. Fourth, by directly modeling spike rate, we improve the detection of sparsely firing neurons. Moreover, our Bayesian methodology seamlessly handles missing data. We present the state-of-the-art performance without requiring manually tuning hyperparameters, considering both a public dataset with partial ground truth and a new experimental dataset.
Journal of the American Statistical Association | 2016
Mingyuan Zhou; Oscar Hernan Madrid Padilla; James G. Scott
ABSTRACT We define a family of probability distributions for random count matrices with a potentially unbounded number of rows and columns. The three distributions we consider are derived from the gamma-Poisson, gamma-negative binomial, and beta-negative binomial processes, which we refer to generically as a family of negative-binomial processes. Because the models lead to closed-form update equations within the context of a Gibbs sampler, they are natural candidates for nonparametric Bayesian priors over count matrices. A key aspect of our analysis is the recognition that although the random count matrices within the family are defined by a row-wise construction, their columns can be shown to be independent and identically distributed (iid). This fact is used to derive explicit formulas for drawing all the columns at once. Moreover, by analyzing these matrices’ combinatorial structure, we describe how to sequentially construct a column-iid random count matrix one row at a time, and derive the predictive distribution of a new row count vector with previously unseen features. We describe the similarities and differences between the three priors, and argue that the greater flexibility of the gamma- and beta-negative binomial processes—especially their ability to model over-dispersed, heavy-tailed count data—makes these well suited to a wide variety of real-world applications. As an example of our framework, we construct a naive-Bayes text classifier to categorize a count vector to one of several existing random count matrices of different categories. The classifier supports an unbounded number of features and, unlike most existing methods, it does not require a predefined finite vocabulary to be shared by all the categories, and needs neither feature selection nor parameter tuning. Both the gamma- and beta-negative binomial processes are shown to significantly outperform the gamma-Poisson process when applied to document categorization, with comparable performance to other state-of-the-art supervised text classification algorithms. Supplementary materials for this article are available online.
Journal of the American Statistical Association | 2018
Siamak Zamani Dadaneh; Xiaoning Qian; Mingyuan Zhou
ABSTRACT We perform differential expression analysis of high-throughput sequencing count data under a Bayesian nonparametric framework, removing sophisticated ad hoc pre-processing steps commonly required in existing algorithms. We propose to use the gamma (beta) negative binomial process, which takes into account different sequencing depths using sample-specific negative binomial probability (dispersion) parameters, to detect differentially expressed genes by comparing the posterior distributions of gene-specific negative binomial dispersion (probability) parameters. These model parameters are inferred by borrowing statistical strength across both the genes and samples. Extensive experiments on both simulated and real-world RNA sequencing count data show that the proposed differential expression analysis algorithms clearly outperform previously proposed ones in terms of the areas under both the receiver operating characteristic and precision-recall curves. Supplementary materials for this article are available online.
international conference on acoustics, speech, and signal processing | 2012
Lingbo Li; Jorge Silva; Mingyuan Zhou; Lawrence Carin
The problem of learning a data-adaptive dictionary for a very large collection of signals is addressed. This paper proposes a statistical model and associated variational Bayesian (VB) inference for simultaneously learning the dictionary and performing sparse coding of the signals. The model builds upon beta process factor analysis (BPFA), with the number of factors automatically inferred, and posterior distributions are estimated for both the dictionary and the signals. Crucially, an online learning procedure is employed, allowing scalability to very large datasets which would be beyond the capabilities of existing batch methods. State-of-the-art performance is demonstrated by experiments with large natural images containing tens of millions of pixels.
international conference on acoustics, speech, and signal processing | 2011
Lingbo Li; Mingyuan Zhou; Eric Wang; Lawrence Carin
A new Bayesian model is proposed, integrating dictionary learning and topic modeling into a unified framework. The model is applied to cluster multiple images, and a subset of the images may be annotated. Example results are presented on the MNIST digit data and on the Microsoft MSRC multi-scene image data. These results reveal the working mechanisms of the model and demonstrate state-of-the-art performance.
international conference on image processing | 2010
John Paisley; Mingyuan Zhou; Guillermo Sapiro; Lawrence Carin
We present a Bayesian model for image interpolation and dictionary learning that uses two nonparametric priors for sparse signal representations: the beta process and the Dirichlet process. Additionally, the model uses spatial information within the image to encourage sharing of information within image subregions. We derive a hybrid MAP/Gibbs sampler, which performs Gibbs sampling for the latent indicator variables and MAP estimation for all other parameters. We present experimental results, where we show an improvement over other state-of-the-art algorithms in the low-measurement regime.