Yizhe Zhang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yizhe Zhang is active.

Explore More

Publication

Featured researches published by Yizhe Zhang.

Nucleic Acids Research | 2013

Composition-based classification of short metagenomic sequences elucidates the landscapes of taxonomic and functional enrichment of microorganisms

Jiemeng Liu; Haifeng Wang; Hongxing Yang; Yizhe Zhang; Jinfeng Wang; Fangqing Zhao; Ji Qi

Compared with traditional algorithms for long metagenomic sequence classification, characterizing microorganisms’ taxonomic and functional abundance based on tens of millions of very short reads are much more challenging. We describe an efficient composition and phylogeny-based algorithm [Metagenome Composition Vector (MetaCV)] to classify very short metagenomic reads (75–100 bp) into specific taxonomic and functional groups. We applied MetaCV to the Meta-HIT data (371-Gb 75-bp reads of 109 human gut metagenomes), and this single-read-based, instead of assembly-based, classification has a high resolution to characterize the composition and structure of human gut microbiota, especially for low abundance species. Most strikingly, it only took MetaCV 10 days to do all the computation work on a server with five 24-core nodes. To our knowledge, MetaCV, benefited from the strategy of composition comparison, is the first algorithm that can classify millions of very short reads within affordable time.

european conference on machine learning | 2016

Laplacian Hamiltonian Monte Carlo

Yizhe Zhang; Changyou Chen; Ricardo Henao; Lawrence Carin

We proposed a Hamiltonian Monte Carlo HMC method with Laplace kinetic energy, and demonstrate the connection between slice sampling and proposed HMC method in one-dimensional cases. Based on this connection, one can perform slice sampling using a numerical integrator in an HMC fashion. We provide theoretical analysis on the performance of such sampler in several univariate cases. Furthermore, the proposed approach extends the standard HMC by enabling sampling from discrete distributions. We compared our method with standard HMC on both synthetic and real data, and discuss its limitations and potential improvements.

international conference on data mining | 2016

Triply Stochastic Variational Inference for Non-linear Beta Process Factor Analysis

Kai Fan; Yizhe Zhang; Ricardo Henao; Katherine A. Heller

We propose a non-linear extension to factor analysis with beta process priors for improved data representation ability. This non-linear Beta Process Factor Analysis (nBPFA) allows data to be represented as a non-linear transformation of a standard sparse factor decomposition. We develop a scalable variational inference framework, which builds upon the ideas of the variational auto-encoder, by allowing latent variables of the model to be sparse. Our framework can be readily used for real-valued, binary and count data. We show theoretically and with experiments that our training scheme, with additive or multiplicative noise on observations, improves performance and prevents overfitting. We benchmark our algorithms on image, text and collaborative filtering datasets. We demonstrate faster convergence rates and competitive performance compared to standard gradient-based approaches.

international conference on data mining | 2016

Dynamic Poisson Factor Analysis

Yizhe Zhang; Yue Zhao; Lawrence A. David; Ricardo Henao; Lawrence Carin

We introduce a novel dynamic model for discrete time-series data, in which the temporal sampling may be nonuniform. The model is specified by constructing a hierarchy of Poisson factor analysis blocks, one for the transitions between latent states and the other for the emissions between latent states and observations. Latent variables are binary and linked to Poisson factor analysis via Bernoulli-Poisson specifications. The model is derived for count data but can be readily modified for binary observations. We derive efficient inference via Markov chain Monte Carlo, that scales with the number of non-zeros in the data and latent binary states, yielding significant acceleration compared to related models. Experimental results on benchmark data show the proposed model achieves state-of-the-art predictive performance. Additional experiments on microbiome data demonstrate applicability of the proposed model to interesting problems in computational biology where interpretability is of utmost importance.

international conference on machine learning | 2017