Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chiyuan Zhang is active.

Publication


Featured researches published by Chiyuan Zhang.


knowledge discovery and data mining | 2010

Unsupervised feature selection for multi-cluster data

Deng Cai; Chiyuan Zhang; Xiaofei He

In many data analysis tasks, one is often confronted with very high dimensional data. Feature selection techniques are designed to find the relevant feature subset of the original features which can facilitate clustering, classification and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. Traditional unsupervised feature selection methods address this issue by selecting the top ranked features based on certain scores computed independently for each feature. These approaches neglect the possible correlation between different features and thus can not produce an optimal feature subset. Inspired from the recent developments on manifold learning and L1-regularized models for subset selection, we propose in this paper a new approach, called Multi-Cluster Feature Selection (MCFS), for unsupervised feature selection. Specifically, we select those features such that the multi-cluster structure of the data can be best preserved. The corresponding optimization problem can be efficiently solved since it only involves a sparse eigen-problem and a L1-regularized least squares problem. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithm.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011

A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization

Xiaofei He; Ming Ji; Chiyuan Zhang; Hujun Bao

In many information processing tasks, one is often confronted with very high-dimensional data. Feature selection techniques are designed to find the meaningful feature subset of the original features which can facilitate clustering, classification, and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenarios, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. Based on Laplacian regularized least squares, which finds a smooth function on the data manifold and minimizes the empirical loss, we propose two novel feature selection algorithms which aim to minimize the expected prediction error of the regularized regression model. Specifically, we select those features such that the size of the parameter covariance matrix of the regularized regression model is minimized. Motivated from experimental design, we use trace and determinant operators to measure the size of the covariance matrix. Efficient computational schemes are also introduced to solve the corresponding optimization problems. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithms.


international conference on acoustics, speech, and signal processing | 2014

A DEEP REPRESENTATION FOR INVARIANCE AND MUSIC CLASSIFICATION

Chiyuan Zhang; Georgios Evangelopoulos; Stephen Voinea; Lorenzo Rosasco; Tomaso Poggio

Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this paper we propose the use of such computational modules for extracting invariant and discriminative audio representations. Building on a theory of invariance in hierarchical architectures, we propose a novel, mid-level representation for acoustical signals, using the empirical distributions of projections on a set of templates and their transformations. Under the assumption that, by construction, this dictionary of templates is composed from similar classes, and samples the orbit of variance-inducing signal transformations (such as shift and scale), the resulting signature is theoretically guaranteed to be unique, invariant to transformations and stable to deformations. Modules of projection and pooling can then constitute layers of deep networks, for learning composite representations. We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

A-Optimal Projection for Image Representation

Xiaofei He; Chiyuan Zhang; Lijun Zhang; Xuelong Li

We consider the problem of image representation from the perspective of statistical design. Recent studies have shown that images are possibly sampled from a low dimensional manifold despite of the fact that the ambient space is usually very high dimensional. Learning low dimensional image representations is crucial for many image processing tasks such as recognition and retrieval. Most of the existing approaches for learning low dimensional representations, such as principal component analysis (PCA) and locality preserving projections (LPP), aim at discovering the geometrical or discriminant structures in the data. In this paper, we take a different perspective from statistical experimental design, and propose a novel dimensionality reduction algorithm called A-Optimal Projection (AOP). AOP is based on a linear regression model. Specifically, AOP finds the optimal basis functions so that the expected prediction error of the regression model can be minimized if the new representations are used for training the model. Experimental results suggest that the proposed approach provides a better representation and achieves higher accuracy in image retrieval.


IEEE Transactions on Circuits and Systems for Video Technology | 2013

Image Compression by Learning to Minimize the Total Error

Chiyuan Zhang; Xiaofei He

In this paper, we consider the problem of lossy image compression. Recently, machine learning techniques have been introduced as effective mechanisms for image compression. The compression involves storing only the grayscale image and a few carefully selected color pixel seeds. For decompression, regression models are learned with the stored data to predict the missing colors. This reduces image compression to standard active learning and semisupervised learning problems. In this paper, we propose a novel algorithm that makes use of all the colors (instead of only the colors of the selected seeds) available during the encoding stage. By minimizing the total color prediction error, our method can achieve a better compression ratio and better colorization quality than previous methods. The experimental results demonstrate the effectiveness of our proposed algorithm.


sino foreign interchange conference on intelligent science and intelligent data engineering | 2011

Orthogonal projection analysis

Binbin Lin; Chiyuan Zhang; Xiaofei He

In this paper, we propose a novel linear dimensionality reduction algorithm, called Orthogonal Projection Analysis (OPA), from a gradient field perspective. Our approach is based on the following two criteria. First, the linear map should preserve the metric of the ambient space, which is based on the assumption that the metric of the ambient space is reliable. The second is the well-known smoothness criterion which is critical for clustering. Interestingly, gradient field is a natural tool to connect to these two requirements. We give a continuous objective function based on gradient fields and discuss how to discretize it by using tangent space. We also show the geometric meaning of our approach, which is requiring the gradient field as orthogonal as possible to the tangent spaces. The experimental results have demonstrated the effectiveness of our proposed approach.


arXiv: Distributed, Parallel, and Cluster Computing | 2015

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

Tianqi Chen; Mu Li; Yutian Li; Min Lin; Naiyan Wang; Minjie Wang; Tianjun Xiao; Bing Xu; Chiyuan Zhang; Zheng Zhang


international conference on learning representations | 2017

Understanding deep learning requires rethinking generalization

Chiyuan Zhang; Samy Bengio; Moritz Hardt; Benjamin Recht; Oriol Vinyals


neural information processing systems | 2015

Learning with a Wasserstein loss

Charlie Frogner; Chiyuan Zhang; Hossein Mobahi; Mauricio Araya-Polo; Tomaso Poggio


Geophysics | 2017

Automated fault detection without seismic processing

Mauricio Araya-Polo; Taylor Dahlke; Charlie Frogner; Chiyuan Zhang; Tomaso Poggio; Detlef Hohl

Collaboration


Dive into the Chiyuan Zhang's collaboration.

Top Co-Authors

Avatar

Tomaso Poggio

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lorenzo Rosasco

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Stephen Voinea

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Charlie Frogner

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexander Rakhlin

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Brando Miranda

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge