Tianyi Zhou
University of Technology, Sydney
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tianyi Zhou.
IEEE Transactions on Image Processing | 2013
Tianyi Zhou; Dacheng Tao
Learning tasks such as classification and clustering usually perform better and cost less (time and space) on compressed representations than on the original data. Previous works mainly compress data via dimension reduction. In this paper, we propose “double shrinking” to compress image data on both dimensionality and cardinality via building either sparse low-dimensional representations or a sparse projection matrix for dimension reduction. We formulate a double shrinking model (DSM) as an l1 regularized variance maximization with constraint ||x||2=1, and develop a double shrinking algorithm (DSA) to optimize DSM. DSA is a path-following algorithm that can build the whole solution path of locally optimal solutions of different sparse levels. Each solution on the path is a “warm start” for searching the next sparser one. In each iteration of DSA, the direction, the step size, and the Lagrangian multiplier are deduced from the Karush-Kuhn-Tucker conditions. The magnitudes of trivial variables are shrunk and the importances of critical variables are simultaneously augmented along the selected direction with the determined step length. Double shrinking can be applied to manifold learning and feature selections for better interpretation of features, and can be combined with classification and clustering to boost their performance. The experimental results suggest that double shrinking produces efficient and effective data compression.
international symposium on information theory | 2012
Tianyi Zhou; Dacheng Tao
Low-rank structure have been profoundly studied in data mining and machine learning. In this paper, we show a dense matrix Xs low-rank approximation can be rapidly built from its left and right random projections Y1 = XA1 and Y2 = XT A2, or bilateral random projection (BRP). We then show power scheme can further improve the precision. The deterministic, average and deviation bounds of the proposed method and its power scheme modification are proved theoretically. The effectiveness and the efficiency of BRP based low-rank approximation is empirically verified on both artificial and real datasets.
international conference on data mining | 2013
Tianyi Zhou; Wei Bian; Dacheng Tao
Nonnegative matrix factorization (NMF) becomes tractable in polynomial time with unique solution under separability assumption, which postulates all the data points are contained in the conical hull of a few anchor data points. Recently developed linear programming and greedy pursuit methods can pick out the anchors from noisy data and results in a near-separable NMF. But their efficiency could be seriously weakened in high dimensions. In this paper, we show that the anchors can be precisely located from low-dimensional geometry of the data points even when their high dimensional features suffer from serious incompleteness. Our framework, entitled divide-and-conquer anchoring (DCA), divides the high-dimensional anchoring problem into a few cheaper sub-problems seeking anchors of data projections in low-dimensional random spaces, which can be solved in parallel by any near-separable NMF, and combines all the detected low-dimensional anchors via a fast hypothesis testing to identify the original anchors. We further develop two non-iterative anchoring algorithms in 1D and 2D spaces for data in convex hull and conical hull, respectively. These two rapid algorithms in the ultra low dimensions suffice to generate a robust and efficient near-separable NMF for high-dimensional or incomplete data via DCA. Compared to existing methods, two vital advantages of DCA are its scalability for big data, and capability of handling incomplete and high-dimensional noisy data. A rigorous analysis proves that DCA is able to find the correct anchors of a rank-k matrix by solving math cal O(klog k) sub-problems. Finally, we show DCA outperforms state-of-the-art methods on various datasets and tasks.
international symposium on information theory | 2012
Tianyi Zhou; Dacheng Tao
We consider recovering d-level quantization of a signal from k-level quantization of linear measurements. This problem has great potential in practical systems, but has not been fully addressed in compressed sensing (CS). We tackle it by proposing k-bit Hamming compressed sensing (HCS). It reduces the decoding to a series of hypothesis tests of the bin where the signal lies in. Each test equals to an independent nearest neighbor search for a histogram estimated from quantized measurements. This method is based on that the distribution of the ratio between two random projections is defined by their intersection angle. Compared to CS and 1-bit CS, k-bit HCS leads to lower cost in both hardware and computation. It admits a trade-off between recovery/measurement resolution and measurement amount and thus is more flexible than 1-bit HCS. A rigorous analysis shows its error bound. Extensive empirical study further justifies its appealing accuracy, robustness and efficiency.
IEEE Transactions on Image Processing | 2015
Dongjin Song; Wei Liu; Tianyi Zhou; Dacheng Tao; David A. Meyer
Conditional random fields (CRFs) are a flexible yet powerful probabilistic approach and have shown advantages for popular applications in various areas, including text analysis, bioinformatics, and computer vision. Traditional CRF models, however, are incapable of selecting relevant features as well as suppressing noise from noisy original features. Moreover, conventional optimization methods often converge slowly in solving the training procedure of CRFs, and will degrade significantly for tasks with a large number of samples and features. In this paper, we propose robust CRFs (RCRFs) to simultaneously select relevant features. An optimal gradient method (OGM) is further designed to train RCRFs efficiently. Specifically, the proposed RCRFs employ the
knowledge discovery and data mining | 2013
Yang Mu; Wei Ding; Tianyi Zhou; Dacheng Tao
\ell _{1}
IEEE Transactions on Neural Networks | 2014
Wei Bian; Tianyi Zhou; Aleix M. Martinez; George Baciu; Dacheng Tao
norm of the model parameters to regularize the objective used by traditional CRFs, therefore enabling discovery of the relevant unary features and pairwise features of CRFs. In each iteration of OGM, the gradient direction is determined jointly by the current gradient together with the historical gradients, and the Lipschitz constant is leveraged to specify the proper step size. We show that an OGM can tackle the RCRF model training very efficiently, achieving the optimal convergence rate
systems, man and cybernetics | 2009
Tianyi Zhou; Dacheng Tao
O(1/k^{\vphantom {R^{R^{.}}}2})
knowledge discovery and data mining | 2014
Tianyi Zhou; Dacheng Tao
(where
international conference on machine learning | 2011
Tianyi Zhou; Dacheng Tao
k