Tongliang Liu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tongliang Liu is active.

Explore More

Publication

Featured researches published by Tongliang Liu.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Classification with Noisy Labels by Importance Reweighting

Tongliang Liu; Dacheng Tao

In this paper, we study a classification problem in which sample labels are randomly corrupted. In this scenario, there is an unobservable sample with noise-free labels. However, before being observed, the true labels are independently flipped with a probability p E [0, 0.5), and the random label noise can be class-conditional. Here, we address two fundamental problems raised by this scenario. The first is how to best use the abundant surrogate loss functions designed for the traditional classification problem when there is label noise. We prove that any surrogate loss function can be used for classification with noisy labels by using importance reweighting, with consistency assurance that the label noise does not ultimately hinder the search for the optimal classifier of the noise-free sample. The other is the open problem of how to obtain the noise rate p. We show that the rate is upper bounded by the conditional probability P( ŶIX) of the noisy sample. Consequently, the rate can be estimated, because the upper bound can be easily reached in classification problems. Experimental results on synthetic and real datasets confirm the efficiency of our methods.

IEEE Transactions on Image Processing | 2014

Decomposition-Based Transfer Distance Metric Learning for Image Classification

Yong Luo; Tongliang Liu; Dacheng Tao; Chao Xu

Distance metric learning (DML) is a critical factor for image analysis and pattern recognition. To learn a robust distance metric for a target task, we need abundant side information (i.e., the similarity/dissimilarity pairwise constraints over the labeled data), which is usually unavailable in practice due to the high labeling cost. This paper considers the transfer learning setting by exploiting the large quantity of side information from certain related, but different source tasks to help with target metric learning (with only a little side information). The state-of-the-art metric learning algorithms usually fail in this setting because the data distributions of the source task and target task are often quite different. We address this problem by assuming that the target distance metric lies in the space spanned by the eigenvectors of the source metrics (or other randomly generated bases). The target metric is represented as a combination of the base metrics, which are computed using the decomposed components of the source metrics (or simply a set of random bases); we call the proposed method, decomposition-based transfer DML (DTDML). In particular, DTDML learns a sparse combination of the base metrics to construct the target metric by forcing the target metric to be close to an integration of the source metrics. The main advantage of the proposed method compared with existing transfer metric learning approaches is that we directly learn the base metric coefficients instead of the target metric. To this end, far fewer variables need to be learned. We therefore obtain more reliable solutions given the limited side information and the optimization tends to be faster. Experiments on the popular handwritten image (digit, letter) classification and challenge natural image annotation tasks demonstrate the effectiveness of the proposed method.

IEEE Transactions on Neural Networks | 2016

On the Performance of Manhattan Nonnegative Matrix Factorization

Tongliang Liu; Dacheng Tao

Extracting low-rank and sparse structures from matrices has been extensively studied in machine learning, compressed sensing, and conventional signal processing, and has been widely applied to recommendation systems, image reconstruction, visual analytics, and brain signal processing. Manhattan nonnegative matrix factorization (MahNMF) is an extension of the conventional NMF, which models the heavy-tailed Laplacian noise by minimizing the Manhattan distance between a nonnegative matrix X and the product of two nonnegative low-rank factor matrices. Fast algorithms have been developed to restore the low-rank and sparse structures of X in the MahNMF. In this paper, we study the statistical performance of the MahNMF in the frame of the statistical learning theory. We decompose the expected reconstruction error of the MahNMF into the estimation error and the approximation error. The estimation error is bounded by the generalization error bounds of the MahNMF, while the approximation error is analyzed using the asymptotic results of the minimum distortion of vector quantization. The generalization error bound is valuable for determining the size of the training sample needed to guarantee a desirable upper bound for the defect between the expected and empirical reconstruction errors. Statistical performance analysis shows how the reduced dimensionality affects the estimation and approximation errors. Our framework can also be used for analyzing the performance of the NMF.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017

Algorithm-Dependent Generalization Bounds for Multi-Task Learning

Tongliang Liu; Dacheng Tao; Mingli Song; Stephen J. Maybank

Often, tasks are collected for multi-task learning (MTL) because they share similar feature structures. Based on this observation, in this paper, we present novel algorithm-dependent generalization bounds for MTL by exploiting the notion of algorithmic stability. We focus on the performance of one particular task and the average performance over multiple tasks by analyzing the generalization ability of a common parameter that is shared in MTL. When focusing on one particular task, with the help of a mild assumption on the feature structures, we interpret the function of the other tasks as a regularizer that produces a specific inductive bias. The algorithm for learning the common parameter, as well as the predictor, is thereby uniformly stable with respect to the domain of the particular task and has a generalization bound with a fast convergence rate of order O(1/n), where n is the sample size of the particular task. When focusing on the average performance over multiple tasks, we prove that a similar inductive bias exists under certain conditions on the feature structures. Thus, the corresponding algorithm for learning the common parameter is also uniformly stable with respect to the domains of the multiple tasks, and its generalization bound is of the order O(1/T), where T is the number of tasks. These theoretical analyses naturally show that the similarity of feature structures in MTL will lead to specific regularizations for predicting, which enables the learning algorithms to generalize fast and correctly from a few examples.

IEEE Transactions on Image Processing | 2016

Local Rademacher Complexity for Multi-Label Learning

Chang Xu; Tongliang Liu; Dacheng Tao; Chao Xu

We analyze the local Rademacher complexity of empirical risk minimization-based multi-label learning algorithms, and in doing so propose a new algorithm for multi-label learning. Rather than using the trace norm to regularize the multi-label predictor, we instead minimize the tail sum of the singular values of the predictor in multi-label learning. Benefiting from the use of the local Rademacher complexity, our algorithm, therefore, has a sharper generalization error bound. Compared with methods that minimize over all singular values, concentrating on the tail singular values results in better recovery of the low-rank structure of the multi-label predictor, which plays an important role in exploiting label correlations. We propose a new conditional singular value thresholding algorithm to solve the resulting objective function. Moreover, a variance control strategy is employed to reduce the variance of variables in optimization. Empirical studies on real-world data sets validate our theoretical results and demonstrate the effectiveness of the proposed algorithm for multi-label learning.

knowledge discovery and data mining | 2015

Spectral Ensemble Clustering

Hongfu Liu; Tongliang Liu; Junjie Wu; Dacheng Tao; Yun Fu

Ensemble clustering, also known as consensus clustering, is emerging as a promising solution for multi-source and/or heterogeneous data clustering. The co-association matrix based method, which redefines the ensemble clustering problem as a classical graph partition problem, is a landmark method in this area. Nevertheless, the relatively high time and space complexity preclude it from real-life large-scale data clustering. We therefore propose SEC, an efficient Spectral Ensemble Clustering method based on co-association matrix. We show that SEC has theoretical equivalence to weighted K-means clustering and results in vastly reduced algorithmic complexity. We then derive the latent consensus function of SEC, which to our best knowledge is among the first to bridge co-association matrix based method to the methods with explicit object functions. The robustness and generalizability of SEC are then investigated to prove the superiority of SEC in theory. We finally extend SEC to meet the challenge rising from incomplete basic partitions, based on which a scheme for big data clustering can be formed. Experimental results on various real-world data sets demonstrate that SEC is an effective and efficient competitor to some state-of-the-art ensemble clustering methods and is also suitable for big data clustering.

IEEE Transactions on Neural Networks | 2017

Large-Cone Nonnegative Matrix Factorization

Tongliang Liu; Mingming Gong; Dacheng Tao

Nonnegative matrix factorization (NMF) has been greatly popularized by its parts-based interpretation and the effective multiplicative updating rule for searching local solutions. In this paper, we study the problem of how to obtain an attractive local solution for NMF, which not only fits the given training data well but also generalizes well on the unseen test data. Based on the geometric interpretation of NMF, we introduce two large-cone penalties for NMF and propose large-cone NMF (LCNMF) algorithms. Compared with NMF, LCNMF will obtain bases comprising a larger simplicial cone, and therefore has three advantages. 1) the empirical reconstruction error of LCNMF could mostly be smaller; (2) the generalization ability of the proposed algorithm is much more powerful; and (3) the obtained bases of LCNMF have a low-overlapping property, which enables the bases to be sparse and makes the proposed algorithms very robust. Experiments on synthetic and real-world data sets confirm the efficiency of LCNMF.

IEEE Transactions on Neural Networks | 2015

Deformed Graph Laplacian for Semisupervised Learning

Chen Gong; Tongliang Liu; Dacheng Tao; Keren Fu; Enmei Tu; Jie Yang

Graph Laplacian has been widely exploited in traditional graph-based semisupervised learning (SSL) algorithms to regulate the labels of examples that vary smoothly on the graph. Although it achieves a promising performance in both transductive and inductive learning, it is not effective for handling ambiguous examples (shown in Fig. 1). This paper introduces deformed graph Laplacian (DGL) and presents label prediction via DGL (LPDGL) for SSL. The local smoothness term used in LPDGL, which regularizes examples and their neighbors locally, is able to improve classification accuracy by properly dealing with ambiguous examples. Theoretical studies reveal that LPDGL obtains the globally optimal decision function, and the free parameters are easy to tune. The generalization bound is derived based on the robustness analysis. Experiments on a variety of real-world data sets demonstrate that LPDGL achieves top-level performance on both transductive and inductive settings by comparing it with popular SSL algorithms, such as harmonic functions, AnchorGraph regularization, linear neighborhood propagation, Laplacian regularized least square, and Laplacian support vector machine.

Neural Computation | 2016

Dimensionality-dependent generalization bounds for k-dimensional coding schemes

Tongliang Liu; Dacheng Tao; Dong Xu

The k-dimensional coding schemes refer to a collection of methods that attempt to represent data using a set of representative k-dimensional vectors and include nonnegative matrix factorization, dictionary learning, sparse coding, k-means clustering, and vector quantization as special cases. Previous generalization bounds for the reconstruction error of the k-dimensional coding schemes are mainly dimensionality-independent. A major advantage of these bounds is that they can be used to analyze the generalization error when data are mapped into an infinite- or high-dimensional feature space. However, many applications use finite-dimensional data features. Can we obtain dimensionality-dependent generalization bounds for k-dimensional coding schemes that are tighter than dimensionality-independent bounds when data are in a finite-dimensional feature space? Yes. In this letter, we address this problem and derive a dimensionality-dependent generalization bound for k-dimensional coding schemes by bounding the covering number of the loss function class induced by the reconstruction error. The bound is of order , where m is the dimension of features, k is the number of the columns in the linear implementation of coding schemes, and n is the size of sample, when n is finite and when n is infinite. We show that our bound can be tighter than previous results because it avoids inducing the worst-case upper bound on k of the loss function. The proposed generalization bound is also applied to some specific coding schemes to demonstrate that the dimensionality-dependent bound is an indispensable complement to the dimensionality-independent generalization bounds.

international conference on information science and technology | 2014

On the robustness and generalization of Cauchy regression

Tongliang Liu; Dacheng Tao

It was recently highlighted in a special issue of Nature [1] that the value of big data has yet to be effectively exploited for innovation, competition and productivity. To realize the full potential of big data, big learning algorithms need to be developed to keep pace with the continuous creation, storage and sharing of data. Least squares (LS) and least absolute deviation (LAD) have been successful regression tools used in business, government and society over the past few decades. However, these existing technologies are severely limited by noisy data because their breakdown points are both zero, i.e., they do not tolerate outliers. By appropriately setting the turning constant of Cauchy regression (CR), the maximum possible value (50%) of the breakdown point can be attained. CR therefore has the capability to learn a robust model from noisy big data. Although the theoretical analysis of the breakdown point for CR has been comprehensively investigated, we propose a new approach by interpreting the optimization of an objective function as a sample-weighted procedure. We therefore clearly show the differences of the robustness between LS, LAD and CR. We also study the statistical performance of CR. This study derives the generalization error bounds for CR by analyzing the covering number and Rademacher complexity of the hypothesis class, as well as showing how the scale parameter affects its performance.

Explore More