Fayao Liu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fayao Liu is active.

Explore More

Publication

Featured researches published by Fayao Liu.

computer vision and pattern recognition | 2015

Deep convolutional neural fields for depth estimation from a single image

Fayao Liu; Chunhua Shen; Guosheng Lin

We consider the problem of depth estimation from a single monocular image in this work. It is a challenging task as no reliable depth cues are available, e.g., stereo correspondences, motions etc. Previous efforts have been focusing on exploiting geometric priors or additional sources of information, with all using hand-crafted features. Recently, there is mounting evidence that features from deep convolutional neural networks (CNN) are setting new records for various vision applications. On the other hand, considering the continuous characteristic of the depth values, depth estimations can be naturally formulated into a continuous conditional random field (CRF) learning problem. Therefore, we in this paper present a deep convolutional neural field model for estimating depths from a single image, aiming to jointly explore the capacity of deep CNN and continuous CRF. Specifically, we propose a deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework. The proposed method can be used for depth estimations of general scenes with no geometric priors nor any extra information injected. In our case, the integral of the partition function can be analytically calculated, thus we can exactly solve the log-likelihood optimization. Moreover, solving the MAP problem for predicting depths of a new image is highly efficient as closed-form solutions exist. We experimentally demonstrate that the proposed method outperforms state-of-the-art depth estimation methods on both indoor and outdoor scene datasets.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields

Fayao Liu; Chunhua Shen; Guosheng Lin; Ian D. Reid

In this article, we tackle the problem of depth estimation from single monocular images. Compared with depth estimation using multiple images such as stereo depth perception, depth from monocular images is much more challenging. Prior work typically focuses on exploiting geometric priors or additional sources of information, most using hand-crafted features. Recently, there is mounting evidence that features from deep convolutional neural networks (CNN) set new records for various vision applications. On the other hand, considering the continuous characteristic of the depth values, depth estimation can be naturally formulated as a continuous conditional random field (CRF) learning problem. Therefore, here we present a deep convolutional neural field model for estimating depths from single monocular images, aiming to jointly explore the capacity of deep CNN and continuous CRF. In particular, we propose a deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework. We then further propose an equally effective model based on fully convolutional networks and a novel superpixel pooling method, which is about 10 times faster, to speedup the patch-wise convolutions in the deep model. With this more efficient model, we are able to design deeper networks to pursue better performance. Our proposed method can be used for depth estimation of general scenes with no geometric priors nor any extra information injected. In our case, the integral of the partition function can be calculated in a closed form such that we can exactly solve the log-likelihood maximization. Moreover, solving the inference problem for predicting depths of a test image is highly efficient as closed-form solutions exist. Experiments on both indoor and outdoor scene datasets demonstrate that the proposed method outperforms state-of-the-art depth estimation approaches.

Pattern Recognition | 2015

CRF learning with CNN features for image segmentation

Fayao Liu; Guosheng Lin; Chunhua Shen

Conditional Random Rields (CRF) have been widely applied in image segmentations. While most studies rely on hand-crafted features, we here propose to exploit a pre-trained large convolutional neural network (CNN) to generate deep features for CRF learning. The deep CNN is trained on the ImageNet dataset and transferred to image segmentations here for constructing potentials of superpixels. Then the CRF parameters are learnt using a structured support vector machine (SSVM). To fully exploit context information in inference, we construct spatially related co-occurrence pairwise potentials and incorporate them into the energy function. This prefers labelling of object pairs that frequently co-occur in a certain spatial layout and at the same time avoids implausible labellings during the inference. Extensive experiments on binary and multi-class segmentation benchmarks demonstrate the promise of the proposed method. We thus provide new baselines for the segmentation performance on the Weizmann horse, Graz-02, MSRC-21, Stanford Background and PASCAL VOC 2011 datasets. HighlightsA deep CNN pretrained on ImageNet generalizes well to various segmentation datasets.Deep features significantly outperform BoW and unsupervisd feature learning.Combining deep CNN features with CRF yields new state-of-the-art results.Incorporating spatial related co-occurrence potentials further improves the accuracy.

IEEE Journal of Biomedical and Health Informatics | 2014

Multiple Kernel Learning in the Primal for Multimodal Alzheimer’s Disease Classification

Fayao Liu; Luping Zhou; Chunhua Shen; Jianping Yin

To achieve effective and efficient detection of Alzheimers disease (AD), many machine learning methods have been introduced into this realm. However, the general case of limited training samples, as well as different feature representations typically makes this problem challenging. In this paper, we propose a novel multiple kernel-learning framework to combine multimodal features for AD classification, which is scalable and easy to implement. Contrary to the usual way of solving the problem in the dual, we look at the optimization from a new perspective. By conducting Fourier transform on the Gaussian kernel, we explicitly compute the mapping function, which leads to a more straightforward solution of the problem in the primal. Furthermore, we impose the mixed L21 norm constraint on the kernel weights, known as the group lasso regularization, to enforce group sparsity among different feature modalities. This actually acts as a role of feature modality selection, while at the same time exploiting complementary information among different kernels. Therefore, it is able to extract the most discriminative features for classification. Experiments on the ADNI dataset demonstrate the effectiveness of the proposed method.

IEEE Transactions on Neural Networks | 2014

Efficient Dual Approach to Distance Metric Learning

Chunhua Shen; Junae Kim; Fayao Liu; Lei Wang; Anton van den Hengel

Distance metric learning is of fundamental interest in machine learning because the employed distance metric can significantly affect the performance of many learning methods. Quadratic Mahalanobis metric learning is a popular approach to the problem, but typically requires solving a semidefinite programming (SDP) problem, which is computationally expensive. The worst case complexity of solving an SDP problem involving a matrix variable of size D×D with O (D) linear constraints is about O(D6.5) using interior-point methods, where D is the dimension of the input data. Thus, the interior-point methods only practically solve problems exhibiting less than a few thousand variables. Because the number of variables is D (D+1)/2, this implies a limit upon the size of problem that can practically be solved around a few hundred dimensions. The complexity of the popular quadratic Mahalanobis metric learning approach thus limits the size of problem to which metric learning can be applied. Here, we propose a significantly more efficient and scalable approach to the metric learning problem based on the Lagrange dual formulation of the problem. The proposed formulation is much simpler to implement, and therefore allows much larger Mahalanobis metric learning problems to be solved. The time complexity of the proposed method is roughly O (D3), which is significantly lower than that of the SDP approach. Experiments on a variety of data sets demonstrate that the proposed method achieves an accuracy comparable with the state of the art, but is applicable to significantly larger problems. We also show that the proposed method can be applied to solve more general Frobenius norm regularized SDP problems approximately.

computer vision and pattern recognition | 2015

Sequence searching with deep-learnt depth for condition- and viewpoint-invariant route-based place recognition

Michael Milford; Stephanie M. Lowry; Niko Sünderhauf; Sareh Shirazi; Edward Pepperell; Ben Upcroft; Chunhua Shen; Guosheng Lin; Fayao Liu; Cesar Cadena; Ian D. Reid

Vision-based localization on robots and vehicles remains unsolved when extreme appearance change and viewpoint change are present simultaneously. The current state of the art approaches to this challenge either deal with only one of these two problems; for example FAB-MAP (viewpoint invariance) or SeqSLAM (appearance-invariance), or use extensive training within the test environment, an impractical requirement in many application scenarios. In this paper we significantly improve the viewpoint invariance of the SeqSLAM algorithm by using state-of-the-art deep learning techniques to generate synthetic viewpoints. Our approach is different to other deep learning approaches in that it does not rely on the ability of the CNN network to learn invariant features, but only to produce“good enough” depth images from day-time imagery only. We evaluate the system on a new multi-lane day-night car dataset specifically gathered to simultaneously test both appearance and viewpoint change. Results demonstrate that the use of synthetic viewpoints improves the maximum recall achieved at 100% precision by a factor of 2.2 and maximum recall by a factor of 2.7, enabling correct place recognition across multiple road lanes and significantly reducing the time between correct localizations.

Image and Vision Computing | 2016

Online unsupervised feature learning for visual tracking

Fayao Liu; Chunhua Shen; Ian D. Reid; Anton van den Hengel

An online feature learning based tracking method achieves state-of-the-art performance.A dictionary learned from the sequence is capable of capturing appearance changes.The feature learning method can be used in the structured learning tracking framework. We propose a method for visual tracking-by-detection based on online feature learning. Our learning framework performs feature encoding with respect to an over-complete dictionary, followed by spatial pyramid pooling. We then learn a linear classifier based on the resulting feature encoding. Unlike previous work, we learn the dictionary online and update it to help capture the appearance of the tracked target as well as the background. In more detail, given a test image window, we extract local image patches from it and each local patch is encoded with respect to the dictionary. The encoded features are then pooled over a spatial pyramid to form an aggregated feature vector. Finally, a simple linear classifier is trained on these features.Our experiments show that the proposed powerful-albeit simple-tracker, outperforms all the state-of-the-art tracking methods that we have tested. Moreover, we evaluate the performance of different dictionary learning and feature encoding methods in the proposed tracking framework, and analyze the impact of each component in the tracking scenario. In particular, we show that a small dictionary, learned and updated online is as effective and more efficient than a huge dictionary learned offline. We further demonstrate the flexibility of feature learning by showing how it can be used within a structured learning tracking framework. The outcome is one of the best trackers reported to date, which facilitates the advantages of both feature learning and structured output prediction. We also implement a multi-object tracker, which achieves state-of-the-art performance.

IEEE Transactions on Image Processing | 2017

Discriminative Training of Deep Fully Connected Continuous CRFs With Task-Specific Loss

Fayao Liu; Guosheng Lin; Chunhua Shen

Recent works on deep conditional random fields (CRFs) have set new records on many vision tasks involving structured predictions. Here, we propose a fully connected deep continuous CRF model with task-specific losses for both discrete and continuous labeling problems. We exemplify the usefulness of the proposed model on multi-class semantic labeling (discrete) and the robust depth estimation (continuous) problems. In our framework, we model both the unary and the pairwise potential functions as deep convolutional neural networks (CNNs), which are jointly learned in an end-to-end fashion. The proposed method possesses the main advantage of continuously valued CRFs, which is a closed-form solution for the maximum a posteriori (MAP) inference. To better take into account the quality of the predicted estimates during the cause of learning, instead of using the commonly employed maximum likelihood CRF parameter learning protocol, we propose task-specific loss functions for learning the CRF parameters. It enables direct optimization of the quality of the MAP estimates during the learning process. Specifically, we optimize the multi-class classification loss for the semantic labeling task and the Tukey’s biweight loss for the robust depth estimation problem. Experimental results on the semantic labeling and robust depth estimation tasks demonstrate that the proposed method compare favorably against both baseline and state-of-the-art methods. In particular, we show that although the proposed deep CRF model is continuously valued, with the equipment of task-specific loss, it achieves impressive results even on discrete labeling tasks.

International Journal of Computer Vision | 2017

Structured Learning of Binary Codes with Column Generation for Optimizing Ranking Measures

Guosheng Lin; Fayao Liu; Chunhua Shen; Jianxin Wu; Heng Tao Shen

Hashing methods aim to learn a set of hash functions which map the original features to compact binary codes with similarity preserving in the Hamming space. Hashing has proven a valuable tool for large-scale information retrieval. We propose a column generation based binary code learning framework for data-dependent hash function learning. Given a set of triplets that encode the pairwise similarity comparison information, our column generation based method learns hash functions that preserve the relative comparison relations within the large-margin learning framework. Our method iteratively learns the best hash functions during the column generation procedure. Existing hashing methods optimize over simple objectives such as the reconstruction error or graph Laplacian related loss functions, instead of the performance evaluation criteria of interest—multivariate performance measures such as the AUC and NDCG. Our column generation based method can be further generalized from the triplet loss to a general structured learning based framework that allows one to directly optimize multivariate performance measures. For optimizing general ranking measures, the resulting optimization problem can involve exponentially or infinitely many variables and constraints, which is more challenging than standard structured output learning. We use a combination of column generation and cutting-plane techniques to solve the optimization problem. To speed-up the training we further explore stage-wise training and propose to optimize a simplified NDCG loss for efficient inference. We demonstrate the generality of our method by applying it to ranking prediction and image retrieval, and show that it outperforms several state-of-the-art hashing methods.

IEEE Transactions on Neural Networks | 2018

Structured Learning of Tree Potentials in CRF for Image Segmentation

Fayao Liu; Guosheng Lin; Ruizhi Qiao; Chunhua Shen

We propose a new approach to image segmentation, which exploits the advantages of both conditional random fields (CRFs) and decision trees. In the literature, the potential functions of CRFs are mostly defined as a linear combination of some predefined parametric models, and then, methods, such as structured support vector machines, are applied to learn those linear coefficients. We instead formulate the unary and pairwise potentials as nonparametric forests—ensembles of decision trees, and learn the ensemble parameters and the trees in a unified optimization problem within the large-margin framework. In this fashion, we easily achieve nonlinear learning of potential functions on both unary and pairwise terms in CRFs. Moreover, we learn classwise decision trees for each object that appears in the image. Experimental results on several public segmentation data sets demonstrate the power of the learned nonlinear nonparametric potentials.

Explore More