Is this you? Create Your Porfile

Lianli Gao

University of Electronic Science and Technology of China

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lianli Gao is active.

Explore More

Publication

Featured researches published by Lianli Gao.

computer vision and pattern recognition | 2015

Optimal graph learning with partial tags and multiple features for image and video annotation

Lianli Gao; Jingkuan Song; Feiping Nie; Yan Yan; Nicu Sebe; Heng Tao Shen

In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometrically based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimal graph (OGL) from multi-cues (i.e., partial tags and multiple features) which can more accurately embed the relationships among the data points. We further extend our model to address out-of-sample and noisy label issues. Extensive experiments on four public datasets show the consistent superiority of OGL over state-of-the-art methods by up to 12% in terms of mean average precision.

PLOS ONE | 2014

Love Thy Neighbour: Automatic Animal Behavioural Classification of Acceleration Data Using the K-Nearest Neighbour Algorithm

Owen R. Bidder; Hamish A. Campbell; Agustina Gómez-Laich; Patricia Urgé; James S. Walker; Yuzhi Cai; Lianli Gao; Flavio Quintana; Rory P. Wilson

Researchers hoping to elucidate the behaviour of species that aren’t readily observed are able to do so using biotelemetry methods. Accelerometers in particular are proving particularly effective and have been used on terrestrial, aquatic and volant species with success. In the past, behavioural modes were detected in accelerometer data through manual inspection, but with developments in technology, modern accelerometers now record at frequencies that make this impractical. In light of this, some researchers have suggested the use of various machine learning approaches as a means to classify accelerometer data automatically. We feel uptake of this approach by the scientific community is inhibited for two reasons; 1) Most machine learning algorithms require selection of summary statistics which obscure the decision mechanisms by which classifications are arrived, and 2) they are difficult to implement without appreciable computational skill. We present a method which allows researchers to classify accelerometer data into behavioural classes automatically using a primitive machine learning algorithm, k-nearest neighbour (KNN). Raw acceleration data may be used in KNN without selection of summary statistics, and it is easily implemented using the freeware program R. The method is evaluated by detecting 5 behavioural modes in 8 species, with examples of quadrupedal, bipedal and volant species. Accuracy and Precision were found to be comparable with other, more complex methods. In order to assist in the application of this method, the script required to run KNN analysis in R is provided. We envisage that the KNN method may be coupled with methods for investigating animal position, such as GPS telemetry or dead-reckoning, in order to implement an integrated approach to movement ecology research.

IEEE Transactions on Image Processing | 2016

Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation

Jingkuan Song; Lianli Gao; Feiping Nie; Heng Tao Shen; Yan Yan; Nicu Sebe

In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometry-based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimized graph (OGL) from multi-cues (i.e., partial tags and multiple features), which can more accurately embed the relationships among the data points. Since OGL is a transductive method and cannot deal with novel data points, we further extend our model to address the out-of-sample issue. Extensive experiments on image and video annotation show the consistent superiority of OGL over the state-of-the-art methods.

Pattern Recognition | 2018

Quantization-based hashing: a general framework for scalable image and video retrieval

Jingkuan Song; Lianli Gao; Li Liu; Xiaofeng Zhu; Nicu Sebe

As far as we know, we are the first to propose a general framework to incorporate the quantization-based methods into the conventional similarity-preserving hashing, in order to improve the effectiveness of hashing methods. In theory, any quantization method can be adopted to reduce the quantization error of any similarity-preserving hashing methods to improve their performance.This framework can be applied to both unsupervised and supervised hashing. We experimentally obtained the best performance compared to state-ofthe-art supervised and unsupervised hashing methods on six popular datasets.We successfully show it to work on a huge dataset SIFT1B (1 billion data points) by utilizing the graph approximation and out-of-sample extension. Nowadays, due to the exponential growth of user generated images and videos, there is an increasing interest in learning-based hashing methods. In computer vision, the hash functions are learned in such a way that the hash codes can preserve essential properties of the original space (or label information). Then the Hamming distance of the hash codes can approximate the data similarity. On the other hand, vector quantization methods quantize the data into different clusters based on the criteria of minimal quantization error, and then perform the search using look-up tables. While hashing methods using Hamming distance can achieve faster search speed, their accuracy is often outperformed by quantization methods with the same code length, due to the low quantization error and more flexible distance lookups. To improve the effectiveness of the hashing methods, in this work, we propose Quantization-based Hashing (QBH), a general framework which incorporates the advantages of quantization error reduction methods into conventional property preserving hashing methods. The learned hash codes simultaneously preserve the properties in the original space and reduce the quantization error, and thus can achieve better performance. Furthermore, the hash functions and a quantizer can be jointly learned and iteratively updated in a unified framework, which can be readily used to generate hash codes or quantize new data points. Importantly, QBH is a generic framework that can be integrated to different property preserving hashing methods and quantization strategies, and we apply QBH to both unsupervised and supervised hashing models as showcases in this paper. Experimental results on three large-scale unlabeled datasets (i.e., SIFT1M, GIST1M, and SIFT1B), three labeled datastes (i.e., ESPGAME, IAPRTC and MIRFLICKR) and one video dataset (UQ_VIDEO) demonstrate the superior performance of our QBH over existing unsupervised and supervised hashing methods.

Multimedia Systems | 2017

Learning in high-dimensional multimedia data: the state of the art

Lianli Gao; Jingkuan Song; Xingyi Liu; Junming Shao; Jiajun Liu; Jie Shao

AbstractDuring the last decade, the deluge of multimedia data has impacted a wide range of research areas, including multimedia retrieval, 3D tracking, database management, data mining, machine learning, social media analysis, medical imaging, and so on. Machine learning is largely involved in multimedia applications of building models for classification and regression tasks, etc., and the learning principle consists in designing the models based on the information contained in the multimedia dataset. While many paradigms exist and are widely used in the context of machine learning, most of them suffer from the ‘curse of dimensionality’, which means that some strange phenomena appears when data are represented in a high-dimensional space. Given the high dimensionality and the high complexity of multimedia data, it is important to investigate new machine learning algorithms to facilitate multimedia data analysis. To deal with the impact of high dimensionality, an intuitive way is to reduce the dimensionality. On the other hand, some researchers devoted themselves to designing some effective learning schemes for high-dimensional data. In this survey, we cover feature transformation, feature selection and feature encoding, three approaches fighting the consequences of the curse of dimensionality. Next, we briefly introduce some recent progress of effective learning algorithms. Finally, promising future trends on multimedia learning are envisaged.

acm multimedia | 2015

Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval

Jingkuan Song; Lianli Gao; Yan Yan; Dongxiang Zhang; Nicu Sebe

There is an increasing interest in using hash codes for efficient multimedia retrieval and data storage. The hash functions are learned in such a way that the hash codes can preserve essential properties of the original space or the label information. Then the Hamming distance of the hash codes can approximate the data similarity. Existing works have demonstrated the success of many supervised hashing models. However, labeling data is time and labor consuming, especially for scalable datasets. In order to utilize the supervised hashing models to improve the discriminative power of hash codes, we propose a Supervised Hashing with Pseudo Labels (SHPL) which uses the cluster centers of the training data to generate pseudo labels, based on which the hash codes can be generated using the criteria of supervised hashing. More specifically, we utilize linear discriminant analysis (LDA) with trace ratio criterion as a showcase for hash functions learning and during the optimization, we prove that the pseudo labels and the hash codes can be jointly learned and iteratively updated in an unified framework. The learned hash functions can harness the discriminant power of trace ratio criterion, and thus can achieve better performance. Experimental results on three large-scale unlabeled datasets (i.e., SIFT1M, GIST1M, and SIFT1B) demonstrate the superior performance of our SHPL over existing hashing methods.

acm multimedia | 2015

Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning

Lianli Gao; Jingkuan Song; Fuhao Zou; Dongxiang Zhang; Jie Shao

Learning-based hashing methods are becoming the mainstream for approximate scalable multimedia retrieval. They consist of two main components: hash codes learning for training data and hash functions learning for new data points. Tremendous efforts have been devoted to designing novel methods for these two components, i.e., supervised and unsupervised methods for learning hash codes, and different models for inferring hashing functions. However, there is little work integrating supervised and unsupervised hash codes learning into a single framework. Moreover, the hash function learning component is usually based on hand-crafted visual features extracted from the training images. The performance of a content-based image retrieval system crucially depends on the feature representation and such hand-crafted visual features may degrade the accuracy of the hash functions. In this paper, we propose a semi-supervised deep learning hashing (DLH) method for fast multimedia retrieval. More specifically, in the first component, we utilize both visual and label information to learn an relative similarity graph that can more precisely reflect the relationship among training data, and then generate the hash codes based on the graph. In the second stage, we apply a deep convolutional neural network (CNN) to simultaneously learn a good multimedia representation and hash functions. Extensive experiments on three popular datasets demonstrate the superiority of our DLH over both supervised and unsupervised hashing methods.

IEEE Transactions on Multimedia | 2018

Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length

Xuanhan Wang; Lianli Gao; Peng Wang; Xiaoshuai Sun; Xianglong Liu

3-D convolutional neural networks (3-D-convNets) have been very recently proposed for action recognition in videos, and promising results are achieved. However, existing 3-D-convNets has two “artificial” requirements that may reduce the quality of video analysis: 1) It requires a fixed-sized (e.g., 112

The Journal of Experimental Biology | 2013

Creating a behavioural classification module for acceleration data: using a captive surrogate for difficult to observe species

Hamish A. Campbell; Lianli Gao; Owen R. Bidder; Jane Hunter; Craig E. Franklin

\times

Ecological Informatics | 2013

A Web-based semantic tagging and activity recognition system for species' accelerometry data

Lianli Gao; Hamish A. Campbell; Owen R. Bidder; Jane Hunter

112) input video; and 2) most of the 3-D-convNets require a fixed-length input (i.e., video shots with fixed number of frames). To tackle these issues, we propose an end-to-end pipeline named Two-stream 3-D-convNet Fusion, which can recognize human actions in videos of arbitrary size and length using multiple features. Specifically, we decompose a video into spatial and temporal shots. By taking a sequence of shots as input, each stream is implemented using a spatial temporal pyramid pooling (STPP) convNet with a long short-term memory (LSTM) or CNN-E model, softmax scores of which are combined by a late fusion. We devise the STPP convNet to extract equal-dimensional descriptions for each variable-size shot, and we adopt the LSTM/CNN-E model to learn a global description for the input video using these time-varying descriptions. With these advantages, our method should improve all 3-D CNN-based video analysis methods. We empirically evaluate our method for action recognition in videos and the experimental results show that our method outperforms the state-of-the-art methods (both 2-D and 3-D based) on three standard benchmark datasets (UCF101, HMDB51 and ACT datasets).

Explore More