Xiaolin Hu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiaolin Hu is active.

Explore More

Publication

Featured researches published by Xiaolin Hu.

computer vision and pattern recognition | 2015

Recurrent convolutional neural network for object recognition

Ming Liang; Xiaolin Hu

In recent years, the convolutional neural network (CNN) has achieved great success in many computer vision tasks. Partially inspired by neuroscience, CNN shares many properties with the visual system of the brain. A prominent difference is that CNN is typically a feed-forward architecture while in the visual system recurrent connections are abundant. Inspired by this fact, we propose a recurrent CNN (RCNN) for object recognition by incorporating recurrent connections into each convolutional layer. Though the input is static, the activities of RCNN units evolve over time so that the activity of each unit is modulated by the activities of its neighboring units. This property enhances the ability of the model to integrate the context information, which is important for object recognition. Like other recurrent neural networks, unfolding the RCNN through time can result in an arbitrarily deep network with a fixed number of parameters. Furthermore, the unfolded network has multiple paths, which can facilitate the learning process. The model is tested on four benchmark object recognition datasets: CIFAR-10, CIFAR-100, MNIST and SVHN. With fewer trainable parameters, RCNN outperforms the state-of-the-art models on all of these datasets. Increasing the number of parameters leads to even better performance. These results demonstrate the advantage of the recurrent structure over purely feed-forward structure for object recognition.

computer vision and pattern recognition | 2014

A Reverse Hierarchy Model for Predicting Eye Fixations

Tianlin Shi; Ming Liang; Xiaolin Hu

A number of psychological and physiological evidences suggest that early visual attention works in a coarse-to-fine way, which lays a basis for the reverse hierarchy theory (RHT). This theory states that attention propagates from the top level of the visual hierarchy that processes gist and abstract information of input, to the bottom level that processes local details. Inspired by the theory, we develop a computational model for saliency detection in images. First, the original image is downsampled to different scales to constitute a pyramid. Then, saliency on each layer is obtained by image super-resolution reconstruction from the layer above, which is defined as unpredictability from this coarse-to-fine reconstruction. Finally, saliency on each layer of the pyramid is fused into stochastic fixations through a probabilistic model, where attention initiates from the top layer and propagates downward through the pyramid. Extensive experiments on two standard eye-tracking datasets show that the proposed method can achieve competitive results with state-of-the-art models.

international symposium on neural networks | 2017

FxpNet: Training a deep convolutional neural network in fixed-point representation

Xi Chen; Xiaolin Hu; Hucheng Zhou; Ningyi Xu

We introduce FxpNet, a framework to train deep convolutional neural networks with low bit-width arithmetics in both forward pass and backward pass. During training FxpNet further reduces the bit-width of stored parameters (also known as primal parameters) by adaptively updating their fixed-point formats. These primal parameters are usually represented in the full resolution of floating-point values in previous binarized and quantized neural networks. In FxpNet, during forward pass fixed-point primal weights and activations are first binarized before computation, while in backward pass all gradients are represented as low resolution fixed-point values and then accumulated to corresponding fixed-point primal parameters. To have highly efficient implementations in FPGAs, ASICs and other dedicated devices, FxpNet introduces Integer Batch Normalization (IBN) and Fixed-point ADAM (FxpADAM) methods to further reduce the required floating-point operations, which will save considerable power and chip area. The evaluation on CIFAR-10 dataset indicates the effectiveness that FxpNet with 12-bit primal parameters and 12-bit gradients achieves comparable prediction accuracy with state-of-the-art binarized and quantized neural networks.

international symposium on neural networks | 2017

Accelerating convolutional neural networks by group-wise 2D-filter pruning

Niange Yu; Shi Qiu; Xiaolin Hu; Jianmin Li

Network pruning is an effective way to accelerate Convolutional Neural Networks (CNNs). In recent years, structured pruning methods are proposed in favor of unstructured methods as they have shown greater speedup in practical use. Existing structured methods does pruning along two main dimensions: 3D-filter wise, i.e., remove a 3D-fllter as a whole, and filter-shape wise, i.e., remove a same position from all 3D-filters. In this work, we propose a new group-wise 2D-fllter pruning approach that is orthogonal and complementary to the existing methods. The proposed approach removes a portion of 2D-fllters from each 3D-filter according to the pruning patterns learned from the data, and leads to compressed models that do not require sophisticated implementation of convolution operations. A fine-tuning process is followed to recover the accuracy. The knowledge distillation (KD) framework is explored in the fine-tuning process to improve the performance. We present our method for learning the pruning pattens as well as the fine-tuning strategy based on knowledge distillation. The proposed approach is validated on two representative CNN models — ZF and VGG16, pre-trained on ILSVRC12. Experimental results demonstrate the effectiveness of our approach. In VGG16, we get even higher accuracy after speeding-up the network by 4 times.

arXiv: Computer Vision and Pattern Recognition | 2018

Adversarial Attacks and Defences Competition

Alexey Kurakin; Ian J. Goodfellow; Samy Bengio; Yinpeng Dong; Fangzhou Liao; Ming Liang; Tianyu Pang; Jun Zhu; Xiaolin Hu; Cihang Xie; Jianyu Wang; Zhishuai Zhang; Zhou Ren; Alan L. Yuille; Sangxia Huang; Yao Zhao; Yuzhe Zhao; Zhonglin Han; Junjiajia Long; Yerkebulan Berdibekov; Takuya Akiba; Seiya Tokui; Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them. In this chapter, we describe the structure and organization of the competition and the solutions developed by several of the top-placing teams.

international symposium on neural networks | 2015

Interlinked Convolutional Neural Networks for Face Parsing

Yisu Zhou; Xiaolin Hu; Bo Zhang

Face parsing is a basic task in face image analysis. It amounts to labeling each pixel with appropriate facial parts such as eyes and nose. In the paper, we present a interlinked convolutional neural network iCNN for solving this problem in an end-to-end fashion. It consists of multiple convolutional neural networks CNNs taking input in different scales. A special interlinking layer is designed to allow the CNNs to exchange information, enabling them to integrate local and contextual information efficiently. The hallmark of iCNN is the extensive use of downsampling and upsampling in the interlinking layers, while traditional CNNs usually uses downsampling only. A two-stage pipeline is proposed for face parsing and both stages use iCNN. The first stage localizes facial parts in the size-reduced image and the second stage labels the pixels in the identified facial parts in the original image. On a benchmark dataset we have obtained better results than the state-of-the-art methods.

international symposium on neural networks | 2017

Reservoir Computing with a Small-World Network for Discriminating Two Sequential Stimuli

Ke Bai; Fangzhou Liao; Xiaolin Hu

Recently, reservoir network was used for simulating the sequential stimuli discrimination process of monkeys. To deal with the inefficient memory problem of a randomly connected network, a winner-take-all subnetwork was used. In this study, we show that a network with the small-world property makes the WTA subnetwork unnecessary. Using the reinforcement learning in the output layer only, the proposed network successfully learns to accomplish the same discrimination task. In addition, the model neurons exhibit heterogeneous firing properties, which is consistent with the physiological data.

computer vision and pattern recognition | 2018