Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiangmin Xu is active.

Publication


Featured researches published by Xiangmin Xu.


IEEE Transactions on Image Processing | 2016

DehazeNet: An End-to-End System for Single Image Haze Removal

Bolun Cai; Xiangmin Xu; Kui Jia; Chunmei Qing; Dacheng Tao

Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts convolutional neural network-based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, the layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called bilateral rectified linear unit, which is able to improve the quality of recovered haze-free image. We establish connections between the components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts convolutional neural network-based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, the layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called bilateral rectified linear unit, which is able to improve the quality of recovered haze-free image. We establish connections between the components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.


IEEE Transactions on Image Processing | 2016

Simple to Complex Transfer Learning for Action Recognition

Fang Liu; Xiangmin Xu; Shuoyang Qiu; Chunmei Qing; Dacheng Tao

Recognizing complex human actions is very challenging, since training a robust learning model requires a large amount of labeled data, which is difficult to acquire. Considering that each complex action is composed of a sequence of simple actions which can be easily obtained from existing data sets, this paper presents a simple to complex action transfer learning model (SCA-TLM) for complex human action recognition. SCA-TLM improves the performance of complex action recognition by leveraging the abundant labeled simple actions. In particular, it optimizes the weight parameters, enabling the complex actions to be learned to be reconstructed by simple actions. The optimal reconstruct coefficients are acquired by minimizing the objective function, and the target weight parameters are then represented as a combination of source weight parameters. The main advantage of the proposed SCA-TLM compared with existing approaches is that we exploit simple actions to recognize complex actions instead of only using complex actions as training samples. To validate the proposed SCA-TLM, we conduct extensive experiments on two well-known complex action data sets: 1) Olympic Sports data set and 2) UCF50 data set. The results show the effectiveness of the proposed SCA-TLM for complex action recognition.


IEEE Transactions on Image Processing | 2015

Temporal Variance Analysis for Action Recognition

Jie Miao; Xiangmin Xu; Shuoyang Qiu; Chunmei Qing; Dacheng Tao

Slow feature analysis (SFA) extracts slowly varying signals from input data and has been used to model complex cells in the primary visual cortex (V1). It transmits information to both ventral and dorsal pathways to process appearance and motion information, respectively. However, SFA only uses slowly varying features for local feature extraction, because they represent appearance information more effectively than motion information. To better utilize temporal information, we propose temporal variance analysis (TVA) as a generalization of SFA. TVA learns a linear transformation matrix that projects multidimensional temporal data to temporal components with temporal variance. Inspired by the function of V1, we learn receptive fields by TVA and apply convolution and pooling to extract local features. Embedded in the improved dense trajectory framework, TVA for action recognition is proposed to: 1) extract appearance and motion features from gray using slow and fast filters, respectively; 2) extract additional motion features using slow filters from horizontal and vertical optical flows; and 3) separately encode extracted local features with different temporal variances and concatenate all the encoded features as final features. We evaluate the proposed TVA features on several challenging data sets and show that both slow and fast features are useful in the low-level feature extraction. Experimental results show that the proposed TVA features outperform the conventional histogram-based features, and excellent results can be achieved by combining all TVA features.


Multidimensional Systems and Signal Processing | 2016

Underwater video dehazing based on spatial---temporal information fusion

Chunmei Qing; Feng Yu; Xiangmin Xu; Wenyou Huang; Jianxiu Jin

In this paper, a novel multidimensional underwater video dehazing method is presented to restore and enhance the underwater degraded videos. Videos in the underwater suffer from medium scattering and light absorption. The absorption of light traveling in the water makes the underwater hazing videos different from the atmosphere hazing videos. In order to dehaze the underwater videos, a spatial–temporal information fusion method is proposed which includes two main parts. One is transmission estimation, which is based on the correlation between the adjacent frames of videos to keep the color consistency, where fast tracking and the least square method are used to reduce the influence of camera and object motions and water flowing. Another part is background light estimation to keep consistent atmospheric light values in a video. Extensive experimental results demonstrate that the proposed algorithm can have superior haze removing and color balancing capabilities.


IEEE Transactions on Neural Networks | 2018

GoDec+: Fast and Robust Low-Rank Matrix Decomposition Based on Maximum Correntropy

Kailing Guo; Liu Liu; Xiangmin Xu; Dong Xu; Dacheng Tao

GoDec is an efficient low-rank matrix decomposition algorithm. However, optimal performance depends on sparse errors and Gaussian noise. This paper aims to address the problem that a matrix is composed of a low-rank component and unknown corruptions. We introduce a robust local similarity measure called correntropy to describe the corruptions and, in doing so, obtain a more robust and faster low-rank decomposition algorithm: GoDec+. Based on half-quadratic optimization and greedy bilateral paradigm, we deliver a solution to the maximum correntropy criterion (MCC)-based low-rank decomposition problem. Experimental results show that GoDec+ is efficient and robust to different corruptions including Gaussian noise, Laplacian noise, salt & pepper noise, and occlusion on both synthetic and real vision data. We further apply GoDec+ to more general applications including classification and subspace clustering. For classification, we construct an ensemble subspace from the low-rank GoDec+ matrix and introduce an MCC-based classifier. For subspace clustering, we utilize GoDec+ values low-rank matrix for MCC-based self-expression and combine it with spectral clustering. Face recognition, motion segmentation, and face clustering experiments show that the proposed methods are effective and robust. In particular, we achieve the state-of-the-art performance on the Hopkins 155 data set and the first 10 subjects of extended Yale B for subspace clustering.


computer vision and pattern recognition | 2017

Improving Training of Deep Neural Networks via Singular Value Bounding

Kui Jia; Dacheng Tao; Shenghua Gao; Xiangmin Xu

Deep learning methods achieve great success recently on many computer vision problems. In spite of these practical successes, optimization of deep networks remains an active topic in deep learning research. In this work, we focus on investigation of the network solution properties that can potentially lead to good performance. Our research is inspired by theoretical and empirical results that use orthogonal matrices to initialize networks, but we are interested in investigating how orthogonal weight matrices perform when network training converges. To this end, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training, and achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB, all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improves Batch Normalization by removing its potential risk of ill-conditioned layer transform. We present both theoretical and empirical results to justify our proposed methods. Experiments on benchmark image classification datasets show the efficacy of our proposed SVB and BBN. In particular, we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets). Our preliminary results on ImageNet also show the promise in large-scale learning. We release the implementation code of our methods at www.aperture-lab.net/research/svb.


international conference on digital signal processing | 2015

Underwater image enhancement with an adaptive dehazing framework

Chunmei Qing; Wenyou Huang; Siqi Zhu; Xiangmin Xu

In this paper, we propose a novel method to restore and enhance the underwater degraded images. Underwater images suffer from two major problems of distortion: scattering and color change. Scattering is caused by large suspended particles and color distortion is caused by the varying degrees of attenuation encountered by light traveling in the water with different wavelengths, rendering some characteristics which are different from the foggy images in the atmosphere. Our key contribution is proposed an adaptive dahazing framework for underwater image enhancement, which includes two main parts: adaptive underwater brightness estimation and locally adaptive histogram equalization. The enhanced images are characterized by more accurate exposure of underwater images, especially in the dark regions. And improved contrast for the better details and edges are enhanced significantly in the images. Extensive experimental results demonstrate that our method is comparable to higher quality than the state-of-the-art methods.


Multimedia Tools and Applications | 2017

Robust object tracking based on sparse representation and incremental weighted PCA

Xiaofen Xing; Fuhao Qiu; Xiangmin Xu; Chunmei Qing; Yinrong Wu

Object tracking plays a crucial role in many applications of computer vision, but it is still a challenging problem due to the variations of illumination, shape deformation and occlusion. A new robust tracking method based on incremental weighted PCA and sparse representation is proposed. An iterative process consisting of a soft segmentation step and a foreground distribution update step is adpoted to estimate the foreground distribution, cooperating with incremental weighted PCA, we can get the target appearance in terms of the PCA components with less impact of the background in the target templates. In order to make the target appearance model more discriminative, trivial and background templates are both added to the dictionary for sparse representation of the target appearance. Experiments show that the proposed method with some level of background awareness is robust against illumination change, occlusion and appearance variation, and outperforms several latest important tracking methods in terms of tracking performance.


pacific rim conference on multimedia | 2017

Single Image Super-Resolution Using Multi-scale Convolutional Neural Network

Xiaoyi Jia; Xiangmin Xu; Bolun Cai; Kailing Guo

Methods based on convolutional neural network (CNN) have demonstrated tremendous improvements on single image super-resolution. However, the previous methods mainly restore images from one single area in the low-resolution (LR) input, which limits the flexibility of models to infer various scales of details for high-resolution (HR) output. Moreover, most of them train a specific model for each up-scale factor. In this paper, we propose a multi-scale super resolution (MSSR) network. Our network consists of multi-scale paths to make the HR inference, which can learn to synthesize features from different scales. This property helps reconstruct various kinds of regions in HR images. In addition, only one single model is needed for multiple up-scale factors, which is more efficient without loss of restoration quality. Experiments on four public datasets demonstrate that the proposed method achieved state-of-the-art performance with fast speed.


visual communications and image processing | 2016

Image and video dehazing using view-based cluster segmentation

Feng Yu; Chunmei Qing; Xiangmin Xu; Bolun Cai

To avoid distortion in sky regions and make the sky and white objects clear, in this paper we propose a new image and video dehazing method utilizing the view-based cluster segmentation. Firstly, GMM(Gaussian Mixture Model)is utilized to cluster the depth map based on the distant view to estimate the sky region and then the transmission estimation is modified to reduce distortion. Secondly, we present to use GMM based on Color Attenuation Prior to divide a single hazy image into K classifications, so that the atmospheric light estimation is refined to improve global contrast. Finally, online GMM cluster is applied to video dehazing. Extensive experimental results demonstrate that the proposed algorithm can have superior haze removing and color balancing capabilities.

Collaboration


Dive into the Xiangmin Xu's collaboration.

Top Co-Authors

Avatar

Chunmei Qing

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Bolun Cai

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaofen Xing

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jie Miao

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Jianxiu Jin

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Kailing Guo

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Fang Liu

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Tong Zhang

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Fuhao Qiu

South China University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge