Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Bolun Cai is active.

Publication


Featured researches published by Bolun Cai.


IEEE Transactions on Image Processing | 2016

DehazeNet: An End-to-End System for Single Image Haze Removal

Bolun Cai; Xiangmin Xu; Kui Jia; Chunmei Qing; Dacheng Tao

Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts convolutional neural network-based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, the layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called bilateral rectified linear unit, which is able to improve the quality of recovered haze-free image. We establish connections between the components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts convolutional neural network-based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, the layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called bilateral rectified linear unit, which is able to improve the quality of recovered haze-free image. We establish connections between the components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.


pacific rim conference on multimedia | 2017

Single Image Super-Resolution Using Multi-scale Convolutional Neural Network

Xiaoyi Jia; Xiangmin Xu; Bolun Cai; Kailing Guo

Methods based on convolutional neural network (CNN) have demonstrated tremendous improvements on single image super-resolution. However, the previous methods mainly restore images from one single area in the low-resolution (LR) input, which limits the flexibility of models to infer various scales of details for high-resolution (HR) output. Moreover, most of them train a specific model for each up-scale factor. In this paper, we propose a multi-scale super resolution (MSSR) network. Our network consists of multi-scale paths to make the HR inference, which can learn to synthesize features from different scales. This property helps reconstruct various kinds of regions in HR images. In addition, only one single model is needed for multiple up-scale factors, which is more efficient without loss of restoration quality. Experiments on four public datasets demonstrate that the proposed method achieved state-of-the-art performance with fast speed.


visual communications and image processing | 2016

Image and video dehazing using view-based cluster segmentation

Feng Yu; Chunmei Qing; Xiangmin Xu; Bolun Cai

To avoid distortion in sky regions and make the sky and white objects clear, in this paper we propose a new image and video dehazing method utilizing the view-based cluster segmentation. Firstly, GMM(Gaussian Mixture Model)is utilized to cluster the depth map based on the distant view to estimate the sky region and then the transmission estimation is modified to reduce distortion. Secondly, we present to use GMM based on Color Attenuation Prior to divide a single hazy image into K classifications, so that the atmospheric light estimation is refined to improve global contrast. Finally, online GMM cluster is applied to video dehazing. Extensive experimental results demonstrate that the proposed algorithm can have superior haze removing and color balancing capabilities.


pacific rim conference on multimedia | 2016

Real-Time Video Dehazing Based on Spatio-Temporal MRF

Bolun Cai; Xiangmin Xu; Dacheng Tao

Video dehazing has a wide range of real-time applications, but the challenges mainly come from spatio-temporal coherence and computational efficiency. In this paper, a spatio-temporal optimization framework for real-time video dehazing is proposed, which reduces blocking and flickering artifacts and achieves high-quality enhanced results. We build a Markov Random Field (MRF) with an Intensity Value Prior (IVP) to handle spatial consistency and temporal coherence. By maximizing the MRF likelihood function, the proposed framework estimates the haze concentration and preserves the information optimally. Moreover, to facilitate real-time applications, integral image technique is approximated to reduce the main computational burden. Experimental results demonstrate that the proposed framework is effectively to remove haze and flickering artifacts, and sufficiently fast for real-time applications.


international conference on digital signal processing | 2015

Multi-invariance appearance model for object tracking

Guicong Xu; Xiangmin Xu; Xiaofen Xing; Bolun Cai; Chunmei Qing

Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on intensity information, texture information, or use simple color representations for image description, which cannot provide all-around invariance to different scene conditions. Meanwhile there exists no single tracking approach that can successfully handle all scenarios. Due to the complexity of the tracking problem, the combination of multiple features should be computationally efficient and possess a certain amount of robustness while maintaining high discriminative power. This paper combine intensity information (cross-bin distribute field, CDF), texture information (enhance histograms of oriented gradients, EHOG) and color information (color name, CN) in a tracking-by-detection framework, in which a simple tracker called CSK is extended for multi-dimension and multi-cue fusion. The proposed approach improves the baseline single-cue tracker by 4.4% in distance precision. Furthermore,we show that our approach achieving 75.4% is better than most recent state-of-the-art tracking algorithms.


Archive | 2018

Local-Global Extraction Unit for Person Re-identification

Peng Wang; Chunmei Qing; Xiangmin Xu; Bolun Cai; Jianxiu Jin; Jinchang Ren

The huge variance of human pose and inaccurate detection significantly increase the difficulty of person re-identification. Existing deep learning methods mostly focus on extraction of global feature and local feature, or combine them to learn a discriminative pedestrian descriptor. However, rare traditional methods have been exploited the association of the local and global features in convolutional neural networks (CNNs), and some important part-wise information is not captured sufficiently when training. In this paper, we propose a novel architecture called Local-Global Extraction Unit (LGEU), which is able to adaptively re-calibrate part-wise information with integrating the channel-wise information. Extensive experiments on Market-1501, CUHK03, and DukeMTMC-reID datasets achieve competitive results with the state-of-the-art methods. On Market-1501, for instance, LGEU achieves 91.8% rank-1 accuracy and especially 88.0% mAP.


international conference on internet multimedia computing and service | 2017

Progressive Lifelong Learning by Sharing Representations for Few Labeled Data

Guoxi Su; Xiangmin Xu; Chaowen Chen; Bolun Cai; Chunmei Qing

Lifelong Machine Learning (LML) has been receiving more and more attention in the past few years. It produces systems that are able to learn knowledge from consecutive tasks and refine the learned knowledge for a life time. In the optimization process of classical full-supervised LML systems, sufficient labeled data are required for extracting inter-task relationships before transferring. In order to leverage abundant unlabeled data and reduce the expenditure of labeling data, an progressive lifelong learning algorithm (PLLA) is proposed in this paper with unsupervised pre-training to learn shared representations that are more suitable as input to LML systems than the raw input data. Experiments show that the proposed PLLA is much more effective than many other LML methods when few labeled data is available.


international conference on internet multimedia computing and service | 2017

Joint Latent Space and Multi-view Feature Learning

Kailing Guo; Xiangmin Xu; Bolun Cai; Tong Zhang

GoDec+ shows its robustness in low-rank matrix decomposition but only deals with single-view data. This paper extends GoDec+ to multi-view data by jointly learning latent space and multi-view fusion feature. The proposed method factorizes the low-rank matrix in GoDec+ into the product of a basis matrix of the latent space and a shared representation given by a transformation matrix. By constraining the basis matrix to be group sparse, the proposed method treats the effects of different views differently. Extensive experiments show that the proposed method learns a good fusion feature and outperforms the compared methods in image classification and annotation.


pacific rim conference on multimedia | 2016

Exploiting Local Feature Fusion for Action Recognition

Jie Miao; Xiangmin Xu; Xiaoyi Jia; Haoyu Huang; Bolun Cai; Chunmei Qing; Xiaofen Xing

Densely sampled local features with bag-of-words models have been widely applied to action recognition. Conventional approaches assume that different kinds of local features are totally uncorrelated, and they are separately processed, encoded, and then fused at video-level representation. However, these local features are not totally uncorrelated in practice. To address this problem, multi-view local feature fusion is exploited for local descriptor fusion in action recognition. Specifically, tensor canonical correlation analysis (TCCA) is employed to obtain a fused local feature that carries the high-order correlation hidden among different types of local features. The high-order correlation local feature improves the conventional concatenation based fusion approach. Experimental results on three challenging action recognition datasets validate the effectiveness of the proposed approach.


international conference on image processing | 2015

BIT: Bio-inspired tracker

Bolun Cai; Xiangmin Xu; Xiaofen Xing; Chunmei Qing

Visual tracking is a challenging problem due to various factors such as deformation, rotation and illumination. As is well known, given the superior tracking performance of human vision, bio-inspired model is expected to improve the computer visual tracking. However, the design of bio-inspired tracking framework is challenging, due to the incomplete comprehension and hyper-scale of senior neurons, which will influence the effectiveness and real-time performance of the tracker. According to the ventral stream in visual cortex, a novel bio-inspired tracker (BIT) is proposed, which simulates shallow neurons (S1 and C1) to extract low-level bio-inspired feature for target appearance and imitates senior learning mechanism (S2 and C2) to combine generative and discriminative model for position estimation. In addition, Fast Fourier Transform (FFT) is adopted for real-time learning and detection in this framework. On the recent benchmark[1], extensive experimental results show BIT performs favorably against state-of-the-art methods in terms of accuracy and robustness.

Collaboration


Dive into the Bolun Cai's collaboration.

Top Co-Authors

Avatar

Xiangmin Xu

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Chunmei Qing

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaofen Xing

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kailing Guo

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Guicong Xu

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Jie Miao

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Tong Zhang

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaoyi Jia

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Feng Yu

South China University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge