Shuhang Gu
Hong Kong Polytechnic University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shuhang Gu.
computer vision and pattern recognition | 2014
Shuhang Gu; Lei Zhang; Wangmeng Zuo; Xiangchu Feng
As a convex relaxation of the low rank matrix factorization problem, the nuclear norm minimization has been attracting significant research interest in recent years. The standard nuclear norm minimization regularizes each singular value equally to pursue the convexity of the objective function. However, this greatly restricts its capability and flexibility in dealing with many practical problems (e.g., denoising), where the singular values have clear physical meanings and should be treated differently. In this paper we study the weighted nuclear norm minimization (WNNM) problem, where the singular values are assigned different weights. The solutions of the WNNM problem are analyzed under different weighting conditions. We then apply the proposed WNNM algorithm to image denoising by exploiting the image nonlocal self-similarity. Experimental results clearly show that the proposed WNNM algorithm outperforms many state-of-the-art denoising algorithms such as BM3D in terms of both quantitative measure and visual perception quality.
international conference on computer vision | 2015
Shuhang Gu; Wangmeng Zuo; Qi Xie; Deyu Meng; Xiangchu Feng; Lei Zhang
Most of the previous sparse coding (SC) based super resolution (SR) methods partition the image into overlapped patches, and process each patch separately. These methods, however, ignore the consistency of pixels in overlapped patches, which is a strong constraint for image reconstruction. In this paper, we propose a convolutional sparse coding (CSC) based SR (CSC-SR) method to address the consistency issue. Our CSC-SR involves three groups of parameters to be learned: (i) a set of filters to decompose the low resolution (LR) image into LR sparse feature maps, (ii) a mapping function to predict the high resolution (HR) feature maps from the LR ones, and (iii) a set of filters to reconstruct the HR images from the predicted HR feature maps via simple convolution operations. By working directly on the whole image, the proposed CSC-SR algorithm does not need to divide the image into overlapped patches, and can exploit the image global correlation to produce more robust reconstruction of image local structures. Experimental results clearly validate the advantages of CSC over patch based SC in SR application. Compared with state-of-the-art SR methods, the proposed CSC-SR method achieves highly competitive PSNR results, while demonstrating better edge and texture preservation performance.
International Journal of Computer Vision | 2017
Shuhang Gu; Qi Xie; Deyu Meng; Wangmeng Zuo; Xiangchu Feng; Lei Zhang
As a convex relaxation of the rank minimization model, the nuclear norm minimization (NNM) problem has been attracting significant research interest in recent years. The standard NNM regularizes each singular value equally, composing an easily calculated convex norm. However, this restricts its capability and flexibility in dealing with many practical problems, where the singular values have clear physical meanings and should be treated differently. In this paper we study the weighted nuclear norm minimization (WNNM) problem, which adaptively assigns weights on different singular values. As the key step of solving general WNNM models, the theoretical properties of the weighted nuclear norm proximal (WNNP) operator are investigated. Albeit nonconvex, we prove that WNNP is equivalent to a standard quadratic programming problem with linear constrains, which facilitates solving the original problem with off-the-shelf convex optimization solvers. In particular, when the weights are sorted in a non-descending order, its optimal solution can be easily obtained in closed-form. With WNNP, the solving strategies for multiple extensions of WNNM, including robust PCA and matrix completion, can be readily constructed under the alternating direction method of multipliers paradigm. Furthermore, inspired by the reweighted sparse coding scheme, we present an automatic weight setting method, which greatly facilitates the practical implementation of WNNM. The proposed WNNM methods achieve state-of-the-art performance in typical low level vision tasks, including image denoising, background subtraction and image inpainting.
IEEE Transactions on Image Processing | 2016
Yuan Xie; Shuhang Gu; Yan Liu; Wangmeng Zuo; Wensheng Zhang; Lei Zhang
Low rank matrix approximation (LRMA), which aims to recover the underlying low rank matrix from its degraded observation, has a wide range of applications in computer vision. The latest LRMA methods resort to using the nuclear norm minimization (NNM) as a convex relaxation of the nonconvex rank minimization. However, NNM tends to over-shrink the rank components and treats the different rank components equally, limiting its flexibility in practical applications. We propose a more flexible model, namely, the weighted Schatten p-norm minimization (WSNM), to generalize the NNM to the Schatten p-norm minimization with weights assigned to different singular values. The proposed WSNM not only gives better approximation to the original low-rank assumption, but also considers the importance of different rank components. We analyze the solution of WSNM and prove that, under certain weights permutation, WSNM can be equivalently transformed into independent non-convex lp-norm subproblems, whose global optimum can be efficiently solved by generalized iterated shrinkage algorithm. We apply WSNM to typical low-level vision problems, e.g., image denoising and background subtraction. Extensive experimental results show, both qualitatively and quantitatively, that the proposed WSNM can more effectively remove noise, and model the complex and dynamic scenes compared with state-of-the-art methods.
computer vision and pattern recognition | 2016
Qi Xie; Qian Zhao; Deyu Meng; Zongben Xu; Shuhang Gu; Wangmeng Zuo; Lei Zhang
Multispectral images (MSI) can help deliver more faithful representation for real scenes than the traditional image system, and enhance the performance of many computer vision tasks. In real cases, however, an MSI is always corrupted by various noises. In this paper, we propose a new tensor-based denoising approach by fully considering two intrinsic characteristics underlying an MSI, i.e., the global correlation along spectrum (GCS) and nonlocal self-similarity across space (NSS). In specific, we construct a new tensor sparsity measure, called intrinsic tensor sparsity (ITS) measure, which encodes both sparsity insights delivered by the most typical Tucker and CANDECOMP/ PARAFAC (CP) low-rank decomposition for a general tensor. Then we build a new MSI denoising model by applying the proposed ITS measure on tensors formed by non-local similar patches within the MSI. The intrinsic GCS and NSS knowledge can then be efficiently explored under the regularization of this tensor sparsity measure to finely rectify the recovery of a MSI from its corruption. A series of experiments on simulated and real MSI denoising problems show that our method outperforms all state-of-the-arts under comprehensive quantitative performance measures.
computer vision and pattern recognition | 2015
Wangmeng Zuo; Dongwei Ren; Shuhang Gu; Liang Lin; Lei Zhang
The maximum a posterior (MAP)-based blind deconvolution framework generally involves two stages: blur kernel estimation and non-blind restoration. For blur kernel estimation, sharp edge prediction and carefully designed image priors are vital to the success of MAP. In this paper, we propose a blind deconvolution framework together with iteration specific priors for better blur kernel estimation. The family of hyper-Laplacian (Pr(d) ∝ e-∥d∥pp/λ) is adopted for modeling iteration-wise priors of image gra- dients, where each iteration has its own model parameters {λ(t), p(t)}. To avoid heavy parameter tuning, all iteration-wise model parameters can be learned using our principled discriminative learning model from a training set, and can be directly applied to other dataset and real blurry images. Interestingly, with the generalized shrinkage / thresholding operator, negative p value (p <;0) is allowable and we find that it contributes more in estimating the coarse shape of blur kernel. Experimental results on synthetic and real world images demonstrate that our method achieves better deblurring results than the existing gradient prior-based methods. Compared with the state-of-the-art patch prior-based method, our method is competitive in restoration results but is much more efficient.
IEEE Transactions on Image Processing | 2016
Wangmeng Zuo; Dongwei Ren; David Zhang; Shuhang Gu; Lei Zhang
Salient edge selection and time-varying regularization are two crucial techniques to guarantee the success of maximum a posteriori (MAP)-based blind deconvolution. However, the existing approaches usually rely on carefully designed regularizers and handcrafted parameter tuning to obtain satisfactory estimation of the blur kernel. Many regularizers exhibit the structure-preserving smoothing capability, but fail to enhance salient edges. In this paper, under the MAP framework, we propose the iteration-wise ℓp-norm regularizers together with data-driven strategy to address these issues. First, we extend the generalized shrinkage-thresholding (GST) operator for ℓp-norm minimization with negative p value, which can sharpen salient edges while suppressing trivial details. Then, the iteration-wise GST parameters are specified to allow dynamical salient edge selection and time-varying regularization. Finally, instead of handcrafted tuning, a principled discriminative learning approach is proposed to learn the iterationwise GST operators from the training dataset. Furthermore, the multi-scale scheme is developed to improve the efficiency of the algorithm. Experimental results show that, negative p value is more effective in estimating the coarse shape of blur kernel at the early stage, and the learned GST operators can be well generalized to other dataset and real world blurry images. Compared with the state-of-the-art methods, our method achieves better deblurring results in terms of both quantitative metrics and visual quality, and it is much faster than the state-of-the-art patch-based blind deconvolution method.
computer vision and pattern recognition | 2016
Keze Wang; Liang Lin; Wangmeng Zuo; Shuhang Gu; Lei Zhang
Feature representation and object category classification are two key components of most object detection methods. While significant improvements have been achieved for deep feature representation learning, traditional SVM/softmax classifiers remain the dominant methods for the final object category classification. However, SVM/softmax classifiers lack the capacity of explicitly exploiting the complex structure of deep features, as they are purely discriminative methods. The recently proposed discriminative dictionary pair learning (DPL) model involves a fidelity term to minimize the reconstruction loss and a discrimination term to enhance the discriminative capability of the learned dictionary pair, and thus is appropriate for balancing the representation and discrimination to boost object detection performance. In this paper, we propose a novel object detection system by unifying DPL with the convolutional feature learning. Specifically, we incorporate DPL as a Dictionary Pair Classifier Layer (DPCL) into the deep architecture, and develop an end-to-end learning algorithm for optimizing the dictionary pairs and the neural networks simultaneously. Moreover, we design a multi-task loss for guiding our model to accomplish the three correlated tasks: objectness estimation, categoryness computation, and bounding box regression. From the extensive experiments on PASCAL VOC 2007/2012 benchmarks, our approach demonstrates the effectiveness to substantially improve the performances over the popular existing object detection frameworks (e.g., R-CNN [13] and FRCN [12]), and achieves new state-of-the-arts.
computer vision and pattern recognition | 2017
Shuhang Gu; Wangmeng Zuo; Shi Guo; Yunjin Chen; Chongyu Chen; Lei Zhang
The depth images acquired by consumer depth sensors (e.g., Kinect and ToF) usually are of low resolution and insufficient quality. One natural solution is to incorporate with high resolution RGB camera for exploiting their statistical correlation. However, most existing methods are intuitive and limited in characterizing the complex and dynamic dependency between intensity and depth images. To address these limitations, we propose a weighted analysis representation model for guided depth image enhancement, which advances the conventional methods in two aspects: (i) task driven learning and (ii) dynamic guidance. First, we generalize the analysis representation model by including a guided weight function for dependency modeling. And the task-driven learning formulation is introduced to obtain the optimized guidance tailored to specific enhancement task. Second, the depth image is gradually enhanced along with the iterations, and thus the guidance should also be dynamically adjusted to account for the updating of depth image. To this end, stage-wise parameters are learned for dynamic guidance. Experiments on guided depth image upsampling and noisy depth image restoration validate the effectiveness of our method.
neural information processing systems | 2014
Shuhang Gu; Lei Zhang; Wangmeng Zuo; Xiangchu Feng