Jia-Bin Huang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jia-Bin Huang is active.

Explore More

Publication

Featured researches published by Jia-Bin Huang.

international conference on computer vision | 2015

Hierarchical Convolutional Features for Visual Tracking

Chao Ma; Jia-Bin Huang; Xiaokang Yang; Ming-Hsuan Yang

Visual object tracking is challenging as target objects often undergo significant appearance changes caused by deformation, abrupt motion, background clutter and occlusion. In this paper, we exploit features extracted from deep convolutional neural networks trained on object recognition datasets to improve tracking accuracy and robustness. The outputs of the last convolutional layers encode the semantic information of targets and such representations are robust to significant appearance variations. However, their spatial resolution is too coarse to precisely localize targets. In contrast, earlier convolutional layers provide more precise localization but are less invariant to appearance changes. We interpret the hierarchies of convolutional layers as a nonlinear counterpart of an image pyramid representation and exploit these multiple levels of abstraction for visual tracking. Specifically, we adaptively learn correlation filters on each convolutional layer to encode the target appearance. We hierarchically infer the maximum response of each layer to locate targets. Extensive experimental results on a largescale benchmark dataset show that the proposed algorithm performs favorably against state-of-the-art methods.

computer vision and pattern recognition | 2015

Single image super-resolution from transformed self-exemplars

Jia-Bin Huang; Abhishek Singh; Narendra Ahuja

Self-similarity based super-resolution (SR) algorithms are able to produce visually pleasing results without extensive training on external databases. Such algorithms exploit the statistical prior that patches in a natural image tend to recur within and across scales of the same image. However, the internal dictionary obtained from the given image may not always be sufficiently expressive to cover the textural appearance variations in the scene. In this paper, we extend self-similarity based SR to overcome this drawback. We expand the internal patch search space by allowing geometric variations. We do so by explicitly localizing planes in the scene and using the detected perspective geometry to guide the patch search process. We also incorporate additional affine transformations to accommodate local shape variations. We propose a compositional model to simultaneously handle both types of transformations. We extensively evaluate the performance in both urban and natural scenes. Even without using any external training databases, we achieve significantly superior results on urban scenes, while maintaining comparable performance on natural scenes as other state-of-the-art SR algorithms.

asian conference on computer vision | 2010

Exploiting self-similarities for single frame super-resolution

Chih-Yuan Yang; Jia-Bin Huang; Ming-Hsuan Yang

We propose a super-resolution method that exploits selfsimilarities and group structural information of image patches using only one single input frame. The super-resolution problem is posed as learning the mapping between pairs of low-resolution and high-resolution image patches. Instead of relying on an extrinsic set of training images as often required in example-based super-resolution algorithms, we employ a method that generates image pairs directly from the image pyramid of one single frame. The generated patch pairs are clustered for training a dictionary by enforcing group sparsity constraints underlying the image patches. Super-resolution images are then constructed using the learned dictionary. Experimental results show the proposed method is able to achieve the state-of-the-art performance.

computer vision and pattern recognition | 2009

Moving cast shadow detection using physics-based features

Jia-Bin Huang; Chu-Song Chen

Cast shadows induced by moving objects often cause serious problems to many vision applications. We present in this paper an online statistical learning approach to model the background appearance variations under cast shadows. Based on the bi-illuminant (i.e. direct light sources and ambient illumination) dichromatic reflection model, we derive physics-based color features under the assumptions of constant ambient illumination and light sources with common spectral power distributions. We first use one Gaussian mixture model (GMM) to learn the color features, which are constant regardless of the background surfaces or illuminant colors in a scene. Then, we build up one pixel based GMM for each pixel to learn the local shadow features. To overcome the slow convergence rate in the conventional GMM learning, we update the pixel-based GMMs through confidence-rated learning. The proposed method can rapidly learn model parameters in an unsupervised way and adapt to illumination conditions or environment changes. Furthermore, we demonstrate that our method is robust to scenes with few foreground activities and videos captured at low or unsteady frame rates.

international conference on computer graphics and interactive techniques | 2014

Image completion using planar structure guidance

Jia-Bin Huang; Sing Bing Kang; Narendra Ahuja; Johannes Kopf

We propose a method for automatically guiding patch-based image completion using mid-level structural cues. Our method first estimates planar projection parameters, softly segments the known region into planes, and discovers translational regularity within these planes. This information is then converted into soft constraints for the low-level completion algorithm by defining prior probabilities for patch offsets and transformations. Our method handles multiple planes, and in the absence of any detected planes falls back to a baseline fronto-parallel image completion algorithm. We validate our technique through extensive comparisons with state-of-the-art algorithms on a variety of scenes.

IEEE Signal Processing Letters | 2007

Information Preserving Color Transformation for Protanopia and Deuteranopia

Jia-Bin Huang; Yu-Cheng Tseng; Se-In Wu; Sheng-Jyh Wang

In this letter, we proposed a new recoloring method for people with protanopic and deuteranopic color deficiencies. We present a color transformation that aims to preserve the color information in the original images while maintaining the recolored images as natural as possible. Two error functions are introduced and combined together to form an objective function using the Lagrange multiplier with a user-specified parameter lambda. This objective function is then minimized to obtain the optimal settings. Experimental results show that the proposed method can yield more comprehensible images for color-deficient viewers while maintaining the naturalness of the recolored images for standard viewers.

international conference on image processing | 2010

Single image deblurring with adaptive dictionary learning

Zhe Hu; Jia-Bin Huang; Ming-Hsuan Yang

We propose a motion deblurring algorithm that exploits sparsity constraints of image patches using one single frame. In our formulation, each image patch is encoded with sparse coefficients using an over-complete dictionary. The sparsity constraints facilitate recovering the latent image without solving an ill-posed deconvolution problem. In addition, the dictionary is learned and updated directly from one single frame without using additional images. The proposed method iteratively utilizes sparsity constraints to recover latent image, estimates the deblur kernel, and updates the dictionary directly from one single image. The final deblurred image is then recovered once the deblur kernel is estimated using our method. Experiments show that the proposed algorithm achieves favorable results against the state-of-the-art methods.

computer vision and pattern recognition | 2016

Weakly Supervised Object Localization with Progressive Domain Adaptation

Dong Li; Jia-Bin Huang; Yali Li; Shengjin Wang; Ming-Hsuan Yang

We address the problem of weakly supervised object localization where only image-level annotations are available for training. Many existing approaches tackle this problem through object proposal mining. However, a substantial amount of noise in object proposals causes ambiguities for learning discriminative object models. Such approaches are sensitive to model initialization and often converge to an undesirable local minimum. In this paper, we address this problem by progressive domain adaptation with two main steps: classification adaptation and detection adaptation. In classification adaptation, we transfer a pre-trained network to our multi-label classification task for recognizing the presence of a certain object in an image. In detection adaptation, we first use a mask-out strategy to collect class-specific object proposals and apply multiple instance learning to mine confident candidates. We then use these selected object proposals to fine-tune all the layers, resulting in a fully adapted detection network. We extensively evaluate the localization performance on the PASCAL VOC and ILSVRC datasets and demonstrate significant performance improvement over the state-of-the-art methods.

computer vision and pattern recognition | 2017

Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

Wei-Sheng Lai; Jia-Bin Huang; Narendra Ahuja; Ming-Hsuan Yang

Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.

computer vision and pattern recognition | 2010

Fast sparse representation with prototypes

Jia-Bin Huang; Ming-Hsuan Yang

Sparse representation has found applications in numerous domains and recent developments have been focused on the convex relaxation of the lo-norm minimization for sparse coding (i.e., the l\-norm minimization). Nevertheless, the time and space complexities of these algorithms remain significantly high for large-scale problems. As signals in most problems can be modeled by a small set of prototypes, we propose an algorithm that exploits this property and show that the l\-norm minimization problem can be reduced to a much smaller problem, thereby gaining significant speed-ups with much less memory requirements. Experimental results demonstrate that our algorithm is able to achieve double-digit gain in speed with much less memory requirement than the state-of-the-art algorithms.

Explore More