Is this you? Create Your Porfile

Hanlin Goh

Agency for Science, Technology and Research

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hanlin Goh is active.

Explore More

Publication

Featured researches published by Hanlin Goh.

european conference on computer vision | 2012

Unsupervised and supervised visual codes with restricted boltzmann machines

Hanlin Goh; Nicolas Thome; Matthieu Cord; Joo-Hwee Lim

Recently, the coding of local features (e.g. SIFT) for image categorization tasks has been extensively studied. Incorporated within the Bag of Words (BoW) framework, these techniques optimize the projection of local features into the visual codebook, leading to state-of-the-art performances in many benchmark datasets. In this work, we propose a novel visual codebook learning approach using the restricted Boltzmann machine (RBM) as our generative model. Our contribution is three-fold. Firstly, we steer the unsupervised RBM learning using a regularization scheme, which decomposes into a combined prior for the sparsity of each features representation as well as the selectivity for each codeword. The codewords are then fine-tuned to be discriminative through the supervised learning from top-down labels. Secondly, we evaluate the proposed method with the Caltech-101 and 15-Scenes datasets, either matching or outperforming state-of-the-art results. The codebooks are compact and inference is fast. Finally, we introduce an original method to visualize the codebooks and decipher what each visual codeword encodes.

IEEE Transactions on Neural Networks | 2014

Learning deep hierarchical visual feature coding.

Hanlin Goh; Nicolas Thome; Matthieu Cord; Joo-Hwee Lim

In this paper, we propose a hybrid architecture that combines the image modeling strengths of the bag of words framework with the representational power and adaptability of learning deep architectures. Local gradient-based descriptors, such as SIFT, are encoded via a hierarchical coding scheme composed of spatial aggregating restricted Boltzmann machines (RBM). For each coding layer, we regularize the RBM by encouraging representations to fit both sparse and selective distributions. Supervised fine-tuning is used to enhance the quality of the visual representation for the categorization task. We performed a thorough experimental evaluation using three image categorization data sets. The hierarchical coding scheme achieved competitive categorization accuracies of 79.7% and 86.4% on the Caltech-101 and 15-Scenes data sets, respectively. The visual representations learned are compact and the models inference is fast, as compared with sparse coding methods. The low-level representations of descriptors that were learned using this method result in generic features that we empirically found to be transferrable between different image data sets. Further analysis reveal the significance of supervised fine-tuning when the architecture has two layers of representations as opposed to a single layer.

international conference on image processing | 2011

Learning invariant color features with sparse topographic restricted Boltzmann machines

Hanlin Goh; Lukasz Kusmierz; Joo-Hwee Lim; Nicolas Thome; Matthieu Cord

Our objective is to learn invariant color features directly from data via unsupervised learning. In this paper, we introduce a method to regularize restricted Boltzmann machines during training to obtain features that are sparse and topographically organized. Upon analysis, the features learned are Gabor-like and demonstrate a coding of orientation, spatial position, frequency and color that vary smoothly with the topography of the feature map. There is also differentiation between monochrome and color filters, with some exhibiting color-opponent properties. We also found that the learned representation is more invariant to affine image transformations and changes in illumination color.

international conference on multimedia and expo | 2008

Cascaded classification with optimal candidate selection for effective place recognition

Yiqun Li; Joo-Hwee Lim; Hanlin Goh

A two-stage cascaded classification approach with an optimal candidate selection scheme is proposed to recognize places using images taken by camera phones. An optimal acceptance threshold is chosen to maximize the probability of accepting more positives and rejecting more negatives at the first stage so that an optimal number of candidates are selected. The first classifier is trained using simple color and texture features. The second classifier is trained by scale invariant feature transform (SIFT). For a query image, a number of matching candidates are selected using k nearest neighbor at the first stage and passed on to the second stage for a refining classification to select the best matching result. The searching range is narrowed down dynamically at the second stage depending on the output of the first stage. Experimental results show that this method is promising by improving the recognition accuracy and reducing the computation time.

computer vision and pattern recognition | 2008

Boosting descriptors condensed from video sequences for place recognition

Tat-Jun Chin; Hanlin Goh; Joo-Hwee Lim

We investigate the task of efficiently training classifiers to build a robust place recognition system. We advocate an approach which involves densely capturing the facades of buildings and landmarks with video recordings to greedily accumulate as much visual information as possible. Our contributions include (1) a preprocessing step to effectively exploit the temporal continuity intrinsic in the video sequences to dramatically increase training efficiency, (2) training sparse classifiers discriminatively with the resulting data using the AdaBoost principle for place recognition, and (3) methods to speed up recognition using scaled kd-trees and to perform geometric validation on the results. Compared to straightforwardly applying scene recognition methods, our method not only allows a much faster training phase, the resulting classifiers are also more accurate. The sparsity of the classifiers also ensures good potential for recognition at high frame rates. We show extensive experimental results to validate our claims.

international conference on pattern recognition | 2008

Exact integral images at generic angles for 2D barcode detection

Tat-Jun Chin; Hanlin Goh; Ngan Meng Tan

Using integral images for fast computation of sums of rectangular areas is very popular in computer vision. However the method does not extend naturally to rotations at arbitrary angles. We propose a novel solution to elegantly compute integral images at generic angles. Our method is exact in the sense that no approximations are used to derive it and it is vulnerable only to the unavoidable aliasing effects of discretization. Detailed experiments show that our method is more accurate than previously proposed ideas. We also demonstrate its usefulness by detecting 2D barcodes embedded in images.

international conference on multimedia retrieval | 2017

DeepHash for Image Instance Retrieval: Getting Regularization, Depth and Fine-Tuning Right

Jie Lin; Olivier Morère; Antoine Veillard; Ling-Yu Duan; Hanlin Goh; Vijay Chandrasekhar

This work focuses on representing very high-dimensional global image descriptors using very compact 64-1024 bit binary hashes for instance retrieval. We propose DeepHash: a hashing scheme based on deep networks. Key to making DeepHash work at extremely low bitrates are three important considerations -- regularization, depth and fine-tuning -- each requiring solutions specific to the hashing problem. In-depth evaluation shows that our scheme outperforms state-of-the-art methods over several benchmark datasets for both Fisher Vectors and Deep Convolutional Neural Network features, by up to 8.5% over other schemes. The retrieval performance with 256-bit hashes is close to that of the uncompressed floating point features -- a remarkable 512x compression.

international conference on acoustics, speech, and signal processing | 2008

Using densely recorded scenes for place recognition

Tat-Jun Chin; Hanlin Goh; Joo-Hwee Lim

We investigate the task of efficiently modeling a scene to build a robust place recognition system. We propose an approach which involves densely capturing a place with video recordings to greedily cover as many viewpoints of the place as possible. Our contribution is a framework to (1) effectively exploit the temporal continuity intrinsic in the video sequences to reduce the amount of data to process without losing the unique visual information which describes a place, and (2) train discriminative classifiers with the reduced data for place recognition. We show that our method is more efficient and effective than straightforwardly applying scene or object category recognition methods on the video frames.

ieee international conference on cognitive informatics | 2009

Learning from an ensemble of Receptive Fields

Hanlin Goh; Joo Hwee Lim; Chai Quek

In this paper, we construct a neural-inspired computational model based on the representational capabilities of receptive fields. The proposed model, known as Shape Encoding Receptive Fields (SERF), is able to perform fast and accurate data classification and regression of multi-dimensional data. A SERF is a histogram structure that encodes the shape of multi-dimensional data relative to its center, in a manner similar to the neural coding of sensory stimulus by the receptive fields. The bins of this histogram represent a local region in an n-dimensional space. During the training phase, an ensemble of K SERF structures are initialized and data is summarized into the corresponding bins of each SERF structure. The collection of local data summaries makes each SERF a coarse nonlinear data predictor over the entire feature space. The output prediction of an unknown query is computed by the weighted aggregation of the hypotheses of the ensemble of K SERFs. In our series of experiments, we demonstrate the models superiority to perform fast and accurate data prediction.

international symposium on neural networks | 2008

Learning associations of conjuncted fuzzy sets for data prediction

Hanlin Goh; Joo-Hwee Lim; Chai Quek

Fuzzy associative conjuncted maps (FASCOM) is a fuzzy neural network that represents information by conjuncting fuzzy sets and associates them through a combination of unsupervised and supervised learning. The network first quantizes input and output feature maps using fuzzy sets. They are subsequently conjuncted to form antecedents and consequences, and associated to form fuzzy if-then rules. These associations are learnt through a learning process consisting of three consecutive phases. First, an unsupervised phase initializes based on information density the fuzzy membership functions that partition each feature map. Next, a supervised Hebbian learning phase encodes synaptic weights of the input-output associations. Finally, a supervised error reduction phase fine-tunes the fine-tunes the network and discovers the varying influence of an input dimension across output feature space. FASCOM was benchmarked against other prominent architectures using data taken from three nonlinear data estimation tasks and a real-world road traffic density prediction problem. The promising results compiled show significant improvements over the state-of-the-art for all four data prediction tasks.

Explore More