Bohan Zhuang
University of Adelaide
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bohan Zhuang.
computer vision and pattern recognition | 2016
Bohan Zhuang; Guosheng Lin; Chunhua Shen; Ian D. Reid
In this paper, we aim to learn a mapping (or embedding) from images to a compact binary space in which Hamming distances correspond to a ranking measure for the image retrieval task. We make use of a triplet loss because this has been shown to be most effective for ranking problems. However, training in previous works can be prohibitively expensive due to the fact that optimization is directly performed on the triplet space, where the number of possible triplets for training is cubic in the number of training examples. To address this issue, we propose to formulate high-order binary codes learning as a multi-label classification problem by explicitly separating learning into two interleaved stages. To solve the first stage, we design a large-scale high-order binary codes inference algorithm to reduce the high-order objective to a standard binary quadratic problem such that graph cuts can be used to efficiently infer the binary codes which serve as the labels of each training datum. In the second stage we propose to map the original image to compact binary codes via carefully designed deep convolutional neural networks (CNNs) and the hashing function fitting can be solved by training binary CNN classifiers. An incremental/interleaved optimization strategy is proffered to ensure that these two steps are interactive with each other during training for better accuracy. We conduct experiments on several benchmark datasets, which demonstrate both improved training time (by as much as two orders of magnitude) as well as producing state-of-the-art hashing for various retrieval tasks.
computer vision and pattern recognition | 2017
Bohan Zhuang; Lingqiao Liu; Yao Li; Chunhua Shen; Ian D. Reid
Large-scale datasets have driven the rapid development of deep neural networks for visual recognition. However, annotating a massive dataset is expensive and time-consuming. Web images and their labels are, in comparison, much easier to obtain, but direct training on such automatially harvested images can lead to unsatisfactory performance, because the noisy labels of Web images adversely affect the learned recognition models. To address this drawback we propose an end-to-end weakly-supervised deep learning framework which is robust to the label noise in Web images. The proposed framework relies on two unified strategies – random grouping and attention – to effectively reduce the negative impact of noisy web image annotations. Specifically, random grouping stacks multiple images into a single training instance and thus increases the labeling accuracy at the instance level. Attention, on the other hand, suppresses the noisy signals from both incorrectly labeled images and less discriminative image regions. By conducting intensive experiments on two challenging datasets, including a newly collected fine-grained dataset with Web images of different car models, the superior performance of the proposed methods over competitive baselines is clearly demonstrated.
computer vision and pattern recognition | 2017
Yao Li; Guosheng Lin; Bohan Zhuang; Lingqiao Liu; Chunhua Shen; Anton van den Hengel
Recognizing the identities of people in everyday photos is still a very challenging problem for machine vision, due to issues such as non-frontal faces, changes in clothing, location, lighting. Recent studies have shown that rich relational information between people in the same photo can help in recognizing their identities. In this work, we propose to model the relational information between people as a sequence prediction task. At the core of our work is a novel recurrent network architecture, in which relational information between instances labels and appearance are modeled jointly. In addition to relational cues, scene context is incorporated in our sequence prediction model with no additional cost. In this sense, our approach is a unified framework for modeling both contextual cues and visual appearance of person instances. Our model is trained end-to-end with a sequence of annotated instances in a photo as inputs, and a sequence of corresponding labels as targets. We demonstrate that this simple but elegant formulation achieves state-of-the-art performance on the newly released People In Photo Albums (PIPA) dataset.
Plant Methods | 2017
Hao Lu; Zhiguo Cao; Yang Xiao; Bohan Zhuang; Chunhua Shen
BackgroundAccurately counting maize tassels is important for monitoring the growth status of maize plants. This tedious task, however, is still mainly done by manual efforts. In the context of modern plant phenotyping, automating this task is required to meet the need of large-scale analysis of genotype and phenotype. In recent years, computer vision technologies have experienced a significant breakthrough due to the emergence of large-scale datasets and increased computational resources. Naturally image-based approaches have also received much attention in plant-related studies. Yet a fact is that most image-based systems for plant phenotyping are deployed under controlled laboratory environment. When transferring the application scenario to unconstrained in-field conditions, intrinsic and extrinsic variations in the wild pose great challenges for accurate counting of maize tassels, which goes beyond the ability of conventional image processing techniques. This calls for further robust computer vision approaches to address in-field variations.ResultsThis paper studies the in-field counting problem of maize tassels. To our knowledge, this is the first time that a plant-related counting problem is considered using computer vision technologies under unconstrained field-based environment. With 361 field images collected in four experimental fields across China between 2010 and 2015 and corresponding manually-labelled dotted annotations, a novel Maize Tassels Counting (MTC) dataset is created and will be released with this paper. To alleviate the in-field challenges, a deep convolutional neural network-based approach termed TasselNet is proposed. TasselNet can achieve good adaptability to in-field variations via modelling the local visual characteristics of field images and regressing the local counts of maize tassels. Extensive results on the MTC dataset demonstrate that TasselNet outperforms other state-of-the-art approaches by large margins and achieves the overall best counting performance, with a mean absolute error of 6.6 and a mean squared error of 9.6 averaged over 8 test sequences.ConclusionsTasselNet can achieve robust in-field counting of maize tassels with a relatively high degree of accuracy. Our experimental evaluations also suggest several good practices for practitioners working on maize-tassel-like counting problems. It is worth noting that, though the counting errors have been greatly reduced by TasselNet, in-field counting of maize tassels remains an open and unsolved problem.
international conference on computer vision | 2017
Bohan Zhuang; Lingqiao Liu; Chunhua Shen; Ian D. Reid
arXiv: Computer Vision and Pattern Recognition | 2017
Bohan Zhuang; Lingqiao Liu; Chunhua Shen; Ian D. Reid
computer vision and pattern recognition | 2018
Bohan Zhuang; Chunhua Shen; Mingkui Tan; Lingqiao Liu; Ian D. Reid
computer vision and pattern recognition | 2018
Bohan Zhuang; Qi Wu; Chunhua Shen; Ian D. Reid; Anton van den Hengel
Archive | 2017
Bohan Zhuang; Qi Wu; Chunhua Shen; Ian D. Reid; Anton van den Hengel
neural information processing systems | 2018
Zhuangwei Zhuang; Mingkui Tan; Bohan Zhuang; Jing Liu; Yong Guo; Qingyao Wu; Junzhou Huang; Jinhui Zhu