Huei-Fang Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Huei-Fang Yang is active.

Explore More

Publication

Featured researches published by Huei-Fang Yang.

computer vision and pattern recognition | 2015

Deep learning of binary hash codes for fast image retrieval

Kevin Lin; Huei-Fang Yang; Jen-Hao Hsiao; Chu-Song Chen

Approximate nearest neighbor search is an efficient strategy for large-scale image retrieval. Encouraged by the recent advances in convolutional neural networks (CNNs), we propose an effective deep learning framework to generate binary hash codes for fast image retrieval. Our idea is that when the data labels are available, binary codes can be learned by employing a hidden layer for representing the latent concepts that dominate the class labels. The utilization of the CNN also allows for learning image representations. Unlike other supervised methods that require pair-wised inputs for binary code learning, our method learns hash codes and image representations in a point-wised manner, making it suitable for large-scale datasets. Experimental results show that our method outperforms several state-of-the-art hashing algorithms on the CIFAR-10 and MNIST datasets. We further demonstrate its scalability and efficacy on a large-scale dataset of 1 million clothing images.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2018

Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks

Huei-Fang Yang; Kevin Lin; Chu-Song Chen

This paper presents a simple yet effective supervised deep hash approach that constructs binary hash codes from labeled data for large-scale image search. We assume that the semantic labels are governed by several latent attributes with each attribute on or off, and classification relies on these attributes. Based on this assumption, our approach, dubbed supervised semantics-preserving deep hashing (SSDH), constructs hash functions as a latent layer in a deep network and the binary codes are learned by minimizing an objective function defined over classification error and other desirable hash codes properties. With this design, SSDH has a nice characteristic that classification and retrieval are unified in a single learning model. Moreover, SSDH performs joint learning of image representations, hash codes, and classification in a point-wised manner, and thus is scalable to large-scale datasets. SSDH is simple and can be realized by a slight enhancement of an existing deep architecture for classification; yet it is effective and outperforms other hashing approaches on several benchmarks and large datasets. Compared with state-of-the-art approaches, SSDH achieves higher retrieval accuracy, while the classification performance is not sacrificed.

international conference on multimedia retrieval | 2015

Rapid Clothing Retrieval via Deep Learning of Binary Codes and Hierarchical Search

Kevin Lin; Huei-Fang Yang; Kuan-Hsien Liu; Jen-Hao Hsiao; Chu-Song Chen

This paper deals with the problem of clothing retrieval in a recommendation system. We develop a hierarchical deep search framework to tackle this problem. We use a pre-trained network model that has learned rich mid-level visual representations in module 1. Then, in module 2, we add a latent layer to the network and have neurons in this layer to learn hashes-like representations while fine-tuning it on the clothing dataset. Finally, module 3 achieves fast clothing retrieval using the learned hash codes and representations via a coarse-to-fine strategy. We use a large clothing dataset where 161,234 clothes images are collected and labeled. Experiments demonstrate the potential of our proposed framework for clothing retrieval in a large corpus.

british machine vision conference | 2015

Automatic Age Estimation from Face Images via Deep Ranking.

Huei-Fang Yang; Bo-Yao Lin; Kuang-Yu Chang; Chu-Song Chen

This paper focuses on automatic age estimation (AAE) from face images, which amounts to determining the exact age or age group of a face image according to features from faces. Although great effort has been devoted to AAE [1, 4, 6], it remains a challenging problem. The difficulties are due to large facial appearance variations resulting from a number of factors, e.g., aging and facial expressions. AAE algorithms need to overcome heterogeneity in facial appearance changes to provide accurate age estimates. To this end, we propose a generic, deep network model for AAE (see Figure 1). Given a face image, our network first extracts features from the face through a 3-layer scattering network (ScatNet) [2], then reduces the feature dimension by principal component analysis (PCA), and finally predicts the age via category-wise rankers constructed as a 3-layer fullyconnected network. The contributions are: (1) Our ranking method is point-wised and thus is easily scaled up to large-scale datasets; (2) our deep ranking model is general and can be applied to age estimation from faces with large facial appearance variations as a result of aging or facial expression changes; and (3) we show that the high-level concepts learned from large-scale neutral faces can be transferred to estimating ages from faces under expression changes, leading to improved performance. Our model is with the following characteristics: (1) The scattering features are invariant to translation and small deformations. ScatNet is a deep convolutional network of specific characteristics. It uses predefined wavelets and computes scattering representations via a cascade of wavelet transforms and modulus pooling operators from shallow to deep layers. With the nonlinear modulus and averaging operators, ScatNet can produce representations that are discriminative as well as invariant to translation and small deformations. As ScatNet provides fundamentally invariant representations for discriminating feature extraction, only the weights of the fully-connected layers are learned in our network model, which considerably reduces the training time. (2) The rank labels encoded in the network exploit the ordering relation among labels. Each category-wise ranker is an ordinal regression ranker. We encode the age rank based on the reduction framework [5]. Given a set of training samples X = {(xi,yi), i = 1 · · ·N}, let xi ∈ RD be the input image and yi be a rank label (yi ∈ {1, . . . ,K}), respectively, where K is the number of age ranks. For rank k, we separate X into two subsets, X k and X − k , as follows: X k = {(xi,+1)|yi > k} X− k = {(xi,−1)|yi ≤ k}. (1)

international symposium on biomedical imaging | 2015

Detection and tracking of Golgi outposts in microscopy data

Huei-Fang Yang; Chu-Song Chen; Xavier Descombes

Golgi outposts (GOPs) that transport proteins in both the anterograde and retrograde directions play an important role in determining the dendritic morphology in developing neurons. To obtain their heterogeneous motion patterns, we present a data association based framework that first detects the GOPs and then links the detection responses. In the GOP detection stage, we introduce a multi-scale Markov Point Process (MPP) based particle detector that uses multi-scale blobness images obtained by Laplace of Gaussian (LoG) for GOP appearances. This reduces the number of missed detections compared to the use of image intensity for GOP appearances. In the linking stage, we associate detection responses to form reliable tracklets and link the tracklets to form long, complete tracks. As such, high-level information (e.g., motion) is encoded in building the affinity model. We evaluate our approach on the microscopy data sets of dendritic arborization (da) sensory neurons in Drosophila larvae, and the results demonstrate the effectiveness of our method.

ACM Transactions on Multimedia Computing, Communications, and Applications | 2018

Joint Estimation of Age and Expression by Combining Scattering and Convolutional Networks

Huei-Fang Yang; Bo-Yao Lin; Kuang-Yu Chang; Chu-Song Chen

This article tackles the problem of joint estimation of human age and facial expression. This is an important yet challenging problem because expressions can alter face appearances in a similar manner to human aging. Different from previous approaches that deal with the two tasks independently, our approach trains a convolutional neural network (CNN) model that unifies ordinal regression and multi-class classification in a single framework. We demonstrate experimentally that our method performs more favorably against state-of-the-art approaches.

acm multimedia | 2016

Cross-batch Reference Learning for Deep Classification and Retrieval

Huei-Fang Yang; Kevin Lin; Chu-Song Chen

Learning feature representations for image retrieval is essential to multimedia search and mining applications. Recently, deep convolutional networks (CNNs) have gained much attention due to their impressive performance on object detection and image classification, and the feature representations learned from a large-scale generic dataset (e.g., ImageNet) can be transferred to or fine-tuned on the datasets of other domains. However, when the feature representations learned with a deep CNN are applied to image retrieval, the performance is still not as good as they are used for classification, which restricts their applicability to relevant image search. To ensure the retrieval capability of the learned feature space, we introduce a new idea called cross-batch reference (CBR) to enhance the stochastic-gradient-descent (SGD) training of CNNs. In each iteration of our training process, the network adjustment relies not only on the training samples in a single batch, but also on the information passed by the samples in the other batches. This inter-batches communication mechanism is formulated as a cross-batch retrieval process based on the mean average precision (MAP) criterion, where the relevant and irrelevant samples are encouraged to be placed on top and rear of the retrieval list, respectively. The learned feature space is not only discriminative to different classes, but the samples that are relevant to each other or of the same class are also enforced to be centralized. To maximize the cross-batch MAP, we design a loss function that is an approximated lower bound of the MAP on the feature layer of the network, which is differentiable and easier for optimization. By combining the intra-batch classification and inter-batch cross-reference losses, the learned features are effective for both classification and retrieval tasks. Experimental results on various benchmarks demonstrate the effectiveness of our approach.

arXiv: Computer Vision and Pattern Recognition | 2015