Hong-Wei Hao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hong-Wei Hao is active.

Explore More

Publication

Featured researches published by Hong-Wei Hao.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Multi-Orientation Scene Text Detection with Adaptive Clustering

Xu-Cheng Yin; Wei-Yi Pei; Jun Zhang; Hong-Wei Hao

Text detection in natural scene images is an important prerequisite for many content-based image analysis tasks, while most current research efforts only focus on horizontal or near horizontal scene text. In this paper, first we present a unified distance metric learning framework for adaptive hierarchical clustering, which can simultaneously learn similarity weights (to adaptively combine different feature similarities) and the clustering threshold (to automatically determine the number of clusters). Then, we propose an effective multi-orientation scene text detection system, which constructs text candidates by grouping characters based on this adaptive clustering. Our text candidates construction method consists of several sequential coarse-to-fine grouping steps: morphology-based grouping via single-link clustering, orientation-based grouping via divisive hierarchical clustering, and projection-based grouping also via divisive clustering. The effectiveness of our proposed system is evaluated on several public scene text databases, e.g., ICDAR Robust Reading Competition data sets (2011 and 2013), MSRA-TD500 and NEOCR. Specifically, on the multi-orientation text data set MSRA-TD500, the <inline-formula><tex-math>

Neurocomputing | 2016

Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification

Peng Wang; Bo Xu; Jiaming Xu; Guanhua Tian; Cheng-Lin Liu; Hong-Wei Hao

Knowledge Based Systems | 2015

Selecting feature subset with sparsity and low redundancy for unsupervised learning

Jiuqi Han; Zhengya Sun; Hong-Wei Hao

</tex-math><alternatives> <inline-graphic xlink:type=simple xlink:href=yin-ieq1-2388210.gif/></alternatives></inline-formula> measure of our system is <inline-formula><tex-math>

Neurocomputing | 2017

Joint entity and relation extraction based on a hybrid neural network

Suncong Zheng; Yuexing Hao; Dongyuan Lu; Hongyun Bao; Jiaming Xu; Hong-Wei Hao; Bo Xu

Neurocomputing | 2015

DE2: Dynamic ensemble of ensembles for learning nonstationary data

Xu-Cheng Yin; Kaizhu Huang; Hong-Wei Hao

</tex-math><alternatives><inline-graphic xlink:type=simple xlink:href=yin-ieq2-2388210.gif/> </alternatives></inline-formula> percent, much better than the state-of-the-art performance. We also construct and release a practical challenging multi-orientation scene text data set (USTB-SV1K), which is available at http://prir.ustb.edu.cn/TexStar/MOMV-text-detection/.

conference on information and knowledge management | 2014

Social Book Search Reranking with Generalized Content-Based Filtering

Bo-Wen Zhang; Xu-Cheng Yin; Xiao-Ping Cui; Jiao Qu; Bin Geng; Fang Zhou; Li Song; Hong-Wei Hao

Text classification can help users to effectively handle and exploit useful information hidden in large-scale documents. However, the sparsity of data and the semantic sensitivity to context often hinder the classification performance of short texts. In order to overcome the weakness, we propose a unified framework to expand short texts based on word embedding clustering and convolutional neural network (CNN). Empirically, the semantically related words are usually close to each other in embedding spaces. Thus, we first discover semantic cliques via fast clustering. Then, by using additive composition over word embeddings from context with variable window width, the representations of multi-scale semantic units11Semantic units are defined as n-grams which have dominant meaning of text. With n varying, multi-scale contextual information can be exploited. in short texts are computed. In embedding spaces, the restricted nearest word embeddings (NWEs)22In order to prevent outliers, a Euclidean distance threshold is preset between semantic cliques and semantic units, which is used as restricted condition. of the semantic units are chosen to constitute expanded matrices, where the semantic cliques are used as supervision information. Finally, for a short text, the projected matrix33The projected matrix is obtained by table looking up, which encodes Unigram level features. and expanded matrices are combined and fed into CNN in parallel. Experimental results on two open benchmarks validate the effectiveness of the proposed method.

Pattern Recognition | 2015

l0-norm based structural sparse least square regression for feature selection

Jiuqi Han; Zhengya Sun; Hong-Wei Hao

Feature selection techniques are attracting more and more attention with the growing number of domains that produce high dimensional data. Due to the absence of class labels, many researchers focus on the unsupervised scenario, attempting to find an optimal feature subset that preserves the original data distribution. However, the existing methods either fail to achieve sparsity or ignore the potential redundancy among features. In this paper, we propose a novel unsupervised feature selection algorithm, which retains the preserving power, and implements high sparsity and low redundancy in a unified manner. On the one hand, to preserve the data structure of the whole feature set, we build the graph Laplacian matrix and learn the pseudo class labels through spectral analysis. By finding a feature weight matrix, we are allowed to map the original data into a low dimensional space based on the pseudo labels. On the other hand, to ensure the sparsity and low redundancy simultaneously, we introduce a novel regularization term into the objective function with the nonnegative constraints imposed, which can be viewed as the combination of the matrix norms | | ? | | m 1 and | | ? | | m 2 on the weights of features. An iterative multiplicative algorithm is accordingly designed with proved convergence to efficiently solve the constrained optimization problem. Extensive experimental results on different real world data sets demonstrate the promising performance of our proposed method over the state-of-the-arts.

Computational Intelligence and Neuroscience | 2015

Learning document semantic representation with hybrid deep belief network

Yan Yan; Xu-Cheng Yin; Sujian Li; Mingyuan Yang; Hong-Wei Hao

Entity and relation extraction is a task that combines detecting entity mentions and recognizing entities semantic relationships from unstructured text. We propose a hybrid neural network model to extract entities and their relationships without any handcrafted features. The hybrid neural network contains a novel bidirectional encoder-decoder LSTM module (BiLSTM-ED) for entity extraction and a CNN module for relation classification. The contextual information of entities obtained in BiLSTM-ED further pass though to CNN module to improve the relation classification. We conduct experiments on the public dataset ACE05 (Automatic Content Extraction program) to verify the effectiveness of our method. The method we proposed achieves the state-of-the-art results on entity and relation extraction task

PLOS ONE | 2016

ISART: A Generic Framework for Searching Books with Social Information

Xu-Cheng Yin; Bo-Wen Zhang; Xiao-Ping Cui; Jiao Qu; Bin Geng; Fang Zhou; Li Song; Hong-Wei Hao

Learning nonstationary data with concept drift has received much attention in machine learning and been an active topic in ensemble learning. Specifically, batch growing ensemble methods present one important direction for dealing with concept drift involved in nonstationary data. However, current batch growing ensemble methods combine all the available component classifiers only, each trained independently from a batch of non-stationary data. They simply discard interim ensembles and hence may lose useful information obtained from the fine-tuned interim ensembles. Distinctively, we introduce a comprehensive hierarchical approach called Dynamic Ensemble of Ensembles (DE2). The novel method combines classifiers as an ensemble of all the interim ensembles dynamically from consecutive batches of nonstationary data. DE2 includes two key stages: component classifiers and interim ensembles are dynamically trained; and the final ensemble is then learned by exponentially-weighted averaging with available experts, i.e., interim ensembles. Moreover, we engage Sparsity Learning to choose component classifiers selectively and intelligently. We also incorporate the techniques of Dynamic Weighted Majority, and Learn(++).NSE for better integrating different classifiers dynamically. We perform experiments with two benchmark test sets in real nonstationary environments, and compare our DE2 method to other conventional competitive ensemble methods. Experimental results confirm that our approach consistently leads to better performance and has promising generalization ability for learning in nonstationary environments

Computational and Mathematical Methods in Medicine | 2015

A Video-Based Intelligent Recognition and Decision System for the Phacoemulsification Cataract Surgery

Shu Tian; Xu-Cheng Yin; Zhi-Bin Wang; Fang Zhou; Hong-Wei Hao

Semantically searching and navigating products (e.g., on Taobao.com or Amazon.com) with professional metadata and user-generated content from social media is a hot topic in information retrieval and recommendation systems, while most existing methods are specifically designed as a purely searching system. In this paper, taking Social Book Search as an example, we propose a general search-recommendation hybrid system for this topic. Firstly, we propose a Generalized Content-Based Filtering (GCF) model. In this model, a preference value, which flexibly ranges from 0 to 1, is defined to describe a users preference for each item to be recommended, unlike conventionally using a set of preferable items. We also design a weighting formulation for the measure of recommendation. Next, assuming that the query in a searching system acts as a user in a recommendation system, a general reranking model is constructed with GCF to rerank the initial resulting list by utilizing a variety of rich social information. Afterwards, we propose a general search-recommendation hybrid framework for Social Book Search, where learning-to-rank is used to adaptively combine all reranking results. Finally, our proposed system is extensively evaluated on the INEX 2012 and 2013 Social Book Search datasets, and has the best performance (NDCG@10) on both datasets compared to other state-of-the-art systems. Moreover, our system recently won the INEX 2014 Social Book Search Evaluation.

Explore More