Hongxi Wei | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hongxi Wei is active.

Explore More

Publication

Featured researches published by Hongxi Wei.

international conference on document analysis and recognition | 2015

A multiple instances approach to improving keyword spotting on historical Mongolian document images

Hongxi Wei; Guanglai Gao; Xiangdong Su

For keyword spotting of historical Mongolian document images, when user provides different instance image for the same query keyword, the performance will vary a lot. This paper proposed an approach to solving the above problem. Particularly, the whole procedure of keyword spotting is divided into two stages. The main task of the first stage is to generate multiple ranking lists for a query keyword. And the aim of the second stage is to merge the multiple ranking lists to form a final ranking. In the first stage, the ranking list of one query keyword is firstly returned by traditional image matching and then a number of instances for the query keyword are obtained using pseudo relevant feedback. Next, each instance of the query keyword can return the corresponding ranking list separately. In the second stage, the multiple ranking lists from the multiple instances of the query keyword are combined by the data fusion technique. The final ranking will be taken as the retrieval results of the query keyword. The experimental results show that the proposed approach can significantly improve the performance of keyword spotting for the historical Mongolian document images.

international conference on document analysis and recognition | 2011

A Method for Removing Inflectional Suffixes in Word Spotting of Mongolian Kanjur

Hongxi Wei; Guanglai Gao; Yulai Bao

According to characteristics of Mongolian word-formation, a method for removing inflectional suffixes from word images of the Mongolian Kanjur is proposed in this paper. By removing inflectional suffixes, the amount of clusters equivalent indexing terms might be reduced in word spotting. For the above purpose, we need to determine whether or not one word image contains inflectional suffix. If the word image contains inflectional suffix, the inflectional suffix would be segmented from the word image. The proposed method is as follows: first, many parts are segmented from the bottom of the word image according to the cutting positions of the inflectional suffixes. Then, the segmented parts are represented by a number of profile features and classified by multi-BP neural networks. Finally, the outputs of BP are confirmed by template matching using DTW. Experimental results on our data set prove the feasibility of the proposed method.

international conference on advanced computer theory and engineering | 2010

An efficient binarization method for ancient Mongolian document images

Hongxi Wei; Guanglai Gao; Yulai Bao; Yali Wang

In order to recognize and retrieve the Mongolian Kanjur images, lots of preprocessing tasks should be done. In this paper, we concentrate on the binarization of the Mongolian Kanjur images and we have proposed an efficient binarization method for them. The proposed method is applied to each image as follows: First, some preprocessing tasks including grayscaling and smoothing are executed. Second, three well-known global thresholding methods are used for extracting regions of interest (ROIs) from every gray-level image. Then, each ROI is processed by a modified Sauvolas algorithm with variant sizes of the small windows. Experimental results have proved that the proposed binarization method is better than the original Sauvolas algorithm.

international conference on neural information processing | 2016

LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents

Hongxi Wei; Guanglai Gao; Xiangdong Su

The original Bag-of-Visual-Words approach discards the spatial relations of the visual words. In this paper, a LDA-based topic model is adopted to obtain the semantic relations of visual words for each word image. Because the LDA-based topic model usually hurts retrieval performance when directly employs itself. Therefore, the LDA-based topic model is linearly combined with a visual language model for each word image in this study. After that, the basic query likelihood model is used for realizing the procedure of retrieval. The experimental results on our dataset show that the proposed LDA-based representation approach can efficiently and accurately attain to the aim of keyword spotting on a collection of historical Mongolian documents. Meanwhile, the proposed approach improves the performance significantly than the original BoVW approach.

international conference on multimedia and expo | 2017

Representing word image using visual word embeddings and RNN for keyword spotting on historical document images

Hongxi Wei; Hui Zhang; Guanglai Gao

Visual words of Bag-of-Visual-Words (BoVW) framework are independent each other, which results in not only discarding spatial orders between visual words but also lacking semantic information. This study is inspired by word embeddings that a similar embedding procedure is applied to a large number of visual words. By this way, the corresponding embedding vectors of the visual words can be formulated. For a word image, the average of embedding vectors of all visual words within the word image is taken as its embedding vector. Moreover, Recurrent Neural Network (RNN) is utilized to encode each word image into embeddings like an auto-encoder. The RNN embeddings and the visual word embeddings are complementary. In this study, all word images are represented by combining visual word embeddings and RNN embeddings. Experimental results show that the proposed representation approach is superior to the traditional BoVW, spatial pyramid matching and latent Dirichlet allocation.

China National Conference on Chinese Computational Linguistics | 2016

A Novel Approach to Improve the Mongolian Language Model Using Intermediate Characters

Xiaofei Yan; Feilong Bao; Hongxi Wei; Xiangdong Su

In Mongolian language, there is a phenomenon that many words have the same presentation form but represent different words with different codes. Since typists usually input the words according to their representation forms and cannot distinguish the codes sometimes, there are lots of coding errors occurred in Mongolian corpus. It results in statistic and retrieval very difficult on such a Mongolian corpus. To solve this problem, this paper proposed a method which merges the words with same presentation forms by Intermediate characters, then use the corpus in Intermediate characters form to build Mongolian language model. Experimental result shows that the proposed method can reduce the perplexity and the word error rate for the 3-gram language model by 41 % and 30 % respectively when comparing model trained on the corpus without processing. The proposed approach significantly improves the performance of Mongolian language model and greatly enhances the accuracy of Mongolian speech recognition.

pacific rim conference on multimedia | 2017

Integrating Visual Word Embeddings into Translation Language Model for Keyword Spotting on Historical Mongolian Document Images.

Hongxi Wei; Hui Zhang; Guanglai Gao

In Bag-of-Visual-Words (BoVW) framework, there is lacking of the semantic relatedness between visual words. Therefore, a visual word embeddings approach has been proposed in this paper, which is similar to the word embedding technique in natural language processing (NLP). First of all, a large number of visual words are extracted and collected from a word image collection under the framework of BoVW. And then, a deep learning procedure is used for mapping visual words into embedding vectors in a semantic space. After that, the visual word embeddings are integrated into a translation language model for attaining the aim of keyword spotting in the scenario of query-by-example. Experimental results prove that the proposed visual word embeddings based translation language model approach for keyword spotting outperforms various state-of-the-art methods, including BoVW, language model (LM), translation language model with mutual information (TLM-MI) and latent Dirichlet allocation (LDA).

international conference on neural information processing | 2017

Using Word Mover’s Distance with Spatial Constraints for Measuring Similarity Between Mongolian Word Images

Hongxi Wei; Hui Zhang; Guanglai Gao; Xiangdong Su

In the framework of bag-of-visual-words, visual words are independent each other, which results in discarding spatial relations and lacking semantic information of visual words. To capture semantic information of visual words, a deep learning procedure similar to word embedding technique is used for mapping visual words to embedding vectors in a semantic space. And then, word mover’s distance (WMD) is utilized to measure similarity between two word images, which calculates the minimum traveling distance from the visual embeddings of one word image to another one. Moreover, word images are partitioned into several sub-regions with equal sizes along rows and columns in advance. After that, WMDs can be computed from the corresponding sub-regions of the two word images, separately. Thus, the similarity between the two word images is the sum of these WMDs. Experimental results show that the proposed method outperforms various baseline and state-of-the-art methods, including spatial pyramid matching, latent Dirichlet allocation, average visual word embeddings and the original word mover’s distance.

international conference on computer application and system modeling | 2010

A two-stage binarization method for the Mongolian Kanjur images

Hongxi Wei; Guanglai Gao; Yali Wang

This paper presents a two-stage binarization method for the Mongolian Kanjur images. The proposed method includes two stages. In the first stage, three popular global thresholding methods are used to remove the background regions from one gray-level Mongolian Kanjur image. In the second stage, the rest of regions of one gray-level Mongolian Kanjur image are processed by an improved Sauvolas algorithm. The corresponding binary image could be achieved after these two stages. Experimental results on a number of the Mongolian Kanjur images show that the proposed binarization method outperforms the original Sauvolas algorithm.

international conference on document analysis and recognition | 2017