Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zhi-Ming Xu is active.

Publication


Featured researches published by Zhi-Ming Xu.


international conference on machine learning and cybernetics | 2005

Using multiple features and statistical model to calculate text units similarity

Yong-Dong Xu; Zhi-Ming Xu; Xiaolong Wang; Yuanchao Liu; Tao Liu

In many NLP applications, identifying similar information from a set of related documents is a common problem. In this paper, the similarity between two Chinese text units is determined by multiple features extracted from these units, including word statistical features, part of speech features, semantic features, word density feature and text discourse structure features. In addition, a statistical method based on logistic regression model is proposed to automatically fuse these features and calculate the similarity between text paragraphs. The experiment that compares this method with two popular used methods shows the effectiveness of this approach.


asia information retrieval symposium | 2008

A full distributed web crawler based on structured network

Kunpeng Zhu; Zhi-Ming Xu; Xiaolong Wang; Yuming Zhao

Distributed Web crawlers have recently received more and more attention from researchers. Full decentralized crawler without a centralized managing server seems to be an interesting architectural paradigm for realizing large scale information collecting systems for its scalability, failure resilience and increased autonomy of nodes. This paper provides a novel full distributed Web crawler system which is based on structured network, and a distributed crawling model is developed and applied in it which improves the performance of the system. Some important issues such as assignment of tasks, solution of scalability have been discussed. Finally, an experimental study is used to verify the advantages of system, and the results are comparatively satisfying.


international conference on machine learning and cybernetics | 2006

An Open Domain Question Answering System Based on Improved System Similarity Model

Yuming Zhao; Zhi-Ming Xu; Yi Guan; Xiaolong Wang

Question-answering has recently received more and more attention from researchers. It is widely regarded as the advanced stage of information retrieval. This paper provides a novel domain-independent question-answering system which is based on information retrieval in a large-scale collection of texts, and an improved system similarity model is developed and applied in it which improves the performance of the system. Many natural language processing technologies are adopted to increase the accuracy of the system. Several useful tools are incorporates as external auxiliary resources. In addition, some external knowledge such as knowledge from Internet is also widely used in this system. Test data collection and evaluation methodology from 2006 Text Retrieval Conferences Question Answering Track are used to evaluate the system, and the results are comparatively satisfying


international conference on machine learning and cybernetics | 2005

Using category-based semantic field for text categorization

Qiang Wang; Yi Guan; Xiaolong Wang; Zhi-Ming Xu

This paper proposes a new document representation method to text categorization. It applies category-based semantic field (CBSF) theory for text categorization to gain a more efficient representation of documents. The lexical chain is introduced to compute CBSF and Hownet* used as a lexical database. In particular, the title of each document functions as a clue to forecast the potential CBSF of the test document. Combined with classifier, this approach is examined in text categorization and the result indicates that it performs better than conventional methods with features chosen on the basis of bag-of-words (BOW) system, on the same task.


international conference on machine learning and cybernetics | 2009

A text classifier based on biomimetic pattern recognition

Ji-bin Zhang; Shuai Cong; Zhi-Ming Xu; Qi-shu Pan

In this paper a novel text classification method based on biomimetic pattern recognition (BPR) is proposed. And we implement a BPR text classifier with multi-weight neural network; the model of three-weight neural network and its construction method are described in detail. Some text classification experiments have been done to test the performance of our BPR text classifier; experimental results show that three-weight neural network achieved good performance in the text classification task.


international conference on machine learning and cybernetics | 2006

Integration Algorithm of English-Chinese Word Segmentation and Alignment

Zhi-Ming Xu; Chunyu Kit; Jonathan J. Webster

This paper proposes an integration algorithm of English-Chinese word segmentation and alignment. In this algorithm, bilingual word segmentation and alignment work synchronously and interactively. Given sentence-aligned bitext, it cannot only use bilingual word alignments information to guide resolving word segmentation ambiguities, but also avoid the errors of word segmentation from being transferred into word alignment. Experimental result shows that it distinctly improves accuracy of both word segmentation and alignment


international conference on machine learning and cybernetics | 2009

Web site classification based on key resources

Zhi-Ming Xu; Xin-Bo Gao; Meng Lei

Automatic web site classification has a wide application prospect. However, there is a little research on the web site classification. Many methods represent the web site as normal text and still use the methods of text classification. But web sites are combination of many web pages via hyperlinks, so the methods of text classification are not suitable for web sites. This paper proposes a new approach to web site classification. First of all, we get the key resources of web site through a reasonable pruning strategy. Then abstract the topic vector of web site from the key resources, according to the web sites structure information and content information. To reflect the structure information of the web site, we use an improved vector space model which includes both structure feature words and content feature words to represent the topic vector of the web site.


international conference on machine learning and cybernetics | 2008

Multiple features fusion method for identifying text topic boundaries

Yong-Dong Xu; Guang-Ri Quan; Ya-Dong Wang; Zhi-Ming Xu

In general, a document should be regarded as form of some coherent units which are called discourse segments. Discovering the segment boundaries is an important task for many natural language processing applications. In this paper, we proposed a new Chinese text topic boundaries identification method based on multiple features fusion. Our approach firstly extracts multiple features of topics shift from text. For each feature, we adopt corresponding F-dotplotting model to respectively calculate the boundary values of neighboring sentences. Subsequently, the useful features among above cues are automatically select and combined to determine topic boundaries automatically by a statistical method based on logistic regression analysis. The experimental result shows that the F-dotplotting method is more effective than common dotplotting method and the multiple features fusion method based on the logistic regression model can effectively improve Chinese text topic segmentation performance.


international conference on machine learning and cybernetics | 2007

A Peer-to-Peer Information Retrieval System Based on Semantic Similarity Model

Kunpeng Zhu; Zhi-Ming Xu; Xiaolong Wang; Yuming Zhao

Peer-to-peer (P2P) networks have received more and more attention from researchers. P2P seems to be an interesting architectural paradigm for realizing large-scale information retrieval systems for its scalability, failure resilience and increased autonomy of nodes. This paper provides a novel peer-to-peer networks system that is based on information retrieval in a large-scale collection of texts, and a semantic similarity model is developed and applied in it, which improves the performance of the system. Some natural language processing technologies are adopted to increase the accuracy of the system. Several useful tools are incorporates as external auxiliary resources. In addition, feedback knowledge such as query information from peers is also widely used to direct querying messages flooding based on a semantic routing mechanism in this system. Finally, an experimental study is used to verify the advantages of system, and the results are comparatively satisfying.


Int. J. of Asian Lang. Proc. | 2010

A Web Site Classification Approach Based On Its Topological Structure.

Ji-bin Zhang; Zhi-Ming Xu; Kun-li Xiu; Qi-shu Pan

Collaboration


Dive into the Zhi-Ming Xu's collaboration.

Top Co-Authors

Avatar

Xiaolong Wang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yi Guan

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yuming Zhao

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Kunpeng Zhu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Chunyu Kit

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Jonathan J. Webster

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Bingquan Liu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Ji-bin Zhang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Qi-shu Pan

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Qiang Wang

Harbin Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge