Zhongyuan Han | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhongyuan Han is active.

Explore More

Publication

Featured researches published by Zhongyuan Han.

NLPCC | 2013

Feature Analysis in Microblog Retrieval Based on Learning to Rank

Zhongyuan Han; Xuwei Li; Muyun Yang; Haoliang Qi; Sheng Li

Learning to rank, which can fuse various of features, performs well in microblog retrieval. However, it is still unclear how the features function in microblog ranking. To address this issue, this paper examines the contribution of each single feature together with the contribution of the feature combinations via the ranking SVM for microblog retrieval modeling. The experimental results on the TREC microblog collection show that textual features, i.e. content relevance between a query and a microblog, contribute most to the retrieval performance. And the combination of certain non-textual features and textual features can further enhance the retrieval performance, though non-textual features alone produce rather weak results.

international conference on internet computing for science and engineering | 2015

A Method for Microblog Search by Adjusting the Language Model with Time

Song Li; Hui Ning; Zhongyuan Han; Haoliang Qi

Time is a good scale for a microblog search. The traditional retrieval model only uses content to query microblog information. Many studies thought that the relevant microblogs most likely appear when it is close to the instant time of query. Through an analysis of microblog data, people find that the relevant information in microblogs possibly appears when it is far from the query time. These two opinions are conflict each other. Therefore, in this paper we report a modified language model which can be used to adjust time difference with appropriate weights. This way the order of the microblogs can be readjusted. The experimental results on microblog show the model works well.

international conference for young computer scientists | 2016

Time-Based Microblog Search System

Zhongyuan Han; Wenhao Qiao; Shuo Cui; Leilei Kong

This demo shows a time-based microblog research system which developed based on the time profile to estimate the query model, the document model and rank function for microblog search. The system exploits the time profile to boost the performance of microblog search. A brief description of the time-based query model, time-based document model and time-based similarity score is introduced. The index strategy for temporal microblog search is described. Using TREC 2011 and TREC 2012 microblog retrieval collection, the examples of microblog search results are demonstrated.

forum for information retrieval evaluation | 2016

Sentence Paraphrase Detection Using Classification Models

Liuyang Tian; Hui Ning; Leilei Kong; Kaisheng Chen; Haoliang Qi; Zhongyuan Han

In this paper, we address on the task of sentence paraphrase detection which is focused on deciding whether the two sentences have the relationship of paraphrase. A supervised learning strategy for paraphrase detection is described whereby the two sentences are classified to decide the paraphrase relationship and using only the lexical features operated at n-gram as the classification features. Gradient Boosting, K-Nearest Neighbor, Decision Tree and Support vector machine are chosen as the classifiers. The performance of the classification method is compared and the features are analyzed to determine which of them are most important for paraphrase detection. Evaluation is performed on the corpus of 2016 Detecting Paraphrase in Indian Languages task proposed by Forum of Information Retrieval Evaluation (DPIL-FIRE2016). The experimental results show that the Gradient Boosting can achieve the highest Overall Score. By using the learned classifier, we got the highest F1 measure for both Task1 and Task2 on Malayalam and Tamil, and the highest F1 measure for Task2 on Punjabi in DPIL-FIRE2016.

international conference for young computer scientists | 2015

LRC Sousou: A Lyrics Retrieval System

Yong Han; Li Min; Yu Zou; Zhongyuan Han; Song Li; Leilei Kong; Haoliang Qi; Wenhao Qiao; Shuo Cui; Hong Deng

Lyrics retrieval is one of the frequently-used retrieval functions of search engines. However, diversified information requirements are neglected in the existing lyrics retrieval systems. A lyrics retrieval system named LRC Sousou, in which erroneous characters are corrected automatically, the mixed queries of Chinese words and Pinyin are supported, and English phonemes queries are also achieved effectively, is introduced in this paper. The technologies of natural language processing, information retrieval and machine learning algorithm are applied to our lyrics retrieval system which enhance the practicability and efficiency of lyrics search, and improve user experience.

international conference on machine learning and cybernetics | 2012

An empirical study on query expansion and documentexpans ion in information retrieval

Xuwei Li; Muyun Yang; Haoliang Qi; Sheng Li; Zhongyuan Han

Query expansion (QE) and document expansion (DE) have been proved effective for improving the retrieval performance in language modeling approach. However, the issue that which expansion technique is more effective in information retrieval (IR), has not been well studied and discussed. To address this issue, this paper performs an empirical study on QE and DE to examine their effects. Moreover, since QE and DE exploit different corpus structures, we also examine the potential effectiveness of incorporating QE and DE. Experimental results on several TREC test collections show that both QE and DE significantly outperform the classical language model, but the effectiveness of QE and DE is varied in different settings of retrieval. In addition, incorporating QE with DE does not always bring about the best performance.

fuzzy systems and knowledge discovery | 2008

Online Linear Discriminative Learning for Spam Filter

Haoliang Qi; Xiaoning He; Muyun Yang; Jun Li; Guohua Lei; Zhongyuan Han; Sheng Li

This paper describes a simple but effective discriminative learner for spam filter. We statically derive the features within Bayesian framework, cluster them into groups according to their position and then assigning weights respectively. The model is evaluated by TREC Spam corpus and compared with the TREC results. Experimental results show that our linear discriminative model can produce competitive results.

text retrieval conference | 2012