Biyun Hu
Beihang University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Biyun Hu.
international conference on machine learning and cybernetics | 2009
Lin Li; Xia Hu; Biyun Hu; Jun Wang; Yiming Zhou
The paper proposes to determine sentence similarities from different aspects. Based on the information people get from a sentence, Objects-Specified Similarity, Objects-Property Similarity, Objects-Behavior Similarity and Overall Similarity are defined to determine sentence similarities from four aspects. Experiments show that the proposed method makes the sentence similarity comparison more exactly and give out a more reasonable result, which is similar to the peoples comprehension to the meanings of the sentences.
wri global congress on intelligent systems | 2009
Biyun Hu; Jun Wang; Yiming Zhou
News reporting events without people’s review is usually very short, so Vector Space Model (VSM) used for representing long text-based documents is not suitable for describing news. In addition, VSM can not represent basic questions in news: who, what, when, where. A new kind of text model is needed to manipulate news. In this paper, a news ontology incorporating some metadata in OpenCyc,EventsML-G2, NewsML-G2, and News Industry Text Format(NITF) is designed at first. What is more important is that relations in news domain are also described in the ontology. A model for news analysis is then proposed. The process of news analysis is analyzed from three main aspects: instancing, reasoning, and finding similar events. The results indicate that the main benefit of the news ontology is its ability to describe news, reason about news, and deal with the whole process of news analysis.
international symposium on neural networks | 2009
Jun Wang; Yiming Zhou; Lin Li; Biyun Hu; Xia Hu
Most of traditional text clustering methods are based on bag of words representation, which ignore the important information on semantic relationship between key terms. To overcome this problem, researchers have recently proposed several new methods for improving short text clustering accuracy based on enriching short text representation. However, the computational costs of these methods based on expanding words appeared in short texts are usually time-consuming. In this paper, we improve previous work by enriching short text representation with keyword expansion. Empirical results show that the proposed method can greatly save time without sacrificing clustering accuracy.
european conference on information retrieval | 2010
Biyun Hu; Zhoujun Li; Jun Wang
Memory-based collaborative filtering is one of the most popular methods used in recommendation systems. It predicts a user’s preference based on his or her similarity to other users. Traditionally, the Pearson correlation coefficient is often used to compute the similarity between users. In this paper we develop novel memory-based approach that incorporates user’s latent interest. The interest level of a user is first estimated from his/her ratings for items through a latent trait model, and then used for computing the similarity between users. Experimental results show that the proposed method outperforms the traditional memory-based one.
conference on information and knowledge management | 2011
Jun Wang; Xia Hu; Zhoujun Li; Wenhan Chao; Biyun Hu
This paper is concerned with the problem of question recommendation in the setting of Community Question Answering (CQA). Given a question as query, our goal is to rank all of the retrieved questions according to their likelihood of being good recommendations for the query. In this paper, we propose a notion of public interest, and show how public interest can boost the performance of question recommendation. In particular, to model public interest in question recommendation, we build a language model to combine relevance score to the query and popularity score regarding question popularity. Experimental results on Yahoo!Answers dataset demonstrate the performance of question recommendation can be greatly improved with considering the public interest.
fuzzy systems and knowledge discovery | 2009
Lei Shen; Yiming Zhou; Chao Xu; Xia Hu; Biyun Hu
Web sites are making great effort to understand the user’s behavior in order to make the web sites easy to use and further increase their profits. This paper presents a method to predict the user’s buying behavior based on psychology model. We employ the method to analyze online store data and treat the clicking and buying as user’s attitude and behavior. Then, a new model, that is used to predict the user’s future buying behavior, is built based on the attitude-behavior relationship theory. We then verify the model and simultaneously estimate its parameters by path analysis. Our method is evaluated by comparing with traditional naive bayes classification algorithm. Experiments results show that our model is more effective in predicting buying behavior and finding out users who are more profitable to web sites.
fuzzy systems and knowledge discovery | 2010
Jun Wang; Zhoujun Li; Biyun Hu
There is an increasing number of community-based question and answer (cQA) service on the Web. Many tasks in cQA services involve in determining similarity between questions. However, finding similar questions is not trivial. In this paper we propose a new method based on a query expansion technique to tackle the similar question matching problem. We employ the information provided by the corresponding answer and question & answer pairs from cQA services to determine the similarity between questions. We empirically show that with this method it is possible to find semantically similar questions even though they may not share any actual terms in common.
international symposium on neural networks | 2009
Biyun Hu; Yiming Zhou; Jun Wang; Lin Li; Lei Shen
Although many approaches to collaborative filtering have been proposed, few have considered the data quality of the recommender systems. Measurement is imprecise and the rating data given by users is true preference distorted. This paper describes how item response theory, specifically the rating scale model, may be applied to correct the ratings. The theoretically true preferences were then used to substitute for the actual ratings to produce recommendation. This approach was applied to the Jester dataset and traditional k-Nearest Neighbors (k-NN) collaborative filtering algorithm. Experiments demonstrated that rating scale model can enhance the recommendation quality of k-NN algorithm. Analysis also showed that our approach can predict true preferences which k-NN cannot do. The results have important implications for improving the recommendation quality of other collaborative filtering algorithms by finding out the true user preference first.
asia-pacific web conference | 2012
Biyun Hu; Zhoujun Li; Wenhan Chao
Traditionally, it is often assumed that data sparsity is a big problem of user-based collaborative filtering algorithm. However, the analysis is based only on data quantity without considering data quality, which is an important characteristic of data, sparse high quality data may be good for the algorithm, thus, the analysis is one-sided. In this paper, the effects of training ratings with different levels of sparsity on recommendation quality are first investigated on a real world dataset. Preliminary experimental results show that data sparsity can have positive effects on both recommendation accuracy and coverage. Next, the measurement of data noise is introduced. Then, taking data noise into consideration, the effects of data sparsity on the recommendation quality of the algorithm are re-evaluated. Experimental results show that if sparsity implies high data quality (low noise), then sparsity is good for both recommendation accuracy and coverage. This result has shown that the traditional analysis about the effect of data sparsity is one-sided, and has the implication that recommendation quality can be improved substantially by choosing high quality data.
Journal of Electronics Information & Technology | 2011
Jun Wang; Zhou-jun Li; Xia Hu; Biyun Hu
Question retrieval plays important role in question and answering systems. The main problem is how to measure the similarity between candidate questions and query question. This paper presents a tree kernel based method, named weighted tree kernel, to calculate the similarity of sentences’ structures and proposes improvements to the original tree kernel algorithm. In order to reduce the effect on tree kernel bringing by syntactic parsing, a composite kernel is proposed based on the weighted tree kernel and two other string kernels, which can capture syntax, part-of-speech and lexical level information of a sentence, to calculate the semantic similarity between question sentences. Experimental results on Yahoo!Answers dataset show that the proposed method outperforms traditional vector space model based methods by 24.02% in question retrieval accuacry.