Kangmiao Liu
Zhejiang University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kangmiao Liu.
international acm sigir conference on research and development in information retrieval | 2007
Guang Qiu; Kangmiao Liu; Jiajun Bu; Chun Chen; Zhiming Kang
Query ambiguity prevents existing retrieval systems from returning reasonable results for every query. As there is already lots of work done on resolving ambiguity, vague queries could be handled using corresponding approaches separately if they can be identified in advance. Quantification of the degree of (lack of) ambiguity laysthe groundwork for the identification. In this poster, we propose such a measure using query topics based on the topic structure selected from the Open Directory Project (ODP) taxonomy. We introduce clarity score to quantify the lack of ambiguity with respect to data sets constructed from the TREC collections and the rank correlation test results demonstrate a strong positive association between the clarity scores and retrieval precisions for queries.
international workshop on data mining and audience intelligence for advertising | 2007
Guang Qiu; Kangmiao Liu; Jiajun Bu; Chun Chen; Zhiming Kang
Previous work on opinion/sentiment mining focuses only on sentiment classification with the postulation that topics are identified a prior. However, this assumption often fails in reality. In advertising, topics on which users are commenting are crucial as corresponding advertisements can only be promoted when advertisers have the idea of what users are referring to. In this paper, we propose a rule-based approach to extracting topics from opinion sentences given these sentences identified from texts in advance. We build up a sentiment dictionary and define several rules based on the syntactic roles of words using the Dependence Grammar which is considered to be more suitable for Chinese natural language parsing. The experiments show encouraging results.
meeting of the association for computational linguistics | 2007
Keke Cai; Jiajun Bu; Chun Chen; Kangmiao Liu
This paper focuses on the exploration of term dependence in the application of sentence retrieval. The adjacent terms appearing in query are assumed to be related with each other. These assumed dependences among query terms will be further validated for each sentence and sentences, which present strong syntactic relationship among query terms, are considered more relevant. Experimental results have fully demonstrated the promising of the proposed models in improving sentence retrieval effectiveness.
signal-image technology and internet-based systems | 2007
Peng Huang; Jiajun Bu; Chun Chen; Kangmiao Liu; Guang Qiu
Automatic image annotation is a promising methodology for image retrieval. However most current annotation models are not yet sophisticated enough to produce high quality annotations. Given an image, some irrelevant keywords to image contents are produced, which are a primary obstacle to getting high-quality image retrieval. In this paper an approach is proposed to improve automatic image annotation two directions. One is to combine annotation keywords produced by underlying three classic image annotation models of translation model, continuous-space relevance model and multiple Bernoulli relevance models, hoping to increase the number of potential correctly annotated keywords. Another is to remove irrelevant keywords to image semantics based on semantic similarity calculation using WordNet. To verify the proposed hybrid annotation model, we carried out the experiments on the widely used Corel image data set, and the reported experimental results showed that the proposed approach improved image annotation to some extent.
international acm sigir conference on research and development in information retrieval | 2007
Keke Cai; Chun Chen; Kangmiao Liu; Jiajun Bu; Peng Huang
This poster focuses on the study of term context dependence in the application of sentence retrieval. Based on Markov Random Field (MRF), three forms of dependence among query terms are considered. Under different assumptions of term dependence relationship, three feature functions are defined, with the purpose to utilize association features between query terms in sentence to evaluate the relevance of sentence. Experimental results have proven the efficiency of the proposed retrieval models in improving the performance of sentence retrieval.
asia information retrieval symposium | 2008
Peng Huang; Jiajun Bu; Chun Chen; Kangmiao Liu; Guang Qiu
Automatic image annotation techniques are proposed for overcoming the so-called semantic-gap between image low-level feature and high-level concept in content-based image retrieval systems. Due to the limitations of techniques, current state-of-the-art automatic image annotation models still produce some irrelevant concepts to image semantics, which are an obstacle to getting high-quality image retrieval. In this paper we focus on improving image annotation to facilitate web image retrieval. The novelty of our work is to use both WordNet and textual information in web documents to refine original coarse annotations produced by the classic Continuous Relevance Model (CRM). Each keyword in annotations is associated with a certain weight, and larger the weight is, more related to image semantics the corresponding concept is. The experimental results show that the refined annotations improve image retrieval to some extent, compared to the original coarse annotations.
international world wide web conferences | 2007
Keke Cai; Jiajun Bu; Chun Chen; Kangmiao Liu; Wei Chen
This paper makes an intensive investigation of the application of Bayesian network in sentence retrieval and introduces three Bayesian network based sentence retrieval models with or without consideration of term relationships. Term relationships in this paper are considered from two perspectives: relationships between pairs of terms and relationships between terms and term sets. Experiments have proven the efficiency of Bayesian network in the application of sentence retrieval. Particularly, retrieval result with consideration of the second kind of term relationship performs better in improving retrieval precision.
advances in multimedia | 2007
Kangmiao Liu; Guang Qiu; Jiajun Bu; Chun Chen
Blog has received lots of attention since the revolution of Web 2.0 and has attracted millions of users to publish information on it. As time goes by, information seeking in this new media becomes an emergent issue. In our paper, we take multiple features unique in blogs into account and propose a novel algorithm to rank the blog posts in blog search. Coherence between the query type and blogger interest, document relevance and freshness are combined linearly to produce the final ranking score of a post. Specifically, we introduce a user modeling method to capture interests of bloggers. In our experiments, we invite volunteers to complete several tasks and their time cost in the tasks is taken as the primary criteria to evaluate the performance. The experimental results show that our algorithm outperforms traditional ones.
advances in multimedia | 2006
Kangmiao Liu; Wei Chen; Chun Chen; Jiajun Bu; Can Wang; Peng Huang
The explosive growth of World Wide Web has already made it the biggest image repository. Despite some image search engines provide con-venient access to web images, they frequently yield unwanted results. Locating needed and relevant images remains a challenging task. This paper proposes a novel ranking model named EagleRank for web image search engine. In EagleRank, multiple sources of evidence related to the images are considered, including image surrounding text passages, terms in special HTML tags, website types of the images, the hyper-textual structure of the web pages and even the user feedbacks. Meanwhile, the flexibility of EagleRank allows it to combine other potential factors as well. Based on inference network model, EagleRank also gives sufficient support to Boolean AND and OR operators. Our experimental results indicate that EagleRank has better performance than traditional approaches considering only the text from web pages.
international conference on convergence information technology | 2007
Peng Huang; Jiajun Bu; Chun Chen; Kangmiao Liu; Wei Chen
Retrieving images in response to textual queries requires some knowledge of the semantic of images. Accordingly, an efficient image annotation and retrieval system is highly desired for this purpose. However, current image annotation technique is not satisfying which often includes noisy keywords. To improve image annotation, we propose a hybrid Web image annotation model (HIAM) consisting of two basic submodules, HMIAM and IARM. The former, based on hidden Markov model, associates an image with some keywords like other traditional models, while the latter utilizes textual information in Web documents to evaluate each keywords importance to image semantics: each keyword is associated with certain weight to quantify its similarity to image semantics. Then keywords with low weight can be removed as noisy data. The experimental results show that the post-processed annotations by our model are better than original ones.