Is this you? Create Your Porfile

Qinmin Hu

East China Normal University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Qinmin Hu is active.

Explore More

Publication

Featured researches published by Qinmin Hu.

international acm sigir conference on research and development in information retrieval | 2009

A bayesian learning approach to promoting diversity in ranking for biomedical information retrieval

Xiangji Huang; Qinmin Hu

In this paper, we propose a Bayesian learning approach to promoting diversity for information retrieval in biomedicine and a re-ranking model to improve retrieval performance in the biomedical domain. First, the re-ranking model computes the maximum posterior probability of the hidden property corresponding to each retrieved passage. Then it iteratively groups the passages into subsets according to their properties. Finally, these passages are re-ranked from the subsets as our output. There is no need for our proposed method to use any external biomedical resource. We evaluate our Bayesian learning approach by conducting extensive experiments on the TREC 2004-2007 Genomics data sets. The experimental results show the effectiveness of the proposed Bayesian learning approach for promoting diversity in ranking for biomedical information retrieval on four years TREC data sets.

intelligent information systems | 2010

Passage extraction and result combination for genomics information retrieval

Qinmin Hu; Jimmy Xiangji Huang

In this paper, we first propose algorithms for passage extraction to build indices for the purpose of generating more accurate passages as query answers. Second, we propose a basic result combination method and an improved result combination method to combine the retrieved results from different indices for the purpose of selecting and merging relevant passages as outputs. For passage extraction, three new algorithms are proposed, namely paragraphParsed, sentenceParsed and wordSentenceParsed. For result combination, a novel method is proposed, in which we use factor analysis to generate a better baseline result for combination by finding some hidden common factors that can be used to estimate the importance of keywords and keyword associations. Finally, we report the experimental results that confirm the effectiveness and superiority of the factor analysis based method for result combination. Our proposed approaches achieve excellent results on the TREC 2006 and 2007 Genomics data sets, which provide a promising avenue for constructing high performance information retrieval systems in biomedicine.

BMC Bioinformatics | 2011

A robust approach to optimizing multi-source information for enhancing genomics retrieval performance

Qinmin Hu; Jimmy Xiangji Huang; Jun Miao

BackgroundThe users desire to be provided short, specific answers to questions and put them in context by linking original sources from the biomedical literature. Through the use of information retrieval technologies, information systems retrieve information to index data based on all kinds of pre-defined searching techniques/functions such that various ranking strategies are designed depending on different sources. In this paper, we propose a robust approach to optimizing multi-source information for improving genomics retrieval performance.ResultsIn the proposed approach, we first consider a common scenario for a metasearch system that has access to multiple baselines with retrieving and ranking documents/passages by their own models. Then, given selected baselines from multiple sources, we investigate three modified fusion methods in the proposed approach, reciprocal, CombMNZ and CombSUM, to re-rank the candidates as the outputs for evaluation. Our empirical study on both 2007 and 2006 genomics data sets demonstrates the viability of the proposed approach for obtaining better performance. Furthermore, the experimental results show that the reciprocal method provides notable improvements on the individual baseline, especially on the passage2-level MAP and the aspect-level MAP.ConclusionsFrom the extensive experiments on two TREC genomics data sets, we draw the following conclusions. For the three fusion methods proposed in the robust approach, the reciprocal method outperforms the CombMNZ and CombSUM methods obviously, and CombSUM works well on the passage2-level when compared with CombMNZ. Based on the multiple sources of DFR, BM25 and language model, we can observe that the alliance of giants achieves the best result. Meanwhile, under the same combination, the better the baseline performance is, the more contribution the baseline provides. These conclusions are very useful to direct the fusion work in the field of biomedical information retrieval.

knowledge discovery and data mining | 2009

Boosting Biomedical Information Retrieval Performance through Citation Graph: An Empirical Study

Xiaoshi Yin; Xiangji Huang; Qinmin Hu; Zhoujun Li

This paper presents an empirical study of the combination of content-based information retrieval results with linkage-based document importance scores to improve retrieval performance on TREC biomedical literature datasets. In our study, content-based information comes from the state-of-the-art probability model based Okapi information retrieval system. On the other hand, linkage-based information comes from a citation graph generated from REFERENCES sections of a biomedical literature dataset. Three well-known linkage-based ranking algorithms (PageRank, HITS and InDegree) are applied on the citation graph to calculate document importance scores. We use TREC 2007 Genomics dataset for evaluation, which contains 162,259 biomedical literatures. Our approach achieves the best document-based MAP among all results that have been reported so far. Our major findings can be summarized as follows. First, without hyperlinks, linkage information extracted from REFERENCES sections can be used to improve the effectiveness of biomedical information retrieval. Second, performance of the integrated system is sensitive to linkage-based ranking algorithms, and a simpler algorithm, InDegree, is more suitable for biomedical literature retrieval.

BMC Bioinformatics | 2012

Modeling and mining term association for improving biomedical information retrieval performance

Qinmin Hu; Jimmy Xiangji Huang; Xiaohua Hu

BackgroundThe growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance.ResultsWe propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10.ConclusionsFirst, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind term associations. These latent factors are decided by the proposed model and their term appearances in the first round retrieved passages.

cross language evaluation forum | 2009

An integrated approach for medical image retrieval through combining textual and visual features

Zheng Ye; Xiangji Huang; Qinmin Hu; Hongfei Lin

In this paper, we present an empirical study for monolingual medical image retrieval. In particular, we present a series of experiments in ImageCLEFmed 2009 task. There are three main goals. First, we evaluate traditional well-known weighting models in the text retrieval domain, such as BM25, TFIDF and Language Model (LM), for context-based image retrieval. Second, we evaluate statistical-based feedback models and ontology-based feedback models. Third, we investigate how content-based image retrieval can be integrated with these two basic technologies in traditional text retrieval domain. The experimental results have shown that: 1) traditional weighting models work well in context-based medical image retrieval task especially when the parameters are tuned properly; 2) statistical-based feedback models can further improve the retrieval performance when a small number of documents are used for feedback; however, the medical image retrieval can not benefit from ontology-based query expansion method used in this paper; 3) the retrieval performance can be slightly boosted via an integrated retrieval approach.

pacific-asia conference on knowledge discovery and data mining | 2015

Learning Topic-Oriented Word Embedding for Query Classification

Hebin Yang; Qinmin Hu; Liang He

In this paper, we propose a topic-oriented word embedding approach to address the query classification problem. First, the topic information is encoded to generate query categories. Then, the user click-through information is also incorporated in the modified word embedding algorithms. After that, the short and ambiguous queries are enriched to be classified in a supervised learning way. The unique contributions are that we present four neural network strategies based on the proposed model. The experiments are designed on two open data sets, namely Baidu and Sogou, which are two famous commercial search companies. Our evaluation results show that the proposed approach is promising on both large data sets. Under the four proposed strategies, we achieve the high performance as 95.73% in terms of Precision, 97.79% in terms of the F1 measure.

international acm sigir conference on research and development in information retrieval | 2008

A reranking model for genomics aspect search

Qinmin Hu; Xiangji Huang

In this paper, we propose a reranking model to improve the aspect-level performance in the biomedical domain. This model iteratively computes the maximum hidden aspect for every retrieved passage and then reranks these passages from aspect subsets. The experimental results show the improvements of the aspect-level performance up to 27.14% for 2006 Genomics topics and 27.09% for 2007 Genomics topics.

pacific-asia conference on knowledge discovery and data mining | 2015

An Empirical Study of Personal Factors and Social Effects on Rating Prediction

Zhijin Wang; Yan Yang; Qinmin Hu; Liang He

In social networks, the link between a pair of friends has been reported effective in improving recommendation accuracy. Previous studies mainly based on the assumption that any pair of friends shall have similar interests, via minimizing the gap between user’s taste and the average (or similar) taste of this user’s friends to reduce the error of rating prediction. However, these methods ignore the diversity of user’s taste. In this paper, we focus on learning the diversity of user’s taste and effects from this user’s friends in terms of rating behavior. We propose a novel recommendation approach, namely Personal factors with Weighted Social effects Matrix Factorization (PWS), which utilities both user’s taste and social effects to provide recommendations. Experimental results carried out on 3 datasets, show the effectiveness of the proposed approach.

international acm sigir conference on research and development in information retrieval | 2016

SG++: Word Representation with Sentiment and Negation for Twitter Sentiment Classification

Qinmin Hu; Yijun Pei; Qin Chen; Liang He

Here we propose an advance Skip-gram model to incorporate both word sentiment and negation information. In particular, there is a a softmax layer for the word sentiment polarity upon the Skip-gram model. Then, two paralleled embedding layers are set up in the same embedding space, one for the affirmative context and the other for the negated context, followed by their loss functions. We evaluate our proposed model on the 2013 and 2014 SemEval data sets. The experimental results show that the proposed approach achieves better performance and learns higher dimensional word embedding informatively on the large-scale data.

Explore More