Jianyi Guo
Kunming University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jianyi Guo.
Pattern Recognition Letters | 2010
Zhengtao Yu; Lei Su; Lina Li; Quan Zhao; Cunli Mao; Jianyi Guo
In statistical question classification, semi-supervised learning that can exploit the abundant unlabeled samples has received substantial attention in recent years. In this paper, a novel question classification approach with the co-training style semi-supervised learning is proposed. In particular, the method extracts high-frequency keywords as classification features, and uses the word semantic similarity to adjust the feature weights. The classifiers are initially trained from labeled data and then the learned models are refined using unlabeled data which can get labeled if the classifiers agree on the labeling. Experiments on the Chinese question answering system in tourism domain were conducted by employing different feature selections, different supervised and semi-supervised algorithms, different feature dimensions and different unlabeled rates. The experimental results show the proposed method can effectively improve the classification accuracy. Specifically, under the 40% unlabeled rate of training set, the average accuracy rates reach 88.9% on coarse types and 78.2% on fine types, respectively, which get an improvement of around 2-4% points.
International Journal of Machine Learning and Cybernetics | 2015
Jin Jiang; Xin Yan; Zhengtao Yu; Jianyi Guo; Wei Tian
Abstract In order to utilize the associated relationship in the expert page efficiently, we’d like to introduce a Chinese expert disambiguation method based on the semi-supervised graph clustering with the integration of various associated relationships. Firstly, extract the correlation characteristics of the expert attributes according to the correlation analysis on the expert page. Secondly, construct a similarity matrix between the documents on different expert pages with the utilization of the attributes characteristics and the associated relationship of the expert pages. Finally, with the adoption of the attribute correlation as the semi-supervised constraint, construct an expert disambiguation model by applying the graph-based clustering approach to get the solution of the model through the kernel-based method for the purpose to achieve expert name disambiguation. Through the contrast experiment in the Chinese expert disambiguation, it turns out that the disambiguation effect is much better with the adoption of the semi-supervised graph clustering method that has been integrated with the expert-associated relationships.
international conference on machine learning and cybernetics | 2005
Zhengtao Yu; Zhiyun Zheng; Shi-Ping Tang; Jianyi Guo
In document retrieval, query words expansion is normally based on semantic relation of query words. While using natural language questions to retrieve documents, because of more abundant semantic relation of question than that of query words, the precision can be improved by expanding query according to question characteristics. This paper puts forward a query expansion method for answering document retrieval in Chinese question answering system. The related words of question type are got through analyzing the question-answering pair, and the query are expanded with related words of question type. In order to verify the validity of query expansion method, a similarity computation method for question and document based on minimal match span is implemented. The word frequency and position information of query words and expansion words in the document are taken into consideration. The experiment results show that retrieval performance make substantial improvement using query expansion.
Archive | 2013
Wei Tian; Tao Shen; Zhengtao Yu; Jianyi Guo; Yantuan Xian
Aimed at the problems of Chinese experts’ name repetition and representation diversity, a Chinese expert name disambiguation approach based on spectral clustering with the expert page-associated relationships is proposed. Firstly, the TF-IDF algorithm is used to calculate the word-based feature weights, and then the cosine similarity algorithm is employed to compute the similarity between the evidence-pages to obtain the initial similarity matrix of expert evidence-pages. Secondly, the expert page-associated relationship features are taken as the semi-supervised constraint information to correct the initial similarity matrix, and next the spectral clustering-based method is used to build expert disambiguation model. Finally, taking the contrast experiments on Chinese expert evidence-page corpus of manually labeled, the result shows that the semi-supervised spectral clustering on Chinese experts’ name disambiguation method with the expert page-associated relationships than that without the associated constraint information, the F-value has an average increase of 9.02 %.
international symposium on information processing | 2010
Shao-min Zhang; Jianyi Guo; Zhengtao Yu; Chun-ya Lei; Cunli Mao; Hai-xiong Wang
Ontology is a hotspot of research in knowledge engineering. It is an effective means that building an ontology-based domain knowledge base to realize the sharing of domain knowledge. Manual building needs lots of repetitive work, which is inefficient and hard to migrate from one field to others. To solve the above problems, this paper presents an approach of domain ontology construction based on resource model and Jena. First, we use top-down strategy to build structure of the domain concepts, then discover the relations between concepts, instances and their properties, axioms and store them into documents as resource model. Second, the algorithm this paper presents is used to encode the ontology based on Jena. We had done the experiment in Yunnan tourism to prove the validity.
workshop on chinese lexical semantics | 2013
Lei Li; Zhengtao Yu; Cunli Mao; Jianyi Guo
Be aimed at the problem that the loss of aligned information which caused by the different syntactic structure between Chinese and Naxi language, we put forward a method of template extraction based on improved dependency tree-to-string which could improve the effect of translation template. First of all, the method analyzes the dependency-tree of source language and the string of target language, according to the relationship of alignment, maps the dependency-tree of source language to the string of target language. Then, according to the dependency relationship of the words in the source language, we get aligned words and the unaligned words in the source language, at the same time, merge and extend the unaligned words to the aligned words. Finally, we could extract the translation template by the recursive method. In addition, we do some comparative experiments about the improved and the former method. The experimental results show that the improved method has better effect; the Accuracy is increased by 5.14%. In the process of translation, the method can effectively solve the loss of the unaligned words.
computational intelligence and security | 2009
Cunli Mao; Lina Li; Zhengtao Yu; Lu Han; Jianyi Guo; Xiong-Li Lei
The domain knowledge has a direct impact on the result of question - answering (Q & A) in the restricted domain Question Answering System (QA). In this paper, a method of answer extraction for domain Chinese question-and-answer (Q&A) is proposed, which based on the analysis of interrogative sentence and the answer type, carrying on the text retrieval with the help of the domain knowledge and obtaining the relevant paragraphs of the question as the candidate answer. For the question of numeral or list entity type, extracted the question center related domain entity as the answers by adopting the naming entity recognition. For the definition questions, the sentences or paragraphs with higher relevance can be extracted to become the answer based on the relevance ranking between the candidate sentences or paragraphs and questions by combined the computing method of keywords weighting and the method of semantic similarity between the sentences and questions. Experimented on the answer extraction in Yunnan tourism domain, the results show that more remarkable effects have been achieved by adopting the method of answer extraction from domain Chinese question-and-answer.
international conference on machine learning and cybernetics | 2008
Zhengtao Yu; Lu Han; Cunli Mao; Jianyi Guo; Xiangyan Meng; Zhikun Zhang
Traditional text classification model uses statistical methods to obtain features. But in the aspect of discrimination domain and non-domain text category, domain knowledge relations havenpsilat been taken account of in these methods. A domain text classification model was presented in this paper. This model used the support vector machine learning algorithm, gained domain classification feature words through statistic and union domain words, structured domain classification feature space. With the help of domain knowledge relations, computed relevance between domain concepts, got domain classification feature weight. Finally domain text classification was realized. An experiment in the Yunnan tourism domain was carried on to confirm that domain knowledge relations have a good influence on the domain text classification. The classification accuracy rate has been increased 0.04 than improved TFIDF method.
international symposium on communications and information technologies | 2005
Zhengtao Yu; Zhiyun Zheng; Shengxiang Gao; Jianyi Guo
Personalized recommendation is implemented by computing the similarity of users interests and resources. Most of current recommendation systems compute the similarity based on keywords, which is simply implemented but much semantic information are lost. This paper takes the personalized recommendations of digital library as example, proposes a method to implement personalized recommendation service. User profile and resource features are represented by vector space model. And then these keyword vectors are extended in conceptual level by virtue of domain ontology and HowNet knowledge database. So conceptual space vectors of user and resource are generated. Therefore, personalized recommendation service is provided to user according to the similarity of the conceptual space vectors. Experiment data shows that the similarity based on concept is more efficient than similarity based on keyword in personalized recommendation service.
International Journal of Machine Learning and Cybernetics | 2017
Shengxiang Gao; Xiuzhen Yang; Zhengtao Yu; Xiao Pan; Jianyi Guo
AbstractTo make use of the syntactic characteristic of Naxi language effectively, in this paper, a Chinese-Naxi machine translation method based on Naxi dependency language model is proposed. The method, firstly, makes dependency parsing on the Chinese sentence, and extracts Chinese-Naxi dependency tree-to-string translation templates. Secondly, it decodes Naxi Sentence by using the templates, and generates n-best candidate sentences. Thirdly, it makes dependency parsing on each of the candidate Naxi sentences, and obtains a node sequence corresponding to the dependency relationships. Finally, it builds a Naxi dependency language model, with the help of the model, calculates and reorders each node from the sequence, and selects the final target sentence. The comparative experiments, on the proposed method, the existing tree-to-string translating method, Bruin and Mo-tse, show that the Chinese-Naxi machine translation method combined with the Naxi dependency language model can improve the translating accuracy effectively, and its BLEU-2 increases by 2 point than without the Naxi dependency language model.