Cunli Mao
Kunming University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cunli Mao.
Pattern Recognition Letters | 2010
Zhengtao Yu; Lei Su; Lina Li; Quan Zhao; Cunli Mao; Jianyi Guo
In statistical question classification, semi-supervised learning that can exploit the abundant unlabeled samples has received substantial attention in recent years. In this paper, a novel question classification approach with the co-training style semi-supervised learning is proposed. In particular, the method extracts high-frequency keywords as classification features, and uses the word semantic similarity to adjust the feature weights. The classifiers are initially trained from labeled data and then the learned models are refined using unlabeled data which can get labeled if the classifiers agree on the labeling. Experiments on the Chinese question answering system in tourism domain were conducted by employing different feature selections, different supervised and semi-supervised algorithms, different feature dimensions and different unlabeled rates. The experimental results show the proposed method can effectively improve the classification accuracy. Specifically, under the 40% unlabeled rate of training set, the average accuracy rates reach 88.9% on coarse types and 78.2% on fine types, respectively, which get an improvement of around 2-4% points.
international symposium on information processing | 2010
Shao-min Zhang; Jianyi Guo; Zhengtao Yu; Chun-ya Lei; Cunli Mao; Hai-xiong Wang
Ontology is a hotspot of research in knowledge engineering. It is an effective means that building an ontology-based domain knowledge base to realize the sharing of domain knowledge. Manual building needs lots of repetitive work, which is inefficient and hard to migrate from one field to others. To solve the above problems, this paper presents an approach of domain ontology construction based on resource model and Jena. First, we use top-down strategy to build structure of the domain concepts, then discover the relations between concepts, instances and their properties, axioms and store them into documents as resource model. Second, the algorithm this paper presents is used to encode the ontology based on Jena. We had done the experiment in Yunnan tourism to prove the validity.
workshop on chinese lexical semantics | 2013
Lei Li; Zhengtao Yu; Cunli Mao; Jianyi Guo
Be aimed at the problem that the loss of aligned information which caused by the different syntactic structure between Chinese and Naxi language, we put forward a method of template extraction based on improved dependency tree-to-string which could improve the effect of translation template. First of all, the method analyzes the dependency-tree of source language and the string of target language, according to the relationship of alignment, maps the dependency-tree of source language to the string of target language. Then, according to the dependency relationship of the words in the source language, we get aligned words and the unaligned words in the source language, at the same time, merge and extend the unaligned words to the aligned words. Finally, we could extract the translation template by the recursive method. In addition, we do some comparative experiments about the improved and the former method. The experimental results show that the improved method has better effect; the Accuracy is increased by 5.14%. In the process of translation, the method can effectively solve the loss of the unaligned words.
computational intelligence and security | 2009
Cunli Mao; Lina Li; Zhengtao Yu; Lu Han; Jianyi Guo; Xiong-Li Lei
The domain knowledge has a direct impact on the result of question - answering (Q & A) in the restricted domain Question Answering System (QA). In this paper, a method of answer extraction for domain Chinese question-and-answer (Q&A) is proposed, which based on the analysis of interrogative sentence and the answer type, carrying on the text retrieval with the help of the domain knowledge and obtaining the relevant paragraphs of the question as the candidate answer. For the question of numeral or list entity type, extracted the question center related domain entity as the answers by adopting the naming entity recognition. For the definition questions, the sentences or paragraphs with higher relevance can be extracted to become the answer based on the relevance ranking between the candidate sentences or paragraphs and questions by combined the computing method of keywords weighting and the method of semantic similarity between the sentences and questions. Experimented on the answer extraction in Yunnan tourism domain, the results show that more remarkable effects have been achieved by adopting the method of answer extraction from domain Chinese question-and-answer.
international conference on machine learning and cybernetics | 2008
Zhengtao Yu; Lu Han; Cunli Mao; Jianyi Guo; Xiangyan Meng; Zhikun Zhang
Traditional text classification model uses statistical methods to obtain features. But in the aspect of discrimination domain and non-domain text category, domain knowledge relations havenpsilat been taken account of in these methods. A domain text classification model was presented in this paper. This model used the support vector machine learning algorithm, gained domain classification feature words through statistic and union domain words, structured domain classification feature space. With the help of domain knowledge relations, computed relevance between domain concepts, got domain classification feature weight. Finally domain text classification was realized. An experiment in the Yunnan tourism domain was carried on to confirm that domain knowledge relations have a good influence on the domain text classification. The classification accuracy rate has been increased 0.04 than improved TFIDF method.
international conference on machine learning and cybernetics | 2007
Zhengtao Yu; Yan-Xia Qiu; Jin-Hui Deng; Lu Han; Cunli Mao; Xiangyan Meng
FAQ (frequently-asked question) is a good question and answer model to realize business advisory system in restricted domain. A FAQ question answering system model is presented in this paper. With the help of the idea of ontology, a knowledge base is constructed in the domain. With the help of language KDML (Knowledge Database Mark-up Language) of HowNet,the domain ontology and the relationship of it are defined and described, and the fusion of domain knowledge base (domain HowNet) and common knowledge base(HowNet) is realized. On this basis, a question similarity calculation method, which makes use of the characteristics of the domain question and combines lexical relationship, syntactic interdependent relationship and the semantic relationship of domains among question sentences, is implemented. And based on the question similarity calculation, retrieval of related question from the candidate question set and extraction of answers can be implemented with this method. The result of Yunnan tourism question-answer model experiment shows that this method is feasible and effective.
Archive | 2012
Zejian Wu; Zhengtao Yu; Jianyi Guo; Cunli Mao; Youmin Zhang
For the issue that existing methods for Chinese Named Entity Recognition(NER) fail to consider the long-distance dependencies, which is common in the document. This paper, Fusion of long distance dependency, proposes a method for Chinese Named Entity Recognition(NER) based on Markov Logic Networks(MLNs), which comprehensively utilizes local, short distance dependency and long distance dependency features by taking advantage of first order logic to represent knowledge, and then integrates all the features into Markov Network for Chinese named entity recognition with the help of MLNs. Validity of proposed method is verified both in open domain and restricted domain, experimental result shows that proposed method has better effect.
international conference natural language processing | 2011
Zhengtao Yu; Tao Zhang; Jianyi Guo; Cunli Mao; Jian Li
To achieve the corpus of Naxi - English bilingual words alignment, aim at syntactic characteristics of Naxi language. A Naxi-English bilingual words alignment method is proposed. This method uses the log-linear model, and introduces feature functions based on the characteristic of the Naxi language, which are English - Naxi interval switching function and Naxi - English bilingual words position transformation function. With the artificial labeling of Naxi - English words alignment corpus, the parameters of the model are trained by using the minimum error. The Naxi-English bilingual words are alignment automatically by this model. Experiments with IBM Model 3 as a benchmark, and gradually add constraints on the characteristics of the Naxi language with the basis of IBM Model 3. The final experiment results show that the Naxi - English bilingual word alignment accuracy can be improved significantly with the feature functions which are base on characteristic of Naxi.
workshop on chinese lexical semantics | 2013
Xiuzhen Yang; Zhengtao Yu; Jianyi Guo; Xiao Pan; Cunli Mao
This paper proposes Naxi-Chinese bilingual word alignment method based on entity constraint for the characteristic that entity and entity is alignment in bilingual alignment. First, we mark the corpus of Naxi word segmentation and entity label, using Conditional Random Fields constructs the model of Naxi word segmentation and entity recognition to achieve Naxi word segmentation and entity recognition. Then, using entity alignment constraint relation of Naxi-Chinese sentence uses the specific marker to mark out bilingual entity of Naxi-Chinese sentence. Finally, we introduce features, such as fertility probabilities and distortion probabilities et al, and achieve the Naxi-Chinese bilingual alignment model by GIZA++. Naxi-Chinese bilingual word alignment experimental result shows the method proposed by this paper has better effect compared to only use IBM Models.
international symposium on information processing | 2010
Chun-ya Lei; Jianyi Guo; Zhengtao Yu; Shao-min Zhang; Cunli Mao; Chao-shen Zhang
To solve the difficulty of the field of Automatic Entity Relation Extraction, in this paper, a method that used binary classification thinking, meanwhile combined with reasoning rules to extract the field of entity relation is proposed. considering comprehensively the context information of entity, entity type and their combination of characteristics to construct the feature set, which in order to build the Binary Classifier of entity relation extraction, then taking full advantage of the field characteristics of entity relation, further combine reasoning rules to obtain the type of the field of entity relation. Doing our experiment on the artificial collection of 600 corpuses for tourism field, experimental result shows the method of Binary Classifier combining Reasoning is better than Multiple Classifiers, the F-score is improved 3%.