Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yixin Zhong is active.

Publication


Featured researches published by Yixin Zhong.


international conference on machine learning and cybernetics | 2008

A survey on learning to rank

Chuan He; Cong Wang; Yixin Zhong; Rui-Fan Li

Ranking is the key problem for information retrieval and other text applications. Recently, the ranking methods based on machine learning approaches, called learning to rank, become the focus for researchers and practitioners. The main idea of these methods is to apply the various existing and effective algorithms on machine learning to ranking. However, as a learning problem, ranking is different from other classical ones such as classification and regression. In this paper, we investigate the important papers in this direction; the cons and pros of the recent-proposed framework and algorithms for ranking are analyzed, and the relationships among them are discussed. Finally, the promising directions in practice are also pointed out.


international conference natural language processing | 2007

Semantic Role Labeling for multi-VP clauses in Chinese

Jie Cai; Caixia Yuan; Xiaojie Wang; Yixin Zhong

We have built a semantic role labeling (SRL) system for Chinese clauses, with some desired information of the main-predicate in each clause and the relevant functional slots. The ability of our SRL system dealing with simple clauses is considered as the basic performance of the system. When processing more complex clauses (namely clauses with more than one verb phrase in our definition of complex clauses), it is difficult to identify the main-predicate and the larger functional chunks. The final accuracy of semantic role labeling decreases badly due to the difficulty. We propose in this paper a novel way to replace the auto chunker for clauses with more than one verb phrase (multi-VP). The multi-VP clauses will be transformed to I-VP forms which could be handled easily by our SRL system. It will be proven that the multi-VP clauses can be labeled in an almost same accuracy as the basic accuracy for I-VP clauses after being transformed. Besides being applied in semantic role labeling, the pattern-transformation method has a further perspective on the fields where the syntactic clause-pattern description will help and resources with full syntactic parsing are not available.


international conference natural language processing | 2008

Dimensionality reduction for text using LLE

Chuan He; Zhe Dong; Ruifan Li; Yixin Zhong

Dimensionality reduction is a necessary preprocessing step in many fields of information processing such as information retrieval, pattern recognition and data compression. Its goal is to discover the representative or the discriminative information residing in raw data. Locally linear embedding (LLE), one of effective manifold learning algorithms, addresses this problem by computing low-dimensional, neighborhood preserving embeddings of high-dimensional data. The embedding is derived from the symmetries for locally linear reconstructions. And the computation of this embedding is related to an eigen-problem in the implement. Since LLE was proposed, it has been being applied to deal with image data only because it originated from facial recognition. However, the problem of curse of dimensionality is very prevalent. Therefore, we here try to apply this algorithm for text processing. In this paper, we introduce the LLE briefly and analyze its advantage and latent disadvantages, and the relationship between LSI and LLE in the graph embedding framework is then discussed from a theoretic view. Finally, the experimental results are show with the datasets of Reuters21578 and TDT2.


international conference natural language processing | 2008

Semantic keywords-based duplicated web pages removing

Yunhe Weng; Lei Li; Yixin Zhong

Because of many duplicated web pages existing on the web, search engines need to find and remove them, not only for saving process time and hardware resource, but also for ensuring that users can get the result information without many replicas. In this paper, we propose a method to find and remove duplicated Chinese Web pages for search engine. First we describe a scheme based on semantic keywords combined with sentence overlapping, and then show an implemented prototype, with the experimental results that suggest the prototype work well under a proper setting.


international conference natural language processing | 2007

Identification of Noun Phrase with Various Granularities

Ying Qin; Xiaojie Wang; Yixin Zhong

Since noun phrases are the most popular phrases in texts, noun phrase identification is one of vital subtasks of natural language processing. Generally Chinese noun phrases have hierarchical inner structures. This paper proposes an approach of defining various levels of granularity for noun phrases, catering for different application demands. Three levels of granularity noun phrases are proposed, that is, concept noun phrase, base noun phrase and entire noun phrase. The task of noun phrase identification is to label word sequences with phrase tags. All granularity noun phrase identifications are cast as classification problem under certain encoding schemes. The experimental dataset is acquired empirically from Chinese Penn Treebank 5.1. F, measure of concept noun phrase, base noun phrase and entire noun phrase identification reaches 92.12%, 84.13% and 85.32% respectively.


international conference natural language processing | 2003

N-best speech hypothesis reordering based on comprehensive information theory

Jianyi Liu; Yixin Zhong

This paper proposes a hypothesis reordering technique, based on a newly established theory, namely comprehensive information theory, to improve the accuracy of speech recognition in a man-machine dialog system. For each hypothesis, we calculate the amount of comprehensive information that hypothesis provided and then reorder N-best hypothesis according to the amount of comprehensive information. Results of experiments have shown its effectiveness.


international conference natural language processing | 2008

IE based SMS semantic orientation recognition

Lei Li; Yonggui Yang; Yixin Zhong

This paper is a grope research on applying the technology of information extraction (IE) in the field of information content security and the focus is semantic orientation recognition for short message service (SMS). An experimental SMS monitoring system will be introduced, in which rules based and hidden Markov model (HMM) based IE patterns are integrated to recognize the semantic orientation of Chinese SMS in two specific domains. Testing results have shown that the performance is good and the method is feasible and promising.


international conference natural language processing | 2007

Orientation Identification for Chinese Short Text

Shen-zheng Zuo; Yanquan Zhou; Yixin Zhong

With the rapid development of information technology, huge data are accumulated. A vast number of such data appears as short text. It is very useful to orientation identification for short text. But traditional text filtering technology based on statistics usually is ineffective when it deals with orientation, especially for Chinese short text. This paper proposes a novel method for Chinese short text orientation identification which simulates humans cognition. The approach makes full use of field knowledge, combines tendency dictionary and semantic rules, and takes into account the sentiment orientation of words which constructed by Naive Bayesian model. Experiments show that the proposed method works well in terms of orientation identification for Chinese short text.


international conference natural language processing | 2003

Online classifiers for Chinese text classification and filtering

Yanhui Guo; Jianyi Liu; Cong Wang; Yixin Zhong

In text classification systems, the quality and quantity of the training document is one of the most import factors which affect performance. But gathering, filtering and classifying the training documents is very difficult. This paper employs feedback mechanism for text classification systems and reduces the need for labeled training documents by unifying the strengths of k-NN and linear classifiers. It collects various types of example documents provided by the users and set up the user profiles. While the documents from stream are matched with the profiles, the relevant documents are used to improve the user profile. The online approach offers the advantage of continuous learning in the batch-adaptive text considerations, on the batch-adaptive filtering task.


international conference natural language processing | 2005

Chinese pronominal coreference resolution using decision tree plus filter rules

Zhiqiang Wang; Lei Li; Ruifan Li; Yixin Zhong

With the advancement of natural language understanding (NLU), coreference resolution, as a main part of information extraction (IE), has received much more attention. For this task, either rule-based or statistics-based methods cannot meet the needs of large-scale text processing. An integrated method based on decision tree for Chinese pronominal coreference is proposed. The basic idea is filter rules could, to some degree compensate the drawback of decision tree that ignoring the relationship between attributes. The performance of the proposed method is evaluated on Chinese Treebank. In our experiments, the attributes and coreferences are manually labeled, and then the filter rules are utilized to feature vectors following the decision tree of C4.5 algorithm. The success rate is 82.59%, in which the rate of personal pronouns and demonstrative pronouns are 87.60% and 75.21% respectively.

Collaboration


Dive into the Yixin Zhong's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge