Dongmo Zhang
Shanghai Jiao Tong University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dongmo Zhang.
international conference on machine learning and cybernetics | 2002
Dongmo Zhang; Huanye Sheng; Fang Li; Tianfang Yao
Multilingual natural language interface for database (NLIDB) constitutes the primary factor on multilingual information retrieval system. This paper presents a multilingual NLIDB model based on case, which is motivated by the idea of case-based reasoning in machine learning. The model avoids the difficulties of constructing parsers for all intended supporting natural languages by storing every query pattern and its solution as a case into a casebase. Query sentence inputted by user is syntactically compared to cases in the casebase and the solution of the most similar case is reused to query the database. Each case is represented as a XML document fragment and the casebase is a valid XML document. All the facilities provided. by XML greatly enhanced the maintainability and scalability of the model. The model has been implemented in a multilingual NLIDB for a stock market information retrieval system.
international conference natural language processing | 2003
Yi Zhang; Dongmo Zhang
We present a logic approach for answer validation in Chinese question answering (QA). The idea of logic form representation has been used successfully in English QA. Our work extends the LF representation for Chinese. A rule based logic form transformation (LFT) algorithm is introduced and implemented. Lexical knowledge extraction from HowNet for logic proving is discussed. The answer validation algorithm based on LFT is illustrated. An experiment on LFT is performed and reported. Finally, the possibility of cross-lingual QA based on LFT is also prospected.
international conference on natural computation | 2005
Yan Dang; Yulei Zhang; Dongmo Zhang; Liping Zhao
This paper presents a novel approach toward high precision biology species categorization which is mainly based on KNN algorithm. KNN has been successfully used in natural language processing (NLP). Our work extends the learning method for biological data. We view the DNA or RNA sequences of certain species as special natural language texts. The approach for constructing composition vectors of DNA and RNA sequences is described. A learning method based on KNN algorithm is proposed. An experimental system for biology species categorization is implemented. Forty three different bacteria organisms selected randomly from EMBL are used for evaluation purpose. And the preliminary experiments show promising results on precision.
international conference on machine learning and cybernetics | 2005
Yan Dang; Yulei Zhang; Dongmo Zhang
Predicting the secondary structure of RNA molecules from the knowledge of the primary structure (the sequence of bases) is still a challenging task. This paper presents a novel efficient statistical parser to predict RNA secondary structure. The parser is based on Role Inverse algorithm which combines the advantages of both traditional Chart algorithm and LR algorithm, saving both space and time. The Role Inverse algorithm is revised into a probabilistic version to implement the parser. Then a PCFG grammar is established according to the base-pairing structure to predict RNA secondary structure.
international conference natural language processing | 2003
Wei Hu; Dongmo Zhang
In this paper, we propose a cluster-based and brute-correcting grammatical rules learning method which is based on some conclusions of the cognitive linguistics. First, instances of grammatical category are mapped to graphic vectors and distance between two vectors is defined. The set of vectors and the defined distance are proved to form a distance space. Next, this space is mapped to Euclidean space and a simple clustering algorithm is applied to acquire clusters. Then, grammatical rules are learned to describe the cluster. Finally, brute-correcting progress helps to refine the rules. After describing the method we compare the brute-correcting progress with Eric Brills transformation-based learning approach [E. Brill, 1995] informally and present an application in Chinese named entity recognition.
web information systems engineering | 2004
Wei Hu; Dongmo Zhang; Huan-ye Sheng
The task of related news detection is to find news articles which discuss events that have been reported in earlier articles. In this paper the notion of “event” in news is extended to be “vague event” and news article is represented using a vector of vague event trees. Then an approach to vague event-based related news detection is presented and an experiment for Chinese sports news detection is designed.
international conference on machine learning and cybernetics | 2003
Yi Zhang; Dongmo Zhang
This paper presents a novel approach toward high precision answer extraction algorithm for Chinese natural language question answering system. Dependency structure (DS) is introduced to provide more syntactical information. The hybrid approach for constructing DS from statistical parsing result and rule-based headword detection is described. An answer extraction algorithm based on the similarity of dependency structure is proposed. An experimental system for Chinese question answering is implemented. A TREC-like Chinese QA test set is built for evaluation purpose. And the preliminary experimental results show promisingly enhancement on precision.
international conference natural language processing | 2003
Xinhua Mao; Dongmo Zhang
The principles and techniques in analyzing rhetorical relations are discussed, and RR-based means and the process of answer extraction and formulation in the Chinese QA system are introduced. We have implemented an experimental system in Java. Experimental results are given in detail.
Archive | 2002
Fang Li; Huanye Sheng; Dongmo Zhang; Tianfang Yao
Electronic information grows rapidly as the Internet is widely used in our daily life. People can easily obtain information from the Internet. In order to identify some key points from investment news automatically, a multilingual investment information extraction system is realized based on templates and patterns. The system consists of three parts: user query processing, extraction based on templates and patterns and dynamical acquisition. The system features the uniform processing for different languages and the combination of predefined templates and dynamic generated templates. Currently the system processes queries in Chinese, English, German and extracts Chinese investment news from the Internet, German and English investment news will be added in the future.
Lecture Notes in Computer Science | 2002
Fang Li; Huanye Sheng; Dongmo Zhang