Mansuk Song
Yonsei University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mansuk Song.
Natural Language Engineering | 2001
Juntae Yoon; Key-Sun Choi; Mansuk Song
The syntactic structure of a nominal compound must be analyzed first for its semantic interpretation. In addition, the syntactic analysis of nominal compounds is very useful for NLP application such as information extraction, since a nominal compound often has a similar linguistic structure with a simple sentence, as well as representing concrete and compound meaning of an object with several nouns combined. In this paper, we present a novel model for structural analysis of nominal compounds using linguistic and statistical knowledge which is coupled based on lexical information. That is, the syntactic relations defined between nouns (complement-predicate and modifier-head relation) are obtained from large corpora and again used to analyze the structures of nominal compounds and identify the underlying relations between nouns. Experiments show that the model gives good results, and can be effectively used for application systems which do not require deep semantic information.
Computers and The Humanities | 2001
Seonho Kim; Juntae Yoon; Mansuk Song
In this paper, we propose a statistical method to automaticallyextract collocations from Korean POS-tagged corpus. Since a large portion of language is represented by collocation patterns, the collocational knowledge provides a valuable resource for NLP applications. One difficulty of collocation extraction is that Korean has a partially free word order, which also appears in collocations. In this work, we exploit four statistics, ‘frequency’,‘randomness’, ‘convergence’, and ‘correlation in order to take into account the flexible word order of Korean collocations. We separate meaningful bigrams using an evaluation function based on the four statistics and extend the bigrams to n-gram collocations using a fuzzy relation. Experiments show that this method works well for Korean collocations.
Journal of Applied Mathematics and Computing | 1997
KyoungJoong Kim; Mansuk Song
An algorithm to get an optimal choice for the number of symmetric quadrature points is given to find symmetric quadrature formulas over a unit disk with a minimal number of points even when a high degree of polynomial precision is required. The symmetric quadrature formulas for numerical integration over a unit disk of complete polynomial functions up to degree 19 are presented.
international conference on computational linguistics | 2000
Seonho Kim; Juntae Yoon; Mansuk Song
When aligning texts in very different languages such as Korean and English, structural features beyond word or phrase give useful information. In this paper, we present a method for selecting structural features of two languages, from which we construct a model that assigns the conditional probabilities to corresponding tag sequences in bilingual English-Korean corpora. For tag sequence mapping between two languages, we first define a structural feature function which represents statistical properties of empirical distribution of a set of training samples. The system, based on maximum entropy concept, selects only features that produce high increases in loglikelihood of training samples. These structurally mapped features are more informative knowledge for statistical machine translation between English and Korean. Also, the information can help to reduce the parameter space of statistical alignment by eliminating syntactically unlikely alignments.
international conference on computational linguistics | 2000
Juntae Yoon; Yoonkwan Kim; Mansuk Song
Accurate analysis of the temporal expression is crucial for Korean text processing applications such as information extraction and chunking for efficient syntactic analysis. It is a complicated problem since temporal expressions often have the ambiguity of syntactic roles. This paper discusses two problems: (1) representing and identifying the temporal expression (2) distinguishing the syntactic function of the temporal expression in case it has a dual syntactic role. In this paper, temporal expressions and the context for disambiguation which is called local context are represented using lexical data extracted from corpus and the finite state transducer. By experiments, it turns out that the method is effective for temporal expression analysis. In particular, our approach shows the corpus-based work could make a promising result for the problem in a restricted domain in that we can effectievely deal with a large size of lexical data.
Archive | 2000
Juntae Yoon; Seonho Kim; Mansuk Song
This chapter presents a new parsing method using statistical information extracted from a corpus, especially for Korean. In Korean, structural ambiguities occur in the dependency relations between words. While figuring out the correct dependency, lexical associations play an important role. Our parser uses statistical co-occurrence data to compute lexical associations. We show that sentences can be parsed deterministically by means of global management of the associations. The global association table (GAT) is defined, and we describe how the associations between words are recorded in the GAT. We present a hybrid semi-deterministic parser that is controlled by the association values between phrases. Whenever the expectation of the parser fails, it chooses an alternative using a chart to avoid backtracking.
international conference on the computer processing of oriental languages | 1999
Juntae Yoon; Key-Sun Choi; Mansuk Song
empirical methods in natural language processing | 1999
Seonho Kim; Zooil Yang; Mansuk Song; Jung-Ho Ahn
international workshop/conference on parsing technologies | 1997
Juntae Yoon; Mansuk Song; Seonho Kim
empirical methods in natural language processing | 1999
Juntae Yoon; Key-Sun Choi; Mansuk Song