Dong-Yul Ra
Yonsei University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dong-Yul Ra.
Information Processing and Management | 2005
Eui-Kyu Park; Dong-Yul Ra; Myung-Gil Jang
This paper talks about several schemes for improving retrieval effectiveness that can be used in the named page finding tasks of web information retrieval (Overview of the TREC-2002 web track. In: Proceedings of the Eleventh Text Retrieval Conference TREC-2002, NIST Special Publication #500-251, 2003). These methods were applied on top of the basic information retrieval model as additional mechanisms to upgrade the system. Use of the title of web pages was found to be effective. It was confirmed that anchor texts of incoming links was beneficial as suggested in other works. Sentence-query similarity is a new type of information proposed by us and was identified to be the best information to take advantage of. Stratifying and re-ranking the retrieval list based on the maximum count of index terms in common between a sentence and a query resulted in significant improvement of performance. To demonstrate these facts a large-scale web information retrieval system was developed and used for experimentation.
Information Processing Letters | 1999
Dong-Yul Ra; George C. Stockman
Abstract The Inside–Outside algorithm is a well-known method for estimating rule probabilities of stochastic context-free grammars. Stolcke developed a method that can be regarded as an extension of the Inside–Outside algorithm. Both the original method and Stolckes extension require two passes. In this paper we present a one pass algorithm. Analysis shows the new algorithm to be of the same order as Stolckes; however, an experiment shows it to be twice as fast in practice.
Information Processing Letters | 1996
Dong-Yul Ra; Jonghyun Kim
Most of the efforts on designing parallel algorithms for parsing context-free languages have used a large number of simple machines, i.e., nz machines (n is the length of the input string) [3,4,6]. These algorithms require O(n*) or O(n) time. However, these algorithms cannot operate on multiprocessors with a fixed number of processors. To solve this problem, Ibarra et al. 151 gave an algorithm that operates on the hypercube. Their algorithm uses a one-way one-dimensional array of p processors and requires 0(n3/p> time where p < n. However, this algorithm is based on the CYK algorithm Ill. Thus the grammar should be given in the Chomsky normal form [ll. We propose a parallel algorithm that can handle arbitrary context-free grammars (CFGS) since it is based on Earley’s algorithm [l]. Our algorithm can operate on any loosely-coupled multiprocessors. Our algorithm uses a one-way ring of p processors where p Q n. Our algorithm requires o(n3/p> time in the worst case like the algorithm of Ibarra et al. The experiment shows that our algorithm yields performance comparable to the algorithm of Ibarra et al.
Pattern Recognition Letters | 2013
Soojong Lim; Changki Lee; Dong-Yul Ra
Semantic Role Labeling (SRL) systems aim at determining the semantic role labels of the arguments of the predicates in natural language text. SRL systems can usually be built to work upon the result of constitient analysis (constituent-based), or dependency parsing (dependency-based). SRL systems can use either classification or sequence labeling as the main processing mechanism. In this paper, we show that a dependency-based SRL system using sequence labeling can achieve state-of-the-art performance when a new structural SVM adapted from the Pegasos algorithm is exploited for performing sequence labeling.
Journal of KIISE | 2015
Soojong Lim; Yongjin Bae; Hyunki Kim; Dong-Yul Ra
Developing a high-performance Semantic Role Labeling (SRL) system for a domain requires manually annotated training data of large size in the same domain. However, such SRL training data of sufficient size is available only for a few domains. Performances of Korean SRL are degraded by almost 15% or more, when it is directly applied to another domain with relatively small training data. This paper proposes two techniques to minimize performance degradation in the domain transfer. First, a domain adaptation algorithm for Korean SRL is proposed which is based on the prior model that is one of domain adaptation paradigms. Secondly, we proposed to use simplified features related to morphological and syntactic tags, when using small-sized target domain data to suppress the problem of data sparseness. Other domain adaptation techniques were experimentally compared to our techniques in this paper, where news and Wikipedia were used as the sources and target domains, respectively. It was observed that the highest performance is achieved when our two techniques were applied together. In our systems performance, F1 score of 64.3% was considered to be 2.4~3.1% higher than the methods from other research.
Lecture Notes in Computer Science | 2003
Yun Sik Kim; Dong-Yul Ra
We construct a relation table that contains the comparison of the general vocabulary and chatting vocabulary. Also this study proposes a method for constructing a comprehensive chatting dictionary that includes the several sublists of the Internet chatting vocabulary, the relations between those sublists and the information of the attributes of the chatters. With the use of the constructed dictionary this study presents a resolution method for ambiguity that occurs during the process of mapping of the chatting language into the standard language.
conference on intelligent text processing and computational linguistics | 2004
Seonho Kim; Juntae Yoon; Dong-Yul Ra
As a part of work on alignment of the English and Korean parallel corpus, this paper presents a statistical translation model incorporating linguistic knowledge of syntactic and phrasal information for better translations. For this, we propose three models: First, we incorporate syntactic information such as part of speech into the word-based lexical alignment. Based on this model, we propose the second model which finds phrasal correspondence in the parallel corpus. Phrasal mapping through chunk-based shallow parsing enables to settle mismatch of meaningful units in the two languages. Lastly, we develop a two-level alignment model by combining these two models in order to construct both the word and phrase-based translation model. Model parameters are automatically estimated from a set of bilingual sentence pairs by applying the EM algorithm. Experiments show that the structural relationship helps construct a better translation model for structurally different languages like Korean and English.
Archive | 2002
Chung Hee Lee; Myung-Gil Jang; Sang Kyu Park; Dong-Yul Ra; Eui-Kyu Park; Jung-sik Jang
text retrieval conference | 2002
Eui-Kyu Park; Seong-In Moon; Dong-Yul Ra; Myung-Gil Jang
Etri Journal | 2014
Soojong Lim; Changki Lee; Pum-Mo Ryu; Hyunki Kim; Sang Kyu Park; Dong-Yul Ra