Tae-Gil Noh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tae-Gil Noh is active.

Explore More

Publication

Featured researches published by Tae-Gil Noh.

meeting of the association for computational linguistics | 2014

The Excitement Open Platform for Textual Inferences

Bernardo Magnini; Roberto Zanoli; Ido Dagan; Kathrin Eichler; Guenter Neumann; Tae-Gil Noh; Sebastian Padó; Asher Stern; Omer Levy

This paper presents the Excitement Open Platform (EOP), a generic architecture and a comprehensive implementation for textual inference in multiple languages. The platform includes state-of-art algorithms, a large number of knowledge resources, and facilities for experimenting and testing innovative approaches. The EOP is distributed as an open source software.

Natural Language Engineering | 2015

Design and realization of a modular architecture for textual entailment

Sebastian Padó; Tae-Gil Noh; Asher Stern; Rui Wang; Roberto Zanoli

A key challenge at the core of many Natural Language Processing (NLP) tasks is the ability to determine which conclusions can be inferred from a given natural language text. This problem, called the Recognition of Textual Entailment (RTE) , has initiated the development of a range of algorithms, methods, and technologies. Unfortunately, research on Textual Entailment (TE), like semantics research more generally, is fragmented into studies focussing on various aspects of semantics such as world knowledge, lexical and syntactic relations, or more specialized kinds of inference. This fragmentation has problematic practical consequences. Notably, interoperability among the existing RTE systems is poor, and reuse of resources and algorithms is mostly infeasible. This also makes systematic evaluations very difficult to carry out. Finally, textual entailment presents a wide array of approaches to potential end users with little guidance on which to pick. Our contribution to this situation is the novel EXCITEMENT architecture, which was developed to enable and encourage the consolidation of methods and resources in the textual entailment area. It decomposes RTE into components with strongly typed interfaces. We specify (a) a modular linguistic analysis pipeline and (b) a decomposition of the ‘core’ RTE methods into top-level algorithms and subcomponents. We identify four major subcomponent types, including knowledge bases and alignment methods. The architecture was developed with a focus on generality, supporting all major approaches to RTE and encouraging language independence. We illustrate the feasibility of the architecture by constructing mappings of major existing systems onto the architecture. The practical implementation of this architecture forms the EXCITEMENT open platform. It is a suite of textual entailment algorithms and components which contains the three systems named above, including linguistic-analysis pipelines for three languages (English, German, and Italian), and comprises a number of linguistic resources. By addressing the problems outlined above, the platform provides a comprehensive and flexible basis for research and experimentation in textual entailment and is available as open source software under the GNU General Public License.

international acm sigir conference on research and development in information retrieval | 2009

An automatic translation of tags for multimedia contents using folksonomy networks

Tae-Gil Noh; Seong-Bae Park; Hee-Geun Yoon; Sang-Jo Lee; Se-Young Park

This paper proposes a novel method to translate tags attached to multimedia contents for cross-language retrieval. The main issue in this problem is the sense disambiguation of tags given with few textual contexts. In order to solve this problem, the proposed method represents both tags and its translation candidates as networks of co-occurring tags since a network allows richer expression of contexts than other expressions such as co-occurrence vectors. The method translates a tag by selecting the optimal one from possible candidates based on a network similarity even when neither the textual contexts nor sophisticated language resources are available. The experiments on the MIR Flickr-2008 test set show that the proposed method achieves 90.44% accuracy in translating tags from English into German, which is significantly higher than the baseline methods of a frequency based translation and a co-occurrence-based translation.

intelligent user interfaces | 2010

A natural language interface of thorough coverage by concordance with knowledge bases

Yong-Jin Han; Tae-Gil Noh; Seong-Bae Park; Se Young Park; Sang-Jo Lee

One of the critical problems in natural language interfaces is the discordance between the expressions covered by the interface and those by the knowledge base. In the graph-based knowledge base such as an ontology, all possible queries can be prepared in advance. As a solution of the discordance problem in natural language interfaces, this paper proposes a method that translates a natural language query into a formal language query such as SPARQL. In this paper, a user query is translated into a formal language by choosing the most appropriate query from the prepared queries. The experimental results show a high accuracy and coverage for the given knowledge base.

Journal of Web Semantics | 2010

Learning the emergent knowledge from annotated blog postings

Tae-Gil Noh; Seong-Bae Park; Se-Young Park; Sang-Jo Lee

Emergent knowledge does not come from a particular document or a particular knowledge source, but comes from a collection of documents or knowledge sources. This paper proposes a system which combines social web content and semantic web technology to process the emergent knowledge from the blogosphere. The proposed system regards blog postings as experiences of people on particular topics. By annotating postings in the selected domains with ontology vocabularies, the system collects experiences from various people into an ontology about people and experiences. The system processes this ontology with semantic rules to find the emergent knowledge. Users can access previously unavailable facts, concepts and trends which are emerging from social web content by using the proposed system.

joint conference on lexical and computational semantics | 2015

Multi-Level Alignments As An Extensible Representation Basis for Textual Entailment Algorithms

Tae-Gil Noh; Sebastian Padó; Vered Shwartz; Ido Dagan; Vivi Nastase; Kathrin Eichler; Lili Kotlerman; Meni Adler

A major problem in research on Textual Entailment (TE) is the high implementation effort for TE systems. Recently, interoperable standards for annotation and preprocessing have been proposed. In contrast, the algorithmic level remains unstandardized, which makes component re-use in this area very difficult in practice. In this paper, we introduce multi-level alignments as a central, powerful representation for TE algorithms that encourages modular, reusable, multilingual algorithm development. We demonstrate that a pilot open-source implementation of multi-level alignment with minimal features competes with state-of-theart open-source TE engines in three languages.

Engineering Applications of Artificial Intelligence | 2013

An application for plagiarized source code detection based on a parse tree kernel

Jeong Woo Son; Tae-Gil Noh; Hyun-Je Song; Seong-Bae Park

Program plagiarism detection is a task of detecting plagiarized code pairs among a set of source codes. In this paper, we propose a code plagiarism detection system that uses a parse tree kernel. Our parse tree kernel calculates a similarity value between two source codes in terms of their parse tree similarity. Since parse trees contain the essential syntactic structure of source codes, the system effectively handles structural information. The contributions of this paper are two-fold. First, we propose a parse tree kernel that is optimized for program source code. The evaluation shows that our system based on this kernel outperforms well-known baseline systems. Second, we collected a large number of real-world Java source codes from a university programming class. This test set was manually analyzed and tagged by two independent human annotators to mark plagiarized codes. It can be used to evaluate the performance of various detection systems in real-world environments. The experiments with the test set show that the performance of our plagiarism detection system reaches to 93% level of human annotators.

computational science and engineering | 2009

Experience Search: Accessing the Emergent Knowledge from Annotated Blog Postings

Tae-Gil Noh; Yong-Jin Han; Jeong-Woo Son; Hyun-Jae Song; Hee-Geun Yoon; Jae-Ahn Lee; Sang-Do Lee; Kye-Sung Kim; Young-Hwa Lee; Seong-Bae Park; Se-Young Park; Sang-Jo Lee

Emergent knowledge does not come from a particular document or a particular knowledge source, but comes from a collection of documents or knowledge sources. This paper proposes a system which combines the social web contents and the semantic web technology to process the emergent knowledge from the blogosphere. The proposed system regards blog postings as experiences of people on particular topics. By annotating postings in the selected domains with ontology vocabularies, the system collects experiences from various people into an ontology about people and experiences. The system processes this ontology with semantic rules to find the emergent knowledge. Users can access previously unavailable facts, concepts and trends which are emerging from system.

Proceedings of the ACM fourth international workshop on Data and text mining in biomedical informatics | 2010

Unsupervised word sense disambiguation in biomedical texts with co-occurrence network and graph kernel

Tae-Gil Noh; Seong-Bae Park; Sang-Jo Lee

This paper proposes an unsupervised word sense disambiguation method for the biomedical domain. In this paper, a network representation of co-occurrence data is first defined to represent both word senses and word contexts. The representation expresses textual context observed around a certain term as a network, where nodes are terms and edges are the number of co-occurrences between connected terms. A graph kernel is adopted as a similarity measure between terms and senses represented in networks. Candidate senses and ambiguous contexts are then compared directly in the representation space to resolve the word sense. It only needs the sense definitions and a large amount of unlabeled texts. The experiments in the biomedical domain show that the method outperforms a baseline method of vector representation. The performance of the proposed method is comparable to the state-of-the-art unsupervised word sense disambiguation methods.

international conference on the computer processing of oriental languages | 2009

Processing of Korean Natural Language Queries Using Local Grammars

Tae-Gil Noh; Yong-Jin Han; Seong-Bae Park; Se-Young Park

For casual web users, a natural language is more accessible than formal query languages. However, understanding of a natural language query is not trivial for computer systems. This paper proposes a method to parse and understand Korean natural language queries with local grammars. A local grammar is a formalism that can model syntactic structures and synonymous phrases. With local grammars, the system directly extracts users intentions from natural language queries. With 163 hand-crafted local grammar graphs, the system could attain a good level of accuracy and meaningful coverage over IT company/people domain.

Explore More