Hanmin Jung
Pohang University of Science and Technology
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Hanmin Jung.
Information Processing and Management | 2003
Dongseok Kim; Hanmin Jung; Gary Geunbae Lee
This paper presents a new extraction pattern, called modified Document Type Definition (mDTD), which relies on analytical interpretation to identify extraction target from the contents of the Web documents. From conventional DTD in XML documents, we develop two major extensions: first, we introduce an extended content model with type-specific operators and keywords, and second, we refine the way to interpret the conventional DTD rules. As the result of the two, our mDTD becomes freely represent HTML structures and extraction targets. The goal of mDTD is to overcome the current major barriers, that is, domain portability (with minimal human intervention) and high performance, on information extraction. The human experts compose an mDTD as seed rules, and then our system automatically extracts a set of instances by the mDTD from structured documents on the Web. We use the extracted instances as Sequential mDTD Learner (SmL) inputs to generate new mDTD rules based on part-of-speech tags and features for lexical similarity. This process does not require any hand-annotated corpus. We have experimented with 330 Korean and 220 English Web documents on audio and video shopping sites. The average extraction precision is 91.3% for Korean and 81.9% for English.
Multimedia Tools and Applications | 2014
Sung-Pil Choi; Seungwoo Lee; Hanmin Jung; Sa-Kwang Song
Relation extraction refers to a method of efficiently detecting and identifying predefined semantic relationships within a set of entities in text documents. Numerous relation extractionfc techniques have been developed thus far, owing to their innate importance in the domain of information extraction and text mining. The majority of the relation extraction methods proposed to date is based on a supervised learning method requiring the use of learning collections; such learning methods can be classified into feature-based, semi-supervised, and kernel-based techniques. Among these methods, a case analysis on a kernel-based relation extraction method, considered the most successful of the three approaches, is carried out in this paper. Although some previous survey papers on this topic have been published, they failed to select the most essential of the currently available kernel-based relation extraction approaches or provide an in-depth comparative analysis of them. Unlike existing case studies, the study described in this paper is based on a close analysis of the operation principles and individual characteristics of five vital representative kernel-based relation extraction methods. In addition, we present deep comparative analysis results of these methods. In addition, for further research on kernel-based relation extraction with an even higher performance and for general high-level kernel studies for linguistic processing and text mining, some additional approaches including feature-based methods based on various criteria are introduced.
Multimedia Tools and Applications | 2013
Myunggwon Hwang; Do-Heon Jeong; Jinhyung Kim; Sa-Kwang Song; Hanmin Jung; Juhyun Shin; Pankoo Kim
The importance of research on knowledge management is growing due to recent issues on Big Data. One of the most fundamental steps in knowledge management is the extraction of terminologies. Terms are often expressed in various forms and the variations often play a negative role, becoming an obstacle which causes knowledge systems to extract unnecessary ones. To solve the problem, we propose a method of term normalization which finds a normalized form (original and standard form defined in dictionaries) of variant terms. The method employs two characteristics of terms: appearance similarity measuring how similar terms are, context similarity measuring how many clue words they share. Through experiment, we show its positive influence of both similarities in term normalization.
international conference on human interface and management of information | 2011
Hanmin Jung; Mikyoung Lee; Pyung Kim; Won-Kyung Sung
This paper describes a decision-making support system focused on technologies, R&D agents, and R&D results. To deal with heterogeneous literatures and metadata, we introduce text mining and Semantic Web-based service platforms. InSciTe, a decision-making support system developed by us, provides a through process including analysis as well as ETL, verifies search and analysis results, connects its information with Semantic Web open sources in the level of RDF, and generates automatic summary reports. This system is significant in the sense that it has been implemented about a year earlier than similar projects such as CUBIST and FUSE.
symposium on human interface on human interface and management of information | 2009
Hanmin Jung; Mikyoung Lee; Won-Kyung Sung; Beom-Jong You
This paper presents two methods for enhancing auto-complete which providing search keywords that the user wants. The first is to display only search keywords that can guarantee a successful search result in real time regardless of documents insertion, deletion, and update. The second is to display search keywords with their entity types such as person, institution, and topic. To accomplish them, we introduce an auto-complete table that stores the entities extracted and indexed from input documents and their document frequency (DF). An auto-complete manager checks whether each entity in the table can guarantee a successful search result or not by considering its DF, and provides proper entities with their types to the user. To verify the effect of the auto-complete, we are designing a comparative experiment. OntoFrame 2007 without the functions will be compared with OntoFrame 2008 with the functions for discovering the effect of our auto-complete on the reliability of Semantic Web services.
Multimedia Tools and Applications | 2015
Sung-Pil Choi; Sung-Ho Shin; Hanmin Jung; Daesung Lee
Technical terms play an important role of effective queries for many users to search scientific databases. However, authors of scientific literature often employ alternative expressions to represent the meanings of specific terms, in other words, Terminological Paraphrases (TPs) in the literature for certain reasons, which leads to producing relevant documents that are not captured by conventional terms above. In this paper, we propose an effective way to retrieve “de facto relevant documents” which only contain those TPs and cannot be searched by conventional models in an environment with only controlled vocabularies by adapting Predicate Argument Tuple (PAT). The experiment confirms that PAT-based document retrieval is an effective and promising method to discover those kinds of documents and to improve the recall of terminology-based scientific information access models.
Journal of Applied Mathematics | 2014
Hyeok-June Jeong; Myunggwon Hwang; Hanmin Jung; Young-Guk Ha
This paper proposes a multilayered quadrotor control method that can move the quadrotor to the desired goal while resisting disturbance. The proposed control system is modular, convenient to design and verify, and easy to extend. It comprises three layers: a physical layer, a displacement control layer, and an attitude control layer. The displacement control layer considers the movement of the vehicle, while the attitude control layer controls its attitude. The physical layer deals with the physical operation of the vehicle. The two control layers use a mathematical method to provide minute step-by-step control. The proposed control system effectively combines the three layers to achieve drift stabilization.
the internet of things | 2017
Joschka Kersting; Michaela Geierhos; Hanmin Jung; Taehong Kim
In this paper, we present an IoT architecture which handles stream sensor data of air pollution. Particle pollution is known as a serious threat to human health. Along with developments in the use of wireless sensors and the IoT, we propose an architecture that flexibly measures and processes stream data collected in real-time by movable and low-cost IoT sensors. Thus, it enables a wide-spread network of wireless sensors that can follow changes in human behavior. Apart from stating reasons for the need of such a development and its requirements, we provide a conceptual design as well as a technological design of such an architecture. The technological design consists of Kaa and Apache Storm which can collect air pollution information in real-time and solve various problems to process data such as missing data and synchronization. This enables us to add a simulation in which we provide issues that might come up when having our architecture in use. Together with these issues, we state r easons for choosing specific modules among candidates. Our architecture combines wireless sensors with the Kaa IoT framework, an Apache Kafka pipeline and an Apache Storm Data Stream Management System among others. We even provide open-government data sets that are freely available.
Journal of Applied Mathematics | 2014
Dongmin Seo; Hanmin Jung; Won-Kyung Sung; Dukyun Nam
By 2026, Korea is expected to surpass the UN’s definition of an aged society and reach the level of a superaged society. With an aging population come increased disorders involving the spine. To prevent unnecessary spinal surgery and support scientific diagnosis of spinal disease and systematic prediction of treatment outcomes, we have been developing e-Spine, which is a computer simulation model of the human spine. In this paper, we present the Korean spine database and automatic surface mesh intersection algorithm to construct e-Spine. To date, the Korean spine database has collected spine data from 77 cadavers and 298 patients. The spine data consists of 2D images from CT, MRI, or X-ray, 3D shapes, geometry data, and property data. The volume and quality of the Korean spine database are now the world’s highest ones. In addition, our triangular surface mesh intersection algorithm automatically remeshes the spine-implant intersection model to make it valid for finite element analysis (FEA). This makes it possible to run the FEA using the spine-implant mesh model without any manual effort. Our database and surface mesh intersection algorithm will offer great value and utility in the diagnosis, treatment, and rehabilitation of patients suffering from spinal diseases.
ieee international conference on green computing and communications | 2013
Sebastian Kastner; Sung-Pil Choi; Hanmin Jung
Technology trend analysis systems use data mining to process vast amounts of papers, patents and news articles to analyze and predict the life cycles of technologies, products and other kinds of entities. Some systems can also extract relations between entities such as technologies, authors and products. In order to establish precise relations between entities, entity disambiguation has to be performed. In this study, we focused on author disambiguation in the context of technology trend analysis. We used Random Forests and SVM to learn a pair wise similarity function to decide whether two articles were written by the same author or not. Besides comparing common features such as article titles and author affiliations we also studied features that were built from the analyses that were made by KISTIs InSciTe system. For training and evaluation a corpus containing 24, 750 pair wise article similarities was manually constructed using data from InSciTe. Using this corpus, Random Forests outperformed SVM and reached an accuracy value of 98.31%. Only using the newly introduced features, an accuracy of 94.79% was achieved, proving their usefulness.
