Harksoo Kim
Kangwon National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Harksoo Kim.
IEEE Intelligent Systems | 2008
Harksoo Kim; Jungyun Seo
A high-performance FAQ retrieval system uses query-log clustering to resolve lexical-disagreement problems. The proposed system outperforms traditional information-retrieval systems in FAQ retrieval.
Information Processing and Management | 2013
Maengsik Choi; Harksoo Kim
We propose a social relation extraction system using dependency-kernel-based support vector machines (SVMs). The proposed system classifies input sentences containing two peoples names on the basis of whether they do or do not describe social relations between two people. The system then extracts relation names (i.e., social-related keywords) from sentences describing social relations. We propose new tree kernels called dependency trigram kernels for effectively implementing these processes using SVMs. Experiments showed that the proposed kernels delivered better performance than the existing dependency kernel. On the basis of the experimental evidence, we suggest that the proposed system can be used as a useful tool for automatically constructing social networks from unstructured texts.
Pattern Recognition Letters | 2010
Sangwoo Kang; Harksoo Kim; Jungyun Seo
In a multidomain dialogue, identifying speech acts is not easy because of the problem of interference between input features. To overcome this problem, we propose a two-step model for speech act classification. In the first step, the proposed model detects a dialogue domain associated with an input utterance. In the second step, the proposed model determines the speech act of the input utterance by using only statistical information about input features in the detected dialogue domain. In the experiment, the precision of the proposed model was higher than that of the baseline system without domain selection by 5.5%. On the basis of this experimental result, we conclude that reducing the interferences between input features by using a domain detection process is effective in improving the precision of speech act classification in multiple domains.
Pattern Recognition Letters | 2014
Choong-Nyoung Seon; Hyun Jung Lee; Harksoo Kim; Jungyun Seo
Abstract To reduce lengthy and rigid interactions of menu-driven navigation and keyword searches, dialogue systems based on a natural language interface have been developed. Domain action classification is an essential part of a dialogue system because speakers’ intentions are determined through the classification process. Although a domain action consists of a tightly associated speech act and a concept sequence, previous studies have independently dealt with speech acts and concept sequences in order to simplify the models, and this simplification has caused a decrease in performance. A retraining method for improving the domain action classification performance is proposed in order to resolve this problem. The proposed method divides a domain action classification model into a speech act classification model and a concept sequence classification model. The speech act classification model repeatedly uses concept sequence classification model outputs as inputs during training. In the experiments with goal-oriented dialogues, the proposed method exhibited a higher accuracy of 0.6% and higher macro F1-measure of 1.7% compared to the SVM and ME models that dealt with speech acts and concept sequences separately. Based on the experimental results, it was determined that the proposed method can improve the performance of some representative machine learning models for domain action classification.
Journal of Information Processing Systems | 2013
Hyun Jung Lee; Harksoo Kim; Jungyun Seo
Abstract —A speaker’s intentions can be represented by domain actions (domain-independent speech act and domain-dependent concept sequence pairs). Therefore, it is essential that domain actions be determined when implementing dialogue systems because a dialogue system should determine users’ intentions from their utterances and should create counterpart intentions to the users’ intentions. In this paper, a neural network model is proposed for classifying a user’s domain actions and planning a system’s domain actions. An integrated neural network model is proposed for simultaneously determining user and system domain actions using the same framework. The proposed model performed better than previous non-integrated models in an experiment using a goal-oriented dialogue corpus. This result shows that the proposed integration method contributes to improving domain action determination performance. Keywords —Domain Action, Speech Act, Concept Sequence, Neural Network
Pattern Recognition Letters | 2011
Choong-Nyoung Seon; Harksoo Kim; Jungyun Seo
With the rapid evolution of the mobile environment, demand for information extraction from mobile devices is increasing. This paper proposes an information extraction system that is designed for mobile devices with limited hardware resources. The proposed system extracts temporal (dates and times) and named instances (locations and title) from Korean short messages in an appointment management domain. To efficiently extract temporal instances with limited numbers of surface forms, the proposed system uses well-refined finite state automata. To effectively extract various surface forms of named instances with limited hardware resources, the proposed system uses a modified hidden Markov model (HMM) based on character n-grams. In the experiment on instance boundary labeling, the proposed system showed comparable performances with representative conventional classifiers. The proposed system was implemented in a commercial mobile phone to test its ability to automatically extract appointment information from a short message and store the information into a schedule database. The system performed well with a reasonable response time.
International Journal on Artificial Intelligence Tools | 2012
Choong-Nyoung Seon; Harksoo Kim; Jungyun Seo
Visiting a foreign country is now much easier than it was in the past. This has led to a consequent increase in the need for translation services during these visits. To satisfy this need, a reliable translation assistance system based on sentence retrieval techniques is proposed. When a user inputs a sentence in his/her native language, the proposed system retrieves sentences similar to the input sentence from a pre-constructed bilingual corpus and returns pairs of sentences in the native and foreign languages. To reduce the lexical disagreement problems that inevitably occur in this sentence retrieval application, the proposed system uses multi-level linguistic information (i.e., keywords, sentence types, and concepts) with different weights as indexing terms. In addition, the proposed system uses clustering information from sentences with similar meanings to smooth the retrieval target sentences. In an experiment, the proposed system outperformed traditional IR systems. Based on various experiments, it was found that multi-level information was effective at alleviating critical lexical disagreement problems in sentence retrieval. It was also found that the proposed system was suitable for sentence retrieval applications such as translation assistance systems.
Cluster Computing | 2012
Chongmyung Park; Harksoo Kim; Inbum Jung
In wireless sensor networks, when a sensor node detects events in the surrounding environment, the sensing period for learning detailed information is likely to be short. However, the short sensing cycle increases the data traffic of the sensor nodes in a routing path. Since the high traffic load causes a data queue overflow in the sensor nodes, important information about urgent events could be lost. In addition, since the battery energy of the sensor nodes is quickly exhausted, the entire lifetime of wireless sensor networks would be shortened. In this paper, to address these problem issues, a new routing protocol is proposed based on a lightweight genetic algorithm. In the proposed method, the sensor nodes are aware of the data traffic rate to monitor the network congestion. In addition, the fitness function is designed from both the average and the standard deviation of the traffic rates of sensor nodes. Based on dominant gene sets in a genetic algorithm, the proposed method selects suitable data forwarding sensor nodes to avoid heavy traffic congestion. In experiments, the proposed method demonstrates efficient data transmission due to much less queue overflow and supports fair data transmission for all sensor nodes. From the results, it is evident that the proposed method not only enhances the reliability of data transmission but also distributes the energy consumption across wireless sensor networks.
International Journal on Artificial Intelligence Tools | 2016
Hyeokju Ahn; Harksoo Kim
With the rapid evolution of smart home environment, the demand for spoken information retrieval (e.g., voice-activated FAQ retrieval) on information appliances is increasing. In spoken information retrieval, users’ spoken queries are converted into text queries using automatic speech recognition (ASR) engines. If top-1 results of the ASR engines are incorrect, the errors are propagated to information retrieval systems. If a document collection is a small set of sentences such as frequently asked questions (FAQs), the errors have additional effect on the performance of information retrieval systems. To improve the performance of such a sentence retrieval system, we propose a post-processing model of an ASR engine. The post-processing model consists of a re-ranking and a query term generation model. The re-ranking model rearranges top-n outputs of the ASR engines using the ranking support vector machine (Ranking SVM). The query term generation model extracts meaningful content words from the re-ranked queries based on term frequencies and query rankings. In the experiments, the re-ranking model improved the top-1 performance results of an underlying ASR engine with 4.4% higher precision and 6.4% higher recall rate. The query term generation model improved the performance results of an underlying information retrieval system with an accuracy 2.4% to 2.6% higher. Based on the experimental result, the proposed model revealed that it could improve the performance of a spoken sentence retrieval system in a restricted domain.
International Journal of Distributed Sensor Networks | 2014
Sangwoo Kang; Harksoo Kim; Hyun-Kyu Kang; Jungyun Seo
With the rapid evolution of the smart home environment, the demand for natural language processing (NLP) applications on information appliances is increasing. However, it is not easy to embed NLP-based applications in information appliances because most information appliances have hardware constraints such as small memory, limited battery capacity, and restricted processing power. In this paper, we propose a lightweight morphological analysis model, which provides the first step module of NLP for many languages. To overcome hardware constraints, the proposed model modifies a well-known left-longest-match-preference (LLMP) model and simplifies a conventional hidden Markov model (HMM). In the experiments, the proposed model exhibited good performance (a response time of 0.0195 sec per sentence, a memory usage of 1.85 MB, a precision of 92%, and a recall rate of 90%) in terms of the various evaluation measures. On the basis of these experiments, we conclude that the proposed model is suitable for natural language interfaces of information appliances with many hardware limitations because it requires less memory and consumes less battery power.