Hyoil Han | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hyoil Han is active.

Explore More

Publication

Featured researches published by Hyoil Han.

international conference on management of data | 2006

A survey on ontology mapping

Namyoun Choi; Il-Yeol Song; Hyoil Han

Ontology is increasingly seen as a key factor for enabling interoperability across heterogeneous systems and semantic web applications. Ontology mapping is required for combining distributed and heterogeneous ontologies. Developing such ontology mapping has been a core issue of recent ontology research. This paper presents ontology mapping categories, describes the characteristics of each category, compares these characteristics, and surveys tools, systems, and related work based on each category of ontology mapping. We believe this paper provides readers with a comprehensive understanding of ontology mapping and points to various research topics about the specific roles of ontology mapping.

acm symposium on applied computing | 2005

Survey of semantic annotation platforms

Lawrence H. Reeve; Hyoil Han

The realization of the Semantic Web requires the widespread availability of semantic annotations for existing and new documents on the Web. Semantic annotations are to tag ontology class instance data and map it into ontology classes. The fully automatic creation of semantic annotations is an unsolved problem. Instead, current systems focus on the semi-automatic creation of annotations. The Semantic Web also requires facilities for the storage of annotations and ontologies, user interfaces, access APIs, and other features to fully support annotation usage. This paper examines current Semantic Web annotation platforms that provide annotation and related services, and reviews their architecture, approaches and performance.

Information Processing and Management | 2007

The use of domain-specific concepts in biomedical text summarization

Lawrence H. Reeve; Hyoil Han; Ari D. Brooks

Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physicians evaluation of three randomly-selected papers from an evaluation corpus to show that the authors abstract does not always reflect the entire contents of the full-text.

acm symposium on applied computing | 2006

Approaches to text mining for clinical medical records

Xiaohua Zhou; Hyoil Han; Isaac Chankai; Ann A. Prestrud; Ari D. Brooks

Clinical medical records contain a wealth of information, largely in free-text form. Means to extract structured information from free-text records is an important research endeavor. In this paper, we describe a MEDical Information Extraction (MedIE) system that extracts and mines a variety of patient information with breast complaints from free-text clinical records. MedIE is a part of medical text mining project being conducted in Drexel University. Three approaches are proposed to solve different IE tasks and very good performance (precision and recall) was achieved. A graph-based approach which uses the parsing result of link-grammar parser was invented for relation extraction; high accuracy was achieved. A simple but efficient ontology-based approach was adopted to extract medical terms of interest. Finally, an NLP-based feature extraction method coupled with an ID3-based decision tree was used to perform text classification.

data warehousing and knowledge discovery | 2005

XML-OLAP: a multidimensional analysis framework for XML warehouses

Byung-Kwon Park; Hyoil Han; Il-Yeol Song

Recently, a large number of XML documents are available on the Internet. This trend motivated many researchers to analyze them multi-dimensionally in the same way as relational data. In this paper, we propose a new framework for multidimensional analysis of XML documents, which we call XML-OLAP. We base XML-OLAP on XML warehouses where every fact data as well as dimension data are stored as XML documents. We build XML cubes from XML warehouses. We propose a new multidimensional expression language for XML cubes, which we call XML-MDX. XML-MDX statements target XML cubes and use XQuery expressions to designate the measure data. They specify text mining operators for aggregating text constituting the measure data. We evaluate XML-OLAP by applying it to a U.S. patent XML warehouse. We use XML-MDX queries, which demonstrate that XML-OLAP is effective for multi-dimensionally analyzing the U.S. patents.

international conference on data engineering | 2005

Converting Semi-structured Clinical Medical Records into Information and Knowledge

Xiaohua Zhou; Hyoil Han; Isaac Chankai; Ann A. Prestrud; Ari D. Brooks

Clinical medical records contain a wealth of information, largely in free-textual form. Thus, means to extract structured information from free-text records becomes an important research endeavor. In this paper, we propose and implement an information extraction system that extracts three types of information - numeric values, medical terms and categorical value - from semi-structured patient records. Three approaches are proposed to solve the problems posed by each of the three types of values, respectively, and very good performance (precision and recall) is achieved. A novel link-grammar based approach was invented to associate feature and number in a sentence, and extremely high accuracy was achieved. A simple but efficient approach, using POS-based pattern and domain ontology, was adopted to extract medical terms of interest. Finally, an NLPbased feature extraction method coupled with an ID3 based decision tree is used to classify and extract categorical cases. This preliminary approach to categorical fields has, so far, proven to be quite effective.

acm symposium on applied computing | 2007

Semantically enhanced user modeling

Palakorn Achananuparp; Hyoil Han; Olfa Nasraoui; R. M. Johnson

Content-based implicit user modeling techniques usually employ a traditional term vector as a representation of the users interest. However, due to the problem of dimensionality in the vector space model, a simple term vector is not a sufficient representation of the user model as it ignores the semantic relations between terms. In this paper, we present a novel method to enhance a traditional term-based user model with WordNet-based semantic similarity techniques. To achieve this, we use word definitions and relationship hierarchies in WordNet to perform word sense disambiguation and employ domain-specific concepts as category labels for the derived user models. We tested our method on Windows to the Universe, a public educational website covering subjects in the Earth and Space Sciences, and performed an evaluation of our semantically enhanced user models against human judgment. Our approach is distinguishable from existing work because we automatically narrow down the set of domain specific concepts from initial domain concepts obtained from Wikipedia and because we automatically create semantically enhanced user models.

computer-based medical systems | 2006

An Infrastructure of Stream Data Mining, Fusion and Management for Monitored Patients

Hyoil Han; Han C. Ryoo; Herbert Patrick

This paper proposes an infrastructure for data mining, fusion and patient care management using continuous stream data monitored from critically ill patients. Stream data mining, fusion, and management provide efficient ways to increase data utilization and to support knowledge discovery, which can be utilized in many clinical areas to improve the quality of patient care services. The primary goal of our work is to establish a customized infrastructure model designed for critical care services at hospitals. However this structure can be easily expanded to other areas of clinical specialties

acm symposium on applied computing | 2006

BioChain: lexical chaining methods for biomedical text summarization

Lawrence H. Reeve; Hyoil Han; Ari D. Brooks

Lexical chaining is a technique for identifying semantically-related terms in text. We propose concept chaining to link semantically-related concepts within biomedical text together. The resulting concept chains are then used to identify candidate sentences useful for extraction. The extracted sentences are used to produce a summary of the biomedical text. The concept chaining process is adapted from existing lexical chaining approaches, which focus on chaining semantically-related terms, rather than semantically-related concepts. The Unified Medical Language System (UMLS) Metathesaurus and Semantic Network are used as semantic resources. The UMLS MetaMap Transfer tool is used to perform text-to-concept mapping. The goal is to propose concept chaining and develop a novel concept chaining system for the biomedical domain using UMLS lexicon and the ideas of lexical chaining. The resulting concept chains from the full-text are evaluated against the concepts of a human summary (the papers abstract). Precision is measured at 0.90 and recall at 0.92. The resulting concept chains are used to summarize the text. We also evaluate generated summaries using existing summarization systems using sentence matching, and confirm the generated summaries are useful to a domain expert. Our results show that the proposed concept chaining is a promising methodology for biomedical text summarization.

International Journal of Business Intelligence and Data Mining | 2005

Temporal rule induction for clinical outcome analysis

Xiaohua Hu; Il-Yeol Song; Hyoil Han; Illhoi Yoo; Ann A. Prestrud; Murray F. Brennan; Ari D. Brooks

Clinical outcomes analysis normally covers a particular time period. The sample under study is constantly changing as patients are censored, leave the study or die. In this paper, we present a novel data mining approach to mine temporal rules that reflect characteristics of outcomes analysis. We apply our temporal rule induction algorithm to a set of cancer patients, clinical records that were prospectively collected for 20 years. We analyse clinical data not only based on the static event, such as local recurrence for survival analysis, but also based on the temporal event with censored data for each time unit. The rules extracted from our temporal rule induction algorithm are compared to results from statistical analysis. The importance of this paper is that this novel temporal rule induction algorithm provides valuable insights for clinical data assessment and complements traditional statistical analysis.

Explore More