Lawrence H. Reeve | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lawrence H. Reeve is active.

Explore More

Publication

Featured researches published by Lawrence H. Reeve.

acm symposium on applied computing | 2005

Survey of semantic annotation platforms

Lawrence H. Reeve; Hyoil Han

The realization of the Semantic Web requires the widespread availability of semantic annotations for existing and new documents on the Web. Semantic annotations are to tag ontology class instance data and map it into ontology classes. The fully automatic creation of semantic annotations is an unsolved problem. Instead, current systems focus on the semi-automatic creation of annotations. The Semantic Web also requires facilities for the storage of annotations and ontologies, user interfaces, access APIs, and other features to fully support annotation usage. This paper examines current Semantic Web annotation platforms that provide annotation and related services, and reviews their architecture, approaches and performance.

Information Processing and Management | 2007

The use of domain-specific concepts in biomedical text summarization

Lawrence H. Reeve; Hyoil Han; Ari D. Brooks

Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physicians evaluation of three randomly-selected papers from an evaluation corpus to show that the authors abstract does not always reflect the entire contents of the full-text.

acm symposium on applied computing | 2006

BioChain: lexical chaining methods for biomedical text summarization

Lawrence H. Reeve; Hyoil Han; Ari D. Brooks

Lexical chaining is a technique for identifying semantically-related terms in text. We propose concept chaining to link semantically-related concepts within biomedical text together. The resulting concept chains are then used to identify candidate sentences useful for extraction. The extracted sentences are used to produce a summary of the biomedical text. The concept chaining process is adapted from existing lexical chaining approaches, which focus on chaining semantically-related terms, rather than semantically-related concepts. The Unified Medical Language System (UMLS) Metathesaurus and Semantic Network are used as semantic resources. The UMLS MetaMap Transfer tool is used to perform text-to-concept mapping. The goal is to propose concept chaining and develop a novel concept chaining system for the biomedical domain using UMLS lexicon and the ideas of lexical chaining. The resulting concept chains from the full-text are evaluated against the concepts of a human summary (the papers abstract). Precision is measured at 0.90 and recall at 0.92. The resulting concept chains are used to summarize the text. We also evaluate generated summaries using existing summarization systems using sentence matching, and confirm the generated summaries are useful to a domain expert. Our results show that the proposed concept chaining is a promising methodology for biomedical text summarization.

conference on information and knowledge management | 2006

Concept frequency distribution in biomedical text summarization

Lawrence H. Reeve; Hyoil Han; Saya V. Nagori; Jonathan C. Yang; Tamara A. Schwimmer; Ari D. Brooks

Text summarization is a data reduction process. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. Our contribution is two-fold: 1) to propose the frequency of domain concepts as a method to identify important sentences within a full-text; and 2) propose a novel frequency distribution model and algorithm for identifying important sentences based on term or concept frequency distribution. An evaluation of several existing summarization systems using biomedical texts is presented in order to determine a performance baseline. For domain concept comparison, a recent high-performing frequency-based algorithm using terms is adapted to use concepts and evaluated using both terms and concepts. It is shown that the use of concepts performs closely with the use of terms for sentence selection. Our proposed frequency distribution model and algorithm outperforms a state-of-the-art approach.

data integration in the life sciences | 2007

CONANN: an online biomedical concept annotator

Lawrence H. Reeve; Hyoil Han

We describe our biomedical concept annotator designed for online environments, CONANN, which takes a biomedical source phrase and finds the best-matching biomedical concept from a domain resource. Domain concepts are defined in resources such as the United States National Library of Medicines Unified Medical Language System Metathesaurus. CONANN uses an incremental filtering approach to narrow down a list of candidate phrases before deciding on a best match. We show that this approach has the advantage of improving annotation speed over an existing state-of-the-art concept annotator, facilitating the use of concept annotation in online environments. Our main contributions are 1) the design of a phrase-unit concept annotator more readily usable in online environments than existing systems, 2) the introduction of a model which uses semantically focused words in a given ontology (e.g., UMLS) to measure coverage, called Inverse Phrase Frequency, and 3) the use of two different filters to measure coverage and coherence between a source phrase and a domain-specific candidate phrase. An intrinsic evaluation comparing CONANNs concept output to a state-of-the-art concept annotator shows our system has an annotation precision ranging from 90% for exact match concept to 95% for relaxed concept matching while average phrase annotation time is eighteen times faster. In addition, an extrinsic evaluation using the generated concepts in a text summarization task shows no significant degradation when using CONANN.

data mining in bioinformatics | 2007

Biomedical text summarisation using concept chains

Lawrence H. Reeve; Hyoil Han; Ari D. Brooks

BioChainSumm is a biomedical text summariser utilising concept chaining (called BioChain) to link semantically-related concepts within biomedical text together. The BioChain process is adapted from existing lexical chaining approaches which chain semantically-related terms rather than concepts. The BioChain concept chains are used to identify salient candidate sentences which are extracted to produce a summary of the biomedical text. The Unified Medical Language System Metathesaurus and Semantic Network semantic resources identify related biomedical concepts. BioChainSumm is evaluated using the ROUGE system along with several existing, publicly-available summarisers. Our results show BioChain provides a promising methodology for biomedical text summarisation.

Archive | 2006

Information Visualization and the Semantic Web

Lawrence H. Reeve; Hyoil Han; Chaomei Chen

The Semantic Web emphasizes that data should be machine-understandable, whereas information visualization aims to maximize our perceptional and cognitive abilities to make sense of visual-spatial representations of abstract information structures. One of the fundamental requirements of the Semantic Web is to annotate Web data with ontology to accomplish machine-understandable Web. Will they fit to work along with one another? Our illustrative example is intended to demonstrate that on the one hand, the Semantic Web can largely simplify some information visualization tasks today, and semantic annotation can be utilized for semantic visualization; on the other hand, the two fields differ from their philosophical groundings to tactical approaches to individual problems such as knowledge modeling and representation. A lot of theoretical and practical work remains to be done to find the right track for the two fields to work together harmoniously.

bioinformatics and biomedicine | 2008

Online Biomedical Concept Annotation Using Language Model Mapping

Lawrence H. Reeve; Hyoil Han; Ari D. Brooks

We report the results of applying language technology to the bioinformatics problem of online concept annotation of biomedical text. We extend our concept annotator, CONANN, to find biomedical concepts in using concept language models. The goal of CONANN is to improve annotation speed without losing annotation accuracy as compared to offline systems, facilitating the use of concept annotation in online environments. Intrinsic and extrinsic evaluations show accuracy competitive with a state-of-the-art biomedical text concept annotator with a speed improvement of more than four times.

Archive | 2005