Frank W. Hartel
National Institutes of Health
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Frank W. Hartel.
Journal of Biomedical Informatics | 2007
Nicholas Sioutos; Sherri de Coronado; Margaret W. Haber; Frank W. Hartel; Wen-Ling Shaiu; Lawrence W. Wright
Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning these efforts. It is designed to meet the growing need for accurate, comprehensive, and shared terminology, covering topics including: cancers, findings, drugs, therapies, anatomy, genes, pathways, cellular and subcellular processes, proteins, and experimental organisms. The NCI Thesaurus provides a partial model of how these things relate to each other, responding to actual user needs and implemented in a deductive logic framework that can help maintain the integrity and extend the informational power of what is provided. This paper presents the semantic model for cancer diseases and its uses in integrating clinical and molecular knowledge, more briefly examines the models and uses for drug, biochemical pathway, and mouse terminology, and discusses limits of the current approach and directions for future work.
Journal of Web Semantics | 2003
Jennifer Golbeck; Gilberto Fragoso; Frank W. Hartel; James A. Hendler; Jim Oberthaler; Bijan Parsia
The NCI Thysaurus is a public domain description logic-based terminology produced by the National Cancer Institute, and distributed as a component of the NCI Center for Bioinformatics caCORE distribution. It is deep and complex compared to most broad clinical vocabularies, implementing rich semantic interrelationships between the nodes of its taxonomies. The semantic relationships in the Thysaurus are intended to facilitate translational research and to support the bioinformatics infrastructure of the Institute.
Bioinformatics | 2003
Peter A. Covitz; Frank W. Hartel; Carl F. Schaefer; Sherri de Coronado; Gilberto Fragoso; Himanso Sahni; Scott Gustafson; Kenneth H. Buetow
MOTIVATION Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. RESULTS We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources. AVAILABILITY caCORE downloads and web interfaces can be accessed from links on the caCORE web site (http://ncicb.nci.nih.gov/core). caBIO software is distributed under an open source license that permits unrestricted academic and commercial use. Vocabulary and metadata content in the EVS and caDSR, respectively, is similarly unrestricted, and is available through web applications and FTP downloads. SUPPLEMENTARY INFORMATION http://ncicb.nci.nih.gov/core/publications contains links to the caBIO 1.0 class diagram and the caCORE 1.0 Technical Guide, which provide detailed information on the present caCORE architecture, data sources and APIs. Updated information appears on a regular basis on the caCORE web site (http://ncicb.nci.nih.gov/core).
Journal of Biomedical Informatics | 2005
Frank W. Hartel; Sherri de Coronado; Robert Dionne; Gilberto Fragoso; Jennifer Golbeck
The National Cancer Institute has developed the NCI Thesaurus, a biomedical vocabulary for cancer research, covering terminology across a wide range of cancer research domains. A major design goal of the NCI Thesaurus is to facilitate translational research. We describe: the features of Ontylog, a description logic used to build NCI Thesaurus; our methodology for enhancing the terminology through collaboration between ontologists and domain experts, and for addressing certain real world challenges arising in modeling the Thesaurus; and finally, we describe the conversion of NCI Thesaurus from Ontylog into Web Ontology Language Lite. Ontylog has proven well suited for constructing big biomedical vocabularies. We have capitalized on the Ontylog constructs Kind and Role in the collaboration process described in this paper to facilitate communication between ontologists and domain experts. The artifacts and processes developed by NCI for collaboration may be useful in other biomedical terminology development efforts.
Comparative and Functional Genomics | 2004
Gilberto Fragoso; Sherri de Coronado; Margaret Haber; Frank W. Hartel; Larry Wright
The NCI Thesaurus is a reference terminology covering areas of basic and clinical science, built with the goal of facilitating translational research in cancer. It contains nearly 110 000 terms in approximately 36000 concepts, partitioned in 20 subdomains, which include diseases, drugs, anatomy, genes, gene products, techniques, and biological processes, among others, all with a cancer-centric focus in content, and originally designed to support coding activities across the National Cancer Institute. Each concept represents a unit of meaning and contains a number of annotations, such as synonyms and preferred name, as well as annotations such as textual definitions and optional references to external authorities. In addition, concepts are modelled with description logic (DL) and defined by their relationships to other concepts; there are currently approximately 90 types of named relations declared in the terminology. The NCI Thesaurus is produced by the Enterprise Vocabulary Services project, a collaborative effort between the NCI Center for Bioinformatics and the NCI Office of Communications, and is part of the caCORE infrastructure stack (http://ncicb.nci.nih.gov/NCICB/core). It can be accessed programmatically through the open caBIO API and browsed via the web (http://nciterms.nci.nih.gov). A history of editing changes is also accessible through the API. In addition, the Thesaurus is available for download in various file formats, including OWL, the web ontology language, to facilitate its utilization by others.
Applied Ontology | 2008
Natalya Fridman Noy; Sherri de Coronado; Harold R. Solbrig; Gilberto Fragoso; Frank W. Hartel; Mark A. Musen
The National Cancer Institutes (NCI) Thesaurus is a biomedical reference ontology. The NCI Thesaurus is represented using Description Logic, more specifically Ontylog, a Description logic implemented by Apelon, Inc. We are exploring the use of the DL species of the Web Ontology Language (OWL DL)-a W3C recommended standard for ontology representation-instead of Ontylog for representing the NCI Thesaurus. We have studied the requirements for knowledge representation of the NCI Thesaurus, and considered how OWL DL (and its implementation in Protégé-OWL) satisfies these requirements. In this paper, we discuss the areas where OWL DL was sufficient for representing required components, where tool support that would hide some of the complexity and extra levels of indirection would be required, and where language expressiveness is not sufficient given the representation requirements. Because many of the knowledge-representation issues that we encountered are very similar to the issues in representing other biomedical terminologies and ontologies in general, we believe that the lessons that we learned and the approaches that we developed will prove useful and informative for other researchers.
medical informatics europe | 2003
Michael N. Cantor; Indra Neil Sarkar; Rony Gelman; Frank W. Hartel; Olivier Bodenreider; Yves A. Lussier
Integration of disparate biomedical terminologies is becoming increasingly important as links between biological science and clinical medicine grow. Mapping concepts in the Gene Ontology (GO) to the UMLS may help further this integration and allow for more efficient information exchange among researchers. Using a gold standard of GO term--UMLS concept mappings provided by the NCI, we examined the performance of various published and combined mapping techniques, in order to maximize precision and recall. We found that for the previously published techniques precision varied between (0.61-0.95), and recall varied from (0.65-0.90), whereas for the hybrid techniques, precision varied between (0.66-0.97), and recall from (0.59-0.93). Our study reveals the benefits of using mapping techniques that incorporate domain knowledge, and provides a basis for future approaches to mapping between distinct biomedical vocabularies.
Comparative and Functional Genomics | 2004
Gilberto Fragoso; Sherri de Coronado; Margaret Haber; Frank W. Hartel; Larry Wright
pacific symposium on biocomputing | 2003
Indra Neil Sarkar; Michael N. Cantor; Rony Gelman; Frank W. Hartel; Yves A. Lussier
Comparative and Functional Genomics | 2004
Gilberto Fragoso; Sherri de Coronado; Margaret Haber; Frank W. Hartel; Larry Wright