Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christopher Ochs is active.

Publication


Featured researches published by Christopher Ochs.


Journal of the American Medical Informatics Association | 2015

Scalable quality assurance for large SNOMED CT hierarchies using subject-based subtaxonomies

Christopher Ochs; James Geller; Yehoshua Perl; Yan Chen; Junchuan Xu; Hua Min; James T. Case; Zhi Wei

OBJECTIVE Standards terminologies may be large and complex, making their quality assurance challenging. Some terminology quality assurance (TQA) methodologies are based on abstraction networks (AbNs), compact terminology summaries. We have tested AbNs and the performance of related TQA methodologies on small terminology hierarchies. However, some standards terminologies, for example, SNOMED, are composed of very large hierarchies. Scaling AbN TQA techniques to such hierarchies poses a significant challenge. We present a scalable subject-based approach for AbN TQA. METHODS An innovative technique is presented for scaling TQA by creating a new kind of subject-based AbN called a subtaxonomy for large hierarchies. New hypotheses about concentrations of erroneous concepts within the AbN are introduced to guide scalable TQA. RESULTS We test the TQA methodology for a subject-based subtaxonomy for the Bleeding subhierarchy in SNOMEDs large Clinical finding hierarchy. To test the error concentration hypotheses, three domain experts reviewed a sample of 300 concepts. A consensus-based evaluation identified 87 erroneous concepts. The subtaxonomy-based TQA methodology was shown to uncover statistically significantly more erroneous concepts when compared to a control sample. DISCUSSION The scalability of TQA methodologies is a challenge for large standards systems like SNOMED. We demonstrated innovative subject-based TQA techniques by identifying groups of concepts with a higher likelihood of having errors within the subtaxonomy. Scalability is achieved by reviewing a large hierarchy by subject. CONCLUSIONS An innovative methodology for scaling the derivation of AbNs and a TQA methodology was shown to perform successfully for the largest hierarchy of SNOMED.


Journal of the American Medical Informatics Association | 2015

A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships

Christopher Ochs; James Geller; Yehoshua Perl; Yan Chen; Ankur Agrawal; James T. Case; George Hripcsak

OBJECTIVE Large and complex terminologies, such as Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT), are prone to errors and inconsistencies. Abstraction networks are compact summarizations of the content and structure of a terminology. Abstraction networks have been shown to support terminology quality assurance. In this paper, we introduce an abstraction network derivation methodology which can be applied to SNOMED CT target hierarchies whose classes are defined using only hierarchical relationships (ie, without attribute relationships) and similar description-logic-based terminologies. METHODS We introduce the tribal abstraction network (TAN), based on the notion of a tribe-a subhierarchy rooted at a child of a hierarchy root, assuming only the existence of concepts with multiple parents. The TAN summarizes a hierarchy that does not have attribute relationships using sets of concepts, called tribal units that belong to exactly the same multiple tribes. Tribal units are further divided into refined tribal units which contain closely related concepts. A quality assurance methodology that utilizes TAN summarizations is introduced. RESULTS A TAN is derived for the Observable entity hierarchy of SNOMED CT, summarizing its content. A TAN-based quality assurance review of the concepts of the hierarchy is performed, and erroneous concepts are shown to appear more frequently in large refined tribal units than in small refined tribal units. Furthermore, more erroneous concepts appear in large refined tribal units of more tribes than of fewer tribes. CONCLUSIONS In this paper we introduce the TAN for summarizing SNOMED CT target hierarchies. A TAN was derived for the Observable entity hierarchy of SNOMED CT. A quality assurance methodology utilizing the TAN was introduced and demonstrated.


Journal of Biomedical Informatics | 2016

Utilizing a structural meta-ontology for family-based quality assurance of the BioPortal ontologies

Christopher Ochs; Zhe He; Ling Zheng; James Geller; Yehoshua Perl; George Hripcsak; Mark A. Musen

An Abstraction Network is a compact summary of an ontologys structure and content. In previous research, we showed that Abstraction Networks support quality assurance (QA) of biomedical ontologies. The development of an Abstraction Network and its associated QA methodologies, however, is a labor-intensive process that previously was applicable only to one ontology at a time. To improve the efficiency of the Abstraction-Network-based QA methodology, we introduced a QA framework that uses uniform Abstraction Network derivation techniques and QA methodologies that are applicable to whole families of structurally similar ontologies. For the family-based framework to be successful, it is necessary to develop a method for classifying ontologies into structurally similar families. We now describe a structural meta-ontology that classifies ontologies according to certain structural features that are commonly used in the modeling of ontologies (e.g., object properties) and that are important for Abstraction Network derivation. Each class of the structural meta-ontology represents a family of ontologies with identical structural features, indicating which types of Abstraction Networks and QA methodologies are potentially applicable to all of the ontologies in the family. We derive a collection of 81 families, corresponding to classes of the structural meta-ontology, that enable a flexible, streamlined family-based QA methodology, offering multiple choices for classifying an ontology. The structure of 373 ontologies from the NCBO BioPortal is analyzed and each ontology is classified into multiple families modeled by the structural meta-ontology.


Journal of Bioinformatics and Computational Biology | 2016

Quality assurance of the gene ontology using abstraction networks.

Christopher Ochs; Yehoshua Perl; Michael Halper; James Geller; Jane Lomax

The gene ontology (GO) is used extensively in the field of genomics. Like other large and complex ontologies, quality assurance (QA) efforts for GOs content can be laborious and time consuming. Abstraction networks (AbNs) are summarization networks that reveal and highlight high-level structural and hierarchical aggregation patterns in an ontology. They have been shown to successfully support QA work in the context of various ontologies. Two kinds of AbNs, called the area taxonomy and the partial-area taxonomy, are developed for GO hierarchies and derived specifically for the biological process (BP) hierarchy. Within this framework, several QA heuristics, based on the identification of groups of anomalous terms which exhibit certain taxonomy-defined characteristics, are introduced. Such groups are expected to have higher error rates when compared to other terms. Thus, by focusing QA efforts on anomalous terms one would expect to find relatively more erroneous content. By automatically identifying these potential problem areas within an ontology, time and effort will be saved during manual reviews of GOs content. BP is used as a testbed, with samples of three kinds of anomalous BP terms chosen for a taxonomy-based QA review. Additional heuristics for QA are demonstrated. From the results of this QA effort, it is observed that different kinds of inconsistencies in the modeling of GO can be exposed with the use of the proposed heuristics. For comparison, the results of QA work on a sample of terms chosen from GOs general population are presented.


bioinformatics and biomedicine | 2015

Using aggregate taxonomies to summarize SNOMED CT evolution

Christopher Ochs; Yehoshua Perl; James Geller; Mark A. Musen

Terminologies are typically large and complex knowledge systems. It is difficult to obtain an orientation into their structure and content. In previous research we designed compact summary networks called partial-area taxonomies to provide a structural summary of a terminology. The sizes of a terminology and of its partial-area taxonomy are defined as their numbers of nodes. While a partial-area taxonomy is typically smaller than the original terminology, it is often not compact enough to provide a clear “big picture,” due to too many nodes that summarize only a small number of terminology concepts. The display of such a partial-area taxonomy is still overwhelming. In this paper, we introduce a more compact summary of a terminology, called an aggregate taxonomy, obtained by aggregating small partial-area taxonomy nodes into larger nodes. We present a parametrized technique to study the design of such an aggregate taxonomy and apply it to the Specimen hierarchy of SNOMED CT. A software tool for creating and displaying aggregate taxonomies is described. We illustrate how aggregate taxonomies derived across multiple SNOMED CT releases can be used to summarize the evolution of the Specimen hierarchys content over eight years of SNOMED CT releases.


bioinformatics and biomedicine | 2015

Algorithmic detection of inconsistent modeling among SNOMED CT concepts by combining lexical and structural indicators

Ankur Agrawal; Yehoshua Perl; Christopher Ochs; Gai Elhanan

SNOMED CT is important for clinical applications, such as Electronic Health Record (EHR) encoding. However, inconsistency in modeling its concepts may prevent SNOMED CT from providing proper support for clinical use. This study provides an effective methodology for locating inconsistently modeled SNOMED CT concepts. One can expect lexically similar concepts to be modeled similarly. Positional similarity sets, sets of lexically similar concepts having only one different word at the same position of their names, are introduced. Concepts in such sets have a higher likelihood of being unjustifiably inconsistently modeled. A technique to incorporate three structural indicators into the selected sets is provided to further improve the likelihood of finding inconsistently modeled concepts. An analysis of a sample of 50 such sets and for each of these three indicators is performed. The sample of positional similarity sets is found to have 18.6% inconsistent concepts. The use of structural indicators is shown to further improve the likelihood of finding inconsistently modeled concepts up to 41.6% with high statistical significance when compared to the previous sample of positional similarity sets. Positional similarity sets with different structural indicators are shown to help identify inconsistencies in concept modeling with high likelihood. Furthermore, such sets enable the comparison of concept modeling in the context of other lexically similar concepts, which enhances the effectiveness of corrections by auditors. Such quality assurance methods can be used to supplement IHTSDOs own efforts in order to improve the quality of SNOMED CT.


Artificial Intelligence in Medicine | 2017

From SNOMED CT to Uberon: Transferability of evaluation methodology between similarly structured ontologies

Gai Elhanan; Christopher Ochs; Jose L. V. Mejino; Hao Liu; Christopher J. Mungall; Yehoshua Perl

OBJECTIVE To examine whether disjoint partial-area taxonomy, a semantically-based evaluation methodology that has been successfully tested in SNOMED CT, will perform with similar effectiveness on Uberon, an anatomical ontology that belongs to a structurally similar family of ontologies as SNOMED CT. METHOD A disjoint partial-area taxonomy was generated for Uberon. One hundred randomly selected test concepts that overlap between partial-areas were matched to a same size control sample of non-overlapping concepts. The samples were blindly inspected for non-critical issues and presumptive errors first by a general domain expert whose results were then confirmed or rejected by a highly experienced anatomical ontology domain expert. Reported issues were subsequently reviewed by Uberons curators. RESULTS Overlapping concepts in Uberons disjoint partial-area taxonomy exhibited a significantly higher rate of all issues. Clear-cut presumptive errors trended similarly but did not reach statistical significance. A sub-analysis of overlapping concepts with three or more relationship types indicated a much higher rate of issues. CONCLUSIONS Overlapping concepts from Uberons disjoint abstraction network are quite likely (up to 28.9%) to exhibit issues. The results suggest that the methodology can transfer well between same family ontologies. Although Uberon exhibited relatively few overlapping concepts, the methodology can be combined with other semantic indicators to expand the process to other concepts within the ontology that will generate high yields of discovered issues.


Annals of the New York Academy of Sciences | 2017

Introducing the Big Knowledge to Use (BK2U) challenge

Yehoshua Perl; James Geller; Michael Halper; Christopher Ochs; Ling Zheng; Joan Kapusnik-Uner

The purpose of the Big Data to Knowledge initiative is to develop methods for discovering new knowledge from large amounts of data. However, if the resulting knowledge is so large that it resists comprehension, referred to here as Big Knowledge (BK), how can it be used properly and creatively? We call this secondary challenge, Big Knowledge to Use. Without a high‐level mental representation of the kinds of knowledge in a BK knowledgebase, effective or innovative use of the knowledge may be limited. We describe summarization and visualization techniques that capture the big picture of a BK knowledgebase, possibly created from Big Data. In this research, we distinguish between assertion BK and rule‐based BK (rule BK) and demonstrate the usefulness of summarization and visualization techniques of assertion BK for clinical phenotyping. As an example, we illustrate how a summary of many intracranial bleeding concepts can improve phenotyping, compared to the traditional approach. We also demonstrate the usefulness of summarization and visualization techniques of rule BK for drug–drug interaction discovery.


Journal of Biomedical Informatics | 2017

Quality assurance of chemical ingredient classification for the National Drug File – Reference Terminology

Ling Zheng; Hasan Yumak; Ling Chen; Christopher Ochs; James Geller; Joan Kapusnik-Uner; Yehoshua Perl

The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of action, dosage form and physiological effects. Within NDF-RT such information is represented using tens of thousands of roles connecting drugs to classifications. In previous studies, we have introduced various kinds of Abstraction Networks to summarize the content and structure of terminologies in order to facilitate their visual comprehension, and support quality assurance of terminologies. However, these previous kinds of Abstraction Networks are not appropriate for summarizing the NDF-RT classification hierarchies, due to its unique structure. In this paper, we present the novel Ingredient Abstraction Network (IAbN) to summarize, visualize and support the audit of NDF-RTs Chemical Ingredients hierarchy and its associated drugs. A common theme in our quality assurance framework is to use characterizations of sets of concepts, revealed by the Abstraction Network structure, to capture concepts, the modeling of which is more complex than for other concepts. For the IAbN, we characterize drug ingredient concepts as more complex if they belong to IAbN groups with multiple parent groups. We show that such concepts have a statistically significantly higher rate of errors than a control sample and identify two especially common patterns of errors.


Journal of Biomedical Informatics | 2017

An empirical analysis of ontology reuse in BioPortal

Christopher Ochs; Yehoshua Perl; James Geller; Sivaram Arabandi; Tania Tudorache; Mark A. Musen

Biomedical ontologies often reuse content (i.e., classes and properties) from other ontologies. Content reuse enables a consistent representation of a domain and reusing content can save an ontology author significant time and effort. Prior studies have investigated the existence of reused terms among the ontologies in the NCBO BioPortal, but as of yet there has not been a study investigating how the ontologies in BioPortal utilize reused content in the modeling of their own content. In this study we investigate how 355 ontologies hosted in the NCBO BioPortal reuse content from other ontologies for the purposes of creating new ontology content. We identified 197 ontologies that reuse content. Among these ontologies, 108 utilize reused classes in the modeling of their own classes and 116 utilize reused properties in class restrictions. Current utilization of reuse and quality issues related to reuse are discussed.

Collaboration


Dive into the Christopher Ochs's collaboration.

Top Co-Authors

Avatar

Yehoshua Perl

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

James Geller

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Michael Halper

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Gai Elhanan

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Ling Zheng

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yan Chen

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Zhe He

Florida State University

View shared research outputs
Top Co-Authors

Avatar

James T. Case

National Institutes of Health

View shared research outputs
Researchain Logo
Decentralizing Knowledge