Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gai Elhanan is active.

Publication


Featured researches published by Gai Elhanan.


Artificial Intelligence in Medicine | 2004

Auditing concept categorizations in the UMLS

Huanying Gu; Yehoshua Perl; Gai Elhanan; Hua Min; Li Zhang; Yi Peng

The Unified Medical Language System (UMLS) integrates about 880,000 concepts from 100 biomedical terminologies. Each concept is categorized to at least one semantic type of the Semantic Network. During the integration, it is unavoidable that some categorization errors and inconsistencies will be introduced. In this paper, we present an auditing technique to find such errors and inconsistencies. Our technique is based on an expert reviewing the pure intersections of meta-semantic types of a metaschema, a compact abstract view of the UMLS Semantic Network. We use a divide and conquer approach, handling differently small pure intersections and medium to large pure intersections. By using this approach, we limit the number of concepts reviewed, for which we expect a high percentage of errors. We reviewed all concepts in 657 pure intersections containing one to 10 concepts. Various kinds of errors are identified and the analysis of the results are presented in the paper. Also, we checked the pure intersections containing more than 10 concepts for their semantic soundness, where the semantically suspicious pure intersections are presented in the paper and their concepts are reviewed.


Journal of Biomedical Informatics | 2009

Relationship auditing of the FMA ontology

Huanying Helen Gu; Duo Wei; José L. V. Mejino; Gai Elhanan

The Foundational Model of Anatomy (FMA) ontology is a domain reference ontology based on a disciplined modeling approach. Due to its large size, semantic complexity and manual data entry process, errors and inconsistencies are unavoidable and might remain within the FMA structure without detection. In this paper, we present computable methods to highlight candidate concepts for various relationship assignment errors. The process starts with locating structures formed by transitive structural relationships (part_of, tributary_of, branch_of) and examine their assignments in the context of the IS-A hierarchy. The algorithms were designed to detect five major categories of possible incorrect relationship assignments: circular, mutually exclusive, redundant, inconsistent, and missed entries. A domain expert reviewed samples of these presumptive errors to confirm the findings. Seven thousand and fifty-two presumptive errors were detected, the largest proportion related to part_of relationship assignments. The results highlight the fact that errors are unavoidable in complex ontologies and that well designed algorithms can help domain experts to focus on concepts with high likelihood of errors and maximize their effort to ensure consistency and reliability. In the future similar methods might be integrated with data entry processes to offer real-time error detection.


Journal of Biomedical Informatics | 2012

A study of terminology auditors' performance for UMLS semantic type assignments

Huanying Gu; Gai Elhanan; Yehoshua Perl; George Hripcsak; James J. Cimino; Julia Xu; Yan Chen; James Geller; C. Paul Morrey

Auditing healthcare terminologies for errors requires human experts. In this paper, we present a study of the performance of auditors looking for errors in the semantic type assignments of complex UMLS concepts. In this study, concepts are considered complex whenever they are assigned combinations of semantic types. Past research has shown that complex concepts have a higher likelihood of errors. The results of this study indicate that individual auditors are not reliable when auditing such concepts and their performance is low, according to various metrics. These results confirm the outcomes of an earlier pilot study. They imply that to achieve an acceptable level of reliability and performance, when auditing such concepts of the UMLS, several auditors need to be assigned the same task. A mechanism is then needed to combine the possibly differing opinions of the different auditors into a final determination. In the current study, in contrast to our previous work, we used a majority mechanism for this purpose. For a sample of 232 complex UMLS concepts, the majority opinion was found reliable and its performance for accuracy, recall, precision and the F-measure was found statistically significantly higher than the average performance of individual auditors.


Online Journal of Public Health Informatics | 2014

Sculpting the UMLS Refined Semantic Network.

Zhe He; C. Paul Morrey; Yehoshua Perl; Gai Elhanan; Ling Chen; Yan Chen; James Geller

Background The Refined Semantic Network (RSN) for the UMLS was previously introduced to complement the UMLS Semantic Network (SN). The RSN partitions the UMLS Metathesaurus (META) into disjoint groups of concepts. Each such group is semantically uniform. However, the RSN was initially an order of magnitude larger than the SN, which is undesirable since to be useful, a semantic network should be compact. Most semantic types in the RSN represent combinations of semantic types in the UMLS SN. Such a “combination semantic type” is called Intersection Semantic Type (IST). Many ISTs are assigned to very few concepts. Moreover, when reviewing those concepts, many semantic type assignment inconsistencies were found. After correcting those inconsistencies many ISTs, among them some that contradicted UMLS rules, disappeared, which made the RSN smaller. Objective The authors performed a longitudinal study with the goal of reducing the size of the RSN to become compact. This goal was achieved by correcting inconsistencies and errors in the IST assignments in the UMLS, which additionally helped identify and correct ambiguities, inconsistencies, and errors in source terminologies widely used in the realm of public health. Methods In this paper, we discuss the process and steps employed in this longitudinal study and the intermediate results for different stages. The sculpting process includes removing redundant semantic type assignments, expanding semantic type assignments, and removing illegitimate ISTs by auditing ISTs of small extents. However, the emphasis of this paper is not on the auditing methodologies employed during the process, since they were introduced in earlier publications, but on the strategy of employing them in order to transform the RSN into a compact network. For this paper we also performed a comprehensive audit of 168 “small ISTs” in the 2013AA version of the UMLS to finalize the longitudinal study. Results Over the years it was found that the editors of the UMLS introduced some new inconsistencies that resulted in the reintroduction of unwarranted ISTs that had already been eliminated as a result of their previous corrections. Because of that, the transformation of the RSN into a compact network covering all necessary categories for the UMLS was slowed down. The corrections suggested by an audit of the 2013AA version of the UMLS achieve a compact RSN of equal magnitude as the UMLS SN. The number of ISTs has been reduced to 336. We also demonstrate how auditing the semantic type assignments of UMLS concepts can expose other modeling errors in the UMLS source terminologies, e.g., SNOMED CT, LOINC, and RxNORM that are important for health informatics. Such errors would otherwise stay hidden. Conclusions It is hoped that the UMLS curators will implement all required corrections and use the RSN along with the SN when maintaining and extending the UMLS. When used correctly, the RSN will support the prevention of the accidental introduction of inconsistent semantic type assignments into the UMLS. Furthermore, this way the RSN will support the exposure of other hidden errors and inconsistencies in health informatics terminologies, which are sources of the UMLS. Notably, the development of the RSN materializes the deeper, more refined Semantic Network for the UMLS that its designers envisioned originally but had not implemented.


bioinformatics and biomedicine | 2015

Algorithmic detection of inconsistent modeling among SNOMED CT concepts by combining lexical and structural indicators

Ankur Agrawal; Yehoshua Perl; Christopher Ochs; Gai Elhanan

SNOMED CT is important for clinical applications, such as Electronic Health Record (EHR) encoding. However, inconsistency in modeling its concepts may prevent SNOMED CT from providing proper support for clinical use. This study provides an effective methodology for locating inconsistently modeled SNOMED CT concepts. One can expect lexically similar concepts to be modeled similarly. Positional similarity sets, sets of lexically similar concepts having only one different word at the same position of their names, are introduced. Concepts in such sets have a higher likelihood of being unjustifiably inconsistently modeled. A technique to incorporate three structural indicators into the selected sets is provided to further improve the likelihood of finding inconsistently modeled concepts. An analysis of a sample of 50 such sets and for each of these three indicators is performed. The sample of positional similarity sets is found to have 18.6% inconsistent concepts. The use of structural indicators is shown to further improve the likelihood of finding inconsistently modeled concepts up to 41.6% with high statistical significance when compared to the previous sample of positional similarity sets. Positional similarity sets with different structural indicators are shown to help identify inconsistencies in concept modeling with high likelihood. Furthermore, such sets enable the comparison of concept modeling in the context of other lexically similar concepts, which enhances the effectiveness of corrections by auditors. Such quality assurance methods can be used to supplement IHTSDOs own efforts in order to improve the quality of SNOMED CT.


ieee embs international conference on biomedical and health informatics | 2012

Questionable relationship triples in the UMLS

Huanying Gu; Gai Elhanan; Michael Halper; Zhe He

The relationships of the UMLS Metathesaurus are used to describe the nature of the connections between pairs of concepts. The occurrence of multiple relationships between a given pair of concepts may be indicative of some kind of inconsistency. A methodology to algorithmically identify and then categorize all pairs of concepts with multiple relationships is presented. These potentially problematic concept pairs are grouped into four categories, including those that are possibly conflicting from a hierarchical standpoint and those that may violate a mutual-exclusion constraint. Samples of the identified concept pairs are reviewed. Some of the errors and inconsistencies found during the review are reported. The findings indicate that questionable UMLS relationship triples can be easily detected by algorithmic approaches, are common, and at times can be corrected in the UMLS itself.


Artificial Intelligence in Medicine | 2017

From SNOMED CT to Uberon: Transferability of evaluation methodology between similarly structured ontologies

Gai Elhanan; Christopher Ochs; Jose L. V. Mejino; Hao Liu; Christopher J. Mungall; Yehoshua Perl

OBJECTIVE To examine whether disjoint partial-area taxonomy, a semantically-based evaluation methodology that has been successfully tested in SNOMED CT, will perform with similar effectiveness on Uberon, an anatomical ontology that belongs to a structurally similar family of ontologies as SNOMED CT. METHOD A disjoint partial-area taxonomy was generated for Uberon. One hundred randomly selected test concepts that overlap between partial-areas were matched to a same size control sample of non-overlapping concepts. The samples were blindly inspected for non-critical issues and presumptive errors first by a general domain expert whose results were then confirmed or rejected by a highly experienced anatomical ontology domain expert. Reported issues were subsequently reviewed by Uberons curators. RESULTS Overlapping concepts in Uberons disjoint partial-area taxonomy exhibited a significantly higher rate of all issues. Clear-cut presumptive errors trended similarly but did not reach statistical significance. A sub-analysis of overlapping concepts with three or more relationship types indicated a much higher rate of issues. CONCLUSIONS Overlapping concepts from Uberons disjoint abstraction network are quite likely (up to 28.9%) to exhibit issues. The results suggest that the methodology can transfer well between same family ontologies. Although Uberon exhibited relatively few overlapping concepts, the methodology can be combined with other semantic indicators to expand the process to other concepts within the ontology that will generate high yields of discovered issues.


international acm sigir conference on research and development in information retrieval | 2012

Clinical clarity versus terminological order: the readiness of SNOMED CT concept descriptors for primary care

Zhe He; Michael Halper; Yehoshua Perl; Gai Elhanan

As SNOMED usage becomes more ingrained within applications, its range of concept descriptors, and particularly its synonym adequacy, becomes more important. A simulated clinical scenario involving various term-based concept searches is used to assess whether SNOMEDs concept descriptors provide sufficient differentiation to enable possible concept selection between similar terms. Four random samples from different SNOMED concept populations are utilized. Of particular interest are concepts mapped duplicately into UMLS concepts due to shared term patterns. While overall synonym problems are rare (1%), some concept populations exhibited a high rate of potential problems for clinical use (17-62%). The vast majority of issues are due to SNOMEDs inherent structure and fine granularity. Many findings hint at a lack of clear delineation between reference and interface terminological qualities. Closer attention should be given to practical clinical use-case scenarios. Reducing SNOMEDs structural complexity may alleviate many of the described findings and encourage clinical adoption.


data mining in bioinformatics | 2016

A contextual auditing method for SNOMED CT concepts

Ankur Agrawal; Yehoshua Perl; Christopher Ochs; Gai Elhanan

SNOMED CT has been regarded as the most prominent clinical health terminology to be used in Electronic Health Records. However, modelling inconsistencies are preventing SNOMED CT from providing proper support for clinical use. This study introduces positional similarity sets as an effective contextual technique to identify such inconsistencies and improve the modelling of SNOMED CT concepts. Positional similarity sets are sets of lexically similar concepts having only one different word at the same position of their names. A technique to incorporate three structural indicators into the selected sets is provided to improve the likelihood of finding inconsistently modelled concepts. The results show that the likelihood of finding inconsistencies using such positional similarity sets is up to 41.6%. Such quality assurance methods can be used to supplement IHTSDOs own efforts in order to improve the quality of SNOMED CT.


Methods of Information in Medicine | 2018

Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags

Huanying Gu; Zhe He; Duo Wei; Gai Elhanan; Yan Chen

BACKGROUND The UMLS assigns semantic types to all its integrated concepts. The semantic types are widely used in various natural language processing tasks in the biomedical domain, such as named entity recognition, semantic disambiguation, and semantic annotation. Due to the size of the UMLS, erroneous semantic type assignments are hard to detect. It is imperative to devise automated techniques to identify errors and inconsistencies in semantic type assignments. OBJECTIVES Designing a methodology to perform programmatic checks to detect semantic type assignment errors for UMLS concepts with one or more SNOMED CT terms and evaluating concepts in a selected set of SNOMED CT hierarchies to verify our hypothesis that UMLS semantic type assignment errors may exist in concepts residing in semantically inconsistent groups. METHODS Our methodology is a four-stage process. 1) partitioning concepts in a SNOMED CT hierarchy into semantically uniform groups based on their assigned semantic tags; 2) partitioning concepts in each group from 1) into the disjoint sub-groups based on their semantic type assignments; 3) mapping all SNOMED CT semantic tags into one or more semantic types in the UMLS; 4) identifying semantically inconsistent groups that have inconsistent assignments between semantic tags and semantic types according to the mapping from 3) and providing concepts in such groups to the domain experts for reviewing. RESULTS We applied our method on the UMLS 2013AA release. Concepts of the semantically inconsistent groups in the PHYSICAL FORCE and RECORD ARTIFACT hierarchies have error rates 33% and 62.5% respectively, which are greatly larger than error rates 0.6% and 1% in semantically consistent groups of the two hierarchies. CONCLUSION Concepts in semantically in - consistent groups are more likely to contain semantic type assignment errors. Our methodology can make auditing more efficient by limiting auditing resources on concepts of semantically inconsistent groups.

Collaboration


Dive into the Gai Elhanan's collaboration.

Top Co-Authors

Avatar

Yehoshua Perl

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Michael Halper

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

James Geller

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yan Chen

Borough of Manhattan Community College

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Duo Wei

Richard Stockton College of New Jersey

View shared research outputs
Top Co-Authors

Avatar

James J. Cimino

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Zhe He

Florida State University

View shared research outputs
Top Co-Authors

Avatar

Christopher Ochs

New Jersey Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge