Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Harold R. Solbrig is active.

Publication


Featured researches published by Harold R. Solbrig.


Studies in health technology and informatics | 2004

Modeling guidelines for integration into clinical workflow.

Samson W. Tu; Mark A. Musen; Ravi D. Shankar; James J. Campbell; Karen M. Hrabak; James C. McClay; Stanley M. Huff; Robert C. McClure; Craig G. Parker; Roberto A. Rocha; Robert M. Abarbanel; Nick Beard; Julie Glasgow; Guy Mansfield; Prabhu Ram; Qin Ye; Eric Mays; Tony Weida; Christopher G. Chute; Kevin McDonald; David Molu; Mark A. Nyman; Sidna M. Scheitel; Harold R. Solbrig; David A. Zill; Mary K. Goldstein

The success of clinical decision-support systems requires that they are seamlessly integrated into clinical workflow. In the SAGE project, which aims to create the technological infra-structure for implementing computable clinical practice guide-lines in enterprise settings, we created a deployment-driven methodology for developing guideline knowledge bases. It involves (1) identification of usage scenarios of guideline-based care in clinical workflow, (2) distillation and disambiguation of guideline knowledge relevant to these usage scenarios, (3) formalization of data elements and vocabulary used in the guideline, and (4) encoding of usage scenarios and guideline knowledge using an executable guideline model. This methodology makes explicit the points in the care process where guideline-based decision aids are appropriate and the roles of clinicians for whom the guideline-based assistance is intended. We have evaluated the methodology by simulating the deployment of an immunization guideline in a real clinical information system and by reconstructing the workflow context of a deployed decision-support system for guideline-based care. We discuss the implication of deployment-driven guideline encoding for sharability of executable guidelines.


Journal of the American Medical Informatics Association | 2013

Normalization and standardization of electronic health records for high-throughput phenotyping: the SHARPn consortium

Jyotishman Pathak; Kent R. Bailey; Calvin Beebe; Steven Bethard; David Carrell; Pei J. Chen; Dmitriy Dligach; Cory M. Endle; Lacey Hart; Peter J. Haug; Stanley M. Huff; Vinod Kaggal; Dingcheng Li; Hongfang D Liu; Kyle Marchant; James J. Masanz; Timothy A. Miller; Thomas A. Oniki; Martha Palmer; Kevin J. Peterson; Susan Rea; Guergana Savova; Craig Stancl; Sunghwan Sohn; Harold R. Solbrig; Dale Suesse; Cui Tao; David P. Taylor; Les Westberg; Stephen T. Wu

RESEARCH OBJECTIVE To develop scalable informatics infrastructure for normalization of both structured and unstructured electronic health record (EHR) data into a unified, concept-based model for high-throughput phenotype extraction. MATERIALS AND METHODS Software tools and applications were developed to extract information from EHRs. Representative and convenience samples of both structured and unstructured data from two EHR systems-Mayo Clinic and Intermountain Healthcare-were used for development and validation. Extracted information was standardized and normalized to meaningful use (MU) conformant terminology and value set standards using Clinical Element Models (CEMs). These resources were used to demonstrate semi-automatic execution of MU clinical-quality measures modeled using the Quality Data Model (QDM) and an open-source rules engine. RESULTS Using CEMs and open-source natural language processing and terminology services engines-namely, Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) and Common Terminology Services (CTS2)-we developed a data-normalization platform that ensures data security, end-to-end connectivity, and reliable data flow within and across institutions. We demonstrated the applicability of this platform by executing a QDM-based MU quality measure that determines the percentage of patients between 18 and 75 years with diabetes whose most recent low-density lipoprotein cholesterol test result during the measurement year was <100 mg/dL on a randomly selected cohort of 273 Mayo Clinic patients. The platform identified 21 and 18 patients for the denominator and numerator of the quality measure, respectively. Validation results indicate that all identified patients meet the QDM-based criteria. CONCLUSIONS End-to-end automated systems for extracting clinical information from diverse EHR systems require extensive use of standardized vocabularies and terminologies, as well as robust information models for storing, discovering, and processing that information. This study demonstrates the application of modular and open-source resources for enabling secondary use of EHR data through normalization into standards-based, comparable, and consistent format for high-throughput phenotyping to identify patient cohorts.


international conference on semantic systems | 2014

Shape expressions: an RDF validation and transformation language

Eric Prud'hommeaux; José Emilio Labra Gayo; Harold R. Solbrig

RDF is a graph based data model which is widely used for semantic web and linked data applications. In this paper we describe a Shape Expression definition language which enables RDF validation through the declaration of constraints on the RDF model. Shape Expressions can be used to validate RDF data, communicate expected graph patterns for interfaces and generate user interface forms. In this paper we describe the syntax and the formal semantics of Shape Expressions using inference rules. Shape Expressions can be seen as domain specific language to define Shapes of RDF graphs based on regular expressions. Attached to Shape Expressions are semantic actions which provide an extension point for validation or for arbitrary code execution such as those in parser generators. Using semantic actions, it is possible to augment the validation expressiveness of Shape Expressions and to transform RDF graphs in a easy way. We have implemented several validation tools that check if an RDF graph matches against a Shape Expressions schema and infer the corresponding Shapes. We have also implemented two extensions, called GenX and GenJ that leverage the predictability of the graph traversal and create ordered, closed content, XML/Json documents, providing a simple, declarative mapping from RDF data to XML and Json documents.


Journal of the American Medical Informatics Association | 2009

LexGrid: A Framework for Representing, Storing, and Querying Biomedical Terminologies from Simple to Sublime

Jyotishman Pathak; Harold R. Solbrig; James D. Buntrock; Thomas M. Johnson; Christopher G. Chute

Many biomedical terminologies, classifications, and ontological resources such as the NCI Thesaurus (NCIT), International Classification of Diseases (ICD), Systematized Nomenclature of Medicine (SNOMED), Current Procedural Terminology (CPT), and Gene Ontology (GO) have been developed and used to build a variety of IT applications in biology, biomedicine, and health care settings. However, virtually all these resources involve incompatible formats, are based on different modeling languages, and lack appropriate tooling and programming interfaces (APIs) that hinder their wide-scale adoption and usage in a variety of application contexts. The Lexical Grid (LexGrid) project introduced in this paper is an ongoing community-driven initiative, coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics, designed to bridge this gap using a common terminology model called the LexGrid model. The key aspect of the model is to accommodate multiple vocabulary and ontology distribution formats and support of multiple data stores for federated vocabulary distribution. The model provides a foundation for building consistent and standardized APIs to access multiple vocabularies that support lexical search queries, hierarchy navigation, and a rich set of features such as recursive subsumption (e.g., get all the children of the concept penicillin). Existing LexGrid implementations include the LexBIG API as well as a reference implementation of the HL7 Common Terminology Services (CTS) specification providing programmatic access via Java, Web, and Grid services.


international semantic web conference | 2010

Time-oriented question answering from clinical narratives sing semantic-web techniques

Cui Tao; Harold R. Solbrig; Deepak K. Sharma; Wei Qi Wei; Guergana Savova; Christopher G. Chute

The ability to answer temporal-oriented questions based on clinical narratives is essential to clinical research. The temporal dimension in medical data analysis enables clinical researches on many areas, such as, disease progress, individualized treatment, and decision support. The Semantic Web provides a suitable environment to represent the temporal dimension of the clinical data and reason about them. In this paper, we introduce a Semantic-Web based framework, which provides an API for querying temporal information from clinical narratives. The framework is centered by an OWL ontology called CNTRO (Clinical Narrative Temporal Relation Ontology), and contains three major components: time normalizer, SWRL based reasoner, and OWL-DL based reasoner. We also discuss how we adopted these three components in the clinical domain, their limitations, as well as extensions that we found necessary or desirable to archive the purposes of querying time-oriented data from real-world clinical narratives.


Journal of the American Medical Informatics Association | 2000

Embedded Structures and Representation of Nursing Knowledge

Marcelline R. Harris; Judith R. Graves; Harold R. Solbrig; Peter L. Elkin; Christopher G. Chute

Nursing Vocabulary Summit participants were challenged to consider whether reference terminology and information models might be a way to move toward better capture of data in electronic medical records. A requirement of such reference models is fidelity to representa- tions of domain knowledge. This article discusses embedded structures in three different approach- es to organizing domain knowledge: scientific reasoning, expertise, and standardized nursing lan- guages. The concept of pressure ulcer is presented as an example of the various ways lexical ele- ments used in relation to a specific concept are organized across systems. Different approaches to structuring information—the clinical information system, minimum data sets, and standardized messaging formats—are similarly discussed. Recommendations include identification of the poly- hierarchies and categorical structures required within a reference terminology, systematic evalua- tions of the extent to which structured information accurately and completely represents domain knowledge, and modifications or extensions to existing multidisciplinary efforts. J Am Med Inform Assoc. 2000;7:539-549.


Applied Ontology | 2008

Representing the NCI Thesaurus in OWL DL: Modeling tools help modeling languages

Natalya Fridman Noy; Sherri de Coronado; Harold R. Solbrig; Gilberto Fragoso; Frank W. Hartel; Mark A. Musen

The National Cancer Institutes (NCI) Thesaurus is a biomedical reference ontology. The NCI Thesaurus is represented using Description Logic, more specifically Ontylog, a Description logic implemented by Apelon, Inc. We are exploring the use of the DL species of the Web Ontology Language (OWL DL)-a W3C recommended standard for ontology representation-instead of Ontylog for representing the NCI Thesaurus. We have studied the requirements for knowledge representation of the NCI Thesaurus, and considered how OWL DL (and its implementation in Protégé-OWL) satisfies these requirements. In this paper, we discuss the areas where OWL DL was sufficient for representing required components, where tool support that would hide some of the complexity and extra levels of indirection would be required, and where language expressiveness is not sufficient given the representation requirements. Because many of the knowledge-representation issues that we encountered are very similar to the issues in representing other biomedical terminologies and ontologies in general, we believe that the lessons that we learned and the approaches that we developed will prove useful and informative for other researchers.


international conference on database theory | 2015

Complexity and Expressiveness of ShEx for RDF

Slawomir Staworko; Iovka Boneva; José Emilio Labra Gayo; Samuel Hym; Eric Prud'hommeaux; Harold R. Solbrig

We study the expressiveness and complexity of Shape Expression Schema (ShEx), a novel schema formalism for RDF currently under development by W3C. ShEx assigns types to the nodes of an RDF graph and allows to constrain the admissible neighborhoods of nodes of a given type with regular bag expressions (RBEs). We formalize and investigate two alternative semantics, multi-and single-type, depending on whether or not a node may have more than one type. We study the expressive power of ShEx and study the complexity of the validation problem. We show that the single-type semantics is strictly more expressive than the multi-type semantics, single-type validation is generally intractable and multi-type validation is feasible for a small (yet practical) subclass of RBEs. To curb the high computational complexity of validation, we propose a natural notion of determinism and show that multi-type validation for the class of deterministic schemas using single-occurrence regular bag expressions (SORBEs) is tractable.


Journal of Biomedical Informatics | 2013

Terminology representation guidelines for biomedical ontologies in the semantic web notations

Cui Tao; Jyotishman Pathak; Harold R. Solbrig; Wei Qi Wei; Christopher G. Chute

Terminologies and ontologies are increasingly prevalent in healthcare and biomedicine. However they suffer from inconsistent renderings, distribution formats, and syntax that make applications through common terminologies services challenging. To address the problem, one could posit a shared representation syntax, associated schema, and tags. We identified a set of commonly-used elements in biomedical ontologies and terminologies based on our experience with the Common Terminology Services 2 (CTS2) Specification as well as the Lexical Grid (LexGrid) project. We propose guidelines for precisely such a shared terminology model, and recommend tags assembled from SKOS, OWL, Dublin Core, RDF Schema, and DCMI meta-terms. We divide these guidelines into lexical information (e.g. synonyms, and definitions) and semantic information (e.g. hierarchies). The latter we distinguish for use by informal terminologies vs. formal ontologies. We then evaluate the guidelines with a spectrum of widely used terminologies and ontologies to examine how the lexical guidelines are implemented, and whether our proposed guidelines would enhance interoperability.


Journal of the American Medical Informatics Association | 2012

Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups

Guoqian Jiang; Harold R. Solbrig; Christopher G. Chute

Objective The objective of this study is to develop an approach to evaluate the quality of terminological annotations on the value set (ie, enumerated value domain) components of the common data elements (CDEs) in the context of clinical research using both unified medical language system (UMLS) semantic types and groups. Materials and methods The CDEs of the National Cancer Institute (NCI) Cancer Data Standards Repository, the NCI Thesaurus (NCIt) concepts and the UMLS semantic network were integrated using a semantic web-based framework for a SPARQL-enabled evaluation. First, the set of CDE-permissible values with corresponding meanings in external controlled terminologies were isolated. The corresponding value meanings were then evaluated against their NCI- or UMLS-generated semantic network mapping to determine whether all of the meanings fell within the same semantic group. Results Of the enumerated CDEs in the Cancer Data Standards Repository, 3093 (26.2%) had elements drawn from more than one UMLS semantic group. A random sample (n=100) of this set of elements indicated that 17% of them were likely to have been misclassified. Discussion The use of existing semantic web tools can support a high-throughput mechanism for evaluating the quality of large CDE collections. This study demonstrates that the involvement of multiple semantic groups in an enumerated value domain of a CDE is an effective anchor to trigger an auditing point for quality evaluation activities. Conclusion This approach produces a useful quality assurance mechanism for a clinical study CDE repository.

Collaboration


Dive into the Harold R. Solbrig's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric Prud'hommeaux

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Cui Tao

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James R. Campbell

University of Nebraska Medical Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge