Martin Boeker
University of Freiburg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Boeker.
International Journal of Medical Informatics | 2009
Stefan Schulz; Boontawee Suntisrivaraporn; Franz Baader; Martin Boeker
After a critical review of the present architecture of SNOMED CT, addressing both logical and ontological issues, we present a roadmap toward an overall improvement and recommend the following actions: SNOMED CTs ontology, dictionary, and information model components should be kept separate. SNOMED CTs upper level should be re-arranged according to a standard upper level ontology. SNOMED CT concepts should be assigned to the four disjoint groups: classes, instances, relations, and meta-classes. SNOMED CTs binary relations should be reduced to a set of canonical ones, following existing recommendations. Taxonomies should be cleansed and split into disjoint partitions. The number of full definitions should be increased. Finally, new approaches are proposed for modeling part-whole hierarchies, as well as the integration of qualifier relations into a unified framework. All proposed modifications can be expressed by the computationally tractable description logic EL(++).
BMC Medical Research Methodology | 2013
Martin Boeker; Werner Vach; Edith Motschall
BackgroundRecent research indicates a high recall in Google Scholar searches for systematic reviews. These reports raised high expectations of Google Scholar as a unified and easy to use search interface. However, studies on the coverage of Google Scholar rarely used the search interface in a realistic approach but instead merely checked for the existence of gold standard references. In addition, the severe limitations of the Google Search interface must be taken into consideration when comparing with professional literature retrieval tools.The objectives of this work are to measure the relative recall and precision of searches with Google Scholar under conditions which are derived from structured search procedures conventional in scientific literature retrieval; and to provide an overview of current advantages and disadvantages of the Google Scholar search interface in scientific literature retrieval.MethodsGeneral and MEDLINE-specific search strategies were retrieved from 14 Cochrane systematic reviews. Cochrane systematic review search strategies were translated to Google Scholar search expression as good as possible under consideration of the original search semantics. The references of the included studies from the Cochrane reviews were checked for their inclusion in the result sets of the Google Scholar searches. Relative recall and precision were calculated.ResultsWe investigated Cochrane reviews with a number of included references between 11 and 70 with a total of 396 references. The Google Scholar searches resulted in sets between 4,320 and 67,800 and a total of 291,190 hits. The relative recall of the Google Scholar searches had a minimum of 76.2% and a maximum of 100% (7 searches). The precision of the Google Scholar searches had a minimum of 0.05% and a maximum of 0.92%. The overall relative recall for all searches was 92.9%, the overall precision was 0.13%.ConclusionThe reported relative recall must be interpreted with care. It is a quality indicator of Google Scholar confined to an experimental setting which is unavailable in systematic retrieval due to the severe limitations of the Google Scholar search interface. Currently, Google Scholar does not provide necessary elements for systematic scientific literature retrieval such as tools for incremental query optimization, export of a large number of references, a visual search builder or a history function. Google Scholar is not ready as a professional searching tool for tasks where structured retrieval methodology is necessary.
intelligent systems in molecular biology | 2008
Stefan Schulz; Holger Stenzhorn; Martin Boeker
Motivation: The classification of biological entities in terms of species and taxa is an important endeavor in biology. Although a large amount of statements encoded in current biomedical ontologies is taxon-dependent there is no obvious or standard way for introducing taxon information into an integrative ontology architecture, supposedly because of ongoing controversies about the ontological nature of species and taxa. Results: In this article, we discuss different approaches on how to represent biological taxa using existing standards for biomedical ontologies such as the description logic OWL DL and the Open Biomedical Ontologies Relation Ontology. We demonstrate how hidden ambiguities of the species concept can be dealt with and existing controversies can be overcome. A novel approach is to envisage taxon information as qualities that inhere in biological organisms, organism parts and populations. Availability: The presented methodology has been implemented in the domain top-level ontology BioTop, openly accessible at http://purl.org/biotop. BioTop may help to improve the logical and ontological rigor of biomedical ontologies and further provides a clear architectural principle to deal with biological taxa information. Contact: [email protected]
Journal of Biomedical Semantics | 2011
Stefan Schulz; Kent A. Spackman; Andrew James; Cristian Cocos; Martin Boeker
BackgroundThe realm of pathological entities can be subdivided into pathological dispositions, pathological processes, and pathological structures. The latter are the bearer of dispositions, which can then be realized by their manifestations — pathologic processes. Despite its ontological soundness, implementing this model via purpose-oriented domain ontologies will likely require considerable effort, both in ontology construction and maintenance, which constitutes a considerable problem for SNOMED CT, presently the largest biomedical ontology.ResultsWe describe an ontology design pattern which allows ontologists to make assertions that blur the distinctions between dispositions, processes, and structures until necessary. Based on the domain upper-level ontology BioTop, it permits ascriptions of location and participation in the definition of pathological phenomena even without an ontological commitment to a distinction between these three categories. An analysis of SNOMED CT revealed that numerous classes in the findings/disease hierarchy are ambiguous with respect to process vs. disposition. Here our proposed approach can easily be applied to create unambiguous classes. No ambiguities could be defined regarding the distinction of structure and non-structure classes, but here we have found problematic duplications.ConclusionsWe defend a judicious use of disjunctive, and therefore ambiguous, classes in biomedical ontologies during the process of ontology construction and in the practice of ontology application. The use of these classes is permitted to span across several top-level categories, provided it contributes to ontology simplification and supports the intended reasoning scenarios.
Philosophical Transactions of the Royal Society A | 2008
Martin Hofmann-Apitius; Juliane Fluck; Laura I. Furlong; Fornes O; Corinna Kolarik; Susanne Hanser; Martin Boeker; Stefan Schulz; Ferran Sanz; Roman Klinger; Mevissen T; Gattermayer T; Baldo Oliva; Christoph M. Friedrich
In essence, the virtual physiological human (VPH) is a multiscale representation of human physiology spanning from the molecular level via cellular processes and multicellular organization of tissues to complex organ function. The different scales of the VPH deal with different entities, relationships and processes, and in consequence the models used to describe and simulate biological functions vary significantly. Here, we describe methods and strategies to generate knowledge environments representing molecular entities that can be used for modelling the molecular scale of the VPH. Our strategy to generate knowledge environments representing molecular entities is based on the combination of information extraction from scientific text and the integration of information from biomolecular databases. We introduce @neuLink, a first prototype of an automatically generated, disease-specific knowledge environment combining biomolecular, chemical, genetic and medical information. Finally, we provide a perspective for the future implementation and use of knowledge environments representing molecular entities for the VPH.
Journal of Biomedical Semantics | 2012
Daniel Schober; Ilinca Tudose; Vojtech Svátek; Martin Boeker
BackgroundAlthough policy providers have outlined minimal metadata guidelines and naming conventions, ontologies of today still display inter- and intra-ontology heterogeneities in class labelling schemes and metadata completeness. This fact is at least partially due to missing or inappropriate tools. Software support can ease this situation and contribute to overall ontology consistency and quality by helping to enforce such conventions.ObjectiveWe provide a plugin for the Protégé Ontology editor to allow for easy checks on compliance towards ontology naming conventions and metadata completeness, as well as curation in case of found violations.ImplementationIn a requirement analysis, derived from a prior standardization approach carried out within the OBO Foundry, we investigate the needed capabilities for software tools to check, curate and maintain class naming conventions. A Protégé tab plugin was implemented accordingly using the Protégé 4.1 libraries. The plugin was tested on six different ontologies. Based on these test results, the plugin could be refined, also by the integration of new functionalities.ResultsThe new Protégé plugin, OntoCheck, allows for ontology tests to be carried out on OWL ontologies. In particular the OntoCheck plugin helps to clean up an ontology with regard to lexical heterogeneity, i.e. enforcing naming conventions and metadata completeness, meeting most of the requirements outlined for such a tool. Found test violations can be corrected to foster consistency in entity naming and meta-annotation within an artefact. Once specified, check constraints like name patterns can be stored and exchanged for later re-use. Here we describe a first version of the software, illustrate its capabilities and use within running ontology development efforts and briefly outline improvements resulting from its application. Further, we discuss OntoChecks capabilities in the context of related tools and highlight potential future expansions.ConclusionsThe OntoCheck plugin facilitates labelling error detection and curation, contributing to lexical quality assurance in OWL ontologies. Ultimately, we hope this Protégé extension will ease ontology alignments as well as lexical post-processing of annotated data and hence can increase overall secondary data usage by humans and computers.
BMC Medical Informatics and Decision Making | 2014
Michael Braun; Alexander U. Brandt; Stefan Schulz; Martin Boeker
BackgroundNumerous information models for electronic health records, such as openEHR archetypes are available. The quality of such clinical models is important to guarantee standardised semantics and to facilitate their interoperability. However, validation aspects are not regarded sufficiently yet. The objective of this report is to investigate the feasibility of archetype development and its community-based validation process, presuming that this review process is a practical way to ensure high-quality information models amending the formal reference model definitions.MethodsA standard archetype development approach was applied on a case set of three clinical tests for multiple sclerosis assessment: After an analysis of the tests, the obtained data elements were organised and structured. The appropriate archetype class was selected and the data elements were implemented in an iterative refinement process. Clinical and information modelling experts validated the models in a structured review process.ResultsFour new archetypes were developed and publicly deployed in the openEHR Clinical Knowledge Manager, an online platform provided by the openEHR Foundation. Afterwards, these four archetypes were validated by domain experts in a team review. The review was a formalised process, organised in the Clinical Knowledge Manager. Both, development and review process turned out to be time-consuming tasks, mostly due to difficult selection processes between alternative modelling approaches. The archetype review was a straightforward team process with the goal to validate archetypes pragmatically.ConclusionsThe quality of medical information models is crucial to guarantee standardised semantic representation in order to improve interoperability. The validation process is a practical way to better harmonise models that diverge due to necessary flexibility left open by the underlying formal reference model definitions.This case study provides evidence that both community- and tool-enabled review processes, structured in the Clinical Knowledge Manager, ensure archetype quality. It offers a pragmatic but feasible way to reduce variation in the representation of clinical information models towards a more unified and interoperable model.
Methods of Information in Medicine | 2009
Stefan Schulz; Martin Boeker; Holger Stenzhorn; Jörg M. Niggemann
OBJECTIVES The application of upper ontologies has been repeatedly advocated for to support the interoperability between different domain ontologies for facilitating the shared use of data within and across disciplines. BioTop is an upper domain ontology that aims at aligning more specialized biomolecular and biomedical ontologies. The integration of BioTop and the upper ontology Basic Formal Ontology (BFO) is the objective of this study. METHODS BFO was manually integrated into BioTop, observing both its free text and formal definitions. BioTop classes were attached to BFO classes as children and BFO classes were reused in the formal definitions of BioTop classes. A description logics reasoner was used to check the logical consistency of this integration. The domain adequacy was checked manually by domain experts. RESULTS Logical inconsistencies were found by the reasoner when applying the BFO classes for fiat and aggregated objects in some of the BioTop class definitions. We discovered that the definition of those particular classes in BFO was dependent on the notion of physical connectedness. Hence we suggest ignoring a BFO subbranch in order not to hinder cross-granularity integration. CONCLUSION Without introducing a more sophisticated theory of granularity, the described problems cannot be properly dealt with. Whereas we argue that an upper ontology should be granularity-independent, we illustrate how granularity-dependent domain ontologies can still be embedded into the framework of BioTop in combination with BFO.
Journal of the American Medical Informatics Association | 2012
Pablo López-García; Martin Boeker; Arantza Illarramendi; Stefan Schulz
OBJECTIVES To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. MATERIALS AND METHODS Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. RESULTS Graph-traversal heuristics provided high coverage (71-96% of terms in the test sets of discharge summaries) at the expense of subset size (17-51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24-55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. DISCUSSION Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. CONCLUSION Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available.
ieee international conference on information technology and applications in biomedicine | 2009
Dimitris K. Iakovidis; Daniel Schober; Martin Boeker; Stefan Schulz
Ontologies are an effective means to formally specify and constrain knowledge. They have proved their utility in various data mining applications, especially in annotating text to render it machine interpretable. More challenging research perspectives arise when ontologies are used to annotate images where the information is encoded in numeric pixel values rather than in natural language. Current approaches to bridge the gap between the pixel-based foundational representation and high level image semantics include the utilization of taxonomies describing 2D spatial relations between the depicted objects and hence linking image features with semantics. To this end we present a novel ontological approach that formalizes concepts and relations regarding image representations for medical image mining. It provides descriptors for pixels, image regions, image features, and clusters. It extends previous approaches by including assertions of spatial relations between clusters in multidimensional feature spaces. The relational assertions enable the linkage between a given image, image region and feature(s) to the object they represent. The proposed approach is more general than most current approaches and can be easily extended to support multimodal data mining.