Hilmar Lapp | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hilmar Lapp is active.

Explore More

Publication

Featured researches published by Hilmar Lapp.

PLOS Biology | 2015

Finding Our Way through Phenotypes

Andrew R. Deans; Suzanna E. Lewis; Eva Huala; Salvatore S. Anzaldo; Michael Ashburner; James P. Balhoff; David C. Blackburn; Judith A. Blake; J. Gordon Burleigh; Bruno Chanet; Laurel Cooper; Mélanie Courtot; Sándor Csösz; Hong Cui; Wasila M. Dahdul; Sandip Das; T. Alexander Dececchi; Agnes Dettai; Rui Diogo; Robert E. Druzinsky; Michel Dumontier; Nico M. Franz; Frank Friedrich; George V. Gkoutos; Melissa Haendel; Luke J. Harmon; Terry F. Hayamizu; Yongqun He; Heather M. Hines; Nizar Ibrahim

Imagine if we could compute across phenotype data as easily as genomic data; this article calls for efforts to realize this vision and discusses the potential benefits.

PLOS ONE | 2010

Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature

Wasila M. Dahdul; James P. Balhoff; Jeffrey M. Engeman; Terry Grande; Eric J. Hilton; Cartik R. Kothari; Hilmar Lapp; John G. Lundberg; Peter E. Midford; Monte Westerfield; Paula M. Mabee

Background The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies. Methodology/Principal Findings We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish, http://zfin.org). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators. Conclusions/Significance The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.

PLOS ONE | 2010

Phenex: ontological annotation of phenotypic diversity.

James P. Balhoff; Wasila M. Dahdul; Cartik R. Kothari; Hilmar Lapp; John G. Lundberg; Paula M. Mabee; Peter E. Midford; Monte Westerfield

Background Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. Methodology/Principal Findings Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. Conclusions/Significance Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

Systematic Biology | 2010

The Teleost Anatomy Ontology: Anatomical Representation for the Genomics Age

Wasila M. Dahdul; John G. Lundberg; Peter E. Midford; James P. Balhoff; Hilmar Lapp; Melissa Haendel; Monte Westerfield; Paula M. Mabee

Abstract The rich knowledge of morphological variation among organisms reported in the systematic literature has remained in free-text format, impractical for use in large-scale synthetic phylogenetic work. This noncomputable format has also precluded linkage to the large knowledgebase of genomic, genetic, developmental, and phenotype data in model organism databases. We have undertaken an effort to prototype a curated, ontology-based evolutionary morphology database that maps to these genetic databases (http://kb.phenoscape.org) to facilitate investigation into the mechanistic basis and evolution of phenotypic diversity. Among the first requirements in establishing this database was the development of a multispecies anatomy ontology with the goal of capturing anatomical data in a systematic and computable manner. An ontology is a formal representation of a set of concepts with defined relationships between those concepts. Multispecies anatomy ontologies in particular are an efficient way to represent the diversity of morphological structures in a clade of organisms, but they present challenges in their development relative to single-species anatomy ontologies. Here, we describe the Teleost Anatomy Ontology (TAO), a multispecies anatomy ontology for teleost fishes derived from the Zebrafish Anatomical Ontology (ZFA) for the purpose of annotating varying morphological features across species. To facilitate interoperability with other anatomy ontologies, TAO uses the Common Anatomy Reference Ontology as a template for its upper level nodes, and TAO and ZFA are synchronized, with zebrafish terms specified as subtypes of teleost terms. We found that the details of ontology architecture have ramifications for querying, and we present general challenges in developing a multispecies anatomy ontology, including refinement of definitions, taxon-specific relationships among terms, and representation of taxonomically variable developmental pathways.

machine vision applications | 2003

Robust DNA microarray image analysis

Norbert Brändle; Horst Bischof; Hilmar Lapp

Abstract.DNA microarrays are an increasingly important tool that allow biologists to gain insight into the function of thousands of genes in a single experiment. Common to all array-based approaches is the necessity to analyze digital images of the scanned DNA array. The ultimate image analysis goal is to automatically quantify every individual array element (spot), providing information about the amount of DNA bound to a spot. Irrespective of the quantification strategy, the preliminary information to extract about a spot includes the mapping between its location in the digital image and its possibly distorted position in the spot array (gridding). We present a gridding approach divided into a spot-amplification step (matched filter), a rotation estimation step (Radon transform), and a grid spanning step. Quantification of the spots is performed by robustly fitting of a parametric model to pixel intensities with the help of M-estimators. The main advantage of parametric spot fitting is its ability to cope with overlapping spots. If the goodness-of-fit is too bad, a semiparametric spot fitting is employed. We show that our approach is superior to simple quantification strategies such as averaging of the pixel intensities. The system was extensively tested on 1740 images resulting from two DNA libraries.

Systematic Biology | 2012

NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata

Rutger A. Vos; James P. Balhoff; Jason Caravas; Mark T. Holder; Hilmar Lapp; Wayne P. Maddison; Peter E. Midford; Anurag Priyam; Jeet Sukumaran; Xuhua Xia; Arlin Stoltzfus

Abstract In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input–output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

Journal of Applied Ichthyology | 2012

500,000 fish phenotypes: The new informatics landscape for evolutionary and developmental biology of the vertebrate skeleton

Paula M. Mabee; James P. Balhoff; Wasila M. Dahdul; Hilmar Lapp; Peter E. Midford; Monte Westerfield

The rich phenotypic diversity that characterizes the vertebrate skeleton results from evolutionary changes in regulation of genes that drive development. Although relatively little is known about the genes that underlie the skeletal variation among fish species, significant knowledge of genetics and development is available for zebrafish. Because developmental processes are highly conserved, this knowledge can be leveraged for understanding the evolution of skeletal diversity. We developed the Phenoscape Knowledgebase (KB; http://kb.phenoscape.org) to yield testable hypotheses of candidate genes involved in skeletal evolution. We developed a community anatomy ontology for fishes and ontology-based methods to represent complex free-text character descriptions of species in a computable format. With these tools, we populated the KB with comparative morphological data from the literature on over 2500 teleost fishes (mainly Ostariophysi) resulting in over 500,000 taxon phenotype annotations. The KB integrates these data with similarly structured phenotype data from zebrafish genes (http://zfin.org). Using ontology-based reasoning, candidate genes can be inferred for the phenotypes that vary across taxa, thereby uniting genetic and phenotypic data to formulate evo-devo hypotheses. The morphological data in the KB can be browsed, sorted, and aggregated in ways that provide unprecedented possibilities for data mining and discovery.

Journal of Biomedical Semantics | 2014

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

Toshiaki Katayama; Mark D. Wilkinson; Kiyoko F. Aoki-Kinoshita; Shuichi Kawashima; Yasunori Yamamoto; Atsuko Yamaguchi; Shinobu Okamoto; Shin Kawano; Jin Dong Kim; Yue Wang; Hongyan Wu; Yoshinobu Kano; Hiromasa Ono; Hidemasa Bono; Simon Kocbek; Jan Aerts; Yukie Akune; Erick Antezana; Kazuharu Arakawa; Bruno Aranda; Joachim Baran; Jerven T. Bolleman; Raoul J. P. Bonnal; Pier Luigi Buttigieg; Matthew Campbell; Yi An Chen; Hirokazu Chiba; Peter J. A. Cock; K. Bretonnel Cohen; Alexandru Constantin

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.

Evolution | 2009

Linking big: the continuing promise of evolutionary synthesis.

Brian L. Sidlauskas; Ganeshkumar Ganapathy; Einat Hazkani-Covo; Kristin P. Jenkins; Hilmar Lapp; Lauren W. McCall; Samantha A. Price; Ryan Scherle; Paula Ann Spaeth; David M. Kidd

Synthetic science promises an unparalleled ability to find new meaning in old data, extant results, or previously unconnected methods and concepts, but pursuing synthesis can be a difficult and risky endeavor. Our experience as biologists, informaticians, and educators at the National Evolutionary Synthesis Center has affirmed that synthesis can yield major insights, but also revealed that technological hurdles, prevailing academic culture, and general confusion about the nature of synthesis can hamper its progress. By presenting our view of what synthesis is, why it will continue to drive progress in evolutionary biology, and how to remove barriers to its progress, we provide a map to a future in which all scientists can engage productively in synthetic research.

Systematic Biology | 2015

Toward Synthesizing Our Knowledge of Morphology: Using Ontologies and Machine Reasoning to Extract Presence/Absence Evolutionary Phenotypes across Studies

T. Alexander Dececchi; James P. Balhoff; Hilmar Lapp; Paula M. Mabee

The reality of larger and larger molecular databases and the need to integrate data scalably have presented a major challenge for the use of phenotypic data. Morphology is currently primarily described in discrete publications, entrenched in noncomputer readable text, and requires enormous investments of time and resources to integrate across large numbers of taxa and studies. Here we present a new methodology, using ontology-based reasoning systems working with the Phenoscape Knowledgebase (KB; kb.phenoscape.org), to automatically integrate large amounts of evolutionary character state descriptions into a synthetic character matrix of neomorphic (presence/absence) data. Using the KB, which includes more than 55 studies of sarcopterygian taxa, we generated a synthetic supermatrix of 639 variable characters scored for 1051 taxa, resulting in over 145,000 populated cells. Of these characters, over 76% were made variable through the addition of inferred presence/absence states derived by machine reasoning over the formal semantics of the source ontologies. Inferred data reduced the missing data in the variable character-subset from 98.5% to 78.2%. Machine reasoning also enables the isolation of conflicts in the data, that is, cells where both presence and absence are indicated; reports regarding conflicting data provenance can be generated automatically. Further, reasoning enables quantification and new visualizations of the data, here for example, allowing identification of character space that has been undersampled across the fin-to-limb transition. The approach and methods demonstrated here to compute synthetic presence/absence supermatrices are applicable to any taxonomic and phenotypic slice across the tree of life, providing the data are semantically annotated. Because such data can also be linked to model organism genetics through computational scoring of phenotypic similarity, they open a rich set of future research questions into phenotype-to-genome relationships.

Explore More