Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jasmin Saric is active.

Publication


Featured researches published by Jasmin Saric.


Nature Reviews Genetics | 2006

Literature mining for the biologist: from information retrieval to biological discovery

Lars Juhl Jensen; Jasmin Saric; Peer Bork

For the average biologist, hands-on literature mining currently means a keyword search in PubMed. However, methods for extracting biomedical facts from the scientific literature have improved considerably, and the associated tools will probably soon be used in many laboratories to automatically annotate and analyse the growing number of system-wide experimental data sets. Owing to the increasing body of text and the open-access policies of many journals, literature mining is also becoming useful for both hypothesis generation and biological discovery. However, the latter will require the integration of literature and high-throughput data, which should encourage close collaborations between biologists and computational linguists.


Bioinformatics | 2006

Extraction of regulatory gene/protein networks from Medline

Jasmin Saric; Lars Juhl Jensen; Rossitza Ouzounova; Isabel Rojas; Peer Bork

MOTIVATION We have previously developed a rule-based approach for extracting information on the regulation of gene expression in yeast. The biomedical literature, however, contains information on several other equally important regulatory mechanisms, in particular phosphorylation, which we now expanded for our rule-based system also to extract. RESULTS This paper presents new results for extraction of relational information from biomedical text. We have improved our system, STRING-IE, to capture both new types of linguistic constructs as well as new types of biological information [i.e. (de-)phosphorylation]. The precision remains stable with a slight increase in recall. From almost one million PubMed abstracts related to four model organisms, we manage to extract regulatory networks and binary phosphorylations comprising 3,319 relation chunks. The accuracy is 83-90% and 86-95% for gene expression and (de-)phosphorylation relations, respectively. To achieve this, we made use of an organism-specific resource of gene/protein names considerably larger than those used in most other biology related information extraction approaches. These names were included in the lexicon when retraining the part-of-speech (POS) tagger on the GENIA corpus. For the domain in question, an accuracy of 96.4% was attained on POS tags. It should be noted that the rules were developed for yeast and successfully applied to both abstracts and full-text articles related to other organisms with comparable accuracy. AVAILABILITY The revised GENIA corpus, the POS tagger, the extraction rules and the full sets of extracted relations are available from http://www.bork.embl.de/Docu/STRING-IE


data integration in the life sciences | 2006

SABIO-RK: integration and curation of reaction kinetics data

Ulrike Wittig; Martin Golebiewski; Renate Kania; Olga Krebs; Saqib Mir; Andreas Weidemann; Stefanie Anstein; Jasmin Saric; Isabel Rojas

Simulating networks of biochemical reactions require reliable kinetic data. In order to facilitate the access to such kinetic data we have developed SABIO-RK, a curated database with information about biochemical reactions and their kinetic properties. The data are manually extracted from literature and verified by curators, concerning standards, formats and controlled vocabularies. This process is supported by tools in a semi-automatic manner. SABIO-RK contains and merges information about reactions such as reactants and modifiers, organism, tissue and cellular location, as well as the kinetic properties of the reactions. The type of the kinetic mechanism, modes of inhibition or activation, and corresponding rate equations are presented together with their parameters and measured values, specifying the experimental conditions under which these were determined. Links to other databases enable the user to gather further information and to refer to the original publication. Information about reactions and their kinetic data can be exported to an SBML file, allowing users to employ the information as the basis for their simulation models.


data and knowledge engineering | 2005

Ontology-driven discourse analysis for information extraction

Philipp Cimiano; Uwe Reyle; Jasmin Saric

This paper presents a novel approach to discourse analysis within information extraction systems. It makes use of DRT as formal representation of the linguistic context as well as of a domain-specific ontology as a basis to compute conceptual relations between extracted events thus establishing discourse coherence. The approach has been implemented within GenIE, an information extraction system with the aim of extracting information about biochemical pathways, about sequences, structures and functions of genomes and proteins. The approach is evaluated against a semantically hand-annotated set of Swiss-Prot protein function descriptions and shows very promising results.


Drug Discovery Today | 2011

Empowering industrial research with shared biomedical vocabularies.

Lee Harland; Christopher Larminie; Susanna-Assunta Sansone; Sorana Popa; M. Scott Marshall; Michael Braxenthaler; Michael N. Cantor; Wendy Filsell; Mark J. Forster; Enoch S. Huang; Andreas Matern; Mark A. Musen; Jasmin Saric; Ted Slater; Jabe Wilson; Nick Lynch; John Wise; Ian Dix

The life science industries (including pharmaceuticals, agrochemicals and consumer goods) are exploring new business models for research and development that focus on external partnerships. In parallel, there is a desire to make better use of data obtained from sources such as human clinical samples to inform and support early research programmes. Success in both areas depends upon the successful integration of heterogeneous data from multiple providers and scientific domains, something that is already a major challenge within the industry. This issue is exacerbated by the absence of agreed standards that unambiguously identify the entities, processes and observations within experimental results. In this article we highlight the risks to future productivity that are associated with incomplete biological and chemical vocabularies and suggest a new model to address this long-standing issue.


Comparative and Functional Genomics | 2003

Developing a protein‐interactions ontology

Esther Ratsch; Jörg Schultz; Jasmin Saric; Philipp Cimiano Lavin; Ulrike Wittig; Uwe Reyle; Isabel Rojas

The prediction and analysis of a protein’s function is an ongoing challenge in the field of genomics. With upcoming datasets on protein interactions [9], it is becoming evident that the function of a protein can only be understood when taking its interaction with other molecules into account. Most current approaches to the classification and description of protein function, such as the Gene Ontology [8], focus on single proteins. These annotation efforts should be paralleled by the development of ontologies dealing with the interactions of a protein with other biomolecules. Currently, most approaches to building such ontologies focus on metabolism [3,6]. So far, for interactions, only high-level classifications have been created [4], developed to assist information extraction from text. In addition to assisting text mining, a more fine-grained (in comparison to these classifications) ontology on protein interactions could be helpful in database development and information mining. As an ontology captures domain knowledge in a computer-understandable way, it can be used for inferencing, i.e. deriving new knowledge from existing data. There are two important points to consider in developing such a formal ontology: (a) it should be independent of its final use; and (b) it should not only restrict itself to a controlled vocabulary but the concepts should be related to each other in a semantically consistent manner, and rules governing these definitions and relations should be incorporated whenever necessary. Here we describe our approach for developing such an ontology.


meeting of the association for computational linguistics | 2004

Extracting Regulatory Gene Expression Networks From Pubmed

Jasmin Saric; Lars Juhl Jensen; Peer Bork; Rossitza Ouzounova; Isabel Rojas

We present an approach using syntacto-semantic rules for the extraction of relational information from biomedical abstracts. The results show that by overcoming the hurdle of technical terminology, high precision results can be achieved. From abstracts related to bakers yeast, we manage to extract a regulatory network comprised of 441 pairwise relations from 58,664 abstracts with an accuracy of 83-90%. To achieve this, we made use of a resource of gene/protein names considerably larger than those used in most other biology related information extraction approaches. This list of names was included in the lexicon of our retrained part-of-speech tagger for use on molecular biology abstracts. For the domain in question an accuracy of 93.6-97.7% was attained on POS-tags. The method is easily adapted to other organisms than yeast, allowing us to extract many more biologically relevant relations.


Journal of Integrative Bioinformatics | 2007

SABIO-RK: A data warehouse for biochemical reactions and their kinetics

Olga Krebs; Martin Golebiewski; Renate Kania; Saqib Mir; Jasmin Saric; Andreas Weidemann; Ulrike Wittig; Isabel Rojas

Abstract Systems biology is an emerging field that aims at obtaining a system-level understanding of biological processes. The modelling and simulation of networks of biochemical reactions have great and promising application potential but require reliable kinetic data. In order to support the systems biology community with such data we have developed SABIO-RK (System for the Analysis of Biochemical Pathways - Reaction Kinetics), a curated database with information about biochemical reactions and their kinetic properties, which allows researchers to obtain and compare kinetic data and to integrate them into models of biochemical networks. SABIO-RK is freely available for academic use at http://sabio.villa-bosch.de/SABIORK/.


Methods of Molecular Biology | 2008

Discovering Biomedical Knowledge from the Literature

Jasmin Saric; Henriette Engelken; Uwe Reyle

Biomedical knowledge is to a very large extent represented only in textual form. To make this knowledge accessible to humans and/or further automatic processing, text mining applications have been developed. At the end of this chapter we present an overview of the most important open access applications and their functionality. The main part of the paper is devoted to the major problems with which all such applications have to deal. The first problem is terminology processing, i.e., recognizing biomedical terms and identifying their meanings, at least to a certain degree. The second problem is to bring together information units that are distributed over more than one sentence. The task of coreference resolution consists of identifying the entities to which the text refers in different sentences and in different ways. The third problem we discuss is that of information extraction, in particular, extraction of relational information. The representation of the domain knowledge is an indispensable component of any text mining application. We discuss different types and depths of ontological modeling and how this knowledge helps to accomplish the tasks described above. An overview of ontological resources is given at the end of the chapter.


international joint conference on artificial intelligence | 2005

Unsupervised learning of semantic relations between concepts of a molecular biology ontology

Massimiliano Ciaramita; Aldo Gangemi; Esther Ratsch; Jasmin Saric; Isabel Rojas

Collaboration


Dive into the Jasmin Saric's collaboration.

Top Co-Authors

Avatar

Isabel Rojas

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Ulrike Wittig

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Uwe Reyle

University of Stuttgart

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peer Bork

University of Würzburg

View shared research outputs
Top Co-Authors

Avatar

Martin Golebiewski

California Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Renate Kania

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Rossitza Ouzounova

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Andreas Weidemann

Heidelberg Institute for Theoretical Studies

View shared research outputs
Researchain Logo
Decentralizing Knowledge