Sumit Madan
Fraunhofer Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sumit Madan.
Database | 2016
Qinghua Wang; Shabbir Syed Abdul; Lara Monteiro Almeida; Sophia Ananiadou; Yalbi Itzel Balderas-Martínez; Riza Theresa Batista-Navarro; David Campos; Lucy Chilton; Hui-Jou Chou; Gabriela Contreras; Laurel Cooper; Hong-Jie Dai; Barbra Ferrell; Juliane Fluck; Socorro Gama-Castro; Nancy George; Georgios V. Gkoutos; Afroza Khanam Irin; Lars Juhl Jensen; Silvia Jimenez; Toni Rose Jue; Ingrid M. Keseler; Sumit Madan; Sérgio Matos; Peter McQuilton; Marija Milacic; Matthew Mort; Jeyakumar Natarajan; Evangelos Pafilis; Emiliano Pereira
Fully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption. The BioCreative Interactive task (IAT) is a track designed for exploring user-system interactions, promoting development of useful TM tools, and providing a communication channel between the biocuration and the TM communities. In BioCreative V, the IAT track followed a format similar to previous interactive tracks, where the utility and usability of TM tools, as well as the generation of use cases, have been the focal points. The proposed curation tasks are user-centric and formally evaluated by biocurators. In BioCreative V IAT, seven TM systems and 43 biocurators participated. Two levels of user participation were offered to broaden curator involvement and obtain more feedback on usability aspects. The full level participation involved training on the system, curation of a set of documents with and without TM assistance, tracking of time-on-task, and completion of a user survey. The partial level participation was designed to focus on usability aspects of the interface and not the performance per se. In this case, biocurators navigated the system by performing pre-designed tasks and then were asked whether they were able to achieve the task and the level of difficulty in completing the task. In this manuscript, we describe the development of the interactive task, from planning to execution and discuss major findings for the systems tested. Database URL: http://www.biocreative.org
Database | 2015
Justyna Szostak; Sam Ansari; Sumit Madan; Juliane Fluck; Marja Talikka; Anita R. Iskandar; Hector De Leon; Martin Hofmann-Apitius; Manuel C. Peitsch; Julia Hoeng
Abstract Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE −/− ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edges supported by 33 PubMed referenced articles. A comparison between the semi-automated and conventional curation processes showed similar results, but significantly reduced curation effort for the semi-automated process. Creating structured knowledge from unstructured text is an important step for the mechanistic interpretation and reusability of knowledge. Our new semi-automated knowledge extraction workflow reduced the curation skills and effort required to capture and organize scientific knowledge. The atherosclerotic plaque destabilization network that was generated is a causal network model for vascular disease demonstrating the usefulness of the workflow for knowledge extraction and construction of mechanistically meaningful biological networks.
Database | 2016
Juliane Fluck; Sumit Madan; Sam Ansari; Alpha Tom Kodamullil; Reagon Karki; Majid Rastegar-Mojarad; Natalie L. Catlett; William S. Hayes; Justyna Szostak; Julia Hoeng; Manuel C. Peitsch
Success in extracting biological relationships is mainly dependent on the complexity of the task as well as the availability of high-quality training data. Here, we describe the new corpora in the systems biology modeling language BEL for training and testing biological relationship extraction systems that we prepared for the BioCreative V BEL track. BEL was designed to capture relationships not only between proteins or chemicals, but also complex events such as biological processes or disease states. A BEL nanopub is the smallest unit of information and represents a biological relationship with its provenance. In BEL relationships (called BEL statements), the entities are normalized to defined namespaces mainly derived from public repositories, such as sequence databases, MeSH or publicly available ontologies. In the BEL nanopubs, the BEL statements are associated with citation information and supportive evidence such as a text excerpt. To enable the training of extraction tools, we prepared BEL resources and made them available to the community. We selected a subset of these resources focusing on a reduced set of namespaces, namely, human and mouse genes, ChEBI chemicals, MeSH diseases and GO biological processes, as well as relationship types ‘increases’ and ‘decreases’. The published training corpus contains 11 000 BEL statements from over 6000 supportive text excerpts. For method evaluation, we selected and re-annotated two smaller subcorpora containing 100 text excerpts. For this re-annotation, the inter-annotator agreement was measured by the BEL track evaluation environment and resulted in a maximal F-score of 91.18% for full statement agreement. In addition, for a set of 100 BEL statements, we do not only provide the gold standard expert annotations, but also text excerpts pre-selected by two automated systems. Those text excerpts were evaluated and manually annotated as true or false supportive in the course of the BioCreative V BEL track task. Database URL: http://wiki.openbel.org/display/BIOC/Datasets
Database | 2016
Fabio Rinaldi; Tilia Ellendorff; Sumit Madan; Simon Clematide; Adrian van der Lek; Theo Mevissen; Juliane Fluck
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal of track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text.
Journal of Alzheimer's Disease | 2017
Alpha Tom Kodamullil; Anandhi Iyappan; Reagon Karki; Sumit Madan; Erfan Younesi; Martin Hofmann-Apitius
Perturbance in inflammatory pathways have been identified as one of the major factors which leads to neurodegenerative diseases (NDD). Owing to the limited access of human brain tissues and the immense complexity of the brain, animal models, specifically mouse models, play a key role in advancing the NDD field. However, many of these mouse models fail to reproduce the clinical manifestations and end points of the disease. NDD drugs, which passed the efficacy test in mice, were repeatedly not successful in clinical trials. There are numerous studies which are supporting and opposing the applicability of mouse models in neuroinflammation and NDD. In this paper, we assessed to what extend a mouse can mimic the cellular and molecular interactions in humans at a mechanism level. Based on our mechanistic modeling approach, we investigate the failure of a neuroinflammation targeted drug in the late phases of clinical trials based on the comparative analyses between the two species.
Database | 2016
Sumit Madan; Sven Hodapp; Philipp Senger; Sam Ansari; Justyna Szostak; Julia Hoeng; Manuel C. Peitsch; Juliane Fluck
Network-based approaches have become extremely important in systems biology to achieve a better understanding of biological mechanisms. For network representation, the Biological Expression Language (BEL) is well designed to collate findings from the scientific literature into biological network models. To facilitate encoding and biocuration of such findings in BEL, a BEL Information Extraction Workflow (BELIEF) was developed. BELIEF provides a web-based curation interface, the BELIEF Dashboard, that incorporates text mining techniques to support the biocurator in the generation of BEL networks. The underlying UIMA-based text mining pipeline (BELIEF Pipeline) uses several named entity recognition processes and relationship extraction methods to detect concepts and BEL relationships in literature. The BELIEF Dashboard allows easy curation of the automatically generated BEL statements and their context annotations. Resulting BEL statements and their context annotations can be syntactically and semantically verified to ensure consistency in the BEL network. In summary, the workflow supports experts in different stages of systems biology network building. Based on the BioCreative V BEL track evaluation, we show that the BELIEF Pipeline automatically extracts relationships with an F-score of 36.4% and fully correct statements can be obtained with an F-score of 30.8%. Participation in the BioCreative V Interactive task (IAT) track with BELIEF revealed a systems usability scale (SUS) of 67. Considering the complexity of the task for new users—learning BEL, working with a completely new interface, and performing complex curation—a score so close to the overall SUS average highlights the usability of BELIEF. Database URL: BELIEF is available at http://www.scaiview.com/belief/
International Conference on Electronic Participation | 2012
Roman Klinger; Philipp Senger; Sumit Madan; Michal Jacovi
Policy decisions in governmental models are often based on their perception and acceptance in the general public. Traditional methods for harvesting opinions like telephone or street surveys are time intensive and costly and direct interaction between a governmental member and the population is limited. Social media harbor the chance to easily get a high number of opinions and proposals in form of poll participation or interactive debate contributions.
bioRxiv | 2017
Raza-Ur Rahman; Abdul Sattar; Maksims Fiosins; Abhivyakti Gautam; Daniel Sumner Magruder; Joern Bethune; Sumit Madan; Juliane Fluck; Stefan Bonn
Small RNAs (sRNAs) are important biomolecules that exert vital functions in organismal health and disease, from viruses to plants, animals, and humans. Given the ever-increasing amounts of sRNA deep sequencing data in online repositories and their potential roles in disease therapy and diagnosis, it is important to enable federated sRNA expression querying across samples, organisms, tissues, cell types, and diseases. Here we present the sRNA Expression Atlas (SEA), a web application that allows for the search of known and novel small RNAs across ten organisms using standardized search terms and ontologies. SEA contains re-analyzed sRNA expression information for over 2000 published samples, including many disease datasets and over 700 novel, high-quality predicted miRNAs. We believe that SEAs simple interface and fast search in combination with its detailed interactive reports will enable researchers to better understand the potential function and diagnostic value of sRNAs across tissues, diseases, and organisms. Availability and Implementation SEA is implemented in Java, J2EE, Python, R, PHP and JavaScript. It is freely available at http://sea.dzne.de
meeting of the association for computational linguistics | 2013
Juliane Fluck; Alexander Klenner; Sumit Madan; Sam Ansari; Tamara Bobić; Julia Hoeng; Martin Hofmann-Apitius; Manuel C. Peitsch
F1000Research | 2017
Justyna Szostak; Sumit Madan; William S. Hayes; Jens Doerpinghaus; Juliane Fluck; Marja Talikka; Manuel C. Peitsch; Julia Hoeng