Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nona Naderi is active.

Publication


Featured researches published by Nona Naderi.


BMC Genomics | 2010

Algorithms and semantic infrastructure for mutation impact extraction and grounding

Jonas B. Laurila; Nona Naderi; René Witte; Alexandre Riazanov; Alexandre Kouznetsov; Christopher J. O. Baker

BackgroundMutation impact extraction is a hitherto unaccomplished task in state of the art mutation extraction systems. Protein mutations and their impacts on protein properties are hidden in scientific literature, making them poorly accessible for protein engineers and inaccessible for phenotype-prediction systems that currently depend on manually curated genomic variation databases.ResultsWe present the first rule-based approach for the extraction of mutation impacts on protein properties, categorizing their directionality as positive, negative or neutral. Furthermore protein and mutation mentions are grounded to their respective UniProtKB IDs and selected protein properties, namely protein functions to concepts found in the Gene Ontology. The extracted entities are populated to an OWL-DL Mutation Impact ontology facilitating complex querying for mutation impacts using SPARQL. We illustrate retrieval of proteins and mutant sequences for a given direction of impact on specific protein properties. Moreover we provide programmatic access to the data through semantic web services using the SADI (Semantic Automated Discovery and Integration) framework.ConclusionWe address the problem of access to legacy mutation data in unstructured form through the creation of novel mutation impact extraction methods which are evaluated on a corpus of full-text articles on haloalkane dehalogenases, tagged by domain experts. Our approaches show state of the art levels of precision and recall for Mutation Grounding and respectable level of precision but lower recall for the task of Mutant-Impact relation extraction. The system is deployed using text mining and semantic web technologies with the goal of publishing to a broad spectrum of consumers.


Bioinformatics | 2011

OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents

Nona Naderi; Thomas Kappler; Christopher J. O. Baker; René Witte

MOTIVATION Semantic tagging of organism mentions in full-text articles is an important part of literature mining and semantic enrichment solutions. Tagged organism mentions also play a pivotal role in disambiguating other entities in a text, such as proteins. A high-precision organism tagging system must be able to detect the numerous forms of organism mentions, including common names as well as the traditional taxonomic groups: genus, species and strains. In addition, such a system must resolve abbreviations and acronyms, assign the scientific name and if possible link the detected mention to the NCBI Taxonomy database for further semantic queries and literature navigation. RESULTS We present the OrganismTagger, a hybrid rule-based/machine learning system to extract organism mentions from the literature. It includes tools for automatically generating lexical and ontological resources from a copy of the NCBI Taxonomy database, thereby facilitating system updates by end users. Its novel ontology-based resources can also be reused in other semantic mining and linked data tasks. Each detected organism mention is normalized to a canonical name through the resolution of acronyms and abbreviations and subsequently grounded with an NCBI Taxonomy database ID. In particular, our system combines a novel machine-learning approach with rule-based and lexical methods for detecting strain mentions in documents. On our manually annotated OT corpus, the OrganismTagger achieves a precision of 95%, a recall of 94% and a grounding accuracy of 97.5%. On the manually annotated corpus of Linnaeus-100, the results show a precision of 99%, recall of 97% and grounding accuracy of 97.4%. AVAILABILITY The OrganismTagger, including supporting tools, resources, training data and manual annotations, as well as end user and developer documentation, is freely available under an open-source license at http://www.semanticsoftware.info/organism-tagger. CONTACT [email protected].


BMC Genomics | 2012

Automated extraction and semantic analysis of mutation impacts from the biomedical literature.

Nona Naderi; René Witte

BackgroundMutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities.ResultsWe present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner.ConclusionWe present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions.


international world wide web conferences | 2013

Vaccine attitude surveillance using semantic analysis: constructing a semantically annotated corpus

Stephanie Brien; Nona Naderi; Arash Shaban-Nejad; Luke Mondor; Doerthe Kroemker; David L. Buckeridge

This paper reports work in progress to semantically annotate blog posts about vaccines to use in the Vaccine Attitude Surveillance using Semantic Analysis (VASSA) framework. The VASSA framework combines semantic web and natural language processing (NLP) tools and techniques to provide a coherent semantic layer across online social media for assessment and analysis of vaccination attitudes and beliefs. We describe how the blog posts were sampled and selected, our schema to semantically annotate concepts defined in our ontology, details of the annotation process, and inter-annotator agreement on a sample of blog posts.


PRIMA Workshops | 2014

Argumentation Mining in Parliamentary Discourse

Nona Naderi; Graeme Hirst

We examine whether using frame choices in forum statements can help us identify framing strategies in parliamentary discourse. In this analysis, we show how features based on embedding representations can improve the discovery of various frames in argumentative political speech. Given the complex nature of the parliamentary discourse, the initial results that are presented here are promising. We further present a manually annotated corpus for frame recognition in parliamentary discourse.


world congress on medical and health informatics, medinfo | 2013

PHIO: a knowledge base for interpretation and calculation of public health indicators.

Arash Shaban-Nejad; Anya Okhmatovskaia; Masoumeh T. Izadi; Nona Naderi; Luke Mondor; Christian Jauvin; David L. Buckeridge

Existing population health indicators tend to be out-of-date, not fully available at local levels of geography, and not developed in a coherent/consistent manner, which hinders their use in public health. The PopHR platform aims to deliver an electronic repository that contains multiple aggregated clinical, administrative, and environmental data sources to provide a coherent view of the health status of populations in the province of Quebec, Canada. This platform is designed to provide representative information in near-real time with high geographical resolution, thereby assisting public health professionals, analysts, clinicians and the public in decision-making. This paper presents our ongoing efforts to develop an integrated population health indicator ontology (PHIO) that captures the knowledge required for calculation and interpretation of health indicators within a PopHR semantic framework.


conference on information and knowledge management | 2011

Semantic text mining for lignocellulose research

Marie-Jean Meurs; Caitlin Murphy; Ingo Morgenstern; Nona Naderi; Greg Butler; Justin Powlowski; Adrian Tsang; René Witte

Semantic technologies, including natural language processing (NLP), ontologies, semantic web services and web-based collaboration tools, promise to support users in dealing with complex data, thereby facilitating knowledge-intensive tasks. An ongoing challenge is to select the appropriate technologies and combine them in a coherent system that brings measurable improvements to the users. We present our ongoing development of a semantic infrastructure in support of genomics-based lignocellulose research. Part of this effort is the automated curation of knowledge from information on enzymes from fungi that is available in the literature and genome resources. Fungi naturally break down lignocellulose, hence the identification and characterization of the enzymes that they use in lignocellulose hydrolysis is an important part in research and development of biomass-derived products and fuels. Working close to the biology researchers who manually curate the existing literature, we developed ontological NLP pipelines integrated in a Web-based interface to help them in two main tasks: mining the literature for relevant information, and at the same time providing rich and semantically linked information.


meeting of the association for computational linguistics | 2017

Argumentation Quality Assessment: Theory vs. Practice.

Henning Wachsmuth; Nona Naderi; Ivan Habernal; Yufang Hou; Graeme Hirst; Iryna Gurevych; Benno Stein

Argumentation quality is viewed differently in argumentation theory and in practical assessment approaches. This paper studies to what extent the views match empirically. We find that most observations on quality phrased spontaneously are in fact adequately represented by theory. Even more, relative comparisons of arguments in practice correlate with absolute quality ratings based on theory. Our results clarify how the two views can learn from each other.


recent advances in natural language processing | 2017

Classifying Frames at the Sentence Level in News Articles.

Nona Naderi; Graeme Hirst

Previous approaches to generic frame classification analyze frames at the document level. Here, we propose a supervised based approach based on deep neural networks and distributional representations for classifying frames at the sentence level in news articles. We conduct our experiments on the publicly available Media Frames Corpus compiled from the U.S. Newspapers. Using (B)LSTMs and GRU networks to represent the meaning of frames, we demonstrate that our approach yields at least 14-point improvement over several baseline methods.


Canadian Journal of Political Science | 2017

Digitization of the Canadian Parliamentary debates

Kaspar Beelen; T. Alberdingk Thijm; Christopher Cochrane; K. Halvemaan; Graeme Hirst; M. Kimmins; S. Lijbrink; Maarten Marx; Nona Naderi; L. Rheault; R. Polyanovsky; T. Whyte

This paper describes the digitization and enrichment of the Canadian House of Commons English Debates from 1901 to present. We start by laying out the general framework in which this project took place and then present the structure of the database and provide guidelines to prospective users. The paper concludes with the introduction of www.lipad.ca, an online platform designed as a hub for archiving Canadian political data, with the parliamentary proceedings at the centre of its architecture.

Collaboration


Dive into the Nona Naderi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge