Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Isabel Segura-Bedmar is active.

Publication


Featured researches published by Isabel Segura-Bedmar.


Journal of Cheminformatics | 2015

The CHEMDNER corpus of chemicals and drugs and its annotation principles

Martin Krallinger; Obdulia Rabal; Florian Leitner; Miguel Vazquez; David Salgado; Zhiyong Lu; Robert Leaman; Yanan Lu; Donghong Ji; Daniel M. Lowe; Roger A. Sayle; Riza Theresa Batista-Navarro; Rafal Rak; Torsten Huber; Tim Rocktäschel; Sérgio Matos; David Campos; Buzhou Tang; Hua Xu; Tsendsuren Munkhdalai; Keun Ho Ryu; S. V. Ramanan; Senthil Nathan; Slavko Žitnik; Marko Bajec; Lutz Weber; Matthias Irmer; Saber A. Akhondi; Jan A. Kors; Shuo Xu

The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/


Journal of Biomedical Informatics | 2011

Using a shallow linguistic kernel for drug-drug interaction extraction

Isabel Segura-Bedmar; Paloma Martínez; César de Pablo-Sánchez

A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. Information Extraction (IE) techniques can provide health care professionals with an interesting way to reduce time spent reviewing the literature for potential drug-drug interactions. Nevertheless, no approach has been proposed to the problem of extracting DDIs in biomedical texts. In this article, we study whether a machine learning-based method is appropriate for DDI extraction in biomedical texts and whether the results provided are superior to those obtained from our previously proposed pattern-based approach. The method proposed here for DDI extraction is based on a supervised machine learning technique, more specifically, the shallow linguistic kernel proposed in Giuliano et al. (2006). Since no benchmark corpus was available to evaluate our approach to DDI extraction, we created the first such corpus, DrugDDI, annotated with 3169 DDIs. We performed several experiments varying the configuration parameters of the shallow linguistic kernel. The model that maximizes the F-measure was evaluated on the test data of the DrugDDI corpus, achieving a precision of 51.03%, a recall of 72.82% and an F-measure of 60.01%. To the best of our knowledge, this work has proposed the first full solution for the automatic extraction of DDIs from biomedical texts. Our study confirms that the shallow linguistic kernel outperforms our previous pattern-based approach. Additionally, it is our hope that the DrugDDI corpus will allow researchers to explore new solutions to the DDI extraction problem.


Journal of Biomedical Informatics | 2013

The DDI corpus

María Herrero-Zazo; Isabel Segura-Bedmar; Paloma Martínez; Thierry Declerck

The management of drug-drug interactions (DDIs) is a critical issue resulting from the overwhelming amount of information available on them. Natural Language Processing (NLP) techniques can provide an interesting way to reduce the time spent by healthcare professionals on reviewing biomedical literature. However, NLP techniques rely mostly on the availability of the annotated corpora. While there are several annotated corpora with biological entities and their relationships, there is a lack of corpora annotated with pharmacological substances and DDIs. Moreover, other works in this field have focused in pharmacokinetic (PK) DDIs only, but not in pharmacodynamic (PD) DDIs. To address this problem, we have created a manually annotated corpus consisting of 792 texts selected from the DrugBank database and other 233 Medline abstracts. This fined-grained corpus has been annotated with a total of 18,502 pharmacological substances and 5028 DDIs, including both PK as well as PD interactions. The quality and consistency of the annotation process has been ensured through the creation of annotation guidelines and has been evaluated by the measurement of the inter-annotator agreement between two annotators. The agreement was almost perfect (Kappa up to 0.96 and generally over 0.80), except for the DDIs in the MedLine database (0.55-0.72). The DDI corpus has been used in the SemEval 2013 DDIExtraction challenge as a gold standard for the evaluation of information extraction techniques applied to the recognition of pharmacological substances and the detection of DDIs from biomedical texts. DDIExtraction 2013 has attracted wide attention with a total of 14 teams from 7 different countries. For the task of recognition and classification of pharmacological names, the best system achieved an F1 of 71.5%, while, for the detection and classification of DDIs, the best result was F1 of 65.1%. These results show that the corpus has enough quality to be used for training and testing NLP techniques applied to the field of Pharmacovigilance. The DDI corpus and the annotation guidelines are free for use for academic research and are available at http://labda.inf.uc3m.es/ddicorpus.


Drug Discovery Today | 2008

Drug name recognition and classification in biomedical texts: A case study outlining approaches underpinning automated systems

Isabel Segura-Bedmar; Paloma Martínez; María Segura-Bedmar

This article presents a system for drug name recognition and classification in biomedical texts. The system combines information obtained by the Unified Medical Language System (UMLS) MetaMap Transfer (MMTx) program and nomenclature rules recommended by the World Health Organization (WHO) International Nonproprietary Names (INNs) Program to identify and classify pharmaceutical substances. Moreover, the system is able to detect possible candidates for drug names that have not been detected by MMTx program by applying these rules, achieving, in this way, a broader coverage. This work is the first step in a method for automatic detection of drug interactions from biomedical texts, a specific type of adverse drug event of special interest in patient safety.


Journal of Biomedical Informatics | 2014

Lessons learnt from the DDIExtraction-2013 Shared Task

Isabel Segura-Bedmar; Paloma Martínez; María Herrero-Zazo

The DDIExtraction Shared Task 2013 is the second edition of the DDIExtraction Shared Task series, a community-wide effort to promote the implementation and comparative assessment of natural language processing (NLP) techniques in the field of the pharmacovigilance domain, in particular, to address the extraction of drug-drug interactions (DDI) from biomedical texts. This edition has been the first attempt to compare the performance of Information Extraction (IE) techniques specific for each of the basic steps of the DDI extraction pipeline. To attain this aim, two main tasks were proposed: the recognition and classification of pharmacological substances and the detection and classification of drug-drug interactions. DDIExtraction 2013 was held from January to June 2013 and attracted wide attention with a total of 14 teams (6 of the teams participated in the drug name recognition task, while 8 participated in the DDI extraction task) from 7 different countries. For the task of the recognition and classification of pharmacological names, the best system achieved an F1 of 71.5%, while, for the detection and classification of DDIs, the best result was an F1 of 65.1%. The results show advances in the state of the art and demonstrate that significant challenges remain to be resolved. This paper focuses on the second task (extraction of DDIs) and examines its main challenges, which have yet to be resolved.


BMC Bioinformatics | 2011

A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents

Isabel Segura-Bedmar; Paloma Martínez; César de Pablo-Sánchez

BackgroundA drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI.MethodsThis paper describes a hybrid linguistic approach to DDI extraction that combines shallow parsing and syntactic simplification with pattern matching. Appositions and coordinate structures are interpreted based on shallow syntactic parsing provided by the UMLS MetaMap tool (MMTx). Subsequently, complex and compound sentences are broken down into clauses from which simple sentences are generated by a set of simplification rules. A pharmacist defined a set of domain-specific lexical patterns to capture the most common expressions of DDI in texts. These lexical patterns are matched with the generated sentences in order to extract DDIs.ResultsWe have performed different experiments to analyze the performance of the different processes. The lexical patterns achieve a reasonable precision (67.30%), but very low recall (14.07%). The inclusion of appositions and coordinate structures helps to improve the recall (25.70%), however, precision is lower (48.69%). The detection of clauses does not improve the performance.ConclusionsInformation Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract DDI from texts. To the best of our knowledge, this work proposes the first integral solution for the automatic extraction of DDI from biomedical texts.


BMC Bioinformatics | 2010

Extracting drug-drug interactions from biomedical texts

Isabel Segura-Bedmar; Paloma Martínez; César de Pablo-Sánchez

Background A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The detection of drug interactions is an important research area in patient safety since these interactions can become very dangerous and increase health care costs. Although there are different databases supporting health care professionals in the detection of drug interactions, this kind of resources is rarely complete. Drug interactions are frequently reported in journals of clinical pharmacology, making medical literature the most effective source for the detection of drug interactions. However, the increasing volume of the literature overwhelms health care professionals trying to keep an up-to-date collection of all reported DDIs. The development of automatic methods for collecting, maintaining and


Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi) | 2014

Detecting drugs and adverse events from Spanish social media streams

Isabel Segura-Bedmar; Ricardo Revert; Paloma Martínez

To the best of our knowledge, this is the first work that does drug and adverse event detection from Spanish posts collected from a health social media. First, we created a goldstandard corpus annotated with drugs and adverse events from social media. Then, Textalytics, a multilingual text analysis engine, was applied to identify drugs and possible adverse events. Overall recall and precision were 0.80 and 0.87 for drugs, and 0.56 and 0.85 for adverse events.


BMC Medical Informatics and Decision Making | 2015

Exploring Spanish health social media for detecting drug effects

Isabel Segura-Bedmar; Paloma Martínez; Ricardo Revert; Julián Moreno-Schneider

BackgroundAdverse Drug reactions (ADR) cause a high number of deaths among hospitalized patients in developed countries. Major drug agencies have devoted a great interest in the early detection of ADRs due to their high incidence and increasing health care costs. Reporting systems are available in order for both healthcare professionals and patients to alert about possible ADRs. However, several studies have shown that these adverse events are underestimated. Our hypothesis is that health social networks could be a significant information source for the early detection of ADRs as well as of new drug indications.MethodsIn this work we present a system for detecting drug effects (which include both adverse drug reactions as well as drug indications) from user posts extracted from a Spanish health forum. Texts were processed using MeaningCloud, a multilingual text analysis engine, to identify drugs and effects. In addition, we developed the first Spanish database storing drugs as well as their effects automatically built from drug package inserts gathered from online websites. We then applied a distant-supervision method using the database on a collection of 84,000 messages in order to extract the relations between drugs and their effects. To classify the relation instances, we used a kernel method based only on shallow linguistic information of the sentences.ResultsRegarding Relation Extraction of drugs and their effects, the distant supervision approach achieved a recall of 0.59 and a precision of 0.48.ConclusionsThe task of extracting relations between drugs and their effects from social media is a complex challenge due to the characteristics of social media texts. These texts, typically posts or tweets, usually contain many grammatical errors and spelling mistakes. Moreover, patients use lay terminology to refer to diseases, symptoms and indications that is not usually included in lexical resources in languages other than English.


empirical methods in natural language processing | 2015

Exploring Word Embedding for Drug Name Recognition

Isabel Segura-Bedmar; Víctor Suárez-Paniagua; Paloma Martínez

This paper describes a machine learningbased approach that uses word embedding features to recognize drug names from biomedical texts. As a starting point, we developed a baseline system based on Conditional Random Field (CRF) trained with standard features used in current Named Entity Recognition (NER) systems. Then, the system was extended to incorporate new features, such as word vectors and word clusters generated by the Word2Vec tool and a lexicon feature from the DINTO ontology. We trained the Word2vec tool over two different corpus: Wikipedia and MedLine. Our main goal is to study the effectiveness of using word embeddings as features to improve performance on our baseline system, as well as to analyze whether the DINTO ontology could be a valuable complementary data source integrated in a machine learning NER system. To evaluate our approach and compare it with previous work, we conducted a series of experiments on the dataset of SemEval-2013 Task 9.1 Drug Name Recognition.

Collaboration


Dive into the Isabel Segura-Bedmar's collaboration.

Top Co-Authors

Avatar

Daniel Sanchez-Cisneros

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Janna Hastings

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Adrián Carruana

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar

P. Fernández

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Paloma Mart

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ali Mehdizadeh Naderi

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Florian Leitner

Technical University of Madrid

View shared research outputs
Researchain Logo
Decentralizing Knowledge