Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Adrien Coulet is active.

Publication


Featured researches published by Adrien Coulet.


Journal of Biomedical Informatics | 2010

Using text to build semantic networks for pharmacogenomics

Adrien Coulet; Nigam H. Shah; Yael Garten; Mark A. Musen; Russ B. Altman

Most pharmacogenomics knowledge is contained in the text of published studies, and is thus not available for automated computation. Natural Language Processing (NLP) techniques for extracting relationships in specific domains often rely on hand-built rules and domain-specific ontologies to achieve good performance. In a new and evolving field such as pharmacogenomics (PGx), rules and ontologies may not be available. Recent progress in syntactic NLP parsing in the context of a large corpus of pharmacogenomics text provides new opportunities for automated relationship extraction. We describe an ontology of PGx relationships built starting from a lexicon of key pharmacogenomic entities and a syntactic parse of more than 87 million sentences from 17 million MEDLINE abstracts. We used the syntactic structure of PGx statements to systematically extract commonly occurring relationships and to map them to a common schema. Our extracted relationships have a 70-87.7% precision and involve not only key PGx entities such as genes, drugs, and phenotypes (e.g., VKORC1, warfarin, clotting disorder), but also critical entities that are frequently modified by these key entities (e.g., VKORC1 polymorphism, warfarin response, clotting disorder treatment). The result of our analysis is a network of 40,000 relationships between more than 200 entity types with clear semantics. This network is used to guide the curation of PGx knowledge and provide a computable resource for knowledge discovery.


Pharmacogenomics | 2010

Recent progress in automatically extracting information from the pharmacogenomic literature.

Yael Garten; Adrien Coulet; Russ B. Altman

The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications.


Journal of Biomedical Semantics | 2011

Integration and publication of heterogeneous text-mined relationships on the Semantic Web

Adrien Coulet; Yael Garten; Michel Dumontier; Russ B. Altman; Mark A. Musen; Nigam H. Shah

BackgroundAdvances in Natural Language Processing (NLP) techniques enable the extraction of fine-grained relationships mentioned in biomedical text. The variability and the complexity of natural language in expressing similar relationships causes the extracted relationships to be highly heterogeneous, which makes the construction of knowledge bases difficult and poses a challenge in using these for data mining or question answering.ResultsWe report on the semi-automatic construction of the PHARE relationship ontology (the PHArmacogenomic RElationships Ontology) consisting of 200 curated relations from over 40,000 heterogeneous relationships extracted via text-mining. These heterogeneous relations are then mapped to the PHARE ontology using synonyms, entity descriptions and hierarchies of entities and roles. Once mapped, relationships can be normalized and compared using the structure of the ontology to identify relationships that have similar semantics but different syntax. We compare and contrast the manual procedure with a fully automated approach using WordNet to quantify the degree of integration enabled by iterative curation and refinement of the PHARE ontology. The result of such integration is a repository of normalized biomedical relationships, named PHARE-KB, which can be queried using Semantic Web technologies such as SPARQL and can be visualized in the form of a biological network.ConclusionsThe PHARE ontology serves as a common semantic framework to integrate more than 40,000 relationships pertinent to pharmacogenomics. The PHARE ontology forms the foundation of a knowledge base named PHARE-KB. Once populated with relationships, PHARE-KB (i) can be visualized in the form of a biological network to guide human tasks such as database curation and (ii) can be queried programmatically to guide bioinformatics applications such as the prediction of molecular interactions. PHARE is available at http://purl.bioontology.org/ontology/PHARE.


international conference on move to meaningful internet systems | 2006

Suggested ontology for pharmacogenomics (SO-Pharm): modular construction and preliminary testing

Adrien Coulet; Malika Smaïl-Tabbone; Amedeo Napoli; Marie-Dominique Devignes

Pharmacogenomics studies the involvement of interindividual variations of DNA sequence in different drug responses (especially adverse drug reactions) Knowledge Discovery in Databases (KDD) process is a means for discovering new pharmacogenomic knowledge in biological databases However data complexity makes it necessary to guide the KDD process by representation of domain knowledge Three domains at least are in concern: genotype, drug and phenotype The approach described here aims at reusing whenever possible existing domain knowledge in order to build a modular formal representation of domain knowledge in pharmacogenomics The resulting ontology is called SO-Pharm for Suggested Ontology for Pharmacogenomics Various situations encountered during the construction process are analyzed and discussed A preliminary validation is provided by representing with SO-Pharm concepts some well-known examples of pharmacogenomic knowledge.


data integration in the life sciences | 2006

SNP-Converter: an ontology-based solution to reconcile heterogeneous SNP descriptions for pharmacogenomic studies

Adrien Coulet; Malika Smaïl-Tabbone; Pascale Benlian; Amedeo Napoli; Marie-Dominique Devignes

Pharmacogenomics explores the impact of individual genomic variations in health problems such as adverse drug reactions. Records of millions of genomic variations, mostly known as Single Nucleotide Polymorphisms (SNP), are available today in various overlapping and heterogeneous databases. Selecting and extracting from these databases or from private sources a proper set of polymorphisms are the first steps of a KDD (Knowledge Discovery in Databases) process in pharmacogenomics. It is however a tedious task hampered by the heterogeneity of SNP nomenclatures and annotations. Standards for representing genomic variants have been proposed by the Human Genome Variation Society (HGVS). The SNP-Converter application is aimed at converting any SNP description into an HGVS-compliant pivot description and vice versa. Used in the frame of a knowledge system, the SNP-Converter application contributes as a wrapper to semantic data integration and enrichment.


Advances in Experimental Medicine and Biology | 2011

Ontology-Based Knowledge Discovery in Pharmacogenomics

Adrien Coulet; Malika Smaïl-Tabbone; Amedeo Napoli; Marie-Dominique Devignes

One current challenge in biomedicine is to analyze large amounts of complex biological data for extracting domain knowledge. This work holds on the use of knowledge-based techniques such as knowledge discovery (KD) and knowledge representation (KR) in pharmacogenomics, where knowledge units represent genotype-phenotype relationships in the context of a given treatment. An objective is to design knowledge base (KB, here also mentioned as an ontology) and then to use it in the KD process itself. A method is proposed for dealing with two main tasks: (1) building a KB from heterogeneous data related to genotype, phenotype, and treatment, and (2) applying KD techniques on knowledge assertions for extracting genotype-phenotype relationships. An application was carried out on a clinical trial concerned with the variability of drug response to montelukast treatment. Genotype-genotype and genotype-phenotype associations were retrieved together with new associations, allowing the extension of the initial KB. This experiment shows the potential of KR and KD processes, especially for designing KB, checking KB consistency, and reasoning for problem solving.


international conference on formal concept analysis | 2013

Using pattern structures for analyzing ontology-based annotations of biomedical data

Adrien Coulet; Florent Domenach; Mehdi Kaytoue; Amedeo Napoli

Annotating data with concepts of an ontology is a common practice in the biomedical domain. Resulting annotations, i.e., data-concept relationships, are useful for data integration whereas the reference ontology can guide the analysis of integrated data. Then the analysis of annotations can provide relevant knowledge units to consider for extracting and understanding possible cor- relations between data. Formal Concept Analysis (FCA) which builds from a binary context a concept lattice can be used for such a knowledge discovery task. However annotated biomedical data are usually not binary and a scaling procedure for using FCA is required as a prepro- cessing, leading to problems of expressivity, ranging from loss of information to the generation of a large num- ber of additional binary attributes. By contrast, pattern structures o er a general FCA-based framework for buil- ding a concept lattice from complex data, e.g., a set of objects with partially ordered descriptions. In this pa- per, we show how to instantiate this general framework when descriptions are ordered by an ontology. We illus- trate our approach with the analysis of annotations of drug related documents, and we show the capabilities of the approach for knowledge discovery.


Journal of Biomedical Semantics | 2017

Discovering associations between adverse drug events using pattern structures and ontologies

Gabin Personeni; Emmanuel Bresso; Marie-Dominique Devignes; Michel Dumontier; Malika Smaïl-Tabbone; Adrien Coulet

BackgroundPatient data, such as electronic health records or adverse event reporting systems, constitute an essential resource for studying Adverse Drug Events (ADEs). We explore an original approach to identify frequently associated ADEs in subgroups of patients.ResultsBecause ADEs have complex manifestations, we use formal concept analysis and its pattern structures, a mathematical framework that allows generalization using domain knowledge formalized in medical ontologies. Results obtained with three different settings and two different datasets show that this approach is flexible and allows extraction of association rules at various levels of generalization.ConclusionsThe chosen approach permits an expressive representation of a patient ADEs. Extracted association rules point to distinct ADEs that occur in a same group of patients, and could serve as a basis for a recommandation system. The proposed representation is flexible and can be extended to make use of additional ontologies and various patient records.


Journal of Biomedical Semantics | 2017

Learning from biomedical linked data to suggest valid pharmacogenes

Kevin Dalleau; Yassine Marzougui; Sébastien Da Silva; Patrice Ringot; Ndeye Coumba Ndiaye; Adrien Coulet

BackgroundA standard task in pharmacogenomics research is identifying genes that may be involved in drug response variability, i.e., pharmacogenes. Because genomic experiments tended to generate many false positives, computational approaches based on the use of background knowledge have been proposed. Until now, only molecular networks or the biomedical literature were used, whereas many other resources are available.MethodWe propose here to consume a diverse and larger set of resources using linked data related either to genes, drugs or diseases. One of the advantages of linked data is that they are built on a standard framework that facilitates the joint use of various sources, and thus facilitates considering features of various origins. We propose a selection and linkage of data sources relevant to pharmacogenomics, including for example DisGeNET and Clinvar. We use machine learning to identify and prioritize pharmacogenes that are the most probably valid, considering the selected linked data. This identification relies on the classification of gene–drug pairs as either pharmacogenomically associated or not and was experimented with two machine learning methods –random forest and graph kernel–, which results are compared in this article.ResultsWe assembled a set of linked data relative to pharmacogenomics, of 2,610,793 triples, coming from six distinct resources. Learning from these data, random forest enables identifying valid pharmacogenes with a F-measure of 0.73, on a 10 folds cross-validation, whereas graph kernel achieves a F-measure of 0.81. A list of top candidates proposed by both approaches is provided and their obtention is discussed.


data integration in the life sciences | 2014

Mining Linked Open Data: A Case Study with Genes Responsible for Intellectual Disability

Gabin Personeni; Simon Daget; Céline Bonnet; Philippe Jonveaux; Marie-Dominique Devignes; Malika Smaïl-Tabbone; Adrien Coulet

Linked Open Data (LOD) constitute a unique dataset that is in a standard format, partially integrated, and facilitates connections with domain knowledge represented within semantic web ontologies. Increasing amounts of biomedical data provided as LOD consequently offer novel opportunities for knowledge discovery in biomedicine. However, most data mining methods are neither adapted to LOD format, nor adapted to consider domain knowledge. We propose in this paper an approach for selecting, integrating, and mining LOD with the goal of discovering genes responsible for a disease. The selection step relies on a set of choices made by a domain expert to isolate relevant pieces of LOD. Because these pieces are potentially not linked, an integration step is required to connect unlinked pieces. The resulting graph is subsequently mined using Inductive Logic Programming (ILP) that presents two main advantages. First, the input format compliant with ILP is close to the format of LOD. Second, domain knowledge can be added to this input and considered by ILP. We have implemented and applied this approach to the characterization of genes responsible for intellectual disability. On the basis of this real-world use case, we present an evaluation of our mining approach and discuss its advantages and drawbacks for the mining of biomedical LOD.

Collaboration


Dive into the Adrien Coulet's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marie-Dominique Devignes

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yannick Toussaint

Free University of Bozen-Bolzano

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge