Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Clement Jonquet is active.

Publication


Featured researches published by Clement Jonquet.


Information Retrieval | 2016

Biomedical term extraction: overview and a new methodology

Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire

AbstractTerminologyn extraction is an essential task in domain knowledge acquisition, as well as for information retrieval. It is also a mandatory first step aimed at building/enriching terminologies and ontologies. As often proposed in the literature, existing terminology extraction methods feature linguistic and statistical aspects and solve some problems related (but not completely) to term extraction, e.g. noise, silence, low frequency, large-corpora, complexity of the multi-word term extraction process. In contrast, we propose a cutting edge methodology to extract and to rank biomedical terms, covering all the mentioned problems. This methodology offers several measures based on linguistic, statistical, graphic and web aspects.n These measures extract and rank candidate terms with excellent precision: we demonstrate that they outperform previously reported precision results for automatic term extraction, and work with different languages (English, French, and Spanish). We also demonstrate how the use of graphs and the web to assess the significance of a term candidate, enables us to outperform precision results. We evaluated our methodology on the biomedical GENIA and LabTestsOnline corpora and compared it with previously reported measures.


international conference natural language processing | 2014

Yet Another Ranking Function for Automatic Multiword Term Extraction

Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire

Term extraction is an essential task in domain knowledge acquisition. We propose two new measures to extract multiword terms from a domain-specific text. The first measure is both linguistic and statistical based. The second measure is graph-based, allowing assessment of the importance of a multiword term of a domain. Existing measures often solve some problems related (but not completely) to term extraction, e.g., noise, silence, low frequency, large-corpora, complexity of the multiword term extraction process. Instead, we focus on managing the entire set of problems, e.g., detecting rare terms and overcoming the low frequency issue. We show that the two proposed measures outperform precision results previously reported for automatic multiword extraction by comparing them with the state-of-the-art reference measures.


web science | 2011

Negotiating the web science curriculum through shared educational artefacts

Su White; Madalina Croitoru; Stéphane B. Bazan; Stefano A. Cerri; Hugh C. Davis; Raffaella Folgieri; Clement Jonquet; François Scharffe; Steffen Staab; Thanassis Tiropanis; Michalis N. Vafopoulos

The far-reaching impact of the Web on society is widely recognised. The interdisciplinary study of this impact has crystallised in the field of study known as Web Science. However, defining an agreed, shared understanding of what constitutes web science requires complex negotiation and translations of understandings across component disciplines, national cultures and educational traditions. Some individual institutions have already established particular curricula, and discussions in the Web Science Curriculum Workshop series have marked the territory to some extent. This paper reports on a process being adopted across a consortium of partners to systematically create a shared understanding of what constitutes web science. It records and critiques the processes instantiated to agree a common curriculum, and presents a framework for future discussion and development.


knowledge acquisition, modeling and management | 2016

Selection and Combination of Heterogeneous Mappings to Enhance Biomedical Ontology Matching

Amina Annane; Zohra Bellahsene; Faiçal Azouaou; Clement Jonquet

This paper presents a novel background knowledge approach which selects and combines existing mappings from a given biomedical ontology repository to improve ontology alignment. Current background knowledge approaches usually select either manually or automatically a limited number of different ontologies and use them as a whole for background knowledge. Whereas in our approach, we propose to pick up only relevant concepts and relevant existing mappings linking these concepts all together in a specific and customized background knowledge graph. Paths within this graph will help to discover new mappings. We have implemented and evaluated our approach using the content of the NCBO BioPortal repository and the Anatomy benchmark from the Ontology Alignment Evaluation Initiative. We used the mapping gain measure to assess how much our final background knowledge graph improves results of state-of-the-art alignment systems. Furthermore, the evaluation shows that our approach produces a high quality alignment and discovers mappings that have not been found by state-of-the-art systems.


extending database technology | 2016

A Way to Automatically Enrich Biomedical Ontologies

Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire

Biomedical ontologies play an important role for information extraction in the biomedical domain. We present a workflow for updating automatically biomedical ontologies, composed of four steps. We detail two contributions concerning the concept extraction and semantic linkage of extracted terminology.


bioRxiv | 2018

Agronomic Linked Data (AgroLD): a Knowledge-based System to Enable Integrative Biology in Agronomy

Aravind Venkatesan; Gildas Tagny Ngompe; Nordine El Hassouni; Imène Chentli; Valentin Guignon; Clement Jonquet; Manuel Ruiz; Pierre Larmande

Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate and assimilate complementary datasets to understand the biological system as a whole. The Semantic Web offers technologies for the integration of heterogeneous data and its transformation into explicit knowledge thanks to ontologies. We have developed the Agronomic Linked Data (AgroLD www.agrold.org), a knowledge-based system relying on Semantic Web technologies and exploiting standard domain ontologies, to integrate data about plant species of high interest for the plant science community e.g., rice, wheat, arabidopsis. We present some integration results of the project, which initially focused on genomics, proteomics and phenomics. AgroLD is now an RDF knowledge base of 100M triples created by annotating and integrating more than 50 datasets coming from 10 data sources –such as Gramene.org and TropGeneDB– with 10 ontologies –such as the Gene Ontology and Plant Trait Ontology. Our objective is to offer a domain specific knowledge platform to solve complex biological and agronomical questions related to the implication of genes/proteins in, for instances, plant disease resistance or high yield traits. We expect the resolution of these questions to facilitate the formulation of new scientific hypotheses to be validated with a knowledge-oriented approach.


web intelligence, mining and semantics | 2016

Multilingual Mapping Reconciliation between English-French Biomedical Ontologies

Amina Annane; Vincent Emonet; Faiçal Azouaou; Clement Jonquet

Even if multilingual ontologies are now more common, for historical reasons, in the biomedical domain, many ontologies or terminologies have been translated from one natural language to another resulting in two potentially aligned ontologies but with their own specificity (e.g., format, developers, and versions). Most often, there is no formal representation of the translation links between translated ontologies and original ones and those mappings are not formally available as linked data. However, these mappings are very important for the interoperability and the integration of multilingual biomedical data. In this paper, we propose an approach to represent translation mappings between ontologies based on the NCBO BioPortal format. We have reconciled more than 228K mappings between ten English ontologies hosted on NCBO BioPortal and their French translations. Then, we have stored both the translated ontologies and mappings on a French customized version of the platform, called the SIFR BioPortal, making the whole thing available in RDF. Reconciling the mappings turned more complex than expected because the translations are rarely exactly the same than the original ontologies as discussed in this paper.


international database engineering and applications symposium | 2014

Integration of linguistic and web information to improve biomedical terminology extraction

Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire

Comprehensive terminology is essential for a community to describe, exchange, and retrieve data. In multiple domain, the explosion of text data produced has reached a level for which automatic terminology extraction and enrichment is mandatory. Automatic Term Extraction (or Recognition) methods use natural language processing to do so. Methods featuring linguistic and statistical aspects as often proposed in the literature, solve some problems related to term extraction as low frequency, complexity of the multi-word term extraction, human effort to validate candidate terms. In contrast, we present two new measures for extracting and ranking muli-word terms from domain-specific corpora, covering the all mentioned problems. In addition we demonstrate how the use of the Web to evaluate the significance of a multi-word term candidate, helps us to outperform precision results obtain on the biomedical GENIA corpus with previous reported measures such as C-value.


bioRxiv | 2018

PGxO and PGxLOD: a reconciliation of pharmacogenomic knowledge of various provenances, enabling further comparison

Pierre Monnin; Joël Legrand; Graziella Husson; Patrice Ringot; Andon Tchechmedjiev; Clement Jonquet; Amedeo Napoli; Adrien Coulet

Background Pharmacogenomics (PGx) studies how genomic variations impact variations in drug response phenotypes. Knowledge in pharmacogenomics is typically composed of units that have the form of ternary relationships gene variant – drug – adverse event. Such a relationship states that an adverse event may occur for patients having the speciifed gene variant and being exposed to the specified drug. State-of-the-art knowledge in PGx is mainly available in reference databases such as PharmGKB and reported in scientific biomedical literature. But, PGx knowledge can also be discovered from clinical data, such as Electronic Health Records (EHRs), and in this case, may either correspond to new knowledge or confirm state-of-the-art knowledge that lacks “clinical counterpart” or validation. For this reason, there is a need for automatic comparison of knowledge units from distinct sources. Results In this article, we propose an approach, based on Semantic Web technologies, to represent and compare PGx knowledge units. To this end, we developed PGxO, a simple ontology that represents PGx knowledge units and their components. Combined with PROV-O, an ontology developed by the W3C to represent provenance information, PGxO enables encoding and associating provenance information to PGx relationships. Additionally, we introduce a set of rules to reconcile PGx knowledge, i.e. to identify when two relationships, potentially expressed using different vocabularies and level of granularity, refer to the same, or to different knowledge units. We evaluated our ontology and rules by populating PGxO with knowledge units extracted from PharmGKB (2,701), the literature (65,720) and from discoveries reported from EHR analysis studies (only 10, manually extracted); and by testing their similarity. We called PGxLOD (PGx Linked Open Data) the resulting knowledge base that represents and reconciles knowledge units of those various origins. Conclusions The proposed ontology and reconciliation rules constitute a first step toward a more complete framework for knowledge comparison in PGx. In this direction, the experimental instantiation of PGxO, named PGxLOD, illustrates the ability and difficulties of reconciling various existing knowledge sources.


Database | 2018

AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture

Lisa C. Harper; Jacqueline D. Campbell; Ethalinda K. S. Cannon; Sook Jung; Monica Poelchau; Ramona L. Walls; Carson M. Andorf; Elizabeth Arnaud; Tanya Z. Berardini; Clayton Birkett; Steve Cannon; James A. Carson; Bradford Condon; Laurel Cooper; Nathan Dunn; Christine G. Elsik; Andrew D. Farmer; Stephen P. Ficklin; David Grant; Emily Grau; Nic Herndon; Zhi-Liang Hu; Jodi L. Humann; Pankaj Jaiswal; Clement Jonquet; Marie-Angélique Laporte; Pierre Larmande; Gerard R. Lazo; Fiona M. McCarthy; Naama Menda

Abstract The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.

Collaboration


Dive into the Clement Jonquet's collaboration.

Top Co-Authors

Avatar

Maguelonne Teisseire

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Mathieu Roche

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amina Annane

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Faiçal Azouaou

École Normale Supérieure

View shared research outputs
Top Co-Authors

Avatar

Pierre Larmande

Institut de recherche pour le développement

View shared research outputs
Top Co-Authors

Avatar

Juan Antonio Lossio Ventura

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Aravind Venkatesan

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge