Juan Antonio Lossio-Ventura
University of Montpellier
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Juan Antonio Lossio-Ventura.
Information Retrieval | 2016
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
AbstractTerminology extraction is an essential task in domain knowledge acquisition, as well as for information retrieval. It is also a mandatory first step aimed at building/enriching terminologies and ontologies. As often proposed in the literature, existing terminology extraction methods feature linguistic and statistical aspects and solve some problems related (but not completely) to term extraction, e.g. noise, silence, low frequency, large-corpora, complexity of the multi-word term extraction process. In contrast, we propose a cutting edge methodology to extract and to rank biomedical terms, covering all the mentioned problems. This methodology offers several measures based on linguistic, statistical, graphic and web aspects. These measures extract and rank candidate terms with excellent precision: we demonstrate that they outperform previously reported precision results for automatic term extraction, and work with different languages (English, French, and Spanish). We also demonstrate how the use of graphs and the web to assess the significance of a term candidate, enables us to outperform precision results. We evaluated our methodology on the biomedical GENIA and LabTestsOnline corpora and compared it with previously reported measures.
international conference natural language processing | 2014
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
Term extraction is an essential task in domain knowledge acquisition. We propose two new measures to extract multiword terms from a domain-specific text. The first measure is both linguistic and statistical based. The second measure is graph-based, allowing assessment of the importance of a multiword term of a domain. Existing measures often solve some problems related (but not completely) to term extraction, e.g., noise, silence, low frequency, large-corpora, complexity of the multiword term extraction process. Instead, we focus on managing the entire set of problems, e.g., detecting rare terms and overcoming the low frequency issue. We show that the two proposed measures outperform precision results previously reported for automatic multiword extraction by comparing them with the state-of-the-art reference measures.
extending database technology | 2016
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
Biomedical ontologies play an important role for information extraction in the biomedical domain. We present a workflow for updating automatically biomedical ontologies, composed of four steps. We detail two contributions concerning the concept extraction and semantic linkage of extracted terminology.
international database engineering and applications symposium | 2014
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
Comprehensive terminology is essential for a community to describe, exchange, and retrieve data. In multiple domain, the explosion of text data produced has reached a level for which automatic terminology extraction and enrichment is mandatory. Automatic Term Extraction (or Recognition) methods use natural language processing to do so. Methods featuring linguistic and statistical aspects as often proposed in the literature, solve some problems related to term extraction as low frequency, complexity of the multi-word term extraction, human effort to validate candidate terms. In contrast, we present two new measures for extracting and ranking muli-word terms from domain-specific corpora, covering the all mentioned problems. In addition we demonstrate how the use of the Web to evaluate the significance of a multi-word term candidate, helps us to outperform precision results obtain on the biomedical GENIA corpus with previous reported measures such as C-value.
acm symposium on applied computing | 2016
Juan Antonio Lossio-Ventura; Hakim Hacid; Mathieu Roche; Pascal Poncelet
We propose in this paper to handle the problem of overload in social interactions by grouping messages according to three important dimensions: (i) content (textual and hashtags), (ii) users, and (iii) time difference. We evaluated our approach on a Twitter data set and we compared it to other existing approaches and the results are promising and encouraging.
international semantic web conference | 2014
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
language resources and evaluation | 2016
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
Cahiers Agricultures | 2015
Mathieu Roche; Sophie Fortuno; Juan Antonio Lossio-Ventura; Amira Akli; Salim Belkebir; Thinhinan Lounis; Serigne Toure
CORIA: COnférence en Recherche d’Information et Applications | 2015
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire
Symposium on Information Management and Big Data - SIMBig 2014 | 2014
Juan Antonio Lossio-Ventura; Clement Jonquet; Mathieu Roche; Maguelonne Teisseire