Cyril Labbé
Joseph Fourier University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cyril Labbé.
Journal of Quantitative Linguistics | 2001
Cyril Labbé; Dominique Labbé
The calculation proposed in this paper measures neighbourhood between several texts. It leads to a normalized metric and a distance scale which can be used for authorship attribution. An experiment is presented on one of the famous cases in French literature: Corneille and Molière. The calculation clearly makes the difference between the two works but it also demonstrates that Corneille contributed to many of Molière’s masterpieces.
Scientometrics | 2013
Cyril Labbé; Dominique Labbé
Two kinds of bibliographic tools are used to retrieve scientific publications and make them available online. For one kind, access is free as they store information made publicly available online. For the other kind, access fees are required as they are compiled on information provided by the major publishers of scientific literature. The former can easily be interfered with, but it is generally assumed that the latter guarantee the integrity of the data they sell. Unfortunately, duplicate and fake publications are appearing in scientific conferences and, as a result, in the bibliographic services. We demonstrate a software method of detecting these duplicate and fake publications. Both the free services (such as Google Scholar and DBLP) and the charged-for services (such as IEEE Xplore) accept and index these publications.
Literary and Linguistic Computing | 2005
Cyril Labbé; Dominique Labbé
How to measure proximities and oppositions in large text corpora? Intertextual distance provides a simple and interesting solution. Its properties make it a good tool for text classification, and especially for tree-analysis which is fully presented and discussed here. In order to measure the quality of this classification, two indices are proposed. The method presented provides an accurate tool for literary studies -as is demonstrated by applying it to two areas of French literature, Racines tragedies and an authorship attribution experiment.
Journal of Quantitative Linguistics | 2004
Cyril Labbé; Dominique Labbé; Pierre Hubert
Segmentation of large textual corpora is one of the major questions asked of literary studies. We present a combination of two relevant methods. First, vocabulary growth analysis highlights the main discontinuities in a work. Second, these results are supplemented with the analysis of variations in vocabulary diversity within corpora. A segmentation algorithm, associated with a test of validity, indicates the optimal succession in distinct stages. This method is applied to Racines works and various other works in French.
language resources and evaluation | 2005
Cyril Labbé; Dominique Labbé
We present a new method to describe the contextual meaning of a key word in a corpus. The vocabulary of the sentences containing this word is compared to that of the entire corpus in order to highlight the words which are significantly overutilized in the neighbourhood of this key word (they are associated in the author’s mind) and the ones which are significantly underutilized (they are mutually exclusive). This method provides an interesting tool for lexicography and literary studies as is shown by applying it to the word amour (love) in the work of Pierre Corneille, the most famous French playwright of the 17th century.
Scientometrics | 2018
Nguyen Minh Tien; Cyril Labbé
Abstract Automatically generated papers have been used to manipulate bibliography indexes on numerous occasions. This paper is interested in different means to generate texts such as recurrent neural network, Markov model, or probabilistic context free grammar, and if it is possible to detect them using a current approach. Then, probabilistic context free grammar (PCFG) is focused on as the one most used. However, even though there have been multiple approaches to detect such types of paper, they are all working at the document level and are unable to detect a small amount of generated text inside a larger body of genuinely written text. Thus, we present the grammatical structure similarity measurement to detect sentences or short fragments of automatically generated text from known PCFG generators. The proposed approach is tested against a pattern checker and various common machine learning methods. Additionally, the ability to detect a modified PCFG generator is also tested.
Scientometrics | 2017
Jennifer A. Byrne; Cyril Labbé
Comparing 5 publications from China that described knockdowns of the human TPD52L2 gene in human cancer cell lines identified unexpected similarities between these publications, flaws in experimental design, and mis-matches between some described experiments and the reported results. Following communications with journal editors, two of these TPD52L2 publications have been retracted. One retraction notice stated that while the authors claimed that the data were original, the experiments had been out-sourced to a biotechnology company. Using search engine queries, automatic text-analysis, different similarity measures, and further visual inspection, we identified 48 examples of highly similar papers describing single gene knockdowns in 1–2 human cancer cell lines that were all published by investigators from China. The incorrect use of a particular TPD52L2 shRNA sequence as a negative or non-targeting control was identified in 30/48 (63%) of these publications, using a combination of Google Scholar searches and visual inspection. Overall, these results suggest that some publications describing the effects of single gene knockdowns in human cancer cell lines may include the results of experiments that were not performed by the authors. This has serious implications for the validity of such results, and for their application in future research.
Corpus | 2003
Cyril Labbé; Dominique Labbé
11th International Conference on Textual Data Statistical Analysis | 2012
Cyril Labbé; Dominique Labbé
Archive | 2008
Cyril Labbé; Dominique Labbé