Yann Guermeur
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yann Guermeur.
BMC Bioinformatics | 2008
Bernhard Gschloessl; Yann Guermeur; J. Mark Cock
BackgroundThe heterokonts are a particularly interesting group of eukaryotic organisms; they include many key species of planktonic and coastal algae and several important pathogens. To understand the biology of these organisms, it is necessary to be able to predict the subcellular localisation of their proteins but this is not straightforward, particularly in photosynthetic heterokonts which possess a complex chloroplast, acquired as the result of a secondary endosymbiosis. This is because the bipartite target peptides that deliver proteins to these chloroplasts can be easily confused with the signal peptides of secreted proteins, causing currently available algorithms to make erroneous predictions. HECTAR, a subcellular targeting prediction method which takes into account the specific properties of heterokont proteins, has been developed to address this problem.ResultsHECTAR is a statistical prediction method designed to assign proteins to five different categories of subcellular targeting: Signal peptides, type II signal anchors, chloroplast transit peptides, mitochondrion transit peptides and proteins which do not possess any N-terminal target peptide. The recognition rate of HECTAR is 96.3%, with Matthews correlation coefficients ranging from 0.67 to 0.95. The method is based on a hierarchical architecture which implements the divide and conquer approach to identify the different possible target peptides one at a time. At each node of the hierarchy, the most relevant outputs of various existing subcellular prediction methods are combined by a Support Vector Machine.ConclusionThe HECTAR method is able to predict the subcellular localisation of heterokont proteins with high accuracy. It also efficiently predicts the subcellular localisation of proteins from cryptophytes, a group that is phylogenetically close to the heterokonts. A variant of HECTAR, called HECTARSEC, can be used to identify signal peptide and type II signal anchor sequences in proteins from any eukaryotic organism. Both HECTAR and HECTARSECare available as a web application at the following address: http://www.sb-roscoff.fr/hectar/.
International Journal of Intelligent Information and Database Systems | 2012
Yann Guermeur
Roughly speaking, there is one main model of pattern recognition support vector machine, with several variants of lower popularity. On the contrary, among the different multi-class support vector machines which can be found in literature, none is clearly favoured. On the one hand, they exhibit distinct statistical properties. On the other hand, multiple comparative studies between multi-class support vector machines and decomposition methods have highlighted the fact that in practice, each model has its advantages and drawbacks. In this article, we introduce a generic model of multi-class support vector machine. It provides the first unifying definition of all the machines of this kind published so far. This contribution makes it possible to devise new machines meeting specific requirements as well as to analyse globally the statistical properties of the multi-class support vector machines.
Optics Express | 2012
Faiza Abdat; Marine Amouroux; Yann Guermeur; Walter Blondel
This paper deals with multi-class classification of skin pre-cancerous stages based on bimodal spectroscopic features combining spatially resolved AutoFluorescence (AF) and Diffuse Reflectance (DR) measurements. A new hybrid method to extract and select features is presented. It is based on Discrete Cosine Transform (DCT) applied to AF spectra and on Mutual Information (MI) applied to DR spectra. The classification is performed by means of a multi-class SVM: the M-SVM2. Its performance is compared with the one of the One-Versus-All (OVA) decomposition method involving bi-class SVMs as base classifiers. The results of this study show that bimodality and the choice of an adequate spatial resolution allow for a significant increase in diagnostic accuracy. This accuracy can get as high as 81.7% when combining different distances in the case of bimodality.
Communications in Statistics-theory and Methods | 2010
Yann Guermeur
Bounds on the risk play a crucial role in statistical learning theory. They usually involve as capacity measure of the model studied the VC dimension or one of its extensions. In classification, such “VC dimensions” exist for models taking values in {0, 1}, [[1, Q]], and ℝ. We introduce the generalizations appropriate for the missing case, the one of models with values in ℝ Q . This provides us with a new guaranteed risk for M-SVMs. For those models, a sharper bound is obtained by using the Rademacher complexity.
Communications in Statistics-theory and Methods | 2013
Yann Guermeur
Roughly speaking, there is one main model of pattern recognition support vector machine, with several variants of lower popularity. On the contrary, among the different multi-class support vector machines which can be found in the literature, none is clearly favoured. On the one hand, they exhibit distinct statistical properties. On the other hand, multiple comparative studies between multi-class support vector machines and decomposition methods have highlighted the fact that each model has its advantages and drawbacks. These observations call for the evaluation of combinations of multi-class support vector machines. In this article, we study the combination of multi-class support vector machines with linear ensemble methods. Their sample complexity is low, which should prevent them from overfitting, and the outputs of two of them are estimates of the class posterior probabilities.
Biomedical spectroscopy and imaging | 2015
Faiza Abdat; Marine Amouroux; Yann Guermeur; Walter Blondel
The current study deals with new perspectives to perform more efficient classification of mouse skin precancerous stages by means of a decision fusion scheme based on belief functions and exploiting the spatial resolution of the autofluorescence and diffuse reflectance spectroscopic data.
international conference laser optics | 2014
Walter Blondel; F. Abdat; Marine Amouroux; Yann Guermeur
The current study deals with new perspectives to perform more efficient classification of mouse skin precancerous stages by exploiting the spatial resolution of multimodal spectro-scopic data in a decision fusion scheme based on belief functions.
trans. computational collective intelligence | 2013
Rémi Bonidal; Samy Tindel; Yann Guermeur
For a support vector machine, model selection consists in selecting the kernel function, the values of its parameters, and the amount of regularization. To set the value of the regularization parameter, one can minimize an appropriate objective function over the regularization path. A priori, this requires the availability of two elements: the objective function and an algorithm computing the regularization path at a reduced cost. The literature provides us with several upper bounds and estimates for the leave-one-out cross-validation error of the l2-SVM. However, no algorithm was available so far for fitting the entire regularization path of this machine. In this article, we introduce the first algorithm of this kind. It is involved in the specification of new methods to tune the corresponding penalization coefficient, whose objective function is a leave-one-out error bound or estimate. From a computational point of view, these methods appear especially appropriate when the Gram matrix is of low rank. A comparative study involving state-of-the-art alternatives provides us with an empirical confirmation of this advantage.
Biomedical spectroscopy and imaging | 2011
Faiza Abdat; Marine Amouroux; Yann Guermeur; Walter Blondel
This paper deals with multi-classification of skin precancerous stages based on bimodal spectroscopy combining AutoFluorescence (AF) and Diffuse Reflectance (DR) measurements. The proposed data processing method is based on Discrete Cosine Transform (DCT) to extract discriminant spectral features and on Support Vector Machine to classify. Results show that DCT gives better results for AF spectra than for DR spectra. This study shows that bimodality and monitoring spectral resolution together allow an increase in diagnostic accuracy. The choice of an adequate spectral resolution always implies an increase in diagnostic accuracy. This accuracy can get as high as 79.0% when combining different distances between collecting and exciting optical fibers.
BMC Bioinformatics | 2006
Nicolas Sapay; Yann Guermeur; Gilbert Deléage