Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Philippe Besse is active.

Publication


Featured researches published by Philippe Besse.


BMC Bioinformatics | 2011

Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems

Kim-Anh Lê Cao; Simon Boitard; Philippe Besse

BackgroundVariable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits.ResultsA simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework.ConclusionssPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.


Psychometrika | 1986

Principal components analysis of sampled functions

Philippe Besse; James O. Ramsay

This paper describes a technique for principal components analysis of data consisting ofn functions each observed atp argument values. This problem arises particularly in the analysis of longitudinal data in which some behavior of a number of subjects is measured at a number of points in time. In such cases information about the behavior of one or more derivatives of the function being sampled can often be very useful, as for example in the analysis of growth or learning curves. It is shown that the use of derivative information is equivalent to a change of metric for the row space in classical principal components analysis. The reproducing kernel for the Hilbert space of functions plays a central role, and defines the best interpolating functions, which are generalized spline functions. An example is offered of how sensitivity to derivative information can reveal interesting aspects of the data.


Scandinavian Journal of Statistics | 2000

Autoregressive Forecasting of Some Functional Climatic Variations

Philippe Besse; Hervé Cardot; David B. Stephenson

Many variations such as the annual cycle in sea surface temperatures can be considered to be smooth functions and are appropriately described using methods from functional data analysis. This study defines a class of functional autoregressive (FAR) models which can be used as robust predictors for making forecasts of entire smooth functions in the future. The methods are illustrated and compared with pointwise predictors such as SARIMA by applying them to forecasting the entire annual cycle of climatological El Nino–Southern Oscillation (ENSO) time series one year ahead. Forecasts for the period 1987–1996 suggest that the FAR functional predictors show some promising skill, compared to traditional scalar SARIMA forecasts which perform poorly.


BMC Bioinformatics | 2009

Sparse canonical methods for biological data integration: application to a cross-platform study

Kim-Anh Lê Cao; Pascal Martin; Christèle Robert-Granié; Philippe Besse

BackgroundIn the context of systems biology, few sparse approaches have been proposed so far to integrate several data sets. It is however an important and fundamental issue that will be widely encountered in post genomic studies, when simultaneously analyzing transcriptomics, proteomics and metabolomics data using different platforms, so as to understand the mutual interactions between the different data sets. In this high dimensional setting, variable selection is crucial to give interpretable results. We focus on a sparse Partial Least Squares approach (sPLS) to handle two-block data sets, where the relationship between the two types of variables is known to be symmetric. Sparse PLS has been developed either for a regression or a canonical correlation framework and includes a built-in procedure to select variables while integrating data. To illustrate the canonical mode approach, we analyzed the NCI60 data sets, where two different platforms (cDNA and Affymetrix chips) were used to study the transcriptome of sixty cancer cell lines.ResultsWe compare the results obtained with two other sparse or related canonical correlation approaches: CCA with Elastic Net penalization (CCA-EN) and Co-Inertia Analysis (CIA). The latter does not include a built-in procedure for variable selection and requires a two-step analysis. We stress the lack of statistical criteria to evaluate canonical correlation methods, which makes biological interpretation absolutely necessary to compare the different gene selections. We also propose comprehensive graphical representations of both samples and variables to facilitate the interpretation of the results.ConclusionsPLS and CCA-EN selected highly relevant genes and complementary findings from the two data sets, which enabled a detailed understanding of the molecular characteristics of several groups of cell lines. These two approaches were found to bring similar results, although they highlighted the same phenomenons with a different priority. They outperformed CIA that tended to select redundant information.


Hepatology | 2007

Novel aspects of PPARα-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study

Pascal Martin; Hervé Guillou; F. Lasserre; Sébastien Déjean; Annaïg Lan; Jean-Marc Pascussi; Magali SanCristobal; Philippe Legrand; Philippe Besse; Thierry Pineau

Peroxisome proliferator‐activated receptor‐α (PPARα) is a major transcriptional regulator of lipid metabolism. It is activated by diverse chemicals such as fatty acids (FAs) and regulates the expression of numerous genes in organs displaying high FA catabolic rates, including the liver. The role of this nuclear receptor as a sensor of whole dietary fat intake has been inferred, mostly from high‐fat diet studies. To delineate its function under low fat intake conditions (4.8% w/w), we studied the effects of five regimens with contrasted FA compositions on liver lipids and hepatic gene expression in wild‐type and PPARα‐deficient mice. Diets containing polyunsaturated FAs reduced hepatic fat stores in wild‐type mice. Only sunflower, linseed, and fish oil diets lowered hepatic lipid stores in PPARα−/− mice, a model of progressive hepatic triglyceride accumulation. These beneficial effects were associated, in particular, with dietary regulation of Δ9‐desaturase in both genotypes, and with a newly identified PPARα‐dependent regulation of lipin. Furthermore, hepatic levels of 18‐carbon essential FAs (C18:2ω6 and C18:3ω3) were elevated in PPARα−/− mice, possibly due to the observed reduction in expression of the Δ6‐desaturase and of enoyl‐coenzyme A isomerases. Effects of diet and genotype were also observed on the xenobiotic metabolism‐related genes Cyp3a11 and CAR. Conclusion: Together, our results suggest that dietary FAs represent—even under low fat intake conditions—a beneficial strategy to reduce hepatic steatosis. Under such conditions, we established the role of PPARα as a dietary FA sensor and highlighted its importance in regulating hepatic FA content and composition. (HEPATOLOGY 2007;45:767–7777.)


Statistics & Probability Letters | 1992

PCA stability and choice of dimensionality

Philippe Besse

A Criterion of stability for PCA scatterplots is defined based on a classical distance between projectors. It is constructed as a risk function and can be estimated by bootstrap or jackknife methods. Furthermore, perturbation theory is used to write down a Taylor expansion of the jackknife estimate for reasons of computational cost and in order to obtain an analytic expression for the approximation. The comparative study of these three estimates on real data shows that the last one is easy to compute, sufficiently accurate and helpful in choosing dimensionality in PCA.


Computational Statistics & Data Analysis | 1997

Simultaneous non-parametric regressions of unbalanced longitudinal data

Philippe Besse; Hervé Cardot; Frédéric Ferraty

Abstract The aim of this paper is to simultaneously estimate n curves corrupted by noise, this means several observations of a random process. The non-parametric estimation of the sampled paths leads to a new kind of functional principal components analysis which simultaneously takes into account a dimensionality and a smoothness constraint. Furthermore, the use of B-spline approximation to estimate the curves allows the study of unbalanced longitudinal data. The relationship between the choice of the smoothing parameter and that of dimensionality is discussed. A simulation study shows good behaviors of this proposed estimate compared to n independent smoothing splines under generalized cross-validation. Finally, the methodology of this paper is illustrated by its application to a real world data set.


BMC Genomics | 2008

Growth rate regulated genes and their wide involvement in the Lactococcus lactis stress responses

Clémentine Dressaire; Emma Redon; Hélène Milhem; Philippe Besse; Pascal Loubière; Muriel Cocaign-Bousquet

BackgroundThe development of transcriptomic tools has allowed exhaustive description of stress responses. These responses always superimpose a general response associated to growth rate decrease and a specific one corresponding to the stress. The exclusive growth rate response can be achieved through chemostat cultivation, enabling all parameters to remain constant except the growth rate.ResultsWe analysed metabolic and transcriptomic responses of Lactococcus lactis in continuous cultures at different growth rates ranging from 0.09 to 0.47 h-1. Growth rate was conditioned by isoleucine supply. Although carbon metabolism was constant and homolactic, a widespread transcriptomic response involving 30% of the genome was observed. The expression of genes encoding physiological functions associated with biogenesis increased with growth rate (transcription, translation, fatty acid and phospholipids metabolism). Many phages, prophages and transposon related genes were down regulated as growth rate increased. The growth rate response was compared to carbon and amino-acid starvation transcriptomic responses, revealing constant and significant involvement of growth rate regulations in these two stressful conditions (overlap 27%).Two regulators potentially involved in the growth rate regulations, llrE and yabB, have been identified. Moreover it was established that genes positively regulated by growth rate are preferentially located in the vicinity of replication origin while those negatively regulated are mainly encountered at the opposite, thus indicating the relationship between genes expression and their location on chromosome. Although stringent response mechanism is considered as the one governing growth deceleration in bacteria, the rigorous comparison of the two transcriptomic responses clearly indicated the mechanisms are distinct.ConclusionThis work of integrative biology was performed at the global level using transcriptomic analysis obtained in various growth conditions. It raised the importance of growth rate regulations in bacteria but also participated to the elucidation of the involved mechanism. Though the mechanism controlling growth rate is not yet fully understood in L. lactis, one expected regulatory mechanism has been ruled out, two potential regulators have been pointed out and the involvement of gene location on the chromosome has also been found to be involved in the expression regulation of these growth related genes.


Eurasip Journal on Bioinformatics and Systems Biology | 2007

Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives

Sébastien Déjean; Pascal Martin; Alain Baccini; Philippe Besse

Microarray data acquired during time-course experiments allow the temporal variations in gene expression to be monitored. An original postprandial fasting experiment was conducted in the mouse and the expression of 200 genes was monitored with a dedicated macroarray at 11 time points between 0 and 72 hours of fasting. The aim of this study was to provide a relevant clustering of gene expression temporal profiles. This was achieved by focusing on the shapes of the curves rather than on the absolute level of expression. Actually, we combined spline smoothing and first derivative computation with hierarchical and partitioning clustering. A heuristic approach was proposed to tune the spline smoothing parameter using both statistical and biological considerations. Clusters are illustrated a posteriori through principal component analysis and heatmap visualization. Most results were found to be in agreement with the literature on the effects of fasting on the mouse liver and provide promising directions for future biological investigations.


Journal of Biological Systems | 2009

HIGHLIGHTING RELATIONSHIPS BETWEEN HETEROGENEOUS BIOLOGICAL DATA THROUGH GRAPHICAL DISPLAYS BASED ON REGULARIZED CANONICAL CORRELATION ANALYSIS

Ignacio González; Sébastien Déjean; Pascal Martin; Olivier Gonçalves; Philippe Besse; Alain Baccini

Biological data produced by high throughput technologies are becoming more and more abundant and are arousing many statistical questions. This paper addresses one of them; when gene expression data are jointly observed with other variables with the purpose of highlighting significant relationships between gene expression and these other variables. One relevant statistical method to explore these relationships is Canonical Correlation Analysis (CCA). Unfortunately, in the context of postgenomic data, the number of variables (gene expressions) is usually greater than the number of units (samples) and CCA cannot be directly performed: a regularized version is required. We applied regularized CCA on data sets from two different studies and show that its interpretation evidences both previously validated relationships and new hypothesis. From the first data sets (nutrigenomic study), we generated interesting hypothesis on the transcription factor pathways potentially linking hepatic fatty acids and gene expression. From the second data sets (pharmacogenomic study on the NCI-60 cancer cell line panel), we identified new ABC transporter candidate substrates which relevancy is illustrated by the concomitant identification of several known substrates. In conclusion, the use of regularized CCA is likely to be relevant to a number and a variety of biological experiments involving the generation of high throughput data. We demonstrated here its ability to enhance the range of relevant conclusions that can be drawn from these relatively expensive experiments.

Collaboration


Dive into the Philippe Besse's collaboration.

Top Co-Authors

Avatar

Alain Baccini

Paul Sabatier University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pascal Martin

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Christèle Robert-Granié

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Jean-Michel Loubes

Institut de Mathématiques de Toulouse

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Béatrice Laurent

Institut de Mathématiques de Toulouse

View shared research outputs
Top Co-Authors

Avatar

Agnès Bonnet

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Brendan Guillouet

Institut de Mathématiques de Toulouse

View shared research outputs
Researchain Logo
Decentralizing Knowledge