Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Florian Rohart is active.

Publication


Featured researches published by Florian Rohart.


PLOS Computational Biology | 2017

mixOmics: An R package for ‘omics feature selection and multiple data integration

Florian Rohart; Benoit Gautier; Amrit Singh; Kim-Anh Lê Cao

The advent of high throughput technologies has led to a wealth of publicly available ‘omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a ‘molecular signature’) to explain or predict biological conditions, but mainly for a single type of ‘omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous ‘omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple ‘omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of ‘omics data available from the package.


Journal of Animal Science | 2012

Phenotypic prediction based on metabolomic data for growing pigs from three main European breeds.

Florian Rohart; Alain Paris; B. Laurent; Cécile Canlet; Jérôme Molina; Marie-José Mercat; Thierry Tribout; Nelly Muller; Nathalie Iannuccelli; Laurence Liaubet; Denis Milan; M. San Cristobal

Predicting phenotypes is a statistical and biotechnical challenge, both in medicine (predicting an illness) and animal breeding (predicting the carcass economical value on a young living animal). High-throughput fine phenotyping is possible using metabolomics, which describes the global metabolic status of an individual, and is the closest to the terminal phenotype. The purpose of this work was to quantify the prediction power of metabolomic profiles for commonly used production phenotypes from a single blood sample from growing pigs. Several statistical approaches were investigated and compared on the basis of cross validation: raw data vs. signal preprocessing (wavelet transformation), with a single-feature selection method. The best results in terms of prediction accuracy were obtained when data were preprocessed using wavelet transformations on the Daubechies basis. The phenotypes related to meat quality were not well predicted because the blood sample was taken some time before slaughter, and slaughter is known to have a strong influence on these traits. By contrast, phenotypes of potential economic interest (e.g., lean meat percentage and ADFI) were well predicted (R(2) = 0.7; P < 0.0001) using metabolomic data.


bioRxiv | 2017

DIABLO - an integrative, multi-omics, multivariate method for multi-group classification

Amrit Singh; Benoit Gautier; Casey P. Shannon; Michael Vacher; Florian Rohart; Scott J Tebutt; Kim-Anh Lê Cao

Systems biology approaches, leveraging multi-omics measurements, are needed to capture the complexity of biological networks while identifying the key molecular drivers of disease mechanisms. We present DIABLO, a novel integrative method to identify multi-omics biomarker panels that can discriminate between multiple phenotypic groups. In the multi-omics analyses of simulated and real-world datasets, DIABLO resulted in superior biological enrichment compared to other integrative methods, and achieved comparable predictive performance with existing multi-step classification schemes. DIABLO is a versatile approach that will benefit a diverse range of research areas, where multiple high dimensional datasets are available for the same set of specimens. DIABLO is implemented along with tools for model selection, and validation, as well as graphical outputs to assist in the interpretation of these integrative analyses (http://mixomics.org/).Rapid advances in technology have led to a wealth of large-scale molecular omics datasets. Integrating such data offers an unprecedented opportunity to assess molecular interactions at multiple functional levels and provide a more comprehensive understanding of the biological pathways involved in different diseases subgroups. However, multiple omics data integration is a challenging task due to the heterogeneity in the different platforms used. There is a need to address the complex and correlated nature of different data-types, in order to identify a robust and reliable multi-omics signature that can predict a phenotype of interest. We introduce a novel multivariate dimension reduction method for multiple omics integration, classification and identification of a multi-omics molecular signature. DIABLO - Data Integration Analysis for Biomarker discovery using a Latent component method for Omics studies, models the correlation structure between omics datasets, resulting in an improved ability to associate biomarkers across multiple functional levels to phenotypes of interest. We demonstrate the capabilities of DIABLO using simulated data and studies of breast cancer and asthma, integrating up to four types of omics datasets to identify relevant biomarkers, while still retaining competitive classification and predictive performance compared to existing methods. Our statistical integrative framework can benefit a diverse range of research areas with varying types of study designs, as well as enabling module-based analyses. Importantly, graphical outputs of our method assist in the interpretation of such complex analyses and provide significant biological insights.


Hepatology | 2017

Human Hepatocellular Carcinomas With a Periportal Phenotype Have the Lowest Potential for Early Recurrence After Curative Resection

Romain Désert; Florian Rohart; Frédéric Canal; Marie Sicard; Mireille Desille; Stéphanie Renaud; Bruno Turlin; Pascale Bellaud; Christine Perret; Bruno Clément; Kim-Anh Lê Cao; Orlando Musso

Hepatocellular carcinomas (HCCs) exhibit a diversity of molecular phenotypes, raising major challenges in clinical management. HCCs detected by surveillance programs at an early stage are candidates for potentially curative therapies (local ablation, resection, or transplantation). In the long term, transplantation provides the lowest recurrence rates. Treatment allocation is based on tumor number, size, vascular invasion, performance status, functional liver reserve, and the prediction of early (<2 years) recurrence, which reflects the intrinsic aggressiveness of the tumor. Well‐differentiated, potentially low‐aggressiveness tumors form the heterogeneous molecular class of nonproliferative HCCs, characterized by an approximate 50% β‐catenin mutation rate. To define the clinical, pathological, and molecular features and the outcome of nonproliferative HCCs, we constructed a 1,133‐HCC transcriptomic metadata set and validated findings in a publically available 210‐HCC RNA sequencing set. We show that nonproliferative HCCs preserve the zonation program that distributes metabolic functions along the portocentral axis in normal liver. More precisely, we identified two well‐differentiated, nonproliferation subclasses, namely periportal‐type (wild‐type β‐catenin) and perivenous‐type (mutant β‐catenin), which expressed negatively correlated gene networks. The new periportal‐type subclass represented 29% of all HCCs; expressed a hepatocyte nuclear factor 4A–driven gene network, which was down‐regulated in mouse hepatocyte nuclear factor 4A knockout mice; were early‐stage tumors by Barcelona Clinic Liver Cancer, Cancer of the Liver Italian Program, and tumor–node–metastasis staging systems; had no macrovascular invasion; and showed the lowest metastasis‐specific gene expression levels and TP53 mutation rates. Also, we identified an eight‐gene periportal‐type HCC signature, which was independently associated with the highest 2‐year recurrence‐free survival by multivariate analyses in two independent cohorts of 247 and 210 patients. Conclusion: Well‐differentiated HCCs display mutually exclusive periportal or perivenous zonation programs. Among all HCCs, periportal‐type tumors have the lowest intrinsic potential for early recurrence after curative resection. (Hepatology 2017;66:1502–1518).


PeerJ | 2016

A molecular classification of human mesenchymal stromal cells

Florian Rohart; Elizabeth Mason; Nicholas Matigian; Rowland Mosbergen; Othmar Korn; Tyrone Chen; Suzanne Butcher; Jatin Patel; Kerry Atkinson; Kiarash Khosrotehrani; Nicholas M. Fisk; Kim-Anh Lê Cao; Christine A. Wells

Mesenchymal stromal cells (MSC) are widely used for the study of mesenchymal tissue repair, and increasingly adopted for cell therapy, despite the lack of consensus on the identity of these cells. In part this is due to the lack of specificity of MSC markers. Distinguishing MSC from other stromal cells such as fibroblasts is particularly difficult using standard analysis of surface proteins, and there is an urgent need for improved classification approaches. Transcriptome profiling is commonly used to describe and compare different cell types; however, efforts to identify specific markers of rare cellular subsets may be confounded by the small sample sizes of most studies. Consequently, it is difficult to derive reproducible, and therefore useful markers. We addressed the question of MSC classification with a large integrative analysis of many public MSC datasets. We derived a sparse classifier (The Rohart MSC test) that accurately distinguished MSC from non-MSC samples with >97% accuracy on an internal training set of 635 samples from 41 studies derived on 10 different microarray platforms. The classifier was validated on an external test set of 1,291 samples from 65 studies derived on 15 different platforms, with >95% accuracy. The genes that contribute to the MSC classifier formed a protein-interaction network that included known MSC markers. Further evidence of the relevance of this new MSC panel came from the high number of Mendelian disorders associated with mutations in more than 65% of the network. These result in mesenchymal defects, particularly impacting on skeletal growth and function. The Rohart MSC test is a simple in silico test that accurately discriminates MSC from fibroblasts, other adult stem/progenitor cell types or differentiated stromal cells. It has been implemented in the www.stemformatics.org resource, to assist researchers wishing to benchmark their own MSC datasets or data from the public domain. The code is available from the CRAN repository and all data used to generate the MSC test is available to download via the Gene Expression Omnibus or the Stemformatics resource.


BMC Bioinformatics | 2017

MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms

Florian Rohart; Aida Eslami; Nicholas Matigian; Stéphanie Bougeard; Kim-Anh Lê Cao

BackgroundMolecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods.ResultsTo limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures.ConclusionsMINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/and http://cran.r-project.org/web/packages/mixOmics/.


Computational Statistics & Data Analysis | 2014

Selection of fixed effects in high dimensional linear mixed models using a multicycle ECM algorithm

Florian Rohart; Magali San Cristobal; Béatrice Laurent

Linear mixed models are especially useful when observations are grouped. In a high dimensional setting however, selecting the fixed effect coefficients in these models is mandatory as classical tools are not performing well. By considering the random effects as missing values in the linear mixed model framework, a @?^1-penalization on the fixed effects coefficients of the resulting log-likelihood is proposed. The optimization problem is solved via a multicycle Expectation Conditional Maximization (ECM) algorithm which allows for the number of parameters p to be larger than the total number of observations n and does not require the inversion of the sample nxn covariance matrix. The proposed algorithm can be combined with any variable selection method developed for linear models. A variant of the proposed approach replaces the @?^1-penalization with a multiple testing procedure for the variable selection aspect and is shown to greatly improve the False Discovery Rate. Both methods are implemented in the MMS R-package, and are shown to give very satisfying results in a high-dimensional simulated setting.


Scientific Reports | 2016

Disease surveillance based on Internet-based linear models: an Australian case study of previously unmodeled infection diseases

Florian Rohart; Gabriel J. Milinovich; Simon M R Avril; Kim-Anh Lê Cao; Shilu Tong; Wenbiao Hu

Effective disease surveillance is critical to the functioning of health systems. Traditional approaches are, however, limited in their ability to deliver timely information. Internet-based surveillance systems are a promising approach that may circumvent many of the limitations of traditional health surveillance systems and provide more intelligence on cases of infection, including cases from those that do not use the healthcare system. Infectious disease surveillance systems built on Internet search metrics have been shown to produce accurate estimates of disease weeks before traditional systems and are an economically attractive approach to surveillance; they are, however, also prone to error under certain circumstances. This study sought to explore previously unmodeled diseases by investigating the link between Google Trends search metrics and Australian weekly notification data. We propose using four alternative disease modelling strategies based on linear models that studied the length of the training period used for model construction, determined the most appropriate lag for search metrics, used wavelet transformation for denoising data and enabled the identification of key search queries for each disease. Out of the twenty-four diseases assessed with Australian data, our nowcasting results highlighted promise for two diseases of international concern, Ross River virus and pneumococcal disease.


Hepatology | 2017

Phenotypic diversity spanning the spectrum of hepatocyte differentiation impacts the outcome of patients with beta-catenin-mutated hepatocellular carcinomas.

Romain Désert; Christelle Reynes; Robert Sabatier; Damien Gregoire; Florian Rohart; Anne Corlu; Frédéric Canal; Marie Sicard; Mireille Desille; Stéphanie Renaud; Bruno Turlin; Laurent Sulpice; Damien Bergeat; Pascale Bellaud; Christine Perret; Bruno Clément; Kim-Anh Lê Cao; Orlando Musso

Body: Background: Clinical and radiological features are used for prognostication in patients with hepatocellular carcinoma (HCC) . The present study aimed to prospectively evaluate the impact of HCC treatment in patients with HCC bearing a transcriptomic signature (TS) (Gut 2016 .doi: 10 .1136/gutjnl-2014-308483) associated with aggressive tumour behaviour and worse survival . Methods: Candidates for HCC treatment were prospectively subjected to histological HCC evaluation, both for diagnosis and for performance of the TS (qRT/PCR) . Physicians deciding and performing treatment were blinded to the presence of TS . Outcome results were matched with TS presence only after follow up finished . Results: 237 patients were enrolled, 81% were male, median age 65 years; 39 .7% of them were alive at the end of follow-up in March 2017 . Overall median survival was 31 months; 26 .6% of patients bearing the TS, had significantly worst survival (median survival: 12 vs . 42 months; p <0 .001) . The 80% of entire population underwent at least one treatment for HCC . The cohort with TS showed a significant lower survival independently from having a therapeutic option (HCC with TS vs . no-signature: median survival: 20 vs . 48 months; p<0 .001) or undergoing supportive therapy only (median survival: 5 vs . 13 months; p<0 .001) . The presence of TS was always associated with worse survival independently from undergoing surgical resection (33 vs . 68 months; p<0 .001), multiple loco-regional treatments (33 vs . 69 months; p<0 .001) or systemic drug therapy alone (8 vs . 19 months; p<0 .001) . Twenty patients (11 .8%) undergoing liver transplant (LT) had the best survival of the entire cohort (98 vs . 36 months; p< 0 .001), however, the 3 patients with TS undergoing LT had a significantly lower survival within the LT group (75±19 vs . 101±58 months, p<0 .001) . One out of 3 (33 .3%) HCCs with TS recurred vs . 1 out of 17 (5 .8%) without TS . At Cox Regression analysis, the presence of transcriptomic signature (HR 2 .636,95% CI 1 .676-4 .145), no treatment vs . performance of any treatment (HR .361, 95% CI .217.601), and liver function (Child-Pugh score: HR:1 .263, 95% CI 1 .0631 .490)) were independently related with worse survival . Conclusion: HCCs bearing the transcriptomic signature have an extremely aggressive clinical course that ultimately impacts on survival despite the application of all the available treatment for HCC . Liver transplant could be the only real therapeutic option but this should be prospectively assessed, as in HCC with transcriptomic signature, a high rate of recurrence is biologically extremely plausible .


Scientific Reports | 2016

Integrating Multi-omics Data to Dissect Mechanisms of DNA repair Dysregulation in Breast Cancer

Chao Liu; Florian Rohart; Peter T. Simpson; Kum Kum Khanna; Mark A. Ragan; Kim-Anh Lê Cao

DNA repair genes and pathways that are transcriptionally dysregulated in cancer provide the first line of evidence for the altered DNA repair status in tumours, and hence have been explored intensively as a source for biomarker discovery. The molecular mechanisms underlying DNA repair dysregulation, however, have not been systematically investigated in any cancer type. In this study, we performed a statistical analysis to dissect the roles of DNA copy number alteration (CNA), DNA methylation (DM) at gene promoter regions and the expression changes of transcription factors (TFs) in the differential expression of individual DNA repair genes in normal versus tumour breast samples. These gene-level results were summarised at pathway level to assess whether different DNA repair pathways are affected in distinct manners. Our results suggest that CNA and expression changes of TFs are major causes of DNA repair dysregulation in breast cancer, and that a subset of the identified TFs may exert global impacts on the dysregulation of multiple repair pathways. Our work hence provides novel insights into DNA repair dysregulation in breast cancer. These insights improve our understanding of the molecular basis of the DNA repair biomarkers identified thus far, and have potential to inform future biomarker discovery.

Collaboration


Dive into the Florian Rohart's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Denis Milan

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Alain Paris

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Béatrice Laurent

Institut de Mathématiques de Toulouse

View shared research outputs
Top Co-Authors

Avatar

Laurence Liaubet

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Cécile Canlet

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Jérôme Molina

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Magali SanCristobal

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Thierry Tribout

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Marie Pierre Sanchez

Institut national de la recherche agronomique

View shared research outputs
Researchain Logo
Decentralizing Knowledge