Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kim-Anh Lê Cao is active.

Publication


Featured researches published by Kim-Anh Lê Cao.


Bioinformatics | 2009

integrOmics: an R package to unravel relationships between two omics datasets

Kim-Anh Lê Cao; Ignacio González; Sébastien Déjean

Motivation: With the availability of many ‘omics’ data, such as transcriptomics, proteomics or metabolomics, the integrative or joint analysis of multiple datasets from different technology platforms is becoming crucial to unravel the relationships between different biological functional levels. However, the development of such an analysis is a major computational and technical challenge as most approaches suffer from high data dimensionality. New methodologies need to be developed and validated. Results: integrOmics efficiently performs integrative analyses of two types of ‘omics’ variables that are measured on the same samples. It includes a regularized version of canonical correlation analysis to enlighten correlations between two datasets, and a sparse version of partial least squares (PLS) regression that includes simultaneous variable selection in both datasets. The usefulness of both approaches has been demonstrated previously and successfully applied in various integrative studies. Availability: integrOmics is freely available from http://CRAN.R-project.org/ or from the web site companion (http://math.univ-toulouse.fr/biostat) that provides full documentation and tutorials. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


BMC Bioinformatics | 2011

Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems

Kim-Anh Lê Cao; Simon Boitard; Philippe Besse

BackgroundVariable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits.ResultsA simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework.ConclusionssPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.


BMC Bioinformatics | 2009

Sparse canonical methods for biological data integration: application to a cross-platform study

Kim-Anh Lê Cao; Pascal Martin; Christèle Robert-Granié; Philippe Besse

BackgroundIn the context of systems biology, few sparse approaches have been proposed so far to integrate several data sets. It is however an important and fundamental issue that will be widely encountered in post genomic studies, when simultaneously analyzing transcriptomics, proteomics and metabolomics data using different platforms, so as to understand the mutual interactions between the different data sets. In this high dimensional setting, variable selection is crucial to give interpretable results. We focus on a sparse Partial Least Squares approach (sPLS) to handle two-block data sets, where the relationship between the two types of variables is known to be symmetric. Sparse PLS has been developed either for a regression or a canonical correlation framework and includes a built-in procedure to select variables while integrating data. To illustrate the canonical mode approach, we analyzed the NCI60 data sets, where two different platforms (cDNA and Affymetrix chips) were used to study the transcriptome of sixty cancer cell lines.ResultsWe compare the results obtained with two other sparse or related canonical correlation approaches: CCA with Elastic Net penalization (CCA-EN) and Co-Inertia Analysis (CIA). The latter does not include a built-in procedure for variable selection and requires a two-step analysis. We stress the lack of statistical criteria to evaluate canonical correlation methods, which makes biological interpretation absolutely necessary to compare the different gene selections. We also propose comprehensive graphical representations of both samples and variables to facilitate the interpretation of the results.ConclusionsPLS and CCA-EN selected highly relevant genes and complementary findings from the two data sets, which enabled a detailed understanding of the molecular characteristics of several groups of cell lines. These two approaches were found to bring similar results, although they highlighted the same phenomenons with a different priority. They outperformed CIA that tended to select redundant information.


Proceedings of the National Academy of Sciences of the United States of America | 2012

Conservation and divergence in Toll-like receptor 4-regulated gene expression in primary human versus mouse macrophages

Kate Schroder; Katharine M. Irvine; Martin S. Taylor; Nilesh J. Bokil; Kim-Anh Lê Cao; Kelly-Anne Masterman; Larisa I. Labzin; Colin A. Semple; Ronan Kapetanovic; Lynsey Fairbairn; Altuna Akalin; Geoffrey J. Faulkner; John Kenneth Baillie; Milena Gongora; Carsten O. Daub; Hideya Kawaji; Geoffrey J. McLachlan; Nick Goldman; Sean M. Grimmond; Piero Carninci; Harukazu Suzuki; Yoshihide Hayashizaki; Boris Lenhard; David A. Hume; Matthew J. Sweet

Evolutionary change in gene expression is generally considered to be a major driver of phenotypic differences between species. We investigated innate immune diversification by analyzing interspecies differences in the transcriptional responses of primary human and mouse macrophages to the Toll-like receptor (TLR)–4 agonist lipopolysaccharide (LPS). By using a custom platform permitting cross-species interrogation coupled with deep sequencing of mRNA 5′ ends, we identified extensive divergence in LPS-regulated orthologous gene expression between humans and mice (24% of orthologues were identified as “divergently regulated”). We further demonstrate concordant regulation of human-specific LPS target genes in primary pig macrophages. Divergently regulated orthologues were enriched for genes encoding cellular “inputs” such as cell surface receptors (e.g., TLR6, IL-7Rα) and functional “outputs” such as inflammatory cytokines/chemokines (e.g., CCL20, CXCL13). Conversely, intracellular signaling components linking inputs to outputs were typically concordantly regulated. Functional consequences of divergent gene regulation were confirmed by showing LPS pretreatment boosts subsequent TLR6 responses in mouse but not human macrophages, in keeping with mouse-specific TLR6 induction. Divergently regulated genes were associated with a large dynamic range of gene expression, and specific promoter architectural features (TATA box enrichment, CpG island depletion). Surprisingly, regulatory divergence was also associated with enhanced interspecies promoter conservation. Thus, the genes controlled by complex, highly conserved promoters that facilitate dynamic regulation are also the most susceptible to evolutionary change.


Science Translational Medicine | 2015

Citrullinated peptide dendritic cell immunotherapy in HLA risk genotype-positive rheumatoid arthritis patients.

Helen Benham; Hendrik J. Nel; Soi Cheng Law; Ahmed M. Mehdi; Shayna Street; Nishta Ramnoruth; Helen Pahau; Bernett Lee; Jennifer Ng; Marion E. Brunck; Claire Hyde; Leendert A. Trouw; Nadine L. Dudek; Anthony W. Purcell; Brendan J. O'Sullivan; John Connolly; Sanjoy K. Paul; Kim-Anh Lê Cao; Ranjeny Thomas

Citrullinated peptide-exposed DCs induced immune regulatory effects in HLA risk genotype–positive RA patients. Immunotherapy out of joint Autoantibodies to anti–citrullinated peptides (ACPA) are found in most patients with rheumatoid arthritis (RA), especially those with HLA-DRB1 risk alleles. Benham et al. report a first-in-human phase 1 trial of a single injection of autologous dendritic cells modified with an NF-κB inhibitor that have been exposed to four citrullinated peptide antigens. They find that HLA risk genotype–positive RA patients had reduced numbers of effector T cells and decreased production of proinflammatory cytokines compared with untreated RA patient controls. The therapy was safe and did not induce disease flares. These data support larger studies of antigen-specific immunotherapy for RA. In animals, immunomodulatory dendritic cells (DCs) exposed to autoantigen can suppress experimental arthritis in an antigen-specific manner. In rheumatoid arthritis (RA), disease-specific anti–citrullinated peptide autoantibodies (ACPA or anti-CCP) are found in the serum of about 70% of RA patients and are strongly associated with HLA-DRB1 risk alleles. This study aimed to explore the safety and biological and clinical effects of autologous DCs modified with a nuclear factor κB (NF-κB) inhibitor exposed to four citrullinated peptide antigens, designated “Rheumavax,” in a single-center, open-labeled, first-in-human phase 1 trial. Rheumavax was administered once intradermally at two progressive dose levels to 18 human leukocyte antigen (HLA) risk genotype–positive RA patients with citrullinated peptide–specific autoimmunity. Sixteen RA patients served as controls. Rheumavax was well tolerated: adverse events were grade 1 (of 4) severity. At 1 month after treatment, we observed a reduction in effector T cells and an increased ratio of regulatory to effector T cells; a reduction in serum interleukin-15 (IL-15), IL-29, CX3CL1, and CXCL11; and reduced T cell IL-6 responses to vimentin447–455–Cit450 relative to controls. Rheumavax did not induce disease flares in patients recruited with minimal disease activity, and DAS28 decreased within 1 month in Rheumavax-treated patients with active disease. This exploratory study demonstrates safety and biological activity of a single intradermal injection of autologous modified DCs exposed to citrullinated peptides, and provides rationale for further studies to assess clinical efficacy and antigen-specific effects of autoantigen immunomodulatory therapy in RA.


Biodata Mining | 2012

Visualising associations between paired ‘omics’ data sets

Ignacio González; Kim-Anh Lê Cao; Melissa J. Davis; Sébastien Déjean

BackgroundEach omics platform is now able to generate a large amount of data. Genomics, proteomics, metabolomics, interactomics are compiled at an ever increasing pace and now form a core part of the fundamental systems biology framework. Recently, several integrative approaches have been proposed to extract meaningful information. However, these approaches lack of visualisation outputs to fully unravel the complex associations between different biological entities.ResultsThe multivariate statistical approaches ‘regularized Canonical Correlation Analysis’ and ‘sparse Partial Least Squares regression’ were recently developed to integrate two types of highly dimensional ‘omics’ data and to select relevant information. Using the results of these methods, we propose to revisit few graphical outputs to better understand the relationships between two ‘omics’ data and to better visualise the correlation structure between the different biological entities. These graphical outputs include Correlation Circle plots, Relevance Networks and Clustered Image Maps. We demonstrate the usefulness of such graphical outputs on several biological data sets and further assess their biological relevance using gene ontology analysis.ConclusionsSuch graphical outputs are undoubtedly useful to aid the interpretation of these promising integrative analysis tools and will certainly help in addressing fundamental biological questions and understanding systems as a whole.AvailabilityThe graphical tools described in this paper are implemented in the freely available R package mixOmics and in its associated web application.


Nature Methods | 2015

Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing

Michael B. Clark; Tim R. Mercer; Giovanni Bussotti; Tommaso Leonardi; Katelin Haynes; Joanna Crawford; Marion E. Brunck; Kim-Anh Lê Cao; Gethin P. Thomas; Wendy Y. Chen; Ryan J. Taft; Lars K. Nielsen; Anton J. Enright; John S. Mattick; Marcel E. Dinger

We compared quantitative RT-PCR (qRT-PCR), RNA-seq and capture sequencing (CaptureSeq) in terms of their ability to assemble and quantify long noncoding RNAs and novel coding exons across 20 human tissues. CaptureSeq was superior for the detection and quantification of genes with low expression, showed little technical variation and accurately measured differential expression. This approach expands and refines previous annotations and simultaneously generates an expression atlas.


PLOS Computational Biology | 2017

mixOmics: An R package for ‘omics feature selection and multiple data integration

Florian Rohart; Benoit Gautier; Amrit Singh; Kim-Anh Lê Cao

The advent of high throughput technologies has led to a wealth of publicly available ‘omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a ‘molecular signature’) to explain or predict biological conditions, but mainly for a single type of ‘omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous ‘omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple ‘omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of ‘omics data available from the package.


Nucleic Acids Research | 2014

A fine-scale dissection of the DNA double-strand break repair machinery and its implications for breast cancer therapy

Chao Liu; Sriganesh Srihari; Kim-Anh Lê Cao; Georgia Chenevix-Trench; Peter T. Simpson; Mark A. Ragan; Kum Kum Khanna

DNA-damage response machinery is crucial to maintain the genomic integrity of cells, by enabling effective repair of even highly lethal lesions such as DNA double-strand breaks (DSBs). Defects in specific genes acquired through mutations, copy-number alterations or epigenetic changes can alter the balance of these pathways, triggering cancerous potential in cells. Selective killing of cancer cells by sensitizing them to further DNA damage, especially by induction of DSBs, therefore requires careful modulation of DSB-repair pathways. Here, we review the latest knowledge on the two DSB-repair pathways, homologous recombination and non-homologous end joining in human, describing in detail the functions of their components and the key mechanisms contributing to the repair. Such an in-depth characterization of these pathways enables a more mechanistic understanding of how cells respond to therapies, and suggests molecules and processes that can be explored as potential therapeutic targets. One such avenue that has shown immense promise is via the exploitation of synthetic lethal relationships, for which the BRCA1–PARP1 relationship is particularly notable. Here, we describe how this relationship functions and the manner in which cancer cells acquire therapy resistance by restoring their DSB repair potential.


BMC Bioinformatics | 2012

A novel approach for biomarker selection and the integration of repeated measures experiments from two assays

Benoit Liquet; Kim-Anh Lê Cao; Hakim Hocini; Rodolphe Thiébaut

BackgroundHigh throughput ’omics’ experiments are usually designed to compare changes observed between different conditions (or interventions) and to identify biomarkers capable of characterizing each condition. We consider the complex structure of repeated measurements from different assays where different conditions are applied on the same subjects.ResultsWe propose a two-step analysis combining a multilevel approach and a multivariate approach to reveal separately the effects of conditions within subjects from the biological variation between subjects. The approach is extended to two-factor designs and to the integration of two matched data sets. It allows internal variable selection to highlight genes able to discriminate the net condition effect within subjects. A simulation study was performed to demonstrate the good performance of the multilevel multivariate approach compared to a classical multivariate method. The multilevel multivariate approach outperformed the classical multivariate approach with respect to the classification error rate and the selection of relevant genes. The approach was applied to an HIV-vaccine trial evaluating the response with gene expression and cytokine secretion. The discriminant multilevel analysis selected a relevant subset of genes while the integrative multilevel analysis highlighted clusters of genes and cytokines that were highly correlated across the samples.ConclusionsOur combined multilevel multivariate approach may help in finding signatures of vaccine effect and allows for a better understanding of immunological mechanisms activated by the intervention. The integrative analysis revealed clusters of genes, that were associated with cytokine secretion. These clusters can be seen as gene signatures to predict future cytokine response. The approach is implemented in the R package mixOmics (http://cran.r-project.org/) with associated tutorials to perform the analysisa.

Collaboration


Dive into the Kim-Anh Lê Cao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Florian Rohart

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Benoit Gautier

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Alok K. Shah

University of Queensland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David C. Whiteman

QIMR Berghofer Medical Research Institute

View shared research outputs
Top Co-Authors

Avatar

Mikael Bodén

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Othmar Korn

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Christèle Robert-Granié

Institut national de la recherche agronomique

View shared research outputs
Researchain Logo
Decentralizing Knowledge