Guillemette Marot
Institut national de la recherche agronomique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guillemette Marot.
Briefings in Bioinformatics | 2013
Marie-Agnès Dillies; Andrea Rau; Julie Aubert; Christelle Hennequet-Antier; Marine Jeanmougin; Nicolas Servant; Céline Keime; Guillemette Marot; David Castel; Jordi Estellé; Gregory Guernec; Bernd Jagla; Luc Jouneau; Denis Laloë; Caroline Le Gall; Brigitte Schaëffer; Stéphane Le Crom; Mickael Guedj; Florence Jaffrézic
During the last 3 years, a number of approaches for the normalization of RNA sequencing data have emerged in the literature, differing both in the type of bias adjustment and in the statistical strategy adopted. However, as data continue to accumulate, there has been no clear consensus on the appropriate normalization method to be used or the impact of a chosen method on the downstream analysis. In this work, we focus on a comprehensive comparison of seven recently proposed normalization methods for the differential analysis of RNA-seq data, with an emphasis on the use of varied real and simulated datasets involving different species and experimental designs to represent data characteristics commonly observed in practice. Based on this comparison study, we propose practical recommendations on the appropriate normalization method to be used and its impact on the differential analysis of RNA-seq data.
Bioinformatics | 2009
Guillemette Marot; Jean-Louis Foulley; Claus-Dieter Mayer; Florence Jaffrezic
MOTIVATION With the proliferation of microarray experiments and their availability in the public domain, the use of meta-analysis methods to combine results from different studies increases. In microarray experiments, where the sample size is often limited, meta-analysis offers the possibility to considerably increase the statistical power and give more accurate results. RESULTS A moderated effect size combination method was proposed and compared with other meta-analysis approaches. All methods were applied to real publicly available datasets on prostate cancer, and were compared in an extensive simulation study for various amounts of inter-study variability. Although the proposed moderated effect size combination improved already existing effect size approaches, the P-value combination was found to provide a better sensitivity and a better gene ranking than the other meta-analysis methods, while effect size methods were more conservative. AVAILABILITY An R package metaMA is available on the CRAN.
BMC Genomics | 2008
Benoit Guyonnet; Guillemette Marot; Jean-Louis Dacheux; Marie-José Mercat; Sandrine Schwob; Florence Jaffrézic; Jean-Luc Gatti
BackgroundMammalians gamete production takes place in the testis but when they exit this organ, although spermatozoa have acquired a specialized and distinct morphology, they are immotile and infertile. It is only after their travel in the epididymis that sperm gain their motility and fertility. Epididymis is a crescent shaped organ adjacent to the testis that can be divided in three gross morphological regions, head (caput), body (corpus) and tail (cauda). It contains a long and unique convoluted tubule connected to the testis via the efferent ducts and finished by joining the vas deferens in its caudal part.ResultsIn this study, the testis, the efferent ducts (vas efferens, VE), nine distinct successive epididymal segments and the deferent duct (vas deferens, VD) of four adult boars of known fertility were isolated and their mRNA extracted. The gene expression of each of these samples was analyzed using a pig generic 9 K nylon microarray (AGENAE program; GEO accession number: GPL3729) spotted with 8931 clones derived from normalized cDNA banks from different pig tissues including testis and epididymis. Differentially expressed transcripts were obtained with moderated t-tests and F-tests and two data clustering algorithms based either on partitioning around medoid (top down PAM) or hierarchical clustering (bottom up HCL) were combined for class discovery and gene expression analysis. Tissue clustering defined seven transcriptomic units: testis, vas efferens and five epididymal transcriptomic units. Meanwhile transcripts formed only four clusters related to the tissues. We have then used a specific statistical method to sort out genes specifically over-expressed (markers) in testis, VE or in each of the five transcriptomic units of the epididymis (including VD). The specific regional expression of some of these genes was further validated by PCR and Q-PCR. We also searched for specific pathways and functions using available gene ontology information.ConclusionThis study described for the first time the complete transcriptomes of the testis, the epididymis, the vas efferens and the vas deferens on the same species. It described new genes or genes not yet reported over-expressed in these boar tissues, as well as new control mechanisms. It emphasizes and fulfilled the gap between studies done in rodents and human, and provides tools that will be useful for further studies on the biochemical processes responsible for the formation and maintain of the epididymal regionalization and the development of a fertile spermatozoa.
Genetics Research | 2007
Florence Jaffrézic; Guillemette Marot; Séverine Degrelle; Isabelle Hue; Jean-Louis Foulley
The importance of variance modelling is now widely known for the analysis of microarray data. In particular the power and accuracy of statistical tests for differential gene expressions are highly dependent on variance modelling. The aim of this paper is to use a structural model on the variances, which includes a condition effect and a random gene effect, and to propose a simple estimation procedure for these parameters by working on the empirical variances. The proposed variance model was compared with various methods on both real and simulated data. It proved to be more powerful than the gene-by-gene analysis and more robust to the number of false positives than the homogeneous variance model. It performed well compared with recently proposed approaches such as SAM and VarMixt even for a small number of replicates, and performed similarly to Limma. The main advantage of the structural model is that, thanks to the use of a linear mixed model on the logarithm of the variances, various factors of variation can easily be incorporated in the model, which is not the case for previously proposed empirical Bayes methods. It is also very fast to compute and is adapted to the comparison of more than two conditions.
Genetics Selection Evolution | 2007
Florence Jaffrézic; Dirk-Jan de Koning; Paul J. Boettcher; Agnès Bonnet; Bart Buitenhuis; R. Closset; Sébastien Déjean; Céline Delmas; Johanne Detilleux; Peter Dovč; Mylène Duval; Jean-Louis Foulley; Jakob Hedegaard; Henrik Hornshøj; Ina Hulsegge; Luc Janss; Kirsty Jensen; Li Jiang; Miha Lavric; Kim-Anh Lê Cao; Mogens Sandø Lund; Roberto Malinverni; Guillemette Marot; Haisheng Nie; Wolfram Petzl; M.H. Pool; Christèle Robert-Granié; Magali San Cristobal; Evert M. van Schothorst; Hans-Joachim Schuberth
A large variety of methods has been proposed in the literature for microarray data analysis. The aim of this paper was to present techniques used by the EADGENE (European Animal Disease Genomics Network of Excellence) WP1.4 participants for data quality control, normalisation and statistical methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two different mastitis causing bacteria: Escherichia coli and Staphylococcus aureus. It was reassuring to see that most of the teams found the same main biological results. In fact, most of the differentially expressed genes were found for infection by E. coli between uninfected and 24 h challenged udder quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised a biological problem of cross-talk between infected and uninfected quarters which will have to be dealt with for further microarray studies.
Statistical Applications in Genetics and Molecular Biology | 2009
Guillemette Marot; Claus-Dieter Mayer
Motivation: Transcriptomic studies using microarray technology have become a standard tool in life sciences in the last decade. Nevertheless the cost of these experiments remains high and forces scientists to work with small sample sizes at the expense of statistical power. In many cases, little or no prior knowledge on the underlying variability is available, which would allow an accurate estimation of the number of samples (microarrays) required to answer a particular biological question of interest. We investigate sequential methods, also called group sequential or adaptive designs in the context of clinical trials, for microarray analysis. Through interim analyses at different stages of the experiment and application of a stopping rule a decision can be made as to whether more samples should be studied or whether the experiment has yielded enough information already. Results: The high dimensionality of microarray data facilitates the sequential approach. Since thousands of genes simultaneously contribute to the stopping decision, the marginal distribution of any single gene is nearly independent of the global stopping rule. For this reason, the interim analysis does not seriously bias the final p-values. We propose a meta-analysis approach to combining the results of the interim analyses at different stages. We consider stopping rules that are either based on the estimated number of true positives or on a sensitivity estimate and particularly discuss the difficulty of estimating the latter. We study this sequential method in an extensive simulation study and also apply it to several real data sets. The results show that applying sequential methods can reduce the number of microarrays without substantial loss of power. An R-package SequentialMA implementing the approach is available from the authors.
Genetics Selection Evolution | 2007
Dirk-Jan de Koning; Florence Jaffrézic; Mogens Sandø Lund; Michael Watson; C.E. Channing; Ina Hulsegge; M.H. Pool; Bart Buitenhuis; Jakob Hedegaard; Henrik Hornshøj; Li Jiang; Peter Sørensen; Guillemette Marot; Céline Delmas; Kim-Anh Lê Cao; Magali San Cristobal; Michael Denis Baron; Roberto Malinverni; Alessandra Stella; Ronald M. Brunner; Hans-Martin Seyfert; Kirsty Jensen; Daphné Mouzaki; David Waddington; Ángeles Jiménez-Marín; Mónica Pérez-Alegre; Eva Pérez-Reinado; R. Closset; Johanne Detilleux; Peter Dovč
Microarray analyses have become an important tool in animal genomics. While their use is becoming widespread, there is still a lot of ongoing research regarding the analysis of microarray data. In the context of a European Network of Excellence, 31 researchers representing 14 research groups from 10 countries performed and discussed the statistical analyses of real and simulated 2-colour microarray data that were distributed among participants. The real data consisted of 48 microarrays from a disease challenge experiment in dairy cattle, while the simulated data consisted of 10 microarrays from a direct comparison of two treatments (dye-balanced). While there was broader agreement with regards to methods of microarray normalisation and significance testing, there were major differences with regards to quality control. The quality control approaches varied from none, through using statistical weights, to omitting a large number of spots or omitting entire slides. Surprisingly, these very different approaches gave quite similar results when applied to the simulated data, although not all participating groups analysed both real and simulated data. The workshop was very successful in facilitating interaction between scientists with a diverse background but a common interest in microarray analyses.
Computational Statistics & Data Analysis | 2009
Guillemette Marot; Jean-Louis Foulley; Florence Jaffrézic
Time-course microarray studies require a particular modelling of covariance matrices when measures are repeated on the same individuals. Taking into account the within-subject correlation in the test statistics for differential gene expression, however, requires a large number of parameters when a gene-specific approach is used, which often results in a lack of power due to the small number of individuals usually considered in microarray experiments. Shrinkage approaches can improve this detection power in differential gene expression studies by reducing the number of parameters, while offering a good flexibility and a small rate of false positives. A natural extension of the shrinkage approach based on a structural mixed model to variance-covariance matrices is proposed. The structural model was used in three configurations to shrink (i) the eigenvalues in an eigenvalue/eigenvector decomposition, (ii) the innovation variances in a Cholesky decomposition, (iii) both the variances and correlation parameters of a gene-by-gene covariance matrix using a Fisher transformation. The proposed methods were applied both to a publicly available data set and to simulated data. They were found to perform well, compared to previously proposed empirical Bayesian approaches, and outperformed the gene-specific or common-covariance methods in many cases.
Reproduction in Domestic Animals | 2013
Damien Valour; Isabelle Hue; Séverine Degrelle; Sébastien Déjean; Guillemette Marot; Olivier Dubois; Guy Germain; P. Humblot; Andrew Ponter; Gilles Charpigny; Bénédicte Grimard
Genetics Selection Evolution | 2007
Michael Watson; Mónica Pérez-Alegre; Michael Denis Baron; Céline Delmas; Peter Dovč; Mylène Duval; Jean-Louis Foulley; Juan José Garrido-Pavón; Ina Hulsegge; Florence Jaffrézic; Ángeles Jiménez-Marín; Miha Lavric; Kim-Anh Lê Cao; Guillemette Marot; Daphné Mouzaki; M.H. Pool; Christèle Robert-Granié; Magali San Cristobal; Gwenola Tosser-Klopp; David Waddington; Dirk-Jan de Koning