Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jean-Michel Marin is active.

Publication


Featured researches published by Jean-Michel Marin.


Bioinformatics | 2014

DIYABC v2.0: a software to make approximate Bayesian computation inferences about population history using single nucleotide polymorphism, DNA sequence and microsatellite data

Jean-Marie Cornuet; Pierre Pudlo; Julien Veyssier; Alexandre Dehne-Garcia; Mathieu Gautier; Raphaël Leblois; Jean-Michel Marin; Arnaud Estoup

MOTIVATION DIYABC is a software package for a comprehensive analysis of population history using approximate Bayesian computation on DNA polymorphism data. Version 2.0 implements a number of new features and analytical methods. It allows (i) the analysis of single nucleotide polymorphism data at large number of loci, apart from microsatellite and DNA sequence data, (ii) efficient Bayesian model choice using linear discriminant analysis on summary statistics and (iii) the serial launching of multiple post-processing analyses. DIYABC v2.0 also includes a user-friendly graphical interface with various new options. It can be run on three operating systems: GNU/Linux, Microsoft Windows and Apple Os X. AVAILABILITY Freely available with a detailed notice document and example projects to academic users at http://www1.montpellier.inra.fr/CBGP/diyabc CONTACT: estoup@supagro.inra.fr Supplementary information: Supplementary data are available at Bioinformatics online.


Statistics and Computing | 2012

Approximate Bayesian Computational methods

Jean-Michel Marin; Pierre Pudlo; Christian P. Robert; Robin J. Ryder

Approximate Bayesian Computation (ABC) methods, also known as likelihood-free techniques, have appeared in the past ten years as the most satisfactory approach to intractable likelihood problems, first in genetics then in a broader spectrum of applications. However, these methods suffer to some degree from calibration difficulties that make them rather volatile in their implementation and thus render them suspicious to the users of more traditional Monte Carlo methods. In this survey, we study the various improvements and extensions brought on the original ABC algorithm in recent years.


Handbook of Statistics | 2005

Bayesian modelling and inference on mixtures of distributions

Jean-Michel Marin; Kerrie Mengersen; Christian P. Robert

Publisher Summary Mixture distributions comprise a finite or infinite number of components, possibly of different distributional types, that can describe different features of data. The Bayesian paradigm allows for probability statements to be made directly about the unknown parameters, prior or expert opinion to be included in the analysis, and hierarchical descriptions of both local-scale and global features of the model. This chapter aims to introduce the prior modeling, estimation, and evaluation of mixture distributions in a Bayesian paradigm. The chapter shows that mixture distributions provide a flexible, parametric framework for statistical modeling and analysis. Focus is on the methods rather than advanced examples, in the hope that an understanding of the practical aspects of such modeling can be carried into many disciplines. It also points out the fundamental difficulty in doing inference with such objects, along with a discussion about prior modeling, which is more restrictive than usual, and the constructions of estimators, which also is more involved than the standard posterior mean solution. Finally, this chapter gives some pointers to the related models and problems like mixtures of regressions and hidden Markov models as well as Dirichlet priors.


Nature Structural & Molecular Biology | 2012

Unraveling cell type-specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs.

Emilie Besnard; Amélie Babled; Laure Lapasset; Ollivier Milhavet; Hugues Parrinello; Christelle Le Dantec; Jean-Michel Marin; Jean-Marc Lemaitre

DNA replication is highly regulated, ensuring faithful inheritance of genetic information through each cell cycle. In metazoans, this process is initiated at many thousands of DNA replication origins whose cell type–specific distribution and usage are poorly understood. We exhaustively mapped the genome-wide location of replication origins in human cells using deep sequencing of short nascent strands and identified ten times more origin positions than we expected; most of these positions were conserved in four different human cell lines. Furthermore, we identified a consensus G-quadruplex–forming DNA motif that can predict the position of DNA replication origins in human cells, accounting for their distribution, usage efficiency and timing. Finally, we discovered a cell type–specific reprogrammable signature of cell identity that was revealed by specific efficiencies of conserved origin positions and not by the selection of cell type–specific subsets of origins.


Statistics and Computing | 2008

Adaptive importance sampling in general mixture classes

Olivier Cappé; Randal Douc; Arnaud Guillin; Jean-Michel Marin; Christian P. Robert

In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the performance of importance sampling, as measured by an entropy criterion. The method, called M-PMC, is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student t distributions. The performance of the proposed scheme is studied on both artificial and real examples, highlighting in particular the benefit of a novel Rao-Blackwellisation device which can be easily incorporated in the updating scheme.


Annals of Statistics | 2007

Convergence of Adaptive Sampling Schemes

Randal Douc; Arnaud Guillin; Jean-Michel Marin; Christian P. Robert

In the design of ecient simulation algorithms, one is often beset with a poor choice of proposal distributions. Although the performances of a given kernel can clarify how adequate it is for the problem at hand, a permanent on-line modification of kernels causes concerns about the validity of the resulting algorithm. While the issue is quite complex and most often intractable for MCMC algorithms, the equivalent version for importance sampling algorithms can be validated quite precisely. We derive sucient convergence conditions for a wide class of population Monte Carlo algorithms and show that Rao‐ Blackwellized versions asymptotically achieve an optimum in terms of a Kullback divergence criterion, while more rudimentary versions simply do not benefit from repeated updating.In the design of efficient simulation algorithms, one is often beset with a poor choice of proposal distributions. Although the performance of a given simulation kernel can clarify a posteriori how adequate this kernel is for the problem at hand, a permanent on-line modification of kernels causes concerns about the validity of the resulting algorithm. While the issue is most often intractable for MCMC algorithms, the equivalent version for importance sampling algorithms can be validated quite precisely. We derive sufficient convergence conditions for adaptive mixtures of population Monte Carlo algorithms and show that Rao--Blackwellized versions asymptotically achieve an optimum in terms of a Kullback divergence criterion, while more rudimentary versions do not benefit from repeated updating.


Bayesian Analysis | 2009

ABC likelihood-free methods for model choice in Gibbs random fields

Aude Grelaud; Christian P. Robert; Jean-Michel Marin; François Rodolphe; Jean-François Taly

Gibbs random fields are polymorphous statistical models that can be used to analyse different types of dependence, in particular for spatially correlated data. However, when those models are faced with the challenge of selecting a dependence structure from many, the use of standard model choice methods is hampered by the unavailability of the normalising constant in the Gibbs likelihood. In particular, from a Bayesian perspective, the computation of the posterior probabilities of the models under competition requires special likelihood-free simulation techniques like the Approximate Bayesian Computation (ABC) algorithm that is intensively used in population Genetics. We show in this paper how to implement an ABC algorithm geared towards model choice in the general setting of Gibbs random fields, demonstrating in particular that there exists a sufficient statistic across models. The accuracy of the approximation to the posterior probabilities can be further improved by importance sampling on the distribution of the models. The practical aspects of the method are detailed through two applications, the test of an iid Bernoulli model versus a first-order Markov chain, and the choice of a folding structure for a protein of Thermotoga maritima implicated into signal transduction processes.


Molecular Ecology Resources | 2012

Estimation of demo-genetic model probabilities with Approximate Bayesian Computation using linear discriminant analysis on summary statistics

Arnaud Estoup; Eric Lombaert; Jean-Michel Marin; Thomas Guillemaud; Pierre Pudlo; Christian P. Robert; Jean-Marie Cornuet

Comparison of demo‐genetic models using Approximate Bayesian Computation (ABC) is an active research field. Although large numbers of populations and models (i.e. scenarios) can be analysed with ABC using molecular data obtained from various marker types, methodological and computational issues arise when these numbers become too large. Moreover, Robert et al. (Proceedings of the National Academy of Sciences of the United States of America, 2011, 108, 15112) have shown that the conclusions drawn on ABC model comparison cannot be trusted per se and required additional simulation analyses. Monte Carlo inferential techniques to empirically evaluate confidence in scenario choice are very time‐consuming, however, when the numbers of summary statistics (Ss) and scenarios are large. We here describe a methodological innovation to process efficient ABC scenario probability computation using linear discriminant analysis (LDA) on Ss before computing logistic regression. We used simulated pseudo‐observed data sets (pods) to assess the main features of the method (precision and computation time) in comparison with traditional probability estimation using raw (i.e. not LDA transformed) Ss. We also illustrate the method on real microsatellite data sets produced to make inferences about the invasion routes of the coccinelid Harmonia axyridis. We found that scenario probabilities computed from LDA‐transformed and raw Ss were strongly correlated. Type I and II errors were similar for both methods. The faster probability computation that we observed (speed gain around a factor of 100 for LDA‐transformed Ss) substantially increases the ability of ABC practitioners to analyse large numbers of pods and hence provides a manageable way to empirically evaluate the power available to discriminate among a large set of complex scenarios.


Bioinformatics | 2016

Reliable ABC model choice via random forests

Pierre Pudlo; Jean-Michel Marin; Arnaud Estoup; Jean-Marie Cornuet; Mathieu Gautier; Christian P. Robert

MOTIVATION Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. RESULTS We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. AVAILABILITY AND IMPLEMENTATION The proposed methodology is implemented in the R package abcrf available on the CRAN. CONTACT jean-michel.marin@umontpellier.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Electronic Journal of Statistics | 2009

Online data processing: comparison of Bayesian regularized particle filters

Roberto Casarin; Jean-Michel Marin

The aim of this paper is to compare three regularized particle filters in an online data processing context. We carry out the comparison in terms of hidden states filtering and parameters estmation, considering a Bayesian paradigm and a univariate stochastic volatility model. We discuss the use of an improper prior distribution in the initialization of the filtering procedure and show that the Regularized Auxiliary Particle Filter (R-APF) outperforms the Regularized Sequential Importance Sampling (R-SIS) and the Regularized Sampling Importance Resampling (R-SIR).

Collaboration


Dive into the Jean-Michel Marin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pierre Pudlo

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Jean-Marie Cornuet

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Arnaud Guillin

Blaise Pascal University

View shared research outputs
Top Co-Authors

Avatar

Arnaud Estoup

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Clotilde Napp

Paris Dauphine University

View shared research outputs
Top Co-Authors

Avatar

Elyès Jouini

Paris Dauphine University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lionel Cucala

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Nicolas Chopin

Paris Dauphine University

View shared research outputs
Researchain Logo
Decentralizing Knowledge