Mario dos Reis
Queen Mary University of London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mario dos Reis.
Proceedings of the Royal Society B: Biological Sciences , 279 (1742) pp. 3491-3500. (2012) | 2012
Mario dos Reis; Jun Inoue; Masami Hasegawa; Robert J. Asher; Philip C. J. Donoghue; Ziheng Yang
The fossil record suggests a rapid radiation of placental mammals following the Cretaceous–Paleogene (K–Pg) mass extinction 65 million years ago (Ma); nevertheless, molecular time estimates, while highly variable, are generally much older. Early molecular studies suffer from inadequate dating methods, reliance on the molecular clock, and simplistic and over-confident interpretations of the fossil record. More recent studies have used Bayesian dating methods that circumvent those issues, but the use of limited data has led to large estimation uncertainties, precluding a decisive conclusion on the timing of mammalian diversifications. Here we use a powerful Bayesian method to analyse 36 nuclear genomes and 274 mitochondrial genomes (20.6 million base pairs), combined with robust but flexible fossil calibrations. Our posterior time estimates suggest that marsupials diverged from eutherians 168–178 Ma, and crown Marsupialia diverged 64–84 Ma. Placentalia diverged 88–90 Ma, and present-day placental orders (except Primates and Xenarthra) originated in a ∼20 Myr window (45–65 Ma) after the K–Pg extinction. Therefore we reject a pre K–Pg model of placental ordinal diversification. We suggest other infamous instances of mismatch between molecular and palaeontological divergence time estimates will be resolved with this same approach.
Cell | 2003
Guojun Sheng; Mario dos Reis; Claudio D. Stern
Gastrulation generates mesoderm and endoderm from embryonic epiblast; soon after, the neural plate is established within the epiblast-both events require FGF signaling. We describe a zinc finger transcriptional activator, Churchill (ChCh), which acts as a switch between different roles of FGF. FGF induces ChCh slowly; this activates Smad-interacting-protein-1 (Sip1), which blocks further induction of the mesoderm markers brachyury and Tbx6L by FGF. ChCh is first expressed as cells stop migrating through the primitive streak, and we show that it regulates cell ingression. We propose a simple mechanism by which FGF sensitizes cells to BMP signals. These results reveal that neural induction requires cessation of mesoderm formation at the midline in addition to the decision between epidermis and neural plate.
Molecular Biology and Evolution | 2011
Ziheng Yang; Mario dos Reis
The branch-site test is a likelihood ratio test to detect positive selection along prespecified lineages on a phylogeny that affects only a subset of codons in a protein-coding gene, with positive selection indicated by accelerated nonsynonymous substitutions (with ω = d(N)/d(S) > 1). This test may have more power than earlier methods, which average nucleotide substitution rates over sites in the protein and/or over branches on the tree. However, a few recent studies questioned the statistical basis of the test and claimed that the test generated too many false positives. In this paper, we examine the null distribution of the test and conduct a computer simulation to examine the false-positive rate and the power of the test. The results suggest that the asymptotic theory is reliable for typical data sets, and indeed in our simulations, the large-sample null distribution was reliable with as few as 20-50 codons in the alignment. We examined the impact of sequence length, the strength of positive selection, and the proportion of sites under positive selection on the power of the branch-site test. We found that the test was far more powerful in detecting episodic positive selection than branch-based tests, which average substitution rates over all codons in the gene and thus miss the signal when most codons are under strong selective constraint. Recent claims of statistical problems with the branch-site test are due to misinterpretations of simulation results. Our results, as well as previous simulation studies that have demonstrated the robustness of the test, suggest that the branch-site test may be a useful tool for detecting episodic positive selection and for generating biological hypotheses for mutation studies and functional analyses. The test is sensitive to sequence and alignment errors and caution should be exercised concerning its use when data quality is in doubt.
Current Biology | 2015
Mario dos Reis; Yuttapong Thawornwattana; Konstantinos Angelis; Maximilian J. Telford; Philip C. J. Donoghue; Ziheng Yang
Summary The timing of divergences among metazoan lineages is integral to understanding the processes of animal evolution, placing the biological events of species divergences into the correct geological timeframe. Recent fossil discoveries and molecular clock dating studies have suggested a divergence of bilaterian phyla >100 million years before the Cambrian, when the first definite crown-bilaterian fossils occur. Most previous molecular clock dating studies, however, have suffered from limited data and biases in methodologies, and virtually all have failed to acknowledge the large uncertainties associated with the fossil record of early animals, leading to inconsistent estimates among studies. Here we use an unprecedented amount of molecular data, combined with four fossil calibration strategies (reflecting disparate and controversial interpretations of the metazoan fossil record) to obtain Bayesian estimates of metazoan divergence times. Our results indicate that the uncertain nature of ancient fossils and violations of the molecular clock impose a limit on the precision that can be achieved in estimates of ancient molecular timescales. For example, although we can assert that crown Metazoa originated during the Cryogenian (with most crown-bilaterian phyla diversifying during the Ediacaran), it is not possible with current data to pinpoint the divergence events with sufficient accuracy to test for correlations between geological and biological events in the history of animals. Although a Cryogenian origin of crown Metazoa agrees with current geological interpretations, the divergence dates of the bilaterians remain controversial. Thus, attempts to build evolutionary narratives of early animal evolution based on molecular clock timescales appear to be premature.
PLOS Computational Biology | 2009
Asif U. Tamuri; Mario dos Reis; Alan J. Hay; Richard A. Goldstein
The natural reservoir of Influenza A is waterfowl. Normally, waterfowl viruses are not adapted to infect and spread in the human population. Sometimes, through reassortment or through whole host shift events, genetic material from waterfowl viruses is introduced into the human population causing worldwide pandemics. Identifying which mutations allow viruses from avian origin to spread successfully in the human population is of great importance in predicting and controlling influenza pandemics. Here we describe a novel approach to identify such mutations. We use a sitewise non-homogeneous phylogenetic model that explicitly takes into account differences in the equilibrium frequencies of amino acids in different hosts and locations. We identify 172 amino acid sites with strong support and 518 sites with moderate support of different selection constraints in human and avian viruses. The sites that we identify provide an invaluable resource to experimental virologists studying adaptation of avian flu viruses to the human host. Identification of the sequence changes necessary for host shifts would help us predict the pandemic potential of various strains. The method is of broad applicability to investigating changes in selective constraints when the timing of the changes is known.
Molecular Biology and Evolution | 2011
Mario dos Reis; Ziheng Yang
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
Nature Reviews Genetics | 2016
Mario dos Reis; Philip C. J. Donoghue; Ziheng Yang
Five decades have passed since the proposal of the molecular clock hypothesis, which states that the rate of evolution at the molecular level is constant through time and among species. This hypothesis has become a powerful tool in evolutionary biology, making it possible to use molecular sequences to estimate the geological ages of species divergence events. With recent advances in Bayesian clock dating methodology and the explosive accumulation of genetic sequence data, molecular clock dating has found widespread applications, from tracking virus pandemics and studying the macroevolutionary process of speciation and extinction to estimating a timescale for life on Earth.
Molecular Biology and Evolution | 2009
Mario dos Reis; Lorenz Wernisch
Natural selection on codon usage is a pervasive force that acts on a large variety of prokaryotic and eukaryotic genomes. Despite this, obtaining reliable estimates of selection on codon usage has proved complicated, perhaps due to the fact that the selection coefficients involved are very small. In this work, a population genetics model is used to measure the strength of selected codon usage bias, S, in 10 eukaryotic genomes. It is shown that the strength of selection is closely linked to expression and that reliable estimates of selection coefficients can only be obtained for genes with very similar expression levels. We compare the strength of selected codon usage for orthologous genes across all 10 genomes classified according to expression categories. Fungi genomes present the largest S values (2.24–2.56), whereas multicellular invertebrate and plant genomes present more moderate values (0.61–1.91). The large mammalian genomes (human and mouse) show low S values (0.22–0.51) for the most highly expressed genes. This might not be evidence for selection in these organisms as the technique used here to estimate S does not properly account for nucleotide composition heterogeneity along such genomes. The relationship between estimated S values and empirical estimates of population size is presented here for the first time. It is shown, as theoretically expected, that population size has an important role in the operativity of translational selection.
Genome Biology and Evolution | 2016
James E. Tarver; Mario dos Reis; Siavash Mirarab; Raymond J. Moran; Sean Parker; Joseph E. O’Reilly; Benjamin L. King; Mary J. O’Connell; Robert J. Asher; Tandy J. Warnow; Kevin J. Peterson; Philip C. J. Donoghue; Davide Pisani
Placental mammals comprise three principal clades: Afrotheria (e.g., elephants and tenrecs), Xenarthra (e.g., armadillos and sloths), and Boreoeutheria (all other placental mammals), the relationships among which are the subject of controversy and a touchstone for debate on the limits of phylogenetic inference. Previous analyses have found support for all three hypotheses, leading some to conclude that this phylogenetic problem might be impossible to resolve due to the compounded effects of incomplete lineage sorting (ILS) and a rapid radiation. Here we show, using a genome scale nucleotide data set, microRNAs, and the reanalysis of the three largest previously published amino acid data sets, that the root of Placentalia lies between Atlantogenata and Boreoeutheria. Although we found evidence for ILS in early placental evolution, we are able to reject previous conclusions that the placental root is a hard polytomy that cannot be resolved. Reanalyses of previous data sets recover Atlantogenata + Boreoeutheria and show that contradictory results are a consequence of poorly fitting evolutionary models; instead, when the evolutionary process is better-modeled, all data sets converge on Atlantogenata. Our Bayesian molecular clock analysis estimates that marsupials diverged from placentals 157–170 Ma, crown Placentalia diverged 86–100 Ma, and crown Atlantogenata diverged 84–97 Ma. Our results are compatible with placental diversification being driven by dispersal rather than vicariance mechanisms, postdating early phases in the protracted opening of the Atlantic Ocean.
Journal of Systematics and Evolution | 2012
Mario dos Reis; Ziheng Yang
Divergence time estimation using molecular sequence data relying on uncertain fossil calibrations is an unconventional statistical estimation problem. As the sequence data provide information about the distances only, estimation of absolute times and rates has to rely on information in the prior, so that the model is only semi‐identifiable. In this paper, we use a combination of mathematical analysis, computer simulation, and real data analysis to examine the uncertainty in posterior time estimates when the amount of sequence data increases. The analysis extends the infinite‐sites theory of Yang and Rannala, which predicts the posterior distribution of divergence times and rate when the amount of data approaches infinity. We found that the posterior credibility interval in general decreases and reaches a non‐zero limit when the data size increases. However, for the node with the most precise fossil calibration (as measured by the interval width divided by the mid value), sequence data do not really make the time estimate any more precise. We propose a finite‐sites theory which predicts that the square of the posterior interval width approaches its infinite‐data limit at the rate 1/n, where n is the sequence length. We suggest a procedure to partition the uncertainty of posterior time estimates into that due to uncertainties in fossil calibrations and that due to sampling errors in the sequence data. We evaluate the impact of conflicting fossil calibrations on posterior time estimation and point out that narrow credibility intervals or overly precise time estimates can be produced by conflicting or erroneous fossil calibrations.