Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Miguel Arenas is active.

Publication


Featured researches published by Miguel Arenas.


Infection, Genetics and Evolution | 2015

Recombination in viruses: Mechanisms, methods of study, and evolutionary consequences

Marcos Pérez-Losada; Miguel Arenas; Juan Carlos Galán; Ferran Palero; Fernando González-Candelas

n Abstractn n Recombination is a pervasive process generating diversity in most viruses. It joins variants that arise independently within the same molecule, creating new opportunities for viruses to overcome selective pressures and to adapt to new environments and hosts. Consequently, the analysis of viral recombination attracts the interest of clinicians, epidemiologists, molecular biologists and evolutionary biologists. In this review we present an overview of three major areas related to viral recombination: (i) the molecular mechanisms that underlie recombination in model viruses, including DNA-viruses (Herpesvirus) and RNA-viruses (Human Influenza Virus and Human Immunodeficiency Virus), (ii) the analytical procedures to detect recombination in viral sequences and to determine the recombination breakpoints, along with the conceptual and methodological tools currently used and a brief overview of the impact of new sequencing technologies on the detection of recombination, and (iii) the major areas in the evolutionary analysis of viral populations on which recombination has an impact. These include the evaluation of selective pressures acting on viral populations, the application of evolutionary reconstructions in the characterization of centralized genes for vaccine design, and the evaluation of linkage disequilibrium and population structure.n n


Molecular Biology and Evolution | 2014

Simulation of Genome-Wide Evolution under Heterogeneous Substitution Models and Complex Multispecies Coalescent Histories

Miguel Arenas; David Posada

Genomic evolution can be highly heterogeneous. Here, we introduce a new framework to simulate genome-wide sequence evolution under a variety of substitution models that may change along the genome and the phylogeny, following complex multispecies coalescent histories that can include recombination, demographics, longitudinal sampling, population subdivision/species history, and migration. A key aspect of our simulation strategy is that the heterogeneity of the whole evolutionary process can be parameterized according to statistical prior distributions specified by the user. We used this framework to carry out a study of the impact of variable codon frequencies across genomic regions on the estimation of the genome-wide nonsynonymous/synonymous ratio. We found that both variable codon frequencies across genes and rate variation among sites and regions can lead to severe underestimation of the global dN/dS values. The program SGWE—Simulation of Genome-Wide Evolution—is freely available from http://code.google.com/p/sgwe-project/, including extensive documentation and detailed examples.


Heredity | 2014

Coestimation of recombination, substitution and molecular adaptation rates by approximate Bayesian computation

Joao S. Lopes; Miguel Arenas; David Posada; Mark A. Beaumont

The estimation of parameters in molecular evolution may be biased when some processes are not considered. For example, the estimation of selection at the molecular level using codon-substitution models can have an upward bias when recombination is ignored. Here we address the joint estimation of recombination, molecular adaptation and substitution rates from coding sequences using approximate Bayesian computation (ABC). We describe the implementation of a regression-based strategy for choosing subsets of summary statistics for coding data, and show that this approach can accurately infer recombination allowing for intracodon recombination breakpoints, molecular adaptation and codon substitution rates. We demonstrate that our ABC approach can outperform other analytical methods under a variety of evolutionary scenarios. We also show that although the choice of the codon-substitution model is important, our inferences are robust to a moderate degree of model misspecification. In addition, we demonstrate that our approach can accurately choose the evolutionary model that best fits the data, providing an alternative for when the use of full-likelihood methods is impracticable. Finally, we applied our ABC method to co-estimate recombination, substitution and molecular adaptation rates from 24 published human immunodeficiency virus 1 coding data sets.


Molecular Biology and Evolution | 2015

Maximum likelihood phylogenetic inference with selection on protein folding stability

Miguel Arenas; Agustín Sánchez-Cobos; Ugo Bastolla

Despite intense work, incorporating constraints on protein native structures into the mathematical models of molecular evolution remains difficult, because most models and programs assume that protein sites evolve independently, whereas protein stability is maintained by interactions between sites. Here, we address this problem by developing a new mean-field substitution model that generates independent site-specific amino acid distributions with constraints on the stability of the native state against both unfolding and misfolding. The model depends on a background distribution of amino acids and one selection parameter that we fix maximizing the likelihood of the observed protein sequence. The analytic solution of the model shows that the main determinant of the site-specific distributions is the number of native contacts of the site and that the most variable sites are those with an intermediate number of native contacts. The mean-field models obtained, taking into account misfolded conformations, yield larger likelihood than models that only consider the native state, because their average hydrophobicity is more realistic, and they produce on the average stable sequences for most proteins. We evaluated the mean-field model with respect to empirical substitution models on 12 test data sets of different protein families. In all cases, the observed site-specific sequence profiles presented smaller Kullback-Leibler divergence from the mean-field distributions than from the empirical substitution model. Next, we obtained substitution rates combining the mean-field frequencies with an empirical substitution model. The resulting mean-field substitution model assigns larger likelihood than the empirical model to all studied families when we consider sequences with identity larger than 0.35, plausibly a condition that enforces conservation of the native structure across the family. We found that the mean-field model performs better than other structurally constrained models with similar or higher complexity. With respect to the much more complex model recently developed by Bordner and Mittelmann, which takes into account pairwise terms in the amino acid distributions and also optimizes the exchangeability matrix, our model performed worse for data with small sequence divergence but better for data with larger sequence divergence. The mean-field model has been implemented into the computer program Prot_Evol that is freely available at http://ub.cbm.uam.es/software/Prot_Evol.php.


Frontiers in Genetics | 2013

The importance and application of the ancestral recombination graph

Miguel Arenas

One of the most important evolutionary forces is recombination, it increases genetic diversity and promotes adaptation through exchange of genetic material and where existent mutations are shuffled. Knowledge about recombination is, for example, fundamental to understand genome structure (Reich et al., 2001), phenotypic diversity (Zhang et al., 2002), and diverse genetic diseases (Daly et al., 2001). Indeed, recombination should be considered to properly study molecular evolution and perform phylogenetic inferences (e.g., Schierup and Hein, 2000; Anisimova et al., 2003; Arenas and Posada, 2010c). The recombination evolutionary history is commonly represented by the ancestral recombination graph (ARG) (Griffiths and Marjoram, 1997), an illustrative example is shown in Figure u200bFigure1.1. Counterintuitively, ARGs have not been widely used, perhaps as a consequence of the difficulties to infer explicit ARGs and the complexity of the ARG representation. The aim of this general commentary is to describe the importance and application of the ARG. n n n nFigure 1 n nIllustrative example of an ARG. RE indicates recombination events. Numbers in nodes indicate intervals of ancestral material. Note that each recombinant fragment (1–2, 3–6, and 7–9) has its own most recent common ancestor (MRCA), ...


Frontiers in Genetics | 2013

Computer Programs and Methodologies for the Simulation of DNA Sequence Data with Recombination

Miguel Arenas

Computer simulations are useful in evolutionary biology for hypothesis testing, to verify analytical methods, to analyze interactions among evolutionary processes, and to estimate evolutionary parameters. In particular, the simulation of DNA sequences with recombination may help in understanding the role of recombination in diverse evolutionary questions, such as the genome structure. Consequently, plenty of computer simulators have been developed to simulate DNA sequence data with recombination. However, the choice of an appropriate tool, among all currently available simulators, is critical if recombination simulations are to be biologically meaningful. This review provides a practical survival guide to commonly used computer programs and methodologies for the simulation of coding and non-coding DNA sequences with recombination. It may help in the correct design of computer simulation experiments of recombination. In addition, the study includes a review of simulation studies investigating the impact of ignoring recombination when performing various evolutionary analyses, such as phylogenetic tree and ancestral sequence reconstructions. Alternative analytical methodologies accounting for recombination are also reviewed.


Molecular Biology and Evolution | 2015

CodABC: A Computational Framework to Coestimate Recombination, Substitution, and Molecular Adaptation Rates by Approximate Bayesian Computation

Miguel Arenas; Joao S. Lopes; Mark A. Beaumont; David Posada

The estimation of substitution and recombination rates can provide important insights into the molecular evolution of protein-coding sequences. Here, we present a new computational framework, called “CodABC,” to jointly estimate recombination, substitution and synonymous and nonsynonymous rates from coding data. CodABC uses approximate Bayesian computation with and without regression adjustment and implements a variety of codon models, intracodon recombination, and longitudinal sampling. CodABC can provide accurate joint parameter estimates from recombining coding sequences, often outperforming maximum-likelihood methods based on more approximate models. In addition, CodABC allows for the inclusion of several nuisance parameters such as those representing codon frequencies, transition matrices, heterogeneity across sites or invariable sites. CodABC is freely available from http://code.google.com/p/codabc/, includes a GUI, extensive documentation and ready-to-use examples, and can run in parallel on multicore machines.


Molecular Phylogenetics and Evolution | 2016

Influence of mutation and recombination on HIV-1 in vitro fitness recovery.

Miguel Arenas; Ramon Lorenzo-Redondo; Cecilio López-Galíndez

The understanding of the evolutionary processes underlying HIV-1 fitness recovery is fundamental for HIV-1 pathogenesis, antiretroviral treatment and vaccine design. It is known that HIV-1 can present very high mutation and recombination rates, however the specific contribution of these evolutionary forces in the in vitro viral fitness recovery has not been simultaneously quantified. To this aim, we analyzed substitution, recombination and molecular adaptation rates in a variety of HIV-1 biological clones derived from a viral isolate after severe population bottlenecks and a number of large population cell culture passages. These clones presented an overall but uneven fitness gain, mean of 3-fold, respect to the initial passage values. We found a significant relationship between the fitness increase and the appearance and fixation of mutations. In addition, these fixed mutations presented molecular signatures of positive selection through the accumulation of non-synonymous substitutions. Interestingly, viral recombination correlated with fitness recovery in most of studied viral quasispecies. The genetic diversity generated by these evolutionary processes was positively correlated with the viral fitness. We conclude that HIV-1 fitness recovery can be derived from the genetic heterogeneity generated through both mutation and recombination, and under diversifying molecular adaptation. The findings also suggest nonrandom evolutionary pathways for in vitro fitness recovery.


Current Genomics | 2014

Spatial and Temporal Simulation of Human Evolution. Methods, Frameworks and Applications

Macarena Benguigui; Miguel Arenas

Analyses of human evolution are fundamental to understand the current gradients of human diversity. In this concern, genetic samples collected from current populations together with archaeological data are the most important resources to study human evolution. However, they are often insufficient to properly evaluate a variety of evolutionary scenarios, leading to continuous debates and discussions. A commonly applied strategy consists of the use of computer simulations based on, as realistic as possible, evolutionary models, to evaluate alternative evolutionary scenarios through statistical correlations with the real data. Computer simulations can also be applied to estimate evolutionary parameters or to study the role of each parameter on the evolutionary process. Here we review the mainly used methods and evolutionary frameworks to perform realistic spatially explicit computer simulations of human evolution. Although we focus on human evolution, most of the methods and software we describe can also be used to study other species. We also describe the importance of considering spatially explicit models to better mimic human evolutionary scenarios based on a variety of phenomena such as range expansions, range shifts, range contractions, sex-biased dispersal, long-distance dispersal or admixtures of populations. We finally discuss future implementations to improve current spatially explicit simulations and their derived applications in human evolution.


Archive | 2019

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Ugo Bastolla; Miguel Arenas

Phylogenetic inference from protein data is traditionally based on empirical substitution models of evolution that assume that protein sites evolve independently of each other and under the same substitution process. However, it is well known that the structural properties of a protein site in the native state affect its evolution, in particular the sequence entropy and the substitution rate. Starting from the seminal proposal by Halpern and Bruno, where structural properties are incorporated in the evolutionary model through site-specific amino acid frequencies, several models have been developed to tackle the influence of protein structure on sequence evolution. Here we describe stability-constrained substitution (SCS)xa0models that explicitly consider the stability of the native state against both unfolded and misfolded states. One of them, the mean-field model, provides an independent sites approximation that can be readily incorporated in maximum likelihood methods of phylogenetic inference, including ancestral sequence reconstruction. Next, we describe its validation with simulated and real proteins and its limitations and advantages with respect to empirical models that lack site specificity. We finally provide guidelines and recommendations to analyze protein data accounting for stability constraints, including computer simulations and inferences of protein evolution based on maximum likelihood. Some practical examples are included to illustrate these procedures.

Collaboration


Dive into the Miguel Arenas's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Angel Conde

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

E. Matykina

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

Juan J. de Damborenea

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

Ugo Bastolla

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

Joao S. Lopes

Instituto Gulbenkian de Ciência

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Agustín Sánchez-Cobos

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

C. Domingo

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

David Martín y Marero

Autonomous University of Madrid

View shared research outputs
Researchain Logo
Decentralizing Knowledge