Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Posada is active.

Publication


Featured researches published by David Posada.


Bioinformatics | 1998

MODELTEST: testing the model of DNA substitution.

David Posada; Keith A. Crandall

SUMMARY The program MODELTEST uses log likelihood scores to establish the model of DNA evolution that best fits the data. AVAILABILITY The MODELTEST package, including the source code and some documentation is available at http://bioag.byu. edu/zoology/crandall_lab/modeltest.html.


Nature Methods | 2012

jModelTest 2: more models, new heuristics and parallel computing

Diego Darriba; Guillermo L. Taboada; Ramón Doallo; David Posada

jModelTest 2: more models, new heuristics and parallel computing Diego Darriba, Guillermo L. Taboada, Ramón Doallo and David Posada Supplementary Table 1. New features in jModelTest 2 Supplementary Table 2. Model selection accuracy Supplementary Table 3. Mean square errors for model averaged estimates Supplementary Note 1. Hill-climbing hierarchical clustering algorithm Supplementary Note 2. Heuristic filtering Supplementary Note 3. Simulations from prior distributions Supplementary Note 4. Speed-up benchmark on real and simulated datasets


Systematic Biology | 2004

Model selection and model averaging in phylogenetics : Advantages of Akaike Information Criterion and Bayesian approaches over likelihood ratio tests

David Posada; Thomas R. Buckley

Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus(genus Carabus) ground beetles described by Sota and Vogler (2001).


Bioinformatics | 2005

ProtTest: selection of best-fit models of protein evolution

Federico Abascal; Rafael Zardoya; David Posada

SUMMARY Using an appropriate model of amino acid replacement is very important for the study of protein evolution and phylogenetic inference. We have built a tool for the selection of the best-fit model of evolution, among a set of candidate models, for a given protein sequence alignment. AVAILABILITY ProtTest is available under the GNU license from http://darwin.uvigo.es


Trends in Ecology and Evolution | 2001

Intraspecific gene genealogies: trees grafting into networks

David Posada; Keith A. Crandall

Intraspecific gene evolution cannot always be represented by a bifurcating tree. Rather, population genealogies are often multifurcated, descendant genes coexist with persistent ancestors and recombination events produce reticulate relationships. Whereas traditional phylogenetic methods assume bifurcating trees, several networking approaches have recently been developed to estimate intraspecific genealogies that take into account these population-level phenomena.


Bioinformatics | 2011

ProtTest 3

Diego Darriba; Guillermo L. Taboada; Ramón Doallo; David Posada

UNLABELLED We have implemented a high-performance computing (HPC) version of ProtTest that can be executed in parallel in multicore desktops and clusters. This version, called ProtTest 3, includes new features and extended capabilities. AVAILABILITY ProtTest 3 source code and binaries are freely available under GNU license for download from http://darwin.uvigo.es/software/prottest3, linked to a Mercurial repository at Bitbucket (https://bitbucket.org/). CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Proceedings of the National Academy of Sciences of the United States of America | 2001

Evaluation of methods for detecting recombination from DNA sequences: Computer simulations

David Posada; Keith A. Crandall

Recombination is a key evolutionary process that shapes the architecture of genomes and the genetic structure of populations. Although many statistical methods are available for the detection of recombination from DNA sequences, their absolute and relative performance is still unknown. Here we evaluated the performance of 14 different recombination detection algorithms. We used the coalescent with recombination to simulate DNA sequences with different levels of recombination, genetic diversity, and rate variation among sites. Recombination detection methods were applied to these data sets, and whether they detected or not recombination was recorded. Different recombination methods showed distinct performance depending on the amount of recombination, genetic diversity, and rate variation among sites. The model of nucleotide substitution under which the data were generated did not seem to have a significant effect. Most methods increase power with more sequence divergence. In general, recombination detection methods seem to capture the presence of recombination, but they are not very powerful. Methods that use substitution patterns or incompatibility among sites were more powerful than methods based on phylogenetic incongruence. Most methods do not seem to infer more false positives than expected by chance. Especially depending on the amount of diversity in the data, different methods could be used to attain maximum power while minimizing false positives. Results shown here will provide some guidance in the selection of the most appropriate method/s for the analysis of the particular data at hand.


Molecular Ecology | 2000

GeoDis : a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes

David Posada; Keith A. Crandall; Alan R. Templeton

The central focus of population genetics is the study of the distribution of the genetic variation within and among populations. This endeavour has often been accomplished by the use of genealogies upon which geographical information is incorporated in the search of association among genetic variation and geographical distribution (see Avise 1998). However, a particular population genetic structure can be the result of distinct processes acting in different points through time and space and may reflect historical rather than ongoing population level processes (Gerber & Templeton 1996). Templeton (1993) and Templeton et al . (1995) describe a methodology (cladistic nested analysis) in which population structure can be separated from population history when it is assessed through rigorous and objective statistical tests upon an estimated nested cladogram (see Templeton et al . 1992). GeoDis is a computer program that implements the cladistic nested analysis. The simplest test for geographical association is to treat sample locations as categorical variables. An exact permutational contingency test is performed for any clade at each nesting level. A chi-square statistic is calculated from the contingency tables in which rows are genetic clades and columns are geographical locations (see also software Chiperm, available at http://bioag.byu.edu/zoology/crandall_lab/ programs.htm). A more elaborate analysis can also be carried out by using information on geographical distances. Using the geographical coordinates of each population two main statistics are calculated, the clade distance ( D c ), which measures the geographical spread of a clade, and the nested clade distance ( D n ), which measures how a clade is geographically distributed relative to other clades in the same higher-level nesting category. In the case of riparian or coastal species, or in the case of species with constrained dispersal routes, a matrix of pairwise distances among the different locations better describes their geographical distribution. The analogue statistics ( D cl and D nl ) are calculated as the average pairwise distances between members of the same focal clade and the average pairwise distances between members of the focal clade with all members of the nesting clade (including the focal clade). An interior-tip statistic (I-T) is also estimated within each nested category as the average interior distance minus the average tip distance. For the calculation of these averages, each clade distance is weighted by the number of copies in that focal clade relative to the total number of copies in the nesting clade. This tip vs. interior contrast corresponds to a young vs. old contrast and, to a lesser extent, rare vs. common (Crandall & Templeton 1993). If the haplotype tree is rooted, say by an out-group, the user can also specify which haplotype is the oldest by designating it as the ‘interior’, and regarding the younger haplotypes all as ‘tips’. When root probabilities or out-group weights for the cladogram are specified (Castelloe & Templeton 1994), the correlation of both distance measures with out-group weights within each nested category is also estimated. The significance of these statistics is estimated through a Monte Carlo procedure. Null distributions are constructed by randomizing the contingency data table for each clade and nesting level and estimating again the test statistics for each randomized data set. Matrix randomization is accomplished by using the algorithm of Roff & Bentzen (1989), which preserves the marginals of the table (clade frequencies and sample sizes), while permuting the individual cells. A minimum number of 1000 random permutations are recommended to make statistical inference at the 5% level of significance (Edgington 1986). The output of GeoDis consists of the calculated statistics and their associated permutational P -values. Templeton (1998) provides a key for the interpretation of these results in biological terms. GeoDis has been written both in C and Java and includes new features, as weighted I-T statistics, and the possibility of using user-defined distances. A previous version of the program written in VAX/VMS Basic exists (AR Templeton). The C program prompts the user for all the options needed to run the program. The Java program provides an interface where the user selects the input and output files, the number of permutations, the possibility of using out-group weights, decimal degrees, and/or user-defined distances. The input file consists of the population information plus the description of the nested cladogram. Details are given in the program documentation. The GeoDis package, containing executables for Macintosh, PC, and Unix machines, documentation, and source code in Java and C is available for free from http:// bioag.byu.edu/zoology/crandall_lab/programs.htm.


Systematic Biology | 2001

Selecting the Best-Fit Model of Nucleotide Substitution

David Posada; Keith A. Crandall

Despite the relevant role of models of nucleotide substitution in phylogenetics, choosing among different models remains a problem. Several statistical methods for selecting the model that best fits the data at hand have been proposed, but their absolute and relative performance has not yet been characterized. In this study, we compare under various conditions the performance of different hierarchical and dynamic likelihood ratio tests, and of Akaike and Bayesian information methods, for selecting best-fit models of nucleotide substitution. We specifically examine the role of the topology used to estimate the likelihood of the different models and the importance of the order in which hypotheses are tested. We do this by simulating DNA sequences under a known model of nucleotide substitution and recording how often this true model is recovered by the different methods. Our results suggest that model selection is reasonably accurate and indicate that some likelihood ratio test methods perform overall better than the Akaike or Bayesian information criteria. The tree used to estimate the likelihood scores does not influence model selection unless it is a randomly chosen tree. The order in which hypotheses are tested, and the complexity of the initial model in the sequence of tests, influence model selection in some cases. Model fitting in phylogenetics has been suggested for many years, yet many authors still arbitrarily choose their models, often using the default models implemented in standard computer programs for phylogenetic estimation. We show here that a best-fit model can be readily identified. Consequently, given the relevance of models, model fitting should be routine in any phylogenetic analysis that uses models of evolution.


Bioinformatics | 2006

GARD: a genetic algorithm for recombination detection

Sergei L. Kosakovsky Pond; David Posada; Mike B. Gravenor; Christopher H. Woelk; Simon D. W. Frost

MOTIVATION Phylogenetic and evolutionary inference can be severely misled if recombination is not accounted for, hence screening for it should be an essential component of nearly every comparative study. The evolution of recombinant sequences can not be properly explained by a single phylogenetic tree, but several phylogenies may be used to correctly model the evolution of non-recombinant fragments. RESULTS We developed a likelihood-based model selection procedure that uses a genetic algorithm to search multiple sequence alignments for evidence of recombination breakpoints and identify putative recombinant sequences. GARD is an extensible and intuitive method that can be run efficiently in parallel. Extensive simulation studies show that the method nearly always outperforms other available tools, both in terms of power and accuracy and that the use of GARD to screen sequences for recombination ensures good statistical properties for methods aimed at detecting positive selection. AVAILABILITY Freely available http://www.datamonkey.org/GARD/

Collaboration


Dive into the David Posada's collaboration.

Top Co-Authors

Avatar

Keith A. Crandall

George Washington University

View shared research outputs
Top Co-Authors

Avatar

Miguel Arenas

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Diego Darriba

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Rafael Zardoya

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

Antonio Figueras

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Federico Abascal

Spanish National Research Council

View shared research outputs
Researchain Logo
Decentralizing Knowledge