Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pavlos Pavlidis is active.

Publication


Featured researches published by Pavlos Pavlidis.


Bioinformatics | 2013

A general species delimitation method with applications to phylogenetic placements

Jiajie Zhang; Paschalia Kapli; Pavlos Pavlidis; Alexandros Stamatakis

Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking methods scale well on large datasets, but the results are highly sensitive to the similarity threshold. Coalescent-based species delimitation approaches often rely on Bayesian statistics and Markov Chain Monte Carlo sampling, and can therefore only be applied to small datasets. Results: We introduce the Poisson tree processes (PTP) model to infer putative species boundaries on a given phylogenetic input tree. We also integrate PTP with our evolutionary placement algorithm (EPA-PTP) to count the number of species in phylogenetic placements. We compare our approaches with popular OTU-picking methods and the General Mixed Yule Coalescent (GMYC) model. For de novo species delimitation, the stand-alone PTP model generally outperforms GYMC as well as OTU-picking methods when evolutionary distances between species are small. PTP neither requires an ultrametric input tree nor a sequence similarity threshold as input. In the open reference species delimitation approach, EPA-PTP yields more accurate results than de novo species delimitation methods. Finally, EPA-PTP scales on large datasets because it relies on the parallel implementations of the EPA and RAxML, thereby allowing to delimit species in high-throughput sequencing data. Availability and implementation: The code is freely available at www.exelixis-lab.org/software.html. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Genetics | 2010

Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations

Pavlos Pavlidis; Jeffrey D. Jensen; Wolfgang Stephan

A major goal of population genomics is to reconstruct the history of natural populations and to infer the neutral and selective scenarios that can explain the present-day polymorphism patterns. However, the separation between neutral and selective hypotheses has proven hard, mainly because both may predict similar patterns in the genome. This study focuses on the development of methods that can be used to distinguish neutral from selective hypotheses in equilibrium and nonequilibrium populations. These methods utilize a combination of statistics on the basis of the site frequency spectrum (SFS) and linkage disequilibrium (LD). We investigate the patterns of genetic variation along recombining chromosomes using a multitude of comparisons between neutral and selective hypotheses, such as selection or neutrality in equilibrium and nonequilibrium populations and recurrent selection models. We perform hypothesis testing using the classical P-value approach, but we also introduce methods from the machine-learning field. We demonstrate that the combination of SFS- and LD-based statistics increases the power to detect recent positive selection in populations that have experienced past demographic changes.


Molecular Biology and Evolution | 2013

SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes

Pavlos Pavlidis; Daniel Živković; Alexandros Stamatakis; Nikolaos Alachiotis

The advent of modern DNA sequencing technology is the driving force in obtaining complete intra-specific genomes that can be used to detect loci that have been subject to positive selection in the recent past. Based on selective sweep theory, beneficial loci can be detected by examining the single nucleotide polymorphism patterns in intraspecific genome alignments. In the last decade, a plethora of algorithms for identifying selective sweeps have been developed. However, the majority of these algorithms have not been designed for analyzing whole-genome data. We present SweeD (Sweep Detector), an open-source tool for the rapid detection of selective sweeps in whole genomes. It analyzes site frequency spectra and represents a substantial extension of the widely used SweepFinder program. The sequential version of SweeD is up to 22 times faster than SweepFinder and, more importantly, is able to analyze thousands of sequences. We also provide a parallel implementation of SweeD for multi-core processors. Furthermore, we implemented a checkpointing mechanism that allows to deploy SweeD on cluster systems with queue execution time restrictions, as well as to resume long-running analyses after processor failures. In addition, the user can specify various demographic models via the command-line to calculate their theoretically expected site frequency spectra. Therefore, (in contrast to SweepFinder) the neutral site frequencies can optionally be directly calculated from a given demographic model. We show that an increase of sample size results in more precise detection of positive selection. Thus, the ability to analyze substantially larger sample sizes by using SweeD leads to more accurate sweep detection. We validate SweeD via simulations and by scanning the first chromosome from the 1000 human Genomes project for selective sweeps. We compare SweeD results with results from a linkage-disequilibrium-based approach and identify common outliers.


Nature | 2010

Evolution of self-compatibility in Arabidopsis by a mutation in the male specificity gene

Takashi Tsuchimatsu; Keita Suwabe; Rie Shimizu-Inatsugi; Sachiyo Isokawa; Pavlos Pavlidis; Thomas Städler; Go Suzuki; Seiji Takayama; Masao Watanabe; Kentaro K. Shimizu

Ever since Darwin’s pioneering research, the evolution of self-fertilisation (selfing) has been regarded as one of the most prevalent evolutionary transitions in flowering plants. A major mechanism to prevent selfing is the self-incompatibility (SI) recognition system, which consists of male and female specificity genes at the S-locus and SI modifier genes. Under conditions that favour selfing, mutations disabling the male recognition component are predicted to enjoy a relative advantage over those disabling the female component, because male mutations would increase through both pollen and seeds whereas female mutations would increase only through seeds. Despite many studies on the genetic basis of loss of SI in the predominantly selfing plant Arabidopsis thaliana, it remains unknown whether selfing arose through mutations in the female specificity gene (S-receptor kinase, SRK), male specificity gene (S-locus cysteine-rich protein, SCR; also known as S-locus protein 11, SP11) or modifier genes, and whether any of them rose to high frequency across large geographic regions. Here we report that a disruptive 213-base-pair (bp) inversion in the SCR gene (or its derivative haplotypes with deletions encompassing the entire SCR-A and a large portion of SRK-A) is found in 95% of European accessions, which contrasts with the genome-wide pattern of polymorphism in European A. thaliana. Importantly, interspecific crossings using Arabidopsis halleri as a pollen donor reveal that some A. thaliana accessions, including Wei-1, retain the female SI reaction, suggesting that all female components including SRK are still functional. Moreover, when the 213-bp inversion in SCR was inverted and expressed in transgenic Wei-1 plants, the functional SCR restored the SI reaction. The inversion within SCR is the first mutation disrupting SI shown to be nearly fixed in geographically wide samples, and its prevalence is consistent with theoretical predictions regarding the evolutionary advantage of mutations in male components.


Molecular Ecology | 2008

A population genomic approach to map recent positive selection in model species

Pavlos Pavlidis; Stephan Hutter; Wolfgang Stephan

Based on nearly complete genome sequences from a variety of organisms data on naturally occurring genetic variation on the scale of hundreds of loci to entire genomes have been collected in recent years. In parallel, new statistical tests have been developed to infer evidence of recent positive selection from these data and to localize the target regions of selection in the genome. These methods have now been successfully applied to Drosophila melanogaster, humans, mice and a few plant species. In genomic regions of normal recombination rates, the targets of positive selection have been mapped down to the level of individual genes.


Bioinformatics | 2012

OmegaPlus: a scalable tool for rapid detection of selective sweeps in whole-genome datasets

Nikolaos Alachiotis; Alexandros Stamatakis; Pavlos Pavlidis

UNLABELLED Recent advances in sequencing technologies have led to the rapid accumulation of molecular sequence data. Analyzing whole-genome data (as obtained from next-generation sequencers) from intra-species samples allows to detect signatures of positive selection along the genome and therefore identify potentially advantageous genes in the course of the evolution of a population. We introduce OmegaPlus, an open-source tool for rapid detection of selective sweeps in whole-genome data based on linkage disequilibrium. The tool is up to two orders of magnitude faster than existing programs for this purpose and also exhibits up to two orders of magnitude smaller memory requirements. AVAILABILITY OmegaPlus is available under GNU GPL at http://www.exelixis-lab.org/software.html.


Molecular Ecology Resources | 2010

msABC: a modification of Hudson's ms to facilitate multi-locus ABC analysis.

Pavlos Pavlidis; Stefan Laurent; Wolfgang Stephan

With the availability of whole‐genome sequence data biologists are able to test hypotheses regarding the demography of populations. Furthermore, the advancement of the Approximate Bayesian Computation (ABC) methodology allows the demographic inference to be performed in a simple framework using summary statistics. We present here msABC, a coalescent‐based software that facilitates the simulation of multi‐locus data, suitable for an ABC analysis. msABC is based on Hudson’s ms algorithm, which is used extensively for simulating neutral demographic histories of populations. The flexibility of the original algorithm has been extended so that sample size may vary among loci, missing data can be incorporated in simulations and calculations, and a multitude of summary statistics for single or multiple populations is generated. The source code of msABC is available at http://bio.lmu.de/~pavlidis/msabc or upon request from the authors.


Bioinformatics | 2017

Multi-rate Poisson tree processes for single-locus species delimitation under maximum likelihood and Markov chain Monte Carlo.

Paschalia Kapli; Sarah Lutteropp; Jiajie Zhang; Kassian Kobert; Pavlos Pavlidis; Alexandros Stamatakis; Tomas Flouri

Motivation: In recent years, molecular species delimitation has become a routine approach for quantifying and classifying biodiversity. Barcoding methods are of particular importance in large‐scale surveys as they promote fast species discovery and biodiversity estimates. Among those, distance‐based methods are the most common choice as they scale well with large datasets; however, they are sensitive to similarity threshold parameters and they ignore evolutionary relationships. The recently introduced “Poisson Tree Processes” (PTP) method is a phylogeny‐aware approach that does not rely on such thresholds. Yet, two weaknesses of PTP impact its accuracy and practicality when applied to large datasets; it does not account for divergent intraspecific variation and is slow for a large number of sequences. Results: We introduce the multi‐rate PTP (mPTP), an improved method that alleviates the theoretical and technical shortcomings of PTP. It incorporates different levels of intraspecific genetic diversity deriving from differences in either the evolutionary history or sampling of each species. Results on empirical data suggest that mPTP is superior to PTP and popular distance‐based methods as it, consistently yields more accurate delimitations with respect to the taxonomy (i.e., identifies more taxonomic species, infers species numbers closer to the taxonomy). Moreover, mPTP does not require any similarity threshold as input. The novel dynamic programming algorithm attains a speedup of at least five orders of magnitude compared to PTP, allowing it to delimit species in large (meta‐) barcoding data. In addition, Markov Chain Monte Carlo sampling provides a comprehensive evaluation of the inferred delimitation in just a few seconds for millions of steps, independently of tree size. Availability and Implementation: mPTP is implemented in C and is available for download at http://github.com/Pas‐Kapli/mptp under the GNU Affero 3 license. A web‐service is available at http://mptp.h‐its.org. Contact: paschalia.kapli@h‐its.org or alexandros.stamatakis@h‐its.org or tomas.flouri@h‐its.org Supplementary information: Supplementary data are available at Bioinformatics online.


Proceedings of the National Academy of Sciences of the United States of America | 2013

Primate genome architecture influences structural variation mechanisms and functional consequences

Omer Gokcumen; Verena Tischler; Jelena Tica; Qihui Zhu; Rebecca Iskow; Eunjung Lee; Markus Hsi-Yang Fritz; Amy Langdon; Adrian M. Stütz; Pavlos Pavlidis; Vladimir Benes; Ryan E. Mills; Peter J. Park; Charles Lee; Jan O. Korbel

Significance Genomic structural variants (SVs) significantly contribute to human genetic variation and have been linked with numerous diseases. Compared with humans, the characterization of SVs occurring within and across nonhuman primates has lagged. We generated comprehensive massively parallel DNA sequencing-based SV maps in three nonhuman primate species and show that the rates of different SV formation mechanisms, such as nonallelic homologous recombination and Alu retrotransposition, vary significantly between the great apes and the rhesus macaque—leading to markedly different SV landscapes in these species. Linking gene expression data with species-specific gene duplications, we describe several instances where gene duplicates seem to lead to evolutionary innovation through the gain of gene expression in new tissues. Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.


Genetics | 2012

Selective Sweeps in Multilocus Models of Quantitative Traits

Pavlos Pavlidis; Dirk Metzler; Wolfgang Stephan

We study the trajectory of an allele that affects a polygenic trait selected toward a phenotypic optimum. Furthermore, conditioning on this trajectory we analyze the effect of the selected mutation on linked neutral variation. We examine the well-characterized two-locus two-allele model but we also provide results for diallelic models with up to eight loci. First, when the optimum phenotype is that of the double heterozygote in a two-locus model, and there is no dominance or epistasis of effects on the trait, the trajectories of selected mutations rarely reach fixation; instead, a polymorphic equilibrium at both loci is approached. Whether a polymorphic equilibrium is reached (rather than fixation at both loci) depends on the intensity of selection and the relative distances to the optimum of the homozygotes at each locus. Furthermore, if both loci have similar effects on the trait, fixation of an allele at a given locus is less likely when it starts at low frequency and the other locus is polymorphic (with alleles at intermediate frequencies). Weaker selection increases the probability of fixation of the studied allele, as the polymorphic equilibrium is less stable in this case. When we do not require the double heterozygote to be at the optimum we find that the polymorphic equilibrium is more difficult to reach, and fixation becomes more likely. Second, increasing the number of loci decreases the probability of fixation, because adaptation to the optimum is possible by various combinations of alleles. Summaries of the genealogy (height, total length, and imbalance) and of sequence polymorphism (number of polymorphisms, frequency spectrum, and haplotype structure) next to a selected locus depend on the frequency that the selected mutation approaches at equilibrium. We conclude that multilocus response to selection may in some cases prevent selective sweeps from being completed, as described in previous studies, but that conditions causing this to happen strongly depend on the genetic architecture of the trait, and that fixation of selected mutations is likely in many instances.

Collaboration


Dive into the Pavlos Pavlidis's collaboration.

Top Co-Authors

Avatar

Alexandros Stamatakis

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Omer Gokcumen

Brigham and Women's Hospital

View shared research outputs
Top Co-Authors

Avatar

Nikolaos Alachiotis

Technical University of Crete

View shared research outputs
Top Co-Authors

Avatar

Stefan Laurent

École Polytechnique Fédérale de Lausanne

View shared research outputs
Top Co-Authors

Avatar

Duo Xu

University at Buffalo

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Colin Flanagan

State University of New York System

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge