David D. Pollock | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David D. Pollock is active.

Explore More

Publication

Featured researches published by David D. Pollock.

Nature | 2010

The genome of a songbird.

Wesley C. Warren; David F. Clayton; Hans Ellegren; Arthur P. Arnold; LaDeana W. Hillier; Axel Künstner; Steve Searle; Simon White; Albert J. Vilella; Susan Fairley; Andreas Heger; Lesheng Kong; Chris P. Ponting; Erich D. Jarvis; Claudio V. Mello; Patrick Minx; Peter V. Lovell; Tarciso Velho; Margaret Ferris; Christopher N. Balakrishnan; Saurabh Sinha; Charles Blatti; Sarah E. London; Yun Li; Ya-Chi Lin; Julia M. George; Jonathan V. Sweedler; Bruce R. Southey; Preethi H. Gunaratne; M. G. Watson

The zebra finch is an important model organism in several fields with unique relevance to human neuroscience. Like other songbirds, the zebra finch communicates through learned vocalizations, an ability otherwise documented only in humans and a few other animals and lacking in the chicken—the only bird with a sequenced genome until now. Here we present a structural, functional and comparative analysis of the genome sequence of the zebra finch (Taeniopygia guttata), which is a songbird belonging to the large avian order Passeriformes. We find that the overall structures of the genomes are similar in zebra finch and chicken, but they differ in many intrachromosomal rearrangements, lineage-specific gene family expansions, the number of long-terminal-repeat-based retrotransposons, and mechanisms of sex chromosome dosage compensation. We show that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets. We also show evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience. These results indicate an active involvement of the genome in neural processes underlying vocal communication and identify potential genetic substrates for the evolution and regulation of this behaviour.

PLOS Genetics | 2011

Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

A. P. Jason de Koning; Wanjun Gu; Todd A. Castoe; Mark A. Batzer; David D. Pollock

Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed.

Nature | 2011

The genome of the green anole lizard and a comparative analysis with birds and mammals

Jessica Alföldi; Federica Di Palma; Manfred Grabherr; Christina Williams; Lesheng Kong; Evan Mauceli; Pamela Russell; Craig B. Lowe; Richard E. Glor; Jacob D. Jaffe; David A. Ray; Stéphane Boissinot; Andrew M. Shedlock; Todd A. Castoe; John K. Colbourne; Matthew K. Fujita; Ricardo Moreno; Boudewijn ten Hallers; David Haussler; Andreas Heger; David I. Heiman; Daniel E. Janes; Jeremy Johnson; Pieter J. de Jong; Maxim Koriabine; Marcia Lara; Peter Novick; Chris L. Organ; Sally E. Peach; Steven Poe

The evolution of the amniotic egg was one of the great evolutionary innovations in the history of life, freeing vertebrates from an obligatory connection to water and thus permitting the conquest of terrestrial environments. Among amniotes, genome sequences are available for mammals and birds, but not for non-avian reptiles. Here we report the genome sequence of the North American green anole lizard, Anolis carolinensis. We find that A. carolinensis microchromosomes are highly syntenic with chicken microchromosomes, yet do not exhibit the high GC and low repeat content that are characteristic of avian microchromosomes. Also, A. carolinensis mobile elements are very young and diverse—more so than in any other sequenced amniote genome. The GC content of this lizard genome is also unusual in its homogeneity, unlike the regionally variable GC content found in mammals and birds. We describe and assign sequence to the previously unknown A. carolinensis X chromosome. Comparative gene analysis shows that amniote egg proteins have evolved significantly more rapidly than other proteins. An anole phylogeny resolves basal branches to illuminate the history of their repeated adaptive radiations.

Systematic Biology | 2002

Increased Taxon Sampling Is Advantageous for Phylogenetic Inference

David D. Pollock; Derrick J. Zwickl; Jimmy A. McGuire; David M. Hillis

Until recently, it was believed that complex phylogenies might be extremely difficult to reconstruct due to the phenomenal rate of increase in the number of possible phylogenies as the number of taxa increases. However, Hillis (1996) showed through simulation that, for at least one complex phylogeny of angiosperms with 228 taxa, reconstruction was far more accurate than expected, even with relatively modest amounts of DNA sequence data. This led to a flurry of papers on the subject of taxon sampling and phylogenetic reconstruction, with focus quickly shifting from the question of whether complex phylogenies can be reconstructed to whether and how much an existing phylogeny can be improved through increased taxon sampling (Hillis, 1998; Kim, 1998; Poe, 1998; Poe and Swofford, 1999; Pollock and Bruno, 2000; Rannala et al., 1998; Yang, 1998). Although a statistician might intuitively believe that it is generally better (or at least no worse) to increase the amount of data to resolve a question in statistical inference, the benefits of taxon addition for phylogenetic inference remain controversial. Some researchers have argued that taxon addition can decrease accuracy (Kim, 1996,1998), while others believe that increased sampling improves accuracy (Graybeal, 1998; Hillis, 1996, 1998; Murphy et al., 2001; Poe, 1998; Pollock and Bruno, 2000; Pollock et al., 2000; Soltis et al., 1999). The reasons that different papers come to apparently contradictory conclusions deserve careful consideration. An often cited factor affecting the benefits of taxon addition is the phenomenon of long-branch attraction (LBA). Some phylogenetic methods have a bias toward preferential clustering of long branches, leading to erroneous results when those long branches do not actually represent a monophyletic assemblage (Felsenstein, 1978; Hendy and Penny, 1989). This phenomenon has been cited in favor of increased taxon sampling, since sampling can be designed to break up long branches (Hillis, 1998). However, increased sampling has also been implicated as a potential cause of LBA because addition of a new long branch may wrongly attract a pre-existing long branch that had previously been inferred correctly (Poe and Swofford, 1999; Rannala et al., 1998). LBA may also explain some simulations that have found problems in phylogeny estimation when sampling outside the taxonomic group of interest (but see Pollock and Bruno [2000] for an alternative explanation). Outside sampling in these simulations tended to add long branches, which tended to attract the longest unbroken branch in the group of interest (Hillis, 1998; Rannala et al., 1998). The degree to which LBA is a problem depends greatly on the method of analysis, and LBA is much less of a problem for maximum likelihood (ML) than for parsimony or distance methods (Bruno and Halpern, 1999). A recent paper on the subject of taxon addition (Rosenberg and Kumar, 2001) concludes that increased taxon sampling is of little benefit to phylogenetic inference when compared to increasing sequence length. We disagree with their interpretation and believe that their data support the importance of increased taxon sampling. In addition, some of their data were simulated under extreme conditions (i.e., substitution rates that were very high or low, or sequences that were unreasonably short). Large error values and nonlinear relationships at these extremes make it difficult to interpret effects for the majority of the range, and averaging across the entire range is inappropriate. Moreover, we do not believe that Rosenberg and Kumar (2001) used the most appropriate metric to measure the relative effect of taxon addition. Our reanalysis of their simulated data indicates that increased taxon sampling is highly beneficial for phylogenetic inference.

Systematic Biology | 2003

Is sparse taxon sampling a problem for phylogenetic inference

David M. Hillis; David D. Pollock; Jimmy A. McGuire; Derrick J. Zwickl

Rosenberg and Kumar (2001) addressed the importance of taxon sampling in phylogenetic analysis and concluded that phylogenetic error is “largely independent of taxon sample size” (2001:10756) and that their “results do not provide evidence in favor of adding taxa to problematic phylogenies” (2001:10756). In response to these conclusions, Zwickl and Hillis (2002) and Pollock et al. (2002) conducted additional simulations and reanalyzed the data presented by Rosenberg and Kumar (2001). Zwickl and Hillis and Pollock et al. showed that these conclusions of Rosenberg and Kumar could not be supported either by analyses of their original data or by new simulations that corrected a number of deficiencies in Rosenberg and Kumar’s original experimental design. Both Zwickl and Hillis and Pollock et al. found that increased taxon sampling resulted in greatly reduced phylogenetic estimation error, and Pollock et al. showed that the benefits of increased taxon sampling were similar to adding an equivalent amount of sequence length for the same taxa (in the ranges simulated by Rosenberg and Kumar). In their response, Rosenberg and Kumar (2002) focused on a slightly different conclusion from that in their original paper, which was that “longer sequences, rather than extensive sampling, will better improve the accuracy of phylogenetic inference” (2001:10751). In 2001, Rosenberg and Kumar argued that the beneficial effect of increasing taxa was 10-fold lower than the beneficial effect of increasing sequence length and that the effects of increased taxon sampling for the same genes were negligible (“largely independently” of phylogenetic error). Rosenberg and Kumar (2002) have now concluded that the beneficial effect of increasing taxon sample size is not small, but they suggested that the benefit comes simply from the overall increase in size of the data matrix (the total number of characters × taxa). Furthermore, they maintained that there is a greater benefit to increasing the total sequence length for few taxa than can be obtained by increasing taxon sampling for the same genes. Here, we discuss the two sets of conclusions reached by Rosenberg and Kumar (2001, 2002).

Proceedings of the National Academy of Sciences of the United States of America | 2013

The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system

Freek J. Vonk; Nicholas R. Casewell; Christiaan V. Henkel; Alysha Heimberg; Hans J. Jansen; Ryan J.R. McCleary; Harald Kerkkamp; Rutger A. Vos; Isabel Guerreiro; Juan J. Calvete; Wolfgang Wüster; Anthony E. Woods; Jessica M. Logan; Robert A. Harrison; Todd A. Castoe; A. P. Jason de Koning; David D. Pollock; Mark Yandell; Diego Calderon; Camila Renjifo; Rachel B. Currier; David Salgado; Davinia Pla; Libia Sanz; Asad S. Hyder; José M. C. Ribeiro; Jan W. Arntzen; Guido van den Thillart; Marten Boetzer; Walter Pirovano

Significance Snake venoms are toxic protein cocktails used for prey capture. To investigate the evolution of these complex biological weapon systems, we sequenced the genome of a venomous snake, the king cobra, and assessed the composition of venom gland expressed genes, small RNAs, and secreted venom proteins. We show that regulatory components of the venom secretory system may have evolved from a pancreatic origin and that venom toxin genes were co-opted by distinct genomic mechanisms. After co-option, toxin genes important for prey capture have massively expanded by gene duplication and evolved under positive selection, resulting in protein neofunctionalization. This diverse and dramatic venom-related genomic response seemingly occurs in response to a coevolutionary arms race between venomous snakes and their prey. Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.

Fungal Biology | 2005

The Beetle Gut: A Hyperdiverse Source of Novel Yeasts

Sung-Oui Suh; Joseph V. McHugh; David D. Pollock; Meredith Blackwell

We isolated over 650 yeasts over a three year period from the gut of a variety of beetles and characterized them on the basis of LSU rDNA sequences and morphological and metabolic traits. Of these, at least 200 were undescribed taxa, a number equivalent to almost 30% of all currently recognized yeast species. A Bayesian analysis of species discovery rates predicts further sampling of previously sampled habitats could easily produce another 100 species. The sampled habitat is, thereby, estimated to contain well over half as many more species as are currently known worldwide. The beetle gut yeasts occur in 45 independent lineages scattered across the yeast phylogenetic tree, often in clusters. The distribution suggests that the some of the yeasts diversified by a process of horizontal transmission in the habitats and subsequent specialization in association with insect hosts. Evidence of specialization comes from consistent associations over time and broad geographical ranges of certain yeast and beetle species. The discovery of high yeast diversity in a previously unexplored habitat is a first step toward investigating the basis of the interactions and their impact in relation to ecology and evolution.

Proceedings of the National Academy of Sciences of the United States of America | 2009

Evidence for an ancient adaptive episode of convergent molecular evolution

Todd A. Castoe; A. P. Jason de Koning; Hyunmin Kim; Wanjun Gu; Brice P. Noonan; Gavin J. P. Naylor; Zhi J. Jiang; Christopher L. Parkinson; David D. Pollock

Documented cases of convergent molecular evolution due to selection are fairly unusual, and examples to date have involved only a few amino acid positions. However, because convergence mimics shared ancestry and is not accommodated by current phylogenetic methods, it can strongly mislead phylogenetic inference when it does occur. Here, we present a case of extensive convergent molecular evolution between snake and agamid lizard mitochondrial genomes that overcomes an otherwise strong phylogenetic signal. Evidence from morphology, nuclear genes, and most sites in the mitochondrial genome support one phylogenetic tree, but a subset of mostly amino acid-altering substitutions (primarily at the first and second codon positions) across multiple mitochondrial genes strongly supports a radically different phylogeny. The relevant sites generally evolved slowly but converged between ancient lineages of snakes and agamids. We estimate that ≈44 of 113 predicted convergent changes distributed across all 13 mitochondrial protein-coding genes are expected to have arisen from nonneutral causes—a remarkably large number. Combined with strong previous evidence for adaptive evolution in snake mitochondrial proteins, it is likely that much of this convergent evolution was driven by adaptation. These results indicate that nonneutral convergent molecular evolution in mitochondria can occur at a scale and intensity far beyond what has been documented previously, and they highlight the vulnerability of standard phylogenetic methods to the presence of nonneutral convergent sequence evolution.

PLOS ONE | 2012

Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake.

Todd A. Castoe; Alexander W. Poole; A. P. Jason de Koning; Kenneth L. Jones; Diana F. Tomback; Sara J. Oyler-McCance; Jennifer A. Fike; Stacey L. Lance; Jeffrey W. Streicher; Eric N. Smith; David D. Pollock

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct “Seq-to-SSR” approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clarks Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as

Molecular Ecology Resources | 2010

Rapid identification of thousands of copperhead snake (Agkistrodon contortrix) microsatellite loci from modest amounts of 454 shotgun genome sequence

Todd A. Castoe; Alexander W. Poole; Wanjun Gu; A. P. Jason de Koning; Juan M. Daza; Eric N. Smith; David D. Pollock

10 per sample – a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable.

Explore More