A. P. Jason de Koning

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where A. P. Jason de Koning is active.

Explore More

Publication

Featured researches published by A. P. Jason de Koning.

PLOS Genetics | 2011

Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

A. P. Jason de Koning; Wanjun Gu; Todd A. Castoe; Mark A. Batzer; David D. Pollock

Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed.

Proceedings of the National Academy of Sciences of the United States of America | 2013

The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system

Freek J. Vonk; Nicholas R. Casewell; Christiaan V. Henkel; Alysha Heimberg; Hans J. Jansen; Ryan J.R. McCleary; Harald Kerkkamp; Rutger A. Vos; Isabel Guerreiro; Juan J. Calvete; Wolfgang Wüster; Anthony E. Woods; Jessica M. Logan; Robert A. Harrison; Todd A. Castoe; A. P. Jason de Koning; David D. Pollock; Mark Yandell; Diego Calderon; Camila Renjifo; Rachel B. Currier; David Salgado; Davinia Pla; Libia Sanz; Asad S. Hyder; José M. C. Ribeiro; Jan W. Arntzen; Guido van den Thillart; Marten Boetzer; Walter Pirovano

Significance Snake venoms are toxic protein cocktails used for prey capture. To investigate the evolution of these complex biological weapon systems, we sequenced the genome of a venomous snake, the king cobra, and assessed the composition of venom gland expressed genes, small RNAs, and secreted venom proteins. We show that regulatory components of the venom secretory system may have evolved from a pancreatic origin and that venom toxin genes were co-opted by distinct genomic mechanisms. After co-option, toxin genes important for prey capture have massively expanded by gene duplication and evolved under positive selection, resulting in protein neofunctionalization. This diverse and dramatic venom-related genomic response seemingly occurs in response to a coevolutionary arms race between venomous snakes and their prey. Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.

Proceedings of the National Academy of Sciences of the United States of America | 2009

Evidence for an ancient adaptive episode of convergent molecular evolution

Todd A. Castoe; A. P. Jason de Koning; Hyunmin Kim; Wanjun Gu; Brice P. Noonan; Gavin J. P. Naylor; Zhi J. Jiang; Christopher L. Parkinson; David D. Pollock

Documented cases of convergent molecular evolution due to selection are fairly unusual, and examples to date have involved only a few amino acid positions. However, because convergence mimics shared ancestry and is not accommodated by current phylogenetic methods, it can strongly mislead phylogenetic inference when it does occur. Here, we present a case of extensive convergent molecular evolution between snake and agamid lizard mitochondrial genomes that overcomes an otherwise strong phylogenetic signal. Evidence from morphology, nuclear genes, and most sites in the mitochondrial genome support one phylogenetic tree, but a subset of mostly amino acid-altering substitutions (primarily at the first and second codon positions) across multiple mitochondrial genes strongly supports a radically different phylogeny. The relevant sites generally evolved slowly but converged between ancient lineages of snakes and agamids. We estimate that ≈44 of 113 predicted convergent changes distributed across all 13 mitochondrial protein-coding genes are expected to have arisen from nonneutral causes—a remarkably large number. Combined with strong previous evidence for adaptive evolution in snake mitochondrial proteins, it is likely that much of this convergent evolution was driven by adaptation. These results indicate that nonneutral convergent molecular evolution in mitochondria can occur at a scale and intensity far beyond what has been documented previously, and they highlight the vulnerability of standard phylogenetic methods to the presence of nonneutral convergent sequence evolution.

PLOS ONE | 2012

Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake.

Todd A. Castoe; Alexander W. Poole; A. P. Jason de Koning; Kenneth L. Jones; Diana F. Tomback; Sara J. Oyler-McCance; Jennifer A. Fike; Stacey L. Lance; Jeffrey W. Streicher; Eric N. Smith; David D. Pollock

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct “Seq-to-SSR” approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clarks Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as

Molecular Ecology Resources | 2010

Rapid identification of thousands of copperhead snake (Agkistrodon contortrix) microsatellite loci from modest amounts of 454 shotgun genome sequence

Todd A. Castoe; Alexander W. Poole; Wanjun Gu; A. P. Jason de Koning; Juan M. Daza; Eric N. Smith; David D. Pollock

10 per sample – a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable.

Proceedings of the National Academy of Sciences of the United States of America | 2013

The Burmese python genome reveals the molecular basis for extreme adaptation in snakes

Todd A. Castoe; A. P. Jason de Koning; Kathryn T. Hall; Daren C. Card; Drew R. Schield; Matthew K. Fujita; Robert P. Ruggiero; Jack F. Degner; Juan M. Daza; Wanjun Gu; Jacobo Reyes-Velasco; Kyle J. Shaney; Jill M. Castoe; Samuel E. Fox; Alex W. Poole; Daniel Polanco; Jason Dobry; Michael W. Vandewege; Qing Li; Ryan K. Schott; Aurélie Kapusta; Patrick Minx; Cédric Feschotte; Peter Uetz; David A. Ray; Federico G. Hoffmann; Robert Bogden; Eric N. Smith; Belinda S. W. Chang; Freek J. Vonk

Optimal integration of next‐generation sequencing into mainstream research requires re‐evaluation of how problems can be reasonably overcome and what questions can be asked. One potential application is the rapid acquisition of genomic information to identify microsatellite loci for evolutionary, population genetic and chromosome linkage mapping research on non‐model and not previously sequenced organisms. Here, we report on results using high‐throughput sequencing to obtain a large number of microsatellite loci from the venomous snake Agkistrodon contortrix, the copperhead. We used the 454 Genome Sequencer FLX next‐generation sequencing platform to sample randomly ∼27 Mbp (128 773 reads) of the copperhead genome, thus sampling about 2% of the genome of this species. We identified microsatellite loci in 11.3% of all reads obtained, with 14 612 microsatellite loci identified in total, 4564 of which had flanking sequences suitable for polymerase chain reaction primer design. The random sequencing‐based approach to identify microsatellites was rapid, cost‐effective and identified thousands of useful microsatellite loci in a previously unstudied species.

Genome Biology | 2012

Sequencing three crocodilian genomes to illuminate the evolution of archosaurs and amniotes

John St. John; Edward L. Braun; Sally R. Isberg; Lee G. Miles; Amanda Yoon-Yee Chong; Jaime Gongora; Pauline Dalzell; C. Moran; Bertrand Bed'hom; Arhat Abzhanov; Shane C. Burgess; Amanda M. Cooksey; Todd A. Castoe; Nicholas G. Crawford; Llewellyn D. Densmore; Jennifer C. Drew; Scott V. Edwards; Brant C. Faircloth; Matthew K. Fujita; Matthew J. Greenwold; Federico G. Hoffmann; Jonathan M. Howard; Taisen Iguchi; Daniel E. Janes; Shahid Yar Khan; Satomi Kohno; A. P. Jason de Koning; Stacey L. Lance; Fiona M. McCarthy; John E. McCormack

Significance The molecular basis of morphological and physiological adaptations in snakes is largely unknown. Here, we study these phenotypes using the genome of the Burmese python (Python molurus bivittatus), a model for extreme phenotypic plasticity and metabolic adaptation. We discovered massive rapid changes in gene expression that coordinate major changes in organ size and function after feeding. Many significantly responsive genes are associated with metabolism, development, and mammalian diseases. A striking number of genes experienced positive selection in ancestral snakes. Such genes were related to metabolism, development, lungs, eyes, heart, kidney, and skeletal structure—all highly modified features in snakes. Snake phenotypic novelty seems to be driven by the system-wide coordination of protein adaptation, gene expression, and changes in genome structure. Snakes possess many extreme morphological and physiological adaptations. Identification of the molecular basis of these traits can provide novel understanding for vertebrate biology and medicine. Here, we study snake biology using the genome sequence of the Burmese python (Python molurus bivittatus), a model of extreme physiological and metabolic adaptation. We compare the python and king cobra genomes along with genomic samples from other snakes and perform transcriptome analysis to gain insights into the extreme phenotypes of the python. We discovered rapid and massive transcriptional responses in multiple organ systems that occur on feeding and coordinate major changes in organ size and function. Intriguingly, the homologs of these genes in humans are associated with metabolism, development, and pathology. We also found that many snake metabolic genes have undergone positive selection, which together with the rapid evolution of mitochondrial proteins, provides evidence for extensive adaptive redesign of snake metabolic pathways. Additional evidence for molecular adaptation and gene family expansions and contractions is associated with major physiological and phenotypic adaptations in snakes; genes involved are related to cell cycle, development, lungs, eyes, heart, intestine, and skeletal structure, including GRB2-associated binding protein 1, SSH, WNT16, and bone morphogenetic protein 7. Finally, changes in repetitive DNA content, guanine-cytosine isochore structure, and nucleotide substitution rates indicate major shifts in the structure and evolution of snake genomes compared with other amniotes. Phenotypic and physiological novelty in snakes seems to be driven by system-wide coordination of protein adaptation, gene expression, and changes in the structure of the genome.

Protein Science | 2012

The interface of protein structure, protein biophysics, and molecular evolution

David A. Liberles; Sarah A. Teichmann; Ivet Bahar; Ugo Bastolla; Jesse D. Bloom; Erich Bornberg-Bauer; Lucy J. Colwell; A. P. Jason de Koning; Nikolay V. Dokholyan; Julian J. Echave; Arne Elofsson; Dietlind L. Gerloff; Richard A. Goldstein; Johan A. Grahnen; Mark T. Holder; Clemens Lakner; Nicholas Lartillot; Simon C. Lovell; Gavin J. P. Naylor; Tina Perica; David D. Pollock; Tal Pupko; Lynne Regan; Andrew J. Roger; Nimrod D. Rubinstein; Eugene I. Shakhnovich; Kimmen Sjölander; Shamil R. Sunyaev; Ashley I. Teufel; Jeffrey L. Thorne

The International Crocodilian Genomes Working Group (ICGWG) will sequence and assemble the American alligator (Alligator mississippiensis), saltwater crocodile (Crocodylus porosus) and Indian gharial (Gavialis gangeticus) genomes. The status of these projects and our planned analyses are described.

Genome Biology and Evolution | 2011

Discovery of Highly Divergent Repeat Landscapes in Snake Genomes Using High-Throughput Sequencing

Todd A. Castoe; Kathryn T. Hall; Marcel L. Guibotsy Mboulas; Wanjun Gu; A. P. Jason de Koning; Samuel E. Fox; Alexander W. Poole; Vijetha Vemulapalli; Juan M. Daza; Todd C. Mockler; Eric N. Smith; Cédric Feschotte; David D. Pollock

Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state‐of‐the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high‐throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction.

Genome Biology | 2011

Sequencing the genome of the Burmese python ( Python molurus bivittatus ) as a model for studying extreme adaptations in snakes

Todd A. Castoe; A. P. Jason de Koning; Kathryn T. Hall; Ken Daigoro Yokoyama; Wanjun Gu; Eric N. Smith; Cédric Feschotte; Peter Uetz; David A. Ray; Jason Dobry; Robert Bogden; Stephen P. Mackessy; Anne M. Bronikowski; Wesley C. Warren; Stephen M. Secor; David D. Pollock

We conducted a comprehensive assessment of genomic repeat content in two snake genomes, the venomous copperhead (Agkistrodon contortrix) and the Burmese python (Python molurus bivittatus). These two genomes are both relatively small (∼1.4 Gb) but have surprisingly extensive differences in the abundance and expansion histories of their repeat elements. In the python, the readily identifiable repeat element content is low (21%), similar to bird genomes, whereas that of the copperhead is higher (45%), similar to mammalian genomes. The copperheads greater repeat content arises from the recent expansion of many different microsatellites and transposable element (TE) families, and the copperhead had 23-fold greater levels of TE-related transcripts than the python. This suggests the possibility that greater TE activity in the copperhead is ongoing. Expansion of CR1 LINEs in the copperhead genome has resulted in TE-mediated microsatellite expansion (“microsatellite seeding”) at a scale several orders of magnitude greater than previously observed in vertebrates. Snakes also appear to be prone to horizontal transfer of TEs, particularly in the copperhead lineage. The reason that the copperhead has such a small genome in the face of so much recent expansion of repeat elements remains an open question, although selective pressure related to extreme metabolic performance is an obvious candidate. TE activity can affect gene regulation as well as rates of recombination and gene duplication, and it is therefore possible that TE activity played a role in the evolution of major adaptations in snakes; some evidence suggests this may include the evolution of venom repertoires.

Explore More