Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Paul Ryvkin is active.

Publication


Featured researches published by Paul Ryvkin.


PLOS Genetics | 2010

Genome-Wide Double-Stranded RNA Sequencing Reveals the Functional Significance of Base-Paired RNAs in Arabidopsis

Qi Zheng; Paul Ryvkin; Fan Li; Isabelle Dragomir; Otto Valladares; Jamie Yang; Kajia Cao; Li-San Wang; Brian D. Gregory

The functional structure of all biologically active molecules is dependent on intra- and inter-molecular interactions. This is especially evident for RNA molecules whose functionality, maturation, and regulation require formation of correct secondary structure through encoded base-pairing interactions. Unfortunately, intra- and inter-molecular base-pairing information is lacking for most RNAs. Here, we marry classical nuclease-based structure mapping techniques with high-throughput sequencing technology to interrogate all base-paired RNA in Arabidopsis thaliana and identify ∼200 new small (sm)RNA–producing substrates of RNA–DEPENDENT RNA POLYMERASE6. Our comprehensive analysis of paired RNAs reveals conserved functionality within introns and both 5′ and 3′ untranslated regions (UTRs) of mRNAs, as well as a novel population of functional RNAs, many of which are the precursors of smRNAs. Finally, we identify intra-molecular base-pairing interactions to produce a genome-wide collection of RNA secondary structure models. Although our methodology reveals the pairing status of RNA molecules in the absence of cellular proteins, previous studies have demonstrated that structural information obtained for RNAs in solution accurately reflects their structure in ribonucleoprotein complexes. Furthermore, our identification of RNA–DEPENDENT RNA POLYMERASE6 substrates and conserved functional RNA domains within introns and both 5′ and 3′ untranslated regions (UTRs) of mRNAs using this approach strongly suggests that RNA molecules are correctly folded into their secondary structure in solution. Overall, our findings highlight the importance of base-paired RNAs in eukaryotes and present an approach that should be widely applicable for the analysis of this key structural feature of RNA.


Nucleic Acids Research | 2010

Altered gene expression in the Werner and Bloom syndromes is associated with sequences having G-quadruplex forming potential

Jay E. Johnson; Kajia Cao; Paul Ryvkin; Li-San Wang; F. Brad Johnson

The human Werner and Bloom syndromes (WS and BS) are caused by deficiencies in the WRN and BLM RecQ helicases, respectively. WRN, BLM and their Saccharomyces cerevisiae homologue Sgs1, are particularly active in vitro in unwinding G-quadruplex DNA (G4-DNA), a family of non-canonical nucleic acid structures formed by certain G-rich sequences. Recently, mRNA levels from loci containing potential G-quadruplex-forming sequences (PQS) were found to be preferentially altered in sgs1Δ mutants, suggesting that G4-DNA targeting by Sgs1 directly affects gene expression. Here, we extend these findings to human cells. Using microarrays to measure mRNAs obtained from human fibroblasts deficient for various RecQ family helicases, we observe significant associations between loci that are upregulated in WS or BS cells and loci that have PQS. No such PQS associations were observed for control expression datasets, however. Furthermore, upregulated genes in WS and BS showed no or dramatically reduced associations with sequences similar to PQS but that have considerably reduced potential to form intramolecular G4-DNA. These findings indicate that, like Sgs1, WRN and BLM can regulate transcription globally by targeting G4-DNA.


RNA | 2013

HAMR: high-throughput annotation of modified ribonucleotides

Paul Ryvkin; Yuk Yee Leung; Ian M. Silverman; Micah Childress; Otto Valladares; Isabelle Dragomir; Brian D. Gregory; Li-San Wang

RNA is often altered post-transcriptionally by the covalent modification of particular nucleotides; these modifications are known to modulate the structure and activity of their host RNAs. The recent discovery that an RNA methyl-6 adenosine demethylase (FTO) is a risk gene in obesity has brought to light the significance of RNA modifications to human biology. These noncanonical nucleotides, when converted to cDNA in the course of RNA sequencing, can produce sequence patterns that are distinguishable from simple base-calling errors. To determine whether these modifications can be detected in RNA sequencing data, we developed a method that can not only locate these modifications transcriptome-wide with single nucleotide resolution, but can also differentiate between different classes of modifications. Using small RNA-seq data we were able to detect 92% of all known human tRNA modification sites that are predicted to affect RT activity. We also found that different modifications produce distinct patterns of cDNA sequence, allowing us to differentiate between two classes of adenosine and two classes of guanine modifications with 98% and 79% accuracy, respectively. To show the robustness of this method to sample preparation and sequencing methods, as well as to organismal diversity, we applied it to a publicly available yeast data set and achieved similar levels of accuracy. We also experimentally validated two novel and one known 3-methylcytosine (3mC) sites predicted by HAMR in human tRNAs. Researchers can now use our method to identify and characterize RNA modifications using only RNA-seq data, both retrospectively and when asking questions specifically about modified RNA.


Nucleic Acids Research | 2013

CoRAL: predicting non-coding RNAs from small RNA-sequencing data

Yuk Yee Leung; Paul Ryvkin; Lyle H. Ungar; Brian D. Gregory; Li-San Wang

The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with ∼80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms.


Journal of Computational Biology | 2009

Duplication mechanism and disruptions in flanking regions determine the fate of Mammalian gene duplicates.

Jin Jun; Paul Ryvkin; Edward Hemphill; Craig E. Nelson

Here we identify duplicated genes in five mammalian genomes and classify these duplicates based on the mechanisms by which they were generated. Retrotransposition accounts for at least half of all predicted duplicate genes in these genomes, with tandem and interspersed DNA-mediated duplicates comprising the other half. Estimation of the evolutionary rates in each class revealed greater rate asymmetry between retrotransposed and interspersed DNA duplicate pairs than between tandem duplicates, suggesting that retrotransposed and interspersed DNA duplicates are diverging more quickly. In an attempt to understand the basis of this asymmetry, we identified disruption of flanking DNA as an indicator of new duplicate fate-loss of local synteny accelerates the asymmetry of divergence of interspersed DNA duplicates. We also show that intact retrogenes are enriched in intergenic regions and indel purified regions of the human genome. Moreover, intact retrogenes closest to annotated genes show the greatest levels of purifying selective pressure. Together, these findings suggest that the differential evolution of duplicate genes may be significantly influenced by changes in local genome architecture.


Methods | 2012

Computational detection and analysis of sequences with duplex-derived interstrand G-quadruplex forming potential.

Kajia Cao; Paul Ryvkin; F. Brad Johnson

Bioinformatic approaches to the identification of genomic sequences having G-quadruplex forming potential (QFP) has enabled important tests of the structure of these sequences in vitro and of their behavior under conditions where the formation or function of G-quadruplexes is modulated in vivo. Several similar approaches to identifying intramolecular QFP (i.e. forming among G-runs on one strand of DNA) have been developed previously, but none appears to perfectly predict G-quadruplex formation. Here we describe a new approach, which complements and differs from prior approaches in that it identifies motifs containing G-runs on both strands of duplex DNA that could contribute to G-quadruplex structures. We call these motifs duplex-derived interstrand QFP (ddiQFP), and illustrate their potential applications by describing their genomic distribution and an example of their correspondence to loci targeted by a G-quadruplex-unwinding DNA helicase in yeast.


PLOS ONE | 2015

Transcriptomic Changes Due to Cytoplasmic TDP-43 Expression Reveal Dysregulation of Histone Transcripts and Nuclear Chromatin

Alexandre Amlie-Wolf; Paul Ryvkin; Rui Tong; Isabelle Dragomir; EunRan Suh; Yan Xu; Vivianna M. Van Deerlin; Brian D. Gregory; Linda K. Kwong; John Q. Trojanowski; V. M.-Y. Lee; Li-San Wang; Edward B. Lee

TAR DNA-binding protein 43 (TDP-43) is normally a nuclear RNA-binding protein that exhibits a range of functions including regulation of alternative splicing, RNA trafficking, and RNA stability. However, in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration with TDP-43 inclusions (FTLD-TDP), TDP-43 is abnormally phosphorylated, ubiquitinated, and cleaved, and is mislocalized to the cytoplasm where it forms distinctive aggregates. We previously developed a mouse model expressing human TDP-43 with a mutation in its nuclear localization signal (ΔNLS-hTDP-43) so that the protein preferentially localizes to the cytoplasm. These mice did not exhibit a significant number of cytoplasmic aggregates, but did display dramatic changes in gene expression as measured by microarray, suggesting that cytoplasmic TDP-43 may be associated with a toxic gain-of-function. Here, we analyze new RNA-sequencing data from the ΔNLS-hTDP-43 mouse model, together with published RNA-sequencing data obtained previously from TDP-43 antisense oligonucleotide (ASO) knockdown mice to investigate further the dysregulation of gene expression in the ΔNLS model. This analysis reveals that the transcriptomic effects of the overexpression of the ΔNLS-hTDP-43 transgene are likely due to a gain of cytoplasmic function. Moreover, cytoplasmic TDP-43 expression alters transcripts that regulate chromatin assembly, the nucleolus, lysosomal function, and histone 3’ untranslated region (UTR) processing. These transcriptomic alterations correlate with observed histologic abnormalities in heterochromatin structure and nuclear size in transgenic mouse and human brains.


Nucleic Acids Research | 2012

SAVoR: a server for sequencing annotation and visualization of RNA structures

Fan Li; Paul Ryvkin; Daniel Micah Childress; Otto Valladares; Brian D. Gregory; Li-San Wang

RNA secondary structure is required for the proper regulation of the cellular transcriptome. This is because the functionality, processing, localization and stability of RNAs are all dependent on the folding of these molecules into intricate structures through specific base pairing interactions encoded in their primary nucleotide sequences. Thus, as the number of RNA sequencing (RNA-seq) data sets and the variety of protocols for this technology grow rapidly, it is becoming increasingly pertinent to develop tools that can analyze and visualize this sequence data in the context of RNA secondary structure. Here, we present Sequencing Annotation and Visualization of RNA structures (SAVoR), a web server, which seamlessly links RNA structure predictions with sequencing data and genomic annotations to produce highly informative and annotated models of RNA secondary structure. SAVoR accepts read alignment data from RNA-seq experiments and computes a series of per-base values such as read abundance and sequence variant frequency. These values can then be visualized on a customizable secondary structure model. SAVoR is freely available at http://tesla.pcbi.upenn.edu/savor.


Methods | 2014

Using machine learning and high-throughput RNA sequencing to classify the precursors of small non-coding RNAs.

Paul Ryvkin; Yuk Yee Leung; Lyle H. Ungar; Brian D. Gregory; Li-San Wang

Recent advances in high-throughput sequencing allow researchers to examine the transcriptome in more detail than ever before. Using a method known as high-throughput small RNA-sequencing, we can now profile the expression of small regulatory RNAs such as microRNAs and small interfering RNAs (siRNAs) with a great deal of sensitivity. However, there are many other types of small RNAs (<50nt) present in the cell, including fragments derived from snoRNAs (small nucleolar RNAs), snRNAs (small nuclear RNAs), scRNAs (small cytoplasmic RNAs), tRNAs (transfer RNAs), and transposon-derived RNAs. Here, we present a users guide for CoRAL (Classification of RNAs by Analysis of Length), a computational method for discriminating between different classes of RNA using high-throughput small RNA-sequencing data. Not only can CoRAL distinguish between RNA classes with high accuracy, but it also uses features that are relevant to small RNA biogenesis pathways. By doing so, CoRAL can give biologists a glimpse into the characteristics of different RNA processing pathways and how these might differ between tissue types, biological conditions, or even different species. CoRAL is available at http://wanglab.pcbi.upenn.edu/coral/.


Journal of Computational Biology | 2009

The Birth of New Genes by RNA- and DNA-Mediated Duplication during Mammalian Evolution

Jin Jun; Paul Ryvkin; Edward Hemphill; Ion I. Mandoiu; Craig E. Nelson

Gene duplication has long been recognized as a major force in genome evolution and has recently been recognized as an important source of individual variation. For many years, the origin of functional gene duplicates was assumed to be whole or partial genome duplication events, but recently retrotransposition has also been shown to contribute new functional protein coding genes and siRNAs. In this study, we utilize pseudogenes to recreate more complete gene family histories, and compare the rates of RNA and DNA-mediated duplication and new functional gene formation in five mammalian genomes. We find that RNA-mediated duplication occurs at a much higher and more variable rate than DNA-mediated duplication, and gives rise to many more duplicated sequences over time. We show that, while the chance of RNA-mediated duplicates becoming functional is much lower than that of their DNA-mediated counterparts, the higher rate of retrotransposition leads to nearly equal contributions of new genes by each mechanism. We also find that functional RNA-mediated duplicates are closer to neighboring genes than non-functional RNA-mediated copies, consistent with co-option of regulatory elements at the site of insertion. Overall, new genes derived from DNA and RNA-mediated duplication mechanisms are under similar levels of purifying selective pressure, but have broadly different functions. RNA-mediated duplication gives rise to a diversity of genes but is dominated by the highly expressed genes of RNA metabolic pathways. DNA-mediated duplication can copy regulatory material along with the protein coding region of the gene and often gives rise to classes of genes whose function are dependent on complex regulatory information. This mechanistic difference may in part explain why we find that mammalian protein families tend to evolve by either one mechanism or the other, but rarely by both. Supplementary Material has been provided (see online Supplementary Material at www.liebertonline.com ).

Collaboration


Dive into the Paul Ryvkin's collaboration.

Top Co-Authors

Avatar

Li-San Wang

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Brian D. Gregory

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Isabelle Dragomir

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Yuk Yee Leung

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Craig E. Nelson

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar

Edward Hemphill

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar

F. Brad Johnson

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Jin Jun

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar

Kajia Cao

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Otto Valladares

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge