Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Natalia Volfovsky is active.

Publication


Featured researches published by Natalia Volfovsky.


Genome Biology | 2002

Full-length messenger RNA sequences greatly improve genome annotation

Brian J. Haas; Natalia Volfovsky; Christopher D. Town; Maxim Troukhan; Nickolai Alexandrov; Kenneth A. Feldmann; Richard Flavell; Owen White

BackgroundAnnotation of eukaryotic genomes is a complex endeavor that requires the integration of evidence from multiple, often contradictory, sources. With the ever-increasing amount of genome sequence data now available, methods for accurate identification of large numbers of genes have become urgently needed. In an effort to create a set of very high-quality gene models, we used the sequence of 5,000 full-length gene transcripts from Arabidopsis to re-annotate its genome. We have mapped these transcripts to their exact chromosomal locations and, using alignment programs, have created gene models that provide a reference set for this organism.ResultsApproximately 35% of the transcripts indicated that previously annotated genes needed modification, and 5% of the transcripts represented newly discovered genes. We also discovered that multiple transcription initiation sites appear to be much more common than previously known, and we report numerous cases of alternative mRNA splicing. We include a comparison of different alignment software and an analysis of how the transcript data improved the previously published annotation.ConclusionsOur results demonstrate that sequencing of large numbers of full-length transcripts followed by computational mapping greatly improves identification of the complete exon structures of eukaryotic genes. In addition, we are able to find numerous introns in the untranslated regions of the genes.


Molecular Cancer Research | 2008

The identification of microRNAs in a genomically unstable region of human chromosome 8q24.

Konrad Huppi; Natalia Volfovsky; Timothy Runfola; Tamara Jones; Mark Mackiewicz; Scott E. Martin; J. Frederic Mushinski; Robert M. Stephens; Natasha J. Caplen

The PVT1 locus is identified as a cluster of T(2;8) and T(8;22) “variant” MYC-activating chromosomal translocation breakpoints extending 400 kb downstream of MYC in a subset (≈20%) of Burkitts lymphoma (vBL). Recent reports that microRNAs (miRNA) may be associated with fragile sites and cancer-associated genomic regions prompted us to investigate whether the PVT1 region on chromosome 8q24 may contain miRNAs. Computational analysis of the genomic sequence covering the PVT1 locus and experimental verification identified seven miRNAs. One miRNA, hsa-miR-1204, resides within a previously described PVT1 exon (1b) that is often fused to the immunoglobulin light chain constant region in vBLs and is present in high copy number in MYC/PVT1–amplified tumors. Like its human counterpart, mouse mmu-miR-1204 represents the closest miRNA to Myc (∼50 kb) and is found only 1 to 2 kb downstream of a cluster of retroviral integration sites. Another miRNA, mmu-miR-1206, is close to a cluster of variant translocation breakpoints associated with mouse plasmacytoma and exon 1 of mouse Pvt1. Virtually all the miRNA precursor transcripts are expressed at higher levels in late-stage B cells (including plasmacytoma and vBL cell lines) compared with immature B cells, suggesting possible roles in lymphoid development and/or lymphoma. In addition, lentiviral vector-mediated overexpression of the miR-1204 precursor (human and mouse) in a mouse pre–B-cell line increased expression of Myc. High levels of expression of the hsa-miR-1204 precursor is also seen in several epithelial cancer cell lines with MYC/PVT1 coamplification, suggesting a potentially broad role for these miRNAs in tumorigenesis. (Mol Cancer Res 2008;6(2):212–21)


Retrovirology | 2013

Analysis of 454 sequencing error rate, error sources, and artifact recombination for detection of Low-frequency drug resistance mutations in HIV-1 DNA.

Wei Shao; Valerie F. Boltz; Jonathan Spindler; Mary Kearney; Frank Maldarelli; John W. Mellors; Claudia Stewart; Natalia Volfovsky; Alexander Levitsky; Robert M. Stephens; John M. Coffin

Background454 sequencing technology is a promising approach for characterizing HIV-1 populations and for identifying low frequency mutations. The utility of 454 technology for determining allele frequencies and linkage associations in HIV infected individuals has not been extensively investigated. We evaluated the performance of 454 sequencing for characterizing HIV populations with defined allele frequencies.ResultsWe constructed two HIV-1 RT clones. Clone A was a wild type sequence. Clone B was identical to clone A except it contained 13 introduced drug resistant mutations. The clones were mixed at ratios ranging from 1% to 50% and were amplified by standard PCR conditions and by PCR conditions aimed at reducing PCR-based recombination. The products were sequenced using 454 pyrosequencing. Sequence analysis from standard PCR amplification revealed that 14% of all sequencing reads from a sample with a 50:50 mixture of wild type and mutant DNA were recombinants. The majority of the recombinants were the result of a single crossover event which can happen during PCR when the DNA polymerase terminates synthesis prematurely. The incompletely extended template then competes for primer sites in subsequent rounds of PCR. Although less often, a spectrum of other distinct crossover patterns was also detected. In addition, we observed point mutation errors ranging from 0.01% to 1.0% per base as well as indel (insertion and deletion) errors ranging from 0.02% to nearly 50%. The point errors (single nucleotide substitution errors) were mainly introduced during PCR while indels were the result of pyrosequencing. We then used new PCR conditions designed to reduce PCR-based recombination. Using these new conditions, the frequency of recombination was reduced 27-fold. The new conditions had no effect on point mutation errors. We found that 454 pyrosequencing was capable of identifying minority HIV-1 mutations at frequencies down to 0.1% at some nucleotide positions.ConclusionStandard PCR amplification results in a high frequency of PCR-introduced recombination precluding its use for linkage analysis of HIV populations using 454 pyrosequencing. We designed a new PCR protocol that resulted in a much lower recombination frequency and provided a powerful technique for linkage analysis and haplotype determination in HIV-1 populations. Our analyses of 454 sequencing results also demonstrated that at some specific HIV-1 drug resistant sites, mutations can reliably be detected at frequencies down to 0.1%.


Nature Structural & Molecular Biology | 2011

Impact of chromatin structure on sequence variability in the human genome

Michael Y. Tolstorukov; Natalia Volfovsky; Robert M. Stephens; Peter J. Park

DNA sequence variations in individual genomes give rise to different phenotypes within the same species. One mechanism in this process is the alteration of chromatin structure due to sequence variation that influences gene regulation. We composed a high-confidence collection of human single-nucleotide polymorphisms and indels based on analysis of publicly available sequencing data and investigated whether the DNA loci associated with stable nucleosome positions are protected against mutations. We addressed how the sequence variation reflects the occupancy profiles of nucleosomes bearing different epigenetic modifications on genome scale. We found that indels are depleted around nucleosome positions of all considered types, whereas single-nucleotide polymorphisms are enriched around the positions of bulk nucleosomes but depleted around the positions of epigenetically modified nucleosomes. These findings indicate an increased level of conservation for the sequences associated with epigenetically modified nucleosomes, highlighting complex organization of the human chromatin.


Nature Neuroscience | 2016

Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder

Arjun Krishnan; Ran Zhang; Victoria Yao; Chandra L. Theesfeld; Aaron K. Wong; Alicja Tadych; Natalia Volfovsky; Alan Packer; Alex E. Lash; Olga G. Troyanskaya

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with a strong genetic basis. Yet, only a small fraction of potentially causal genes—about 65 genes out of an estimated several hundred—are known with strong genetic evidence from sequencing studies. We developed a complementary machine-learning approach based on a human brain-specific gene network to present a genome-wide prediction of autism risk genes, including hundreds of candidates for which there is minimal or no prior genetic evidence. Our approach was validated in a large independent case–control sequencing study. Leveraging these genome-wide predictions and the brain-specific network, we demonstrated that the large set of ASD genes converges on a smaller number of key pathways and developmental stages of the brain. Finally, we identified likely pathogenic genes within frequent autism-associated copy-number variants and proposed genes and pathways that are likely mediators of ASD across multiple copy-number variants. All predictions and functional insights are available at http://asd.princeton.edu.


Nucleic Acids Research | 2012

Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools

Regina Z. Cer; Duncan E. Donohue; Uma Mudunuri; Nuri A. Temiz; Michael A. Loss; Nathan J. Starner; Goran N. Halusa; Natalia Volfovsky; Ming Yi; Brian T. Luke; Albino Bacolla; Jack R. Collins; Robert M. Stephens

The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.


Nature Communications | 2013

Gene network reconstruction reveals cell cycle and antiviral genes as major drivers of cervical cancer

Karina L. Mine; Natalia Shulzhenko; Anatoly Yambartsev; Mark Rochman; Gerdine F. Sanson; Malin Lando; Sudhir Varma; Jeff Skinner; Natalia Volfovsky; Tao Deng; Sylvia Michelina Fernandes Brenna; Carmen R.N. Carvalho; Julisa Chamorro Lascasas Ribalta; Michael Bustin; Polly Matzinger; Ismael D.C.G. Silva; Heidi Lyng; Maria Gerbase-DeLima; Andrey Morgun

Although human papillomavirus (HPV) was identified as an etiological factor in cervical cancer, the key human gene drivers of this disease remain unknown. Here we apply an unbiased approach integrating gene expression and chromosomal aberration data. In an independent group of patients, we reconstruct and validate a gene regulatory meta-network, and identify cell cycle and antiviral genes that constitute two major sub-networks up-regulated in tumour samples. These genes are located within the same regions as chromosomal amplifications, most frequently on 3q. We propose a model in which selected chromosomal gains drive activation of antiviral genes contributing to episomal virus elimination, which synergizes with cell cycle dysregulation. These findings may help to explain the paradox of episomal HPV decline in women with invasive cancer who were previously unable to clear the virus.


Molecular Cancer | 2013

Transcription signatures encoded by ultraconserved genomic regions in human prostate cancer

Robert S. Hudson; Ming Yi; Natalia Volfovsky; Robyn L. Prueitt; Dominic Esposito; Stefano Volinia; Chang Gong Liu; Aaron J. Schetter; Katrien Van Roosbroeck; Robert M. Stephens; George A. Calin; Carlo M. Croce; Stefan Ambs

BackgroundUltraconserved regions (UCR) are genomic segments of more than 200 base pairs that are evolutionarily conserved among mammalian species. They are thought to have functions as transcriptional enhancers and regulators of alternative splicing. Recently, it was shown that numerous RNAs are transcribed from these regions. These UCR-encoded transcripts (ucRNAs) were found to be expressed in a tissue- and disease-specific manner and may interfere with the function of other RNAs through RNA: RNA interactions. We hypothesized that ucRNAs have unidentified roles in the pathogenesis of human prostate cancer. In a pilot study, we examined ucRNA expression profiles in human prostate tumors.MethodsUsing a custom microarray with 962 probesets representing sense and antisense sequences for the 481 human UCRs, we examined ucRNA expression in resected, fresh-frozen human prostate tissues (57 tumors, 7 non-cancerous prostate tissues) and in cultured prostate cancer cells treated with either epigenetic drugs (the hypomethylating agent, 5-Aza 2′deoxycytidine, and the histone deacetylase inhibitor, trichostatin A) or a synthetic androgen, R1881. Expression of selected ucRNAs was also assessed by qRT-PCR and NanoString®-based assays. Because ucRNAs may function as RNAs that target protein-coding genes through direct and inhibitory RNA: RNA interactions, computational analyses were applied to identify candidate ucRNA:mRNA binding pairs.ResultsWe observed altered ucRNA expression in prostate cancer (e.g., uc.106+, uc.477+, uc.363 + A, uc.454 + A) and found that these ucRNAs were associated with cancer development, Gleason score, and extraprostatic extension after controlling for false discovery (false discovery rate < 5% for many of the transcripts). We also identified several ucRNAs that were responsive to treatment with either epigenetic drugs or androgen (R1881). For example, experiments with LNCaP human prostate cancer cells showed that uc.287+ is induced by R1881 (P < 0.05) whereas uc.283 + A was up-regulated following treatment with combined 5-Aza 2′deoxycytidine and trichostatin A (P < 0.05). Additional computational analyses predicted RNA loop-loop interactions of 302 different sense and antisense ucRNAs with 1058 different mRNAs, inferring possible functions of ucRNAs via direct interactions with mRNAs.ConclusionsThis first study of ucRNA expression in human prostate cancer indicates an altered transcript expression in the disease.


The Journal of Infectious Diseases | 2010

Kaposi Sarcoma (KS)-Associated Herpesvirus MicroRNA Sequence Analysis and KS Risk in a European AIDS-KS Case Control Study

Vickie Marshall; Elisa Martró; Nazzarena Labo; Alex Ray; Dian Wang; Georginia Mbisa; Rachel Bagni; Natalia Volfovsky; Jordi Casabona; Denise Whitby

BACKGROUND We recently identified polymorphisms in Kaposi sarcoma-associated herpesvirus (KSHV)-encoded microRNA (miRNA) sequences from clinical subjects. Here, we examine whether any of these may contribute to KS risk in a European AIDS-KS case-control study. METHODS KSHV load in peripheral blood was determined by real-time quantitative polymerase chain reaction. Samples that had detectable viral loads were used to amplify the 2.8-kb miRNA encoding region plus a 646-bp fragment of the K12/T0.7 gene. Additionally, we characterized an 840-bp fragment of the K1 gene to determine KSHV subtypes. RESULTS KSHV DNA was detected in peripheral blood mononuclear cells of 49.6% of case patients and 6.8% of controls, and viral loads tended to be higher in case patients. Sequences from the miRNA-encoding regions were conserved overall, but distinct polymorphisms were detected, some of which occurred in primary miRNAs, pre-miRNAs, or mature miRNAs. CONCLUSIONS Patients with KS were more likely to have detectable viral loads than were controls without disease. Despite high conservation in KSHV miRNA-encoded sequences, polymorphisms were observed, including some that have been reported elsewhere. Some polymorphisms could affect mature miRNA processing and appear to be associated with KS risk.


PLOS Genetics | 2013

Guanine Holes Are Prominent Targets for Mutation in Cancer and Inherited Disease

Albino Bacolla; Nuri A. Temiz; Ming Yi; Joseph Ivanic; Regina Z. Cer; Duncan E. Donohue; Edward V. Ball; Uma Mudunuri; Guliang Wang; Aklank Jain; Natalia Volfovsky; Brian T. Luke; Robert M. Stephens; David Neil Cooper; Jack R. Collins; Karen M. Vasquez

Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G•C bp in the context of all 64 5′-NGNN-3′ motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.

Collaboration


Dive into the Natalia Volfovsky's collaboration.

Top Co-Authors

Avatar

Ming Yi

Science Applications International Corporation

View shared research outputs
Top Co-Authors

Avatar

Albino Bacolla

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Alex E. Lash

Memorial Sloan Kettering Cancer Center

View shared research outputs
Top Co-Authors

Avatar

Duncan E. Donohue

Science Applications International Corporation

View shared research outputs
Top Co-Authors

Avatar

Jack R. Collins

Science Applications International Corporation

View shared research outputs
Top Co-Authors

Avatar

Regina Z. Cer

Science Applications International Corporation

View shared research outputs
Top Co-Authors

Avatar

Uma Mudunuri

Science Applications International Corporation

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge