Yana Valasatava
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yana Valasatava.
Nucleic Acids Research | 2017
Peter W. Rose; Andreas Prlić; Ali Altunkaya; Chunxiao Bi; Anthony R. Bradley; Cole Christie; Luigi Di Costanzo; Jose M. Duarte; Shuchismita Dutta; Zukang Feng; Rachel Kramer Green; David S. Goodsell; Brian P. Hudson; Tara Kalro; Robert Lowe; Ezra Peisach; Christopher Randle; Alexander S. Rose; Chenghua Shao; Yi-Ping Tao; Yana Valasatava; Maria Voigt; John D. Westbrook; Jesse Woo; Huangwang Yang; Jasmine Young; Christine Zardecki; Helen M. Berman; Stephen K. Burley
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, http://rcsb.org), the US data center for the global PDB archive, makes PDB data freely available to all users, from structural biologists to computational biologists and beyond. New tools and resources have been added to the RCSB PDB web portal in support of a ‘Structural View of Biology.’ Recent developments have improved the User experience, including the high-speed NGL Viewer that provides 3D molecular visualization in any web browser, improved support for data file download and enhanced organization of website pages for query, reporting and individual structure exploration. Structure validation information is now visible for all archival entries. PDB data have been integrated with external biological resources, including chromosomal position within the human genome; protein modifications; and metabolic pathways. PDB-101 educational materials have been reorganized into a searchable website and expanded to include new features such as the Geis Digital Archive.
Proceedings of the 21st International Conference on Web3D Technology | 2016
Alexander S. Rose; Anthony R. Bradley; Yana Valasatava; Jose M. Duarte; Andreas Prlić; Peter W. Rose
The interactive visualization of very large macromolecular complexes on the web is becoming a challenging problem as experimental techniques advance at an unprecedented rate and deliver structures of increasing size. We have tackled this problem by introducing the binary and compressed Macromolecular Transmission Format (MMTF) to reduce network transfer and parsing time, and by developing NGL, a highly memory-efficient and scalable WebGL-based viewer. MMTF offers over 75% compression over the standard mmCIF format, is over an order of magnitude faster to parse, and contains additional information (e.g., bond information). NGL renders molecular complexes with millions of atoms interactively on desktop computers and smartphones alike, making it a tool of choice for web-based molecular visualization in research and education.
PLOS ONE | 2017
Yana Valasatava; Anthony R. Bradley; Alexander S. Rose; Jose M. Duarte; Andreas Prlić; Peter W. Rose
The size and complexity of 3D macromolecular structures available in the Protein Data Bank is constantly growing. Current tools and file formats have reached limits of scalability. New compression approaches are required to support the visualization of large molecular complexes and enable new and scalable means for data analysis. We evaluated a series of compression techniques for coordinates of 3D macromolecular structures and identified the best performing approaches. By balancing compression efficiency in terms of the decompression speed and compression ratio, and code complexity, our results provide the foundation for a novel standard to represent macromolecular coordinates in a compact and useful file format.
PLOS Computational Biology | 2017
Anthony R. Bradley; Alexander S. Rose; Antonin Pavelka; Yana Valasatava; Jose M. Duarte; Andreas Prlić; Peter W. Rose
Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis.
Genome Medicine | 2017
Gustavo Glusman; Peter W. Rose; Andreas Prlić; Jennifer Dougherty; Jose M. Duarte; Andrew S. Hoffman; Geoffrey J. Barton; Emøke Bendixen; Timothy Bergquist; Christian Bock; Elizabeth Brunk; Marija Buljan; Stephen K. Burley; Binghuang Cai; Hannah Carter; Jian Jiong Gao; Adam Godzik; Michael Heuer; Michael A. Hicks; Thomas Hrabe; Rachel Karchin; Julia Koehler Leman; Lydie Lane; David L. Masica; Sean D. Mooney; John Moult; Gilbert S. Omenn; Frances M. G. Pearl; Vikas Pejaver; Sheila Reynolds
The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.
Nucleic Acids Research | 2018
Stephen K. Burley; Helen M. Berman; Charmi Bhikadiya; Chunxiao Bi; Li Chen; Luigi Di Costanzo; Cole Christie; Ken Dalenberg; Jose M. Duarte; Shuchismita Dutta; Zukang Feng; Sutapa Ghosh; David S. Goodsell; Rachel Kramer Green; Vladimir Guranovic; Dmytro Guzenko; Brian P. Hudson; Tara Kalro; Yuhe Liang; Robert Lowe; Harry Namkoong; Ezra Peisach; Irina Periskova; Andreas Prlić; Chris Randle; Alexander S. Rose; Peter W. Rose; Raul Sala; Monica Sekharan; Chenghua Shao
Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, rcsb.org), the US data center for the global PDB archive, serves thousands of Data Depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without usage restrictions to more than 1 million rcsb.org Users worldwide and 600 000 pdb101.rcsb.org education-focused Users around the globe. PDB Data Depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy and 3D electron microscopy. PDB Data Consumers include researchers, educators and students studying Fundamental Biology, Biomedicine, Biotechnology and Energy. Recent reorganization of RCSB PDB activities into four integrated, interdependent services is described in detail, together with tools and resources added over the past 2 years to RCSB PDB web portals in support of a ‘Structural View of Biology.’
Nucleic Acids Research | 2018
Stephen K. Burley; Helen M. Berman; Charmi Bhikadiya; Chunxiao Bi; Li Chen; Luigi Di Costanzo; Cole Christie; Jose M. Duarte; Shuchismita Dutta; Zukang Feng; Sutapa Ghosh; David S. Goodsell; Rachel Kramer Green; Vladimir Guranovic; Dmytro Guzenko; Brian P. Hudson; Yuhe Liang; Robert Lowe; Ezra Peisach; Irina Periskova; Chris Randle; Alexander S. Rose; Monica Sekharan; Chenghua Shao; Yi-Ping Tao; Yana Valasatava; Maria Voigt; John D. Westbrook; Jasmine Young; Christine Zardecki
Abstract The Protein Data Bank (PDB) is the single global archive of experimentally determined three-dimensional (3D) structure data of biological macromolecules. Since 2003, the PDB has been managed by the Worldwide Protein Data Bank (wwPDB; wwpdb.org), an international consortium that collaboratively oversees deposition, validation, biocuration, and open access dissemination of 3D macromolecular structure data. The PDB Core Archive houses 3D atomic coordinates of more than 144 000 structural models of proteins, DNA/RNA, and their complexes with metals and small molecules and related experimental data and metadata. Structure and experimental data/metadata are also stored in the PDB Core Archive using the readily extensible wwPDB PDBx/mmCIF master data format, which will continue to evolve as data/metadata from new experimental techniques and structure determination methods are incorporated by the wwPDB. Impacts of the recently developed universal wwPDB OneDep deposition/validation/biocuration system and various methods-specific wwPDB Validation Task Forces on improving the quality of structures and data housed in the PDB Core Archive are described together with current challenges and future plans.
Bioinformatics | 2018
Alexander S. Rose; Anthony R. Bradley; Yana Valasatava; Jose M. Duarte; Andreas Prlić; Peter W. Rose
Motivation: The interactive visualization of very large macromolecular complexes on the web is becoming a challenging problem as experimental techniques advance at an unprecedented rate and deliver structures of increasing size. Results: We have tackled this problem by developing highly memory‐efficient and scalable extensions for the NGL WebGL‐based molecular viewer and by using Macromolecular Transmission Format (MMTF), a binary and compressed MMTF. These enable NGL to download and render molecular complexes with millions of atoms interactively on desktop computers and smartphones alike, making it a tool of choice for web‐based molecular visualization in research and education. Availability and implementation: The source code is freely available under the MIT license at github.com/arose/ngl and distributed on NPM (npmjs.com/package/ngl). MMTF‐JavaScript encoders and decoders are available at github.com/rcsb/mmtf‐javascript.
F1000Research | 2017
Peter W. Rose; Anthony R. Bradley; Jose M. Duarte; Antonin Pavelka; Andreas Prlić; Alexander S. Rose; Yana Valasatava; Yue Yue
F1000Research | 2016
Peter W. Rose; Yana Valasatava; Anthony R. Bradley; Alexander S. Rose; Jose M. Duarte; Andreas Prlić