Dong Si
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dong Si.
Biopolymers | 2012
Dong Si; Shuiwang Ji; Kamal Al Nasr; Jing He
The accuracy of the secondary structure element (SSE) identification from volumetric protein density maps is critical for de-novo backbone structure derivation in electron cryo-microscopy (cryoEM). It is still challenging to detect the SSE automatically and accurately from the density maps at medium resolutions (∼5-10 Å). We present a machine learning approach, SSELearner, to automatically identify helices and β-sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank. We tested our approach using 10 simulated density maps. The averaged specificity and sensitivity for the helix detection are 94.9% and 95.8%, respectively, and those for the β-sheet detection are 86.7% and 96.4%, respectively. We have developed a secondary structure annotator, SSID, to predict the helices and β-strands from the backbone Cα trace. With the help of SSID, we tested our SSELearner using 13 experimentally derived cryo-EM density maps. The machine learning approach shows the specificity and sensitivity of 91.8% and 74.5%, respectively, for the helix detection and 85.2% and 86.5% respectively for the β-sheet detection in cryoEM maps of Electron Microscopy Data Bank. The reduced detection accuracy reveals the challenges in SSE detection when the cryoEM maps are used instead of the simulated maps. Our results suggest that it is effective to use one cryoEM map for learning to detect the SSE in another cryoEM map of similar quality.
international conference on bioinformatics | 2013
Dong Si; Jing He
Secondary structure element (SSE) identification from volumetric protein density maps is critical for de-novo backbone structure derivation in electron cryo-microscopy (cryoEM). Although multiple methods have been developed to detect SSE from the density maps, accurate detection either need use intervention or carefully adjusting various parameters. It is still challenging to detect the SSE automatically and accurately from cryoEM density maps at medium resolutions (~5-10Å). A detected β-sheet can be represented by either the voxels of the β-sheet density or by many piecewise polygons to compose a rough surface. However, none of these is effective in capturing the global surface feature of the β-sheet. We present an effective single-parameter approach, SSEtracer, to automatically identify helices and β-sheets from the cryoEM three-dimensional (3D) maps at medium resolutions. More importantly, we present a simple mathematical model to represent the β-sheet density. It was tested using eleven cryoEM β-sheets detected by SSEtracer. The RMSE between the density and the model is 1.88Å. The mathematical model can be used for the β-strands detection from medium resolution density maps.
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine | 2012
Kamal Al Nasr; Lin Chen; Dong Si; Desh Ranjan; Mohammad Zubair; Jing He
Cryo-electron Microscopy (cryoEM) is an advanced imaging technique that produces volume maps at different resolutions. This technique is capable of visualizing large molecular complexes such as viruses and ribosomes. At the medium resolutions, such as 5 to 10Å, the location and orientation of the secondary structure elements (SSEs) can be computationally identified. However, there is no registration between the detected SSEs and the protein sequence, and therefore it is challenging to derive the atomic structure from such volume data. We present, in this paper, the preliminary results of the full-atom protein chains using our de novo modeling framework. The framework has multiple components including the ranking of topologies, the construction of helices and loops along the density traces, and the energy evaluation of the structure. A test containing thirteen simulated density maps and two experimentally derived density maps show that the true topology was ranked among the top 35 of the huge topological space. The best atomic model of the true topology was ranked within the top 40 for twelve of the fifteen proteins tested. The average backbone RMSD100 of these models is about 4Å for the fifteen proteins.
bioinformatics and biomedicine | 2016
Dong Si; Tao Zeng; Shuiwang Ji; Jing He
The detection of secondary structure of proteins using three dimensional (3D) cryo-electron microscopy (cryo-EM) images is still a challenging task when the spatial resolution of cryo-EM images is at medium level (5–10Å). Prior researches focused on the usage of local features that may not capture the global information of image objects. In this study, we propose to use deep learning methods to extract high representative global features and then automatically detect secondary structures of proteins. In particular, we build a convolutional neural network (CNN) classifier that predicts the probability of label for every individual voxel in 3D cryo-EM image with respect to the secondary structure elements of proteins such as α-helix, β-sheet and background. To effectively incorporate the 3D spatial information in protein structures, we propose to perform 3D convolutions in the convolutional layers of CNNs. We show that the proposed CNN classifier can outperform existing SVM method on identifying the secondary structure elements of proteins from 3D cryo-EM medium resolution images.
international conference of the ieee engineering in medicine and biology society | 2014
Dong Si; Jing He
Electron cryo-microscopy (Cryo-EM) technique produces 3-dimensional (3D) density images of proteins. When resolution of the images is not high enough to resolve the molecular details, it is challenging for image processing methods to enhance the molecular features. β-barrel is a particular structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive β-strands from the 3D image of a β-barrel at medium resolutions. We propose a new method, StrandRoller, to generate a small set of possible β-traces from the density images at medium resolutions of 5-10Å. StrandRoller has been tested using eleven β-barrel images simulated to 10Å resolution and one image isolated from the experimentally derived cryo-EM density image at 6.7Å resolution. StrandRoller was able to detect 81.84% of the β-strands with an overall 1.5Å 2-way distance between the detected and the observed β-traces, if the best of fifteen detections is considered. Our results suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when no separation of the β-strands is visible in the images.
BioMed Research International | 2017
Dong Si; Jing He
Cryo-electron microscopy (cryo-EM) has produced density maps of various resolutions. Although α-helices can be detected from density maps at 5–8 Å resolutions, β-strands are challenging to detect at such density maps due to close-spacing of β-strands. The variety of shapes of β-sheets adds the complexity of β-strands detection from density maps. We propose a new approach to model traces of β-strands for β-barrel density regions that are extracted from cryo-EM density maps. In the test containing eight β-barrels extracted from experimental cryo-EM density maps at 5.5 Å–8.25 Å resolution, StrandRoller detected about 74.26% of the amino acids in the β-strands with an overall 2.05 Å 2-way distance between the detected β-traces and the observed ones, if the best of the fifteen detection cases is considered.
BMC Structural Biology | 2013
Andrew McKnight; Dong Si; Kamal Al Nasr; Andrey N. Chernikov; Nikos Chrisochoides; Jing He
BackgroundDe novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein.ResultsWe have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5Å of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2Å.ConclusionsThe tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices.
international conference on bioinformatics | 2016
Dong Si
Cryo-electron microscopy (Cryo-EM) has become central to the study of large-scale molecular interactions and has produced three-dimensional (3D) density maps at various resolutions. Secondary structure element (SSE) identification from volumetric protein density maps is critical for de novo backbone structure derivation in cryo-EM. Multiple methods have been developed to detect helix and β-sheet from density maps at medium resolutions (~5-10Å). β-barrel as a special β-sheet structure has been found in many proteins. However, currently there are no methods aimed to automatically and accurately extract the β-barrel as a complete chunk of density from cryo-EM map of β-barrel protein. We present an effective approach, BarrelMiner, to automatically extract the β-barrel region from cryo-EM density maps based on the conspicuous cylindrical shape of β-barrel. BarrelMiner has been tested using ten simulated density maps at 10Å resolution and seven experimental cryo-EM maps between 5.5Å and 8.25Å resolution. The result suggests that BarrelMiner can be used for the detection of β-barrel from medium resolution cryo-EM density maps of β-barrel proteins.
international conference on bioinformatics | 2014
Dong Si; Jing He
A β-sheet is composed of multiple β-strands that are stabilized by inter-strand hydrogen bonds. It has been discovered that a β-sheet is right-handed twisted. We have developed a geometrical method to investigate the relationship between the twist of a β-sheet and the orientations of β-strand traces. The results from forty-one β-sheets suggest that the set of β-traces that has similar orientations as those in the observed set of β-traces has near maximum twist angle AMT among other sets with various orientations.
international symposium on bioinformatics research and applications | 2017
Albert Ng; Dong Si
Cryo-electron microscopy (Cryo-EM) is a technique that produces three-dimensional density maps of large protein complexes and enables the study of the interactions and structures of those molecules. Identifying the secondary structures (α-helices and β-sheets) located in proteins using density maps is vital in identifying and matching the backbone of the protein with the cryo-EM density map. The β-barrel is a unique β-sheet structure commonly found in proteins, such as membranes and lipocalins. We present a new approach utilizing a genetic algorithm and ray tracing to automatically identify and extract β-barrels from cryo-EM density maps. This approach was tested using ten simulated density maps at 9 A resolution and six experimental density maps at various resolutions. The results suggest that our approach is capable of performing automatic detection and extraction of the β-barrels from medium resolution cryo-EM density maps.