Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jianjun Zhou is active.

Publication


Featured researches published by Jianjun Zhou.


Nucleic Acids Research | 2008

CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data

David S. Wishart; David Arndt; Mark V. Berjanskii; Peter Tang; Jianjun Zhou; Guohui Lin

CS23D (chemical shift to 3D structure) is a web server for rapidly generating accurate 3D protein structures using only assigned nuclear magnetic resonance (NMR) chemical shifts and sequence data as input. Unlike conventional NMR methods, CS23D requires no NOE and/or J-coupling data to perform its calculations. CS23D accepts chemical shift files in either SHIFTY or BMRB formats, and produces a set of PDB coordinates for the protein in about 10–15 min. CS23D uses a pipeline of several preexisting programs or servers to calculate the actual protein structure. Depending on the sequence similarity (or lack thereof) CS23D uses either (i) maximal subfragment assembly (a form of homology modeling), (ii) chemical shift threading or (iii) shift-aided de novo structure prediction (via Rosetta) followed by chemical shift refinement to generate and/or refine protein coordinates. Tests conducted on more than 100 proteins from the BioMagResBank indicate that CS23D converges (i.e. finds a solution) for >95% of protein queries. These chemical shift generated structures were found to be within 0.2–2.8 Å RMSD of the NMR structure generated using conventional NOE-base NMR methods or conventional X-ray methods. The performance of CS23D is dependent on the completeness of the chemical shift assignments and the similarity of the query protein to known 3D folds. CS23D is accessible at http://www.cs23d.ca.


Nucleic Acids Research | 2006

cisRED: a database system for genome-scale computational discovery of regulatory elements.

Gordon Robertson; Misha Bilenky; Keven Lin; An He; W. Yuen; M. Dagpinar; Richard Varhol; Kevin Teague; Obi L. Griffith; Xuekui Zhang; Yinghong Pan; Maik Hassel; Monica C. Sleumer; Wenying Pan; Erin Pleasance; M. Chuang; H. Hao; Yvonne Y. Li; Neil A. Robertson; Christopher D. Fjell; Bernard Li; Stephen B. Montgomery; Tamara Astakhova; Jianjun Zhou; Jörg Sander; Asim Siddiqui; Steven J.M. Jones

We describe cisRED, a database for conserved regulatory elements that are identified and ranked by a genome-scale computational system (). The database and high-throughput predictive pipeline are designed to address diverse target genomes in the context of rapidly evolving data resources and tools. Motifs are predicted in promoter regions using multiple discovery methods applied to sequence sets that include corresponding sequence regions from vertebrates. We estimate motif significance by applying discovery and post-processing methods to randomized sequence sets that are adaptively derived from target sequence sets, retain motifs with p-values below a threshold and identify groups of similar motifs and co-occurring motif patterns. The database offers information on atomic motifs, motif groups and patterns. It is web-accessible, and can be queried directly, downloaded or installed locally.


Analytical Chemistry | 2013

MyCompoundID: using an evidence-based metabolome library for metabolite identification.

Liang Li; Ronghong Li; Jianjun Zhou; Azeret Zuniga; Avalyn Stanislaus; Yiman Wu; Tao Huan; Jiamin Zheng; Yi Shi; David S. Wishart; Guohui Lin

Identification of unknown metabolites is a major challenge in metabolomics. Without the identities of the metabolites, the metabolome data generated from a biological sample cannot be readily linked with the proteomic and genomic information for studies in systems biology and medicine. We have developed a web-based metabolite identification tool ( http://www.mycompoundid.org ) that allows searching and interpreting mass spectrometry (MS) data against a newly constructed metabolome library composed of 8,021 known human endogenous metabolites and their predicted metabolic products (375,809 compounds from one metabolic reaction and 10,583,901 from two reactions). As an example, in the analysis of a simple extract of human urine or plasma and the whole human urine by liquid chromatography-mass spectrometry and MS/MS, we are able to identify at least two times more metabolites in these samples than by using a standard human metabolome library. In addition, it is shown that the evidence-based metabolome library (EML) provides a much superior performance in identifying putative metabolites from a human urine sample, compared to the use of the ChemPub and KEGG libraries.


Nucleic Acids Research | 2009

GeNMR: a web server for rapid NMR-based protein structure determination

Mark V. Berjanskii; Peter Tang; Jack Liang; Joseph A. Cruz; Jianjun Zhou; You Zhou; Edward Bassett; Cam Macdonell; Paul Lu; Guohui Lin; David S. Wishart

GeNMR (GEnerate NMR structures) is a web server for rapidly generating accurate 3D protein structures using sequence data, NOE-based distance restraints and/or NMR chemical shifts as input. GeNMR accepts distance restraints in XPLOR or CYANA format as well as chemical shift files in either SHIFTY or BMRB formats. The web server produces an ensemble of PDB coordinates for the protein within 15–25 min, depending on model complexity and completeness of experimental restraints. GeNMR uses a pipeline of several pre-existing programs and servers to calculate the actual protein structure. In particular, GeNMR combines genetic algorithms for structure optimization along with homology modeling, chemical shift threading, torsion angle and distance predictions from chemical shifts/NOEs as well as ROSETTA-based structure generation and simulated annealing with XPLOR-NIH to generate and/or refine protein coordinates. GeNMR greatly simplifies the task of protein structure determination as users do not have to install or become familiar with complex stand-alone programs or obscure format conversion utilities. Tests conducted on a sample of 90 proteins from the BioMagResBank indicate that GeNMR produces high-quality models for all protein queries, regardless of the type of NMR input data. GeNMR was developed to facilitate rapid, user-friendly structure determination of protein structures via NMR spectroscopy. GeNMR is accessible at http://www.genmr.ca.


Nucleic Acids Research | 2010

PROSESS: a protein structure evaluation suite and server

Mark V. Berjanskii; Yongjie Liang; Jianjun Zhou; Peter Tang; Paul Stothard; You Zhou; Joseph A. Cruz; Cam Macdonell; Guohui Lin; Paul Lu; David S. Wishart

PROSESS (PROtein Structure Evaluation Suite and Server) is a web server designed to evaluate and validate protein structures generated by X-ray crystallography, NMR spectroscopy or computational modeling. While many structure evaluation packages have been developed over the past 20 years, PROSESS is unique in its comprehensiveness, its capacity to evaluate X-ray, NMR and predicted structures as well as its ability to evaluate a variety of experimental NMR data. PROSESS integrates a variety of previously developed, well-known and thoroughly tested methods to evaluate both global and residue specific: (i) covalent and geometric quality; (ii) non-bonded/packing quality; (iii) torsion angle quality; (iv) chemical shift quality and (v) NOE quality. In particular, PROSESS uses VADAR for coordinate, packing, H-bond, secondary structure and geometric analysis, GeNMR for calculating folding, threading and solvent energetics, ShiftX for calculating chemical shift correlations, RCI for correlating structure mobility to chemical shift and PREDITOR for calculating torsion angle-chemical shifts agreement. PROSESS also incorporates several other programs including MolProbity to assess atomic clashes, Xplor-NIH to identify and quantify NOE restraint violations and NAMD to assess structure energetics. PROSESS produces detailed tables, explanations, structural images and graphs that summarize the results and compare them to values observed in high-quality or high-resolution protein structures. Using a simplified red–amber–green coloring scheme PROSESS also alerts users about both general and residue-specific structural problems. PROSESS is intended to serve as a tool that can be used by structure biologists as well as database curators to assess and validate newly determined protein structures. PROSESS is freely available at http://www.prosess.ca.


Journal of Biomolecular NMR | 2012

Resolution-by-proxy: a simple measure for assessing and comparing the overall quality of NMR protein structures

Mark V. Berjanskii; Jianjun Zhou; Yongjie Liang; Guohui Lin; David S. Wishart

In protein X-ray crystallography, resolution is often used as a good indicator of structural quality. Diffraction resolution of protein crystals correlates well with the number of X-ray observables that are used in structure generation and, therefore, with protein coordinate errors. In protein NMR, there is no parameter identical to X-ray resolution. Instead, resolution is often used as a synonym of NMR model quality. Resolution of NMR structures is often deduced from ensemble precision, torsion angle normality and number of distance restraints per residue. The lack of common techniques to assess the resolution of X-ray and NMR structures complicates the comparison of structures solved by these two methods. This problem is sometimes approached by calculating “equivalent resolution” from structure quality metrics. However, existing protocols do not offer a comprehensive assessment of protein structure as they calculate equivalent resolution from a relatively small number (<5) of protein parameters. Here, we report a development of a protocol that calculates equivalent resolution from 25 measurable protein features. This new method offers better performance (correlation coefficient of 0.92, mean absolute error of 0.28xa0Å) than existing predictors of equivalent resolution. Because the method uses coordinate data as a proxy for X-ray diffraction data, we call this measure “Resolution-by-Proxy” or ResProx. We demonstrate that ResProx can be used to identify under-restrained, poorly refined or inaccurate NMR structures, and can discover structural defects that the other equivalent resolution methods cannot detect. The ResProx web server is available at http://www.resprox.ca.


very large data bases | 2003

Data bubbles for non-vector data: speeding-up hierarchical clustering in arbitrary metric spaces

Jianjun Zhou; Jörg Sander

To speed-up clustering algorithms, data summarization methods have been proposed, which first summarize the data set by computing suitable representative objects. Then, a clustering algorithm is applied to these representatives only, and a clustering structure for the whole data set is derived, based on the result for the representatives. Most previous methods are, however, limited in their application domain. They are in general based on sufficient statistics such as the linear sum of a set of points, which assumes that the data is from a vector space. On the other hand, in many important applications, the data is from a metric non-vector space, and only distances between objects can be exploited to construct effective data summarizations. In this paper, we develop a new data summarization method based only on distance information that can be applied directly to non-vector data. An extensive performance evaluation shows that our method is very effective in finding the hierarchical clustering structure of non-vector data using only a very small number of data summarizations, thus resulting in a large reduction of runtime while trading only very little clustering quality.


Journal of Proteome Research | 2014

Development of isotope labeling liquid chromatography mass spectrometry for mouse urine metabolomics: quantitative metabolomic study of transgenic mice related to Alzheimer's disease.

Jun Peng; Kevin Guo; Jianguo Xia; Jianjun Zhou; Jing Yang; David Westaway; David S. Wishart; Liang Li

Because of a limited volume of urine that can be collected from a mouse, it is very difficult to apply the common strategy of using multiple analytical techniques to analyze the metabolites to increase the metabolome coverage for mouse urine metabolomics. We report an enabling method based on differential isotope labeling liquid chromatography mass spectrometry (LC-MS) for relative quantification of over 950 putative metabolites using 20 μL of urine as the starting material. The workflow involves aliquoting 10 μL of an individual urine sample for ¹²C-dansylation labeling that target amines and phenols. Another 10 μL of aliquot was taken from each sample to generate a pooled sample that was subjected to ¹³C-dansylation labeling. The ¹²C-labeled individual sample was mixed with an equal volume of the ¹³C-labeled pooled sample. The mixture was then analyzed by LC-MS to generate information on metabolite concentration differences among different individual samples. The interday repeatability for the LC-MS runs was assessed, and the median relative standard deviation over 4 days was 5.0%. This workflow was then applied to a metabolomic biomarker discovery study using urine samples obtained from the TgCRND8 mouse model of early onset familial Alzheimers disease (FAD) throughout the course of their pathological deposition of beta amyloid (Aβ). It was showed that there was a distinct metabolomic separation between the AD prone mice and the wild type (control) group. As early as 15-17 weeks of age (presymptomatic), metabolomic differences were observed between the two groups, and after the age of 25 weeks the metabolomic alterations became more pronounced. The metabolomic changes at different ages corroborated well with the phenotype changes in this transgenic mice model. Several useful candidate biomarkers including methionine, desaminotyrosine, taurine, N1-acetylspermidine, and 5-hydroxyindoleacetic acid were identified. Some of them were found in previous metabolomics studies in human cerebrospinal fluid or blood samples. This work illustrates the utility of this isotope labeling LC-MS method for biomarker discovery using mouse urine metabolomics.


BMC Bioinformatics | 2013

An improved method to detect correct protein folds using partial clustering

Jianjun Zhou; David S. Wishart

BackgroundStructure-based clustering is commonly used to identify correct protein folds among candidate folds (also called decoys) generated by protein structure prediction programs. However, traditional clustering methods exhibit a poor runtime performance on large decoy sets. We hypothesized that a more efficient “partial“ clustering approach in combination with an improved scoring scheme could significantly improve both the speed and performance of existing candidate selection methods.ResultsWe propose a new scheme that performs rapid but incomplete clustering on protein decoys. Our method detects structurally similar decoys (measured using either Cα RMSD or GDT-TS score) and extracts representatives from them without assigning every decoy to a cluster. We integrated our new clustering strategy with several different scoring functions to assess both the performance and speed in identifying correct or near-correct folds. Experimental results on 35 Rosetta decoy sets and 40 I-TASSER decoy sets show that our method can improve the correct fold detection rate as assessed by two different quality criteria. This improvement is significantly better than two recently published clustering methods, Durandal and Calibur-lite. Speed and efficiency testing shows that our method can handle much larger decoy sets and is up to 22 times faster than Durandal and Calibur-lite.ConclusionsThe new method, named HS-Forest, avoids the computationally expensive task of clustering every decoy, yet still allows superior correct-fold selection. Its improved speed, efficiency and decoy-selection performance should enable structure prediction researchers to work with larger decoy sets and significantly improve their ab initio structure prediction performance.


Nucleic Acids Research | 2007

PPT-DB: the protein property prediction and testing database

David S. Wishart; David Arndt; Mark V. Berjanskii; Anchi Guo; Yi Shi; Savita Shrivastava; Jianjun Zhou; You Zhou; Guohui Lin

The protein property prediction and testing database (PPT-DB) is a database housing nearly 30 carefully curated databases, each of which contains commonly predicted protein property information. These properties include both structural (i.e. secondary structure, contact order, disulfide pairing) and dynamic (i.e. order parameters, B-factors, folding rates) features that have been measured, derived or tabulated from a variety of sources. PPT-DB is designed to serve two purposes. First it is intended to serve as a centralized, up-to-date, freely downloadable and easily queried repository of predictable or ‘derived’ protein property data. In this role, PPT-DB can serve as a one-stop, fully standardized repository for developers to obtain the required training, testing and validation data needed for almost any kind of protein property prediction program they may wish to create. The second role that PPT-DB can play is as a tool for homology-based protein property prediction. Users may query PPT-DB with a sequence of interest and have a specific property predicted using a sequence similarity search against PPT-DBs extensive collection of proteins with known properties. PPT-DB exploits the well-known fact that protein structure and dynamic properties are highly conserved between homologous proteins. Predictions derived from PPT-DBs similarity searches are typically 85–95% correct (for categorical predictions, such as secondary structure) or exhibit correlations of >0.80 (for numeric predictions, such as accessible surface area). This performance is 10–20% better than what is typically obtained from standard ‘ab initio’ predictions. PPT-DB, its prediction utilities and all of its contents are available at http://www.pptdb.ca

Collaboration


Dive into the Jianjun Zhou's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yi Shi

University of Alberta

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Liang Li

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

An He

University of British Columbia

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge