Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yuhua Yao is active.

Publication


Featured researches published by Yuhua Yao.


Proteins | 2008

Analysis of similarity/dissimilarity of protein sequences

Yuhua Yao; Qi Dai; Chun Li; Ping-an He; Xuying Nan; Yaozhou Zhang

On the basis of a selected pair of physicochemical properties of amino acids, we introduce a dynamic 2D graphical representation of protein sequences. Then, we introduce and compare two numerical characterizations of protein graphs as descriptors to analyze the nine ND5 proteins. The approach is simple, convenient, and fast. Proteins 2008.


Journal of Computational Chemistry | 2009

Similarity/dissimilarity studies of protein sequences based on a new 2D graphical representation

Yuhua Yao; Qi Dai; Ling Li; Xuying Nan; Ping-an He; Yaozhou Zhang

A (two‐dimensional) 2D graphical representation of protein sequences based on six physicochemical properties of amino acids is outlined. The numerical characterization of protein graphs is given as descriptors of protein sequences. It is not only useful for comparative study of proteins but also for encoding innate information about the structure of proteins. The coefficient of determination is proposed as a new similarity/dissimilarity measure. Finally, a simple example is taken to highlight the behavior of the new similarity/dissimilarity measure on protein sequences taken from the ND6 (NADH dehydrogenase subunit 6) proteins for eight different species. The results demonstrate the approach is convenient, fast, and efficient.


Journal of Computational Chemistry | 2010

The graphical representation of protein sequences based on the physicochemical properties and its applications

Ping-an He; Yan-Ping Zhang; Yuhua Yao; Yi-Fa Tang; Xuying Nan

Based on the chaos game representation, a 2D graphical representation of protein sequences was introduced in which the 20 amino acids are rearranged in a cyclic order according to their physicochemical properties. The Euclidean distances between the corresponding amino acids from the 2‐D graphical representations are computed to find matching (or conserved) fragments of amino acids between the two proteins. Again, the cumulative distance of the 2D‐graphical representations is defined to compare the similarity of protein. And, the examination of the similarity among sequences of the ND5 proteins of nine species shows the utility of our approach.


Journal of Theoretical Biology | 2012

A 3D graphical representation of protein sequences based on the Gray code.

Ping-an He; Dan Li; Yanping Zhang; Xin Wang; Yuhua Yao

Based on the order of 6-bit binary Gray code, a cyclic order of 20 amino acids is introduced. A novel 3D graphical representation of protein sequences is proposed according to the CGR of DNA sequences. Furthermore, the mathematical descriptor is suggested to characterize the graphical representation curve. The efficiency of our approach can be illustrated by performing the comparison of similarities/dissimilarities among sequences of the ND5 proteins of nine different species. With the correlation and significance analysis, the comparisons of both our results and results of other graphical representation with the ClustalWs results can show the utility of our approach.


Journal of Theoretical Biology | 2014

A protein structural classes prediction method based on PSI-BLAST profile

Shuyan Ding; Shoujiang Yan; Shuhua Qi; Yan Li; Yuhua Yao

Knowledge of protein structural classes plays an important role in understanding protein folding patterns. Prediction of protein structural class based solely on sequence data remains to be a challenging problem. In this study, we extract the long-range correlation information and linear correlation information from position-specific score matrix (PSSM). A total of 3600 features are extracted, then, 278 features are selected by a filter feature selection method based on 1189 dataset. To verify the performance of our method (named by LCC-PSSM), jackknife tests are performed on three widely used low similarity benchmark datasets. Comparison of our results with the existing methods shows that our method provides the favorable performance for protein structural class prediction. Stand-alone version of the proposed method (LCC-PSSM) is written in MATLAB language and it can be downloaded from http://bioinfo.zstu.edu.cn/LCC-PSSM/.


Journal of Computational Chemistry | 2008

Analysis of similarity/dissimilarity of DNA sequences based on a class of 2D graphical representation

Yuhua Yao; Qi Dai; Xuying Nan; Ping-an He; Zuoming Nie; Songping Zhou; Yaozhou Zhang

On the basis of a class of 2D graphical representations of DNA sequences, sensitivity analysis has been performed, showing the high‐capability of the proposed representations to take into account small modifications of the DNA sequences. And sensitivity analysis also indicates that the absolute differences of the leading eigenvalues of the L/L matrices associated with DNA increase with the increase of the number of the base mutations. Besides, we conclude that the similarity analysis method based on the correlation angles can better eliminate the effects of the lengths of DNA sequences if compared with the method using the Euclidean distances. As application, the examination of similarities/dissimilarities among the coding sequences of the first exon of β‐globin gene of different species has been performed by our method, and the reasonable results verify the validity of our method.


Journal of Theoretical Biology | 2014

A novel descriptor of protein sequences and its application

Yuhua Yao; Shoujiang Yan; Jianning Han; Qi Dai; Ping-an He

In this paper, a dynamic 3-D graphical representation of protein sequences is introduced based on three physical-chemical properties of amino acids. The coordinates of the graph have direct biological significance, which could reflect the innate structure of the proteins. The information of principal moments of inertia and range of axis coordinate are extracted as a novel mixed descriptor and proposed for the comparison of protein primary sequences. Meanwhile, the Euclidean distance of the normalized descriptor vectors which avoid the influence of the difference in length of protein sequences under consideration is employed as a quantitative measurement of the similarity of proteins. Finally, we take the nine ND5 (NADH dehydrogenase subunit 5) proteins for example and illustrate the effectiveness of our approach.


BMC Bioinformatics | 2013

Comparison study on statistical features of predicted secondary structures for protein structural class prediction: From content to position

Qi Dai; Yan Li; Xiaoqing Liu; Yuhua Yao; Yunjie Cao; Ping-an He

BackgroundMany content-based statistical features of secondary structural elements (CBF-PSSEs) have been proposed and achieved promising results in protein structural class prediction, but until now position distribution of the successive occurrences of an element in predicted secondary structure sequences hasn’t been used. It is necessary to extract some appropriate position-based features of the secondary structural elements for prediction task.ResultsWe proposed some position-based features of predicted secondary structural elements (PBF-PSSEs) and assessed their intrinsic ability relative to the available CBF-PSSEs, which not only offers a systematic and quantitative experimental assessment of these statistical features, but also naturally complements the available comparison of the CBF-PSSEs. We also analyzed the performance of the CBF-PSSEs combined with the PBF-PSSE and further constructed a new combined feature set, PBF11CBF-PSSE. Based on these experiments, novel valuable guidelines for the use of PBF-PSSEs and CBF-PSSEs were obtained.ConclusionsPBF-PSSEs and CBF-PSSEs have a compelling impact on protein structural class prediction. When combining with the PBF-PSSE, most of the CBF-PSSEs get a great improvement over the prediction accuracies, so the PBF-PSSEs and the CBF-PSSEs have to work closely so as to make significant and complementary contributions to protein structural class prediction. Besides, the proposed PBF-PSSE’s performance is extremely sensitive to the choice of parameter k. In summary, our quantitative analysis verifies that exploring the position information of predicted secondary structural elements is a promising way to improve the abilities of protein structural class prediction.


Journal of Theoretical Biology | 2011

Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison.

Qi Dai; Xiaoqing Liu; Yuhua Yao; Fukun Zhao

Sequence comparison is one of the major tasks in bioinformatics, which can be used to study structural and functional conservation, as well as evolutionary relations among the sequences. Numerous dissimilarity measures achieve promising results in sequence comparison, but challenges remain. This paper studied numerical characteristics of word frequencies and proposed a novel dissimilarity measure for sequence comparison. Instead of using the word frequencies directly, the proposed measure considers both the word frequencies and overlapping structures of words. To verify the effectiveness of the proposed measure, we tested it with two experiments and further compared it with alignment-based and alignment-free measures. The results demonstrate that the proposed measure extracting more information on the overlapping structures of the words improves the efficiency of sequence comparison.


Gene | 2015

Prediction of protein structural classes for low-similarity sequences using reduced PSSM and position-based secondary structural features.

Junru Wang; Cong Wang; Jiajia Cao; Xiaoqing Liu; Yuhua Yao; Qi Dai

Many efficient methods have been proposed to advance protein structural class prediction, but there are still some challenges where additional insight or technology is needed for low-similarity sequences. In this work, we schemed out a new prediction method for low-similarity datasets using reduced PSSM and position-based secondary structural features. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed method achieved the best performance among the evaluated methods, with overall accuracy 3-5% higher than the existing best-performing method. This paper also found that the reduced alphabets with size 13 simplify PSSM structures efficiently while reserving its maximal information. This understanding can be used to design more powerful prediction methods for protein structural class.

Collaboration


Dive into the Yuhua Yao's collaboration.

Top Co-Authors

Avatar

Qi Dai

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Ping-an He

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Xiaoqing Liu

Hangzhou Dianzi University

View shared research outputs
Top Co-Authors

Avatar

Xuying Nan

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Huimin Xu

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Yaozhou Zhang

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Cong Wang

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Fen Kong

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Shoujiang Yan

Zhejiang Sci-Tech University

View shared research outputs
Top Co-Authors

Avatar

Zhuoxing Shi

Zhejiang Sci-Tech University

View shared research outputs
Researchain Logo
Decentralizing Knowledge