Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Peiying Ruan is active.

Publication


Featured researches published by Peiying Ruan.


BMC Bioinformatics | 2014

Feature weight estimation for gene selection: a local hyperlinear learning approach

Hongmin Cai; Peiying Ruan; Michael K. Ng; Tatsuya Akutsu

BackgroundModeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers.ResultsWe propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB).ConclusionExperiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms.


BMC Bioinformatics | 2014

Prediction of Heterotrimeric Protein Complexes by Two-phase Learning Using Neighboring Kernels

Peiying Ruan; Morihiro Hayashida; Osamu Maruyama; Tatsuya Akutsu

BackgroundProtein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods.ResultsWe propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes.ConclusionsWe propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes.


PLOS ONE | 2013

Prediction of Heterodimeric Protein Complexes from Weighted Protein-Protein Interaction Networks Using Novel Features and Kernel Functions

Peiying Ruan; Morihiro Hayashida; Osamu Maruyama; Tatsuya Akutsu

Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.


Integrated Computer-aided Engineering | 2012

A quadsection algorithm for grammar-based image compression

Morihiro Hayashida; Peiying Ruan; Tatsuya Akutsu

Grammar-based compression is to find a small grammar that generates a given data and has been well-studied in text compression. In this paper, we apply this methodology to compression of rectangular image data. We first define a context-free rectangular image grammar CFRIG by extending the context-free grammar. Then we propose a quadsection type algorithm by extending a bisection type algorithm for grammar-based compression of text data. We show that our proposed algorithm approximates in polynomial time the smallest CFRIG within a factor of On^{4/3}, where an input image data is of size On × On. Furthermore, we practically improve the proposed algorithm by adding several rules concerning sub-images. We also present results on computational experiments on the proposed algorithms.


Methods | 2014

Proteome compression via protein domain compositions.

Morihiro Hayashida; Peiying Ruan; Tatsuya Akutsu

In this paper, we study domain compositions of proteins via compression of whole proteins in an organism for the sake of obtaining the entropy that the individual contains. We suppose that a protein is a multiset of domains. Since gene duplication and fusion have occurred through evolutionary processes, the same domains and the same compositions of domains appear in multiple proteins, which enables us to compress a proteome by using references to proteins for duplicated and fused proteins. Such a network with references to at most two proteins is modeled as a directed hypergraph. We propose a heuristic approach by combining the Edmonds algorithm and an integer linear programming, and apply our procedure to 14 proteomes of Dictyostelium discoideum, Escherichia coli, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Oryza sativa, Danio rerio, Xenopus laevis, Gallus gallus, Mus musculus, Pan troglodytes, and Homo sapiens. The compressed size using both of duplication and fusion was smaller than that using only duplication, which suggests the importance of fusion events in evolution of a proteome.


BMC Bioinformatics | 2018

Improving prediction of heterodimeric protein complexes using combination with pairwise kernel

Peiying Ruan; Morihiro Hayashida; Tatsuya Akutsu; Jean-Philippe Vert

BackgroundSince many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers.ResultsIn this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far.ConclusionsWe propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.


international conference on future generation information technology | 2010

A Quadsection Algorithm for Grammar-Based Image Compression

Morihiro Hayashida; Peiying Ruan; Tatsuya Akutsu

Grammar-based compression is to find a small grammar that generates a given data and has been well-studied in text compression. In this paper, we apply this methodology to compression of rectangular image data. We first define a context-free rectangular image grammar (CFRIG) by extending the context-free grammar. Then we propose a quadsection type algorithm by extending a bisection type algorithm for grammar-based compression of text data. We show that our proposed algorithm approximates in polynomial time the smallest CFRIG within a factor of O(n 4/3), where an input image data is of size O(n) ×O(n). We also present results on computational experiments on the proposed algorithm.


IEICE technical report. Speech | 2014

Prediction of Heterotrimeric Protein Complexes by Two-phase Learning Using Neighboring Kernels (ニューロコンピューティング)

Peiying Ruan; Morihiro Hayashida; Osamu Maruyama


IEICE technical report. Speech | 2014

Prediction of Heterotrimeric Protein Complexes by Two-phase Learning Using Neighboring Kernels (情報論的学習理論と機械学習)

Peiying Ruan; Morihiro Hayashida; Osamu Maruyama; Tatsuya Akutsu


情報処理学会研究報告 | 2010

A Quadsection Algorithm for Grammar-Based Image Compression (数理モデル化と問題解決(MPS) Vol.2010-MPS-78)

Morihiro Hayashida; Peiying Ruan; Tatsuya Akutsu

Collaboration


Dive into the Peiying Ruan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hongmin Cai

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Michael K. Ng

Hong Kong Baptist University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge