Wooyoung Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wooyoung Kim is active.

Explore More

Publication

Featured researches published by Wooyoung Kim.

BMC Systems Biology | 2011

Biological network motif detection and evaluation

Wooyoung Kim; Min Li; Jianxin Wang; Yi Pan

BackgroundMolecular level of biological data can be constructed into system level of data as biological networks. Network motifs are defined as over-represented small connected subgraphs in networks and they have been used for many biological applications. Since network motif discovery involves computationally challenging processes, previous algorithms have focused on computational efficiency. However, we believe that the biological quality of network motifs is also very important.ResultsWe define biological network motifs as biologically significant subgraphs and traditional network motifs are differentiated as structural network motifs in this paper. We develop five algorithms, namely, EDGE GO-BNM, EDGE BETWEENNESS-BNM, NMF-BNM, NMFGO-BNM and VOLTAGE-BNM, for efficient detection of biological network motifs, and introduce several evaluation measures including motifs included in complex, motifs included in functional module and GO term clustering score in this paper. Experimental results show that EDGE GO-BNM and EDGE BETWEENNESS-BNM perform better than existing algorithms and all of our algorithms are applicable to find structural network motifs as well.ConclusionWe provide new approaches to finding network motifs in biological networks. Our algorithms efficiently detect biological network motifs and further improve existing algorithms to find high quality structural network motifs, which would be impossible using existing algorithms. The performances of the algorithms are compared based on our new evaluation measures in biological contexts. We believe that our work gives some guidelines of network motifs research for the biological networks.

Tsinghua Science & Technology | 2012

Prediction of essential proteins using topological properties in GO-pruned PPI network based on machine learning methods

Wooyoung Kim

The prediction of essential proteins, the minimal set required for a living cell to support cellular life, is an important task to understand the cellular processes of an organism. Fast progress in high-throughput technologies and the production of large amounts of data enable the discovery of essential proteins at the system level by analyzing Protein-Protein Interaction (PPI) networks, and replacing biological or chemical experiments. Furthermore, additional gene-level annotation information, such as Gene Ontology (GO) terms, helps to detect essential proteins with higher accuracy. Various centrality algorithms have been used to determine essential proteins in a PPI network, and, recently motif centrality GO, which is based on network motifs and GO terms, works best in detecting essential proteins in a Bakers yeast Saccharomyces cerevisiae PPI network, compared to other centrality algorithms. However, each centrality algorithm contributes to the detection of essential proteins with different properties, which makes the integration of them a logical next step. In this paper, we construct a new feature space, named CENT-ING-GO consisting of various centrality measures and GO terms, and provide a computational approach to predict essential proteins with various machine learning techniques. The experimental results show that CENT-ING-GO feature space improves performance over the INT-GO feature space in previous work by Acencio and Lemke in 2009. We also demonstrate that pruning a PPI with informative GO terms can improve the prediction performance further.

bioinformatics and biomedicine | 2011

Essential Protein Discovery Based on Network Motif and Gene Ontology

Wooyoung Kim; Min Li; Jianxin Wang; Yi Pan

Essential proteins are indispensable to support cellular life and constitute a minimal set required for a living cell. Fast progress in high-throughput technologies and large amount of data enable to discover essential proteins in system level by analyzing protein-protein interaction networks. A number of centrality algorithms are suggested to detect essential proteins, but they focus only on network structures. In this paper, we develop a new centrality algorithm, named MCGO which uses network motifs for centrality measure in the graph pruned by EDGE GO. EDGE GO algorithm utilizes Gene Ontology(GO) to trim a number of uninformative edges from the network. We compare the performance of our algorithm with DC (degree centrality) and SoECC (sum of edge clustering coefficient) against various evaluation measures. Experimental results applied to an yeast protein-protein interaction network downloaded from DIP database show that MCGO performs significantly better than DC and SoECC. We also show that DC and SoECC improve greatly when EDGE GO is applied to them.

Expert Systems With Applications | 2011

Sparse nonnegative matrix factorization for protein sequence motif discovery

Wooyoung Kim; Bernard Chen; Jingu Kim; Yi Pan; Haesun Park

The problem of discovering motifs from protein sequences is a critical and challenging task in the field of bioinformatics. The task involves clustering relatively similar protein segments from a huge collection of protein sequences and culling high quality motifs from a set of clusters. A granular computing strategy combined with K-means clustering algorithm was previously proposed for the task, but this strategy requires a manual selection of biologically meaningful clusters which are to be used as an initial condition. This manipulated clustering method is undisciplined as well as computationally expensive. In this paper, we utilize sparse non-negative matrix factorization (SNMF) to cluster a large protein data set. We show how to combine this method with Fuzzy C-means algorithm and incorporate bio-statistics information to increase the number of clusters whose structural similarity is high. Our experimental results show that an SNMF approach provides better protein groupings in terms of similarities in secondary structures while maintaining similarities in protein primary sequences.

high performance computing and communications | 2015

Agent and Spatial Based Parallelization of Biological Network Motif Search

Matthew Kipps; Wooyoung Kim; Munehiro Fukuda

Most graph algorithms are challenging in parallelization, in particular executing fine-grain computation at each graph node in parallel from both programmability and performance viewpoints. To bridge the semantic gap between the original sequential algorithms and their corresponding parallelized programs, we have been developing MASS: a parallel library for multi-agent spatial simulation. The library allows software agents to crawl a distributed array, e.g., a graph mapped over a cluster system. To demonstrate the MASS librarys fitness to graph parallelization, we have focused on biological network motif search. This paper compares three different parallelizing approaches such as the MASS agent-based, MASS array-based, and the conventional MPI parallelizations, and discusses the MASS librarys applicability to graph algorithms.

international conference on machine learning and applications | 2008

Detection of Unnatural Movement Using Epitomic Analysis

Wooyoung Kim; James M. Rehg

Epitomic analysis, a recent statistical approach to form a generative model, has been applied to image, video and audio processing applications. We apply the epitomic analysis to motion capture data and define it as a motion epitome, a probabilistic model representing a finite set of primitive movements which retain various lengths of local dynamics. We review the generation, inference and learning procedures of an epitome, adapt them for motion capture data and utilize the epitomic analysis to detect unnatural movements given only positive (natural) training data. We introduce a multi-resolution of motion epitomes as well as a full body and an ensemble of epitomes, then present experimental results and compare the performance with other conventional classification methods, including Hidden Markov Models and Switching Linear Dynamic Systems.

international symposium on bioinformatics research and applications | 2017

NemoLib: A Java Library for Efficient Network Motif Detection

Andrew Andersen; Wooyoung Kim

A network motif is defined as an overabundant subgraph pattern in a network and has been applied in various biological and medical problems. Various network motif detection algorithms and tools are currently available. However, most existing software programs are outdated, incompatible with modern operating systems, or do not provide sufficient operation instructions. Furthermore, most tools provide limited information regarding network motifs, which necessitates post-processing program to apply to real problems. Consequently, the lack of usability brings a certain amount of skepticism about the relevance of network motifs in investigating real biological problems. Therefore, this paper introduces NemoLib (network motif library) as a general purpose tool for detection and analysis of network motifs. NemoLib is highly programmable Java library which provides for extensibility.

BMC Bioinformatics | 2017

NemoProfile as an efficient approach to network motif analysis with instance collection

Wooyoung Kim; Lynnette Haukap

BackgroundA network motif is defined as a statistically significant and recurring subgraph pattern within a network. Most existing instance collection methods are not feasible due to high memory usage issues and provision of limited network motif information. They require a two-step process that requires network motif identification prior to instance collection. Due to the impracticality in obtaining motif instances, the significance of their contribution to problem solving is debated within the field of biology.ResultsThis paper presents NemoProfile, an efficient new network motif data model. NemoProfile simplifies instance collection by resolving memory overhead issues and is seamlessly generated, thus eliminating the need for costly two-step processing. Additionally, a case study was conducted to demonstrate the application of network motifs to existing problems in the field of biology.ConclusionNemoProfile comprises network motifs and their instances, thereby facilitating network motifs usage in real biological problems.

ieee international conference on cloud computing technology and science | 2016

MASS-Based NemoProfile Construction for an Efficient Network Motif Search

Andrew Andersen; Wooyoung Kim; Munehiro Fukuda

A network motif is a frequent and unique subgraph pattern defined in a network and it has been applied in various biological and medical problems. However, finding network motifs is computationally intensive task as it involves heavily resource-demanding tasks. We have suggested a couple of parallelization efforts to alleviate the computational intensity in the past including MASS(Multi-Agent Spatial Simulation)-based and MapReduce-based methods. MASS library allows software agent to crawl a distributed array over a cluster system to find network motifs. Although MASS library can serve as a promising tool to parallelize graph algorithms intuitively with moderate performance penalty, current output formats that fail to provide instances of network motifs, are restricting extensible applications for real-world problems. Therefore, this paper introduces a MASS-based parallelization strategy to construct a new network motif representation called, NemoProfile (Network Motif Profile). A sequential implementation has showed that NemoProfile is a compact representation that can be almost effortlessly generated during sequential version of motif-finding process. In this paper, we also show that its effectiveness in various parallel implementations including MASS-Agent, MASS-Place, and MPI-based methods, by providing various testing results.

Tsinghua Science & Technology | 2013