Chihyun Park | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chihyun Park is active.

Explore More

Publication

Featured researches published by Chihyun Park.

intelligent systems in molecular biology | 2011

Integrative gene network construction for predicting a set of complementary prostate cancer genes

Jaegyoon Ahn; Youngmi Yoon; Chihyun Park; Eunji Shin; Sanghyun Park

MOTIVATION Diagnosis and prognosis of cancer and understanding oncogenesis within the context of biological pathways is one of the most important research areas in bioinformatics. Recently, there have been several attempts to integrate interactome and transcriptome data to identify subnetworks that provide limited interpretations of known and candidate cancer genes, as well as increase classification accuracy. However, these studies provide little information about the detailed roles of identified cancer genes. RESULTS To provide more information to the network, we constructed the network by incorporating genetic interactions and manually curated gene regulations to the protein interaction network. To make our newly constructed network cancer specific, we identified edges where two genes show different expression patterns between cancer and normal phenotypes. We showed that the integration of various datasets increased classification accuracy, which suggests that our network is more complete than a network based solely on protein interactions. We also showed that our network contains significantly more known cancer-related genes than other feature selection algorithms. Through observations of some examples of cancer-specific subnetworks, we were able to predict more detailed and interpretable roles of oncogenes and other cancer candidate genes in the prostate cancer cells. AVAILABILITY http://embio.yonsei.ac.kr/~Ahn/tc.php. CONTACT [email protected]

PLOS ONE | 2014

Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.

Chihyun Park; Jaegyoon Ahn; Hyunjin Kim; Sanghyun Park

Background The prognosis of cancer recurrence is an important research area in bioinformatics and is challenging due to the small sample sizes compared to the vast number of genes. There have been several attempts to predict cancer recurrence. Most studies employed a supervised approach, which uses only a few labeled samples. Semi-supervised learning can be a great alternative to solve this problem. There have been few attempts based on manifold assumptions to reveal the detailed roles of identified cancer genes in recurrence. Results In order to predict cancer recurrence, we proposed a novel semi-supervised learning algorithm based on a graph regularization approach. We transformed the gene expression data into a graph structure for semi-supervised learning and integrated protein interaction data with the gene expression data to select functionally-related gene pairs. Then, we predicted the recurrence of cancer by applying a regularization approach to the constructed graph containing both labeled and unlabeled nodes. Conclusions The average improvement rate of accuracy for three different cancer datasets was 24.9% compared to existing supervised and semi-supervised methods. We performed functional enrichment on the gene networks used for learning. We identified that those gene networks are significantly associated with cancer-recurrence-related biological functions. Our algorithm was developed with standard C++ and is available in Linux and MS Windows formats in the STL library. The executable program is freely available at: http://embio.yonsei.ac.kr/~Park/ssl.php.

PLOS ONE | 2011

A multi-sample based method for identifying common CNVs in normal human genomic structure using high-resolution aCGH data

Chihyun Park; Jaegyoon Ahn; Youngmi Yoon; Sanghyun Park

Background It is difficult to identify copy number variations (CNV) in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH) containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample. Methodology and Principal Findings We developed a multi-sample-based genomic variations detector (MGVD) that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs); CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR). Conclusions and Significance We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at: http://embio.yonsei.ac.kr/~Park/mgvd.php.

Computers in Biology and Medicine | 2013

ICP: A novel approach to predict prognosis of prostate cancer with inner-class clustering of gene expression data

Hyunjin Kim; Jaegyoon Ahn; Chihyun Park; Youngmi Yoon; Sanghyun Park

Prostate cancer has heterogeneous characteristics. For that reason, even if tumors appear histologically similar to each other, there are many cases in which they are actually different, based on their gene expression levels. A single tumor may have multiple expression levels with both high-risk cancer genes and low-risk cancer genes. We can produce more useful models for stratifying prostate cancers into high-risk cancer and low-risk cancer categories by considering the range in each class through inner-class clustering. In this paper, we attempt to classify cancers into high-risk (aggressive) prostate cancer and low-risk (non-aggressive) prostate cancer using ICP (Inner-class Clustering and Prediction). Our model classified more efficiently than the models of the algorithms used for comparison. After discovering a number of genes linked to prostate cancer from the gene pairs used in our classification, we discovered that the proposed method can be used to find new unknown genes and gene pairs which distinguish between high-risk cancer and low-risk cancer.

bioinformatics and bioengineering | 2009

A Computational Approach to Detect CNVs Using High-throughput Sequencing

Myungjin Moon; Jaegyoon Ahn; Youngmi Yoon; Chihyun Park; Sanghyun Park; Jee-Hee Yoon

Copy-Number Variations (CNVs) can be defined as gains or losses that are greater than 1kbs of genomic DNA among phenotypically normal individuals. CNVs detected by microarray based approach are limited to medium or large sized ones because of its low resolution. Here we propose a novel approach to detect CNVs by aligning the short reads obtained by high-throughput sequencer to the previously assembled human genome sequence, and analyzing the distribution of the aligned reads. Application of our algorithm demonstrates the feasibility of detecting CNVs of arbitrary length, which include short ones that microarray based algorithms cannot detect. Also, false positive and false negative rates of the results were relatively low compared to those of microarray based algorithms.

acm symposium on applied computing | 2009

A novel approach to detect copy number variation using segmentation and genetic algorithm

Chihyun Park; Youngmi Yoon; Jaegyoon Ahn; Myungjin Moon; Sanghyun Park

Among many forms of genomic variations, copy-number variations (CNVs) can be defined as gains or losses of several kilobases to hundreds of kilobases of genomic DNA. Since many CNVs include genes that result in differential levels of gene expression, CNVs may account for a significant proportion of normal phenotypic variation. Some scientists demonstrated that a large portion of overlapping, currently known common human CNVs, were smaller in his dataset. However, previous experimental studies, performed primarily by a-CGH techniques, are limited to detection of CNVs of large-sized CNVs. Efficient algorithms for finding small-sized CNVs are essential. In our paper, we propose a novel approach to find small-sized CNVs on a-CGH data which is a sequential 2-dimensional clustering method. The algorithm we propose is robust to some level of noise. And regardless of the size of probes, our algorithm can find CNVs consisting of small number of probes.

Expert Systems With Applications | 2017

Systematic identification of differential gene network to elucidate Alzheimer's disease

Chihyun Park; Youngmi Yoon; Min Oh; Seok Jong Yu; Jaegyoon Ahn

We focus on revealing the mechanism of Alzheimers disease (AD) by network analysis.We present a novel method to construct a gene network by integrating omics data.The gene network, maintaining the specificity of AD, was statistically optimized.Potential genes and modules that can elucidate a mechanism of AD were identified.We demonstrated the epigenetic factor and ribosomal process are associated with AD. Alzheimers disease (AD) is a genetically complex neurodegenerative diseases and its pathological mechanism has not been fully discovered. The mechanism of AD can be inferred by elucidating how molecular entities are interacting on the pathway level and how some pathways collectively influence the occurrence of the disease. Such an analysis is considerably complex and cannot be manually performed by experts. It can be solved by integrating huge heterogeneous dataset and systematically building an intelligent system which model molecular network and analyze the causality. Here, we present a novel method to construct an optimized AD-specific differential gene network by integrating a high-confidence interactome and gene expression data. In order to consider an epigenetic factor, we identified differentially methylated genes in AD and the results were projected on the network for mechanism analysis. Through diverse topological analysis and functional enrichment tests, we experimentally demonstrated that the several potential genes and sub networks were significantly related with AD and they could be used to elucidate the molecular mechanism. Taken the experimental results and literature studies together, we newly discovered that ribosomal process-related genes and DNA methylation might play an important role in AD. The proposed system is applicable not only to AD but also to various complex genetic disease models that require new molecular mechanism analysis based on network.

BMC Bioinformatics | 2017

Drug voyager: a computational platform for exploring unintended drug action

Min Oh; Jaegyoon Ahn; Taekeon Lee; Giup Jang; Chihyun Park; Youngmi Yoon

BackgroundThe dominant paradigm in understanding drug action focuses on the intended therapeutic effects and frequent adverse reactions. However, this approach may limit opportunities to grasp unintended drug actions, which can open up channels to repurpose existing drugs and identify rare adverse drug reactions. Advances in systems biology can be exploited to comprehensively understand pharmacodynamic actions, although proper frameworks to represent drug actions are still lacking.ResultsWe suggest a novel platform to construct a drug-specific pathway in which a molecular-level mechanism of action is formulated based on pharmacologic, pharmacogenomic, transcriptomic, and phenotypic data related to drug response (http://databio.gachon.ac.kr/tools/). In this platform, an adoption of three conceptual levels imitating drug perturbation allows these pathways to be realistically rendered in comparison to those of other models. Furthermore, we propose a new method that exploits functional features of the drug-specific pathways to predict new indications as well as adverse reactions. For therapeutic uses, our predictions significantly overlapped with clinical trials and an up-to-date drug-disease association database. Also, our method outperforms existing methods with regard to classification of active compounds for cancers. For adverse reactions, our predictions were significantly enriched in an independent database derived from the Food and Drug Administration (FDA) Adverse Event Reporting System and meaningfully cover an Adverse Reaction Database provided by Health Canada. Lastly, we discuss several predictions for both therapeutic indications and side-effects through the published literature.ConclusionsOur study addresses how we can computationally represent drug-signaling pathways to understand unintended drug actions and to facilitate drug discovery and screening.

Bioinformatics | 2012

Identification of functional CNV region networks using a CNV-gene mapping algorithm in a genome-wide scale

Chihyun Park; Jaegyoon Ahn; Youngmi Yoon; Sanghyun Park

MOTIVATION Identifying functional relation of copy number variation regions (CNVRs) and gene is an essential process in understanding the impact of genotypic variations on phenotype. There have been many related works, but only a few attempts were made to normal populations. RESULTS To analyze the functions of genome-wide CNVRs, we applied a novel correlation measure called Correlation based on Sample Set (CSS) to paired Whole Genome TilePath array and messenger RNA (mRNA) microarray data from 210 HapMap individuals with normal phenotypes and calculated the confident CNVR-gene relationships. Two CNVR nodes form an edge if they regulate a common set of genes, allowing the construction of a global CNVR network. We performed functional enrichment on the common genes that were trans-regulated from CNVRs clustered together in our CNVR network. As a result, we observed that most of CNVR clusters in our CNVR network were reported to be involved in some biological processes or cellular functions, while most CNVR clusters from randomly constructed CNVR networks showed no evidence of functional enrichment. Those results imply that CSS is capable of finding related CNVR-gene pairs and CNVR networks that have functional significance. AVAILABILITY http://embio.yonsei.ac.kr/~ Park/cnv_net.php. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

PLOS ONE | 2018

Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles

Chihyun Park; Jung Rim Kim; Jeongwoo Kim; Sang-Hyun Park

The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene–gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer’s disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer’s disease.

Explore More