Youngmi Yoon
Gachon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Youngmi Yoon.
international conference of the ieee engineering in medicine and biology society | 2004
H.M. Seong; J.S. Lee; Tae-Min Shin; W.S. Kim; Youngmi Yoon
Conventional power spectrum methods based on fast Fourier transform (FFT), autoregressive(AR) model are not appropriate for analyzing biomedical signals whose spectral characteristics change rapidly. On the other hand, time-frequency analysis has more desirable characteristics of a time-varying spectrum. In this study, we investigated the spectral components of heart rate variability (HRV) in a time-frequency domain. Then, from the instantaneous frequency, obtained from time-frequency distribution, the method extracting frequency components of HRV was proposed. The subjects were 17 healthy young men. A coin-stacking task was used to induce mental stress. In the results, the emotional stress of subjects produced an increase in sympathetic activity. Sympathetic activation was responsible for the significant increase in the LF/HF ratio. The subjects were divided into two groups with task ability. The subject who had higher mental stress had a lack of task ability.
intelligent systems in molecular biology | 2011
Jaegyoon Ahn; Youngmi Yoon; Chihyun Park; Eunji Shin; Sanghyun Park
MOTIVATION Diagnosis and prognosis of cancer and understanding oncogenesis within the context of biological pathways is one of the most important research areas in bioinformatics. Recently, there have been several attempts to integrate interactome and transcriptome data to identify subnetworks that provide limited interpretations of known and candidate cancer genes, as well as increase classification accuracy. However, these studies provide little information about the detailed roles of identified cancer genes. RESULTS To provide more information to the network, we constructed the network by incorporating genetic interactions and manually curated gene regulations to the protein interaction network. To make our newly constructed network cancer specific, we identified edges where two genes show different expression patterns between cancer and normal phenotypes. We showed that the integration of various datasets increased classification accuracy, which suggests that our network is more complete than a network based solely on protein interactions. We also showed that our network contains significantly more known cancer-related genes than other feature selection algorithms. Through observations of some examples of cancer-specific subnetworks, we were able to predict more detailed and interpretable roles of oncogenes and other cancer candidate genes in the prostate cancer cells. AVAILABILITY http://embio.yonsei.ac.kr/~Ahn/tc.php. CONTACT [email protected]
Information Sciences | 2008
Youngmi Yoon; Jong-Chan Lee; Sanghyun Park; Sangjay Bien; Hyun Cheol Chung; Sun Young Rha
The ability to provide thousands of gene expression values simultaneously makes microarray data very useful for phenotype classification. A major constraint in phenotype classification is that the number of genes greatly exceeds the number of samples. We overcame this constraint in two ways; we increased the number of samples by integrating independently generated microarrays that had been designed with the same biological objectives, and reduced the number of genes involved in the classification by selecting a small set of informative genes. We were able to maximally use the abundant microarray data that is being stockpiled by thousands of different research groups while improving classification accuracy. Our goal is to implement a feature (gene) selection method that can be applicable to integrated microarrays as well as to build a highly accurate classifier that permits straightforward biological interpretation. In this paper, we propose a two-stage approach. Firstly, we performed a direct integration of individual microarrays by transforming an expression value into a rank value within a sample and identified informative genes by calculating the number of swaps to reach a perfectly split sequence. Secondly, we built a classifier which is a parameter-free ensemble method using only the pre-selected informative genes. By using our classifier that was derived from large, integrated microarray sample datasets, we achieved high accuracy, sensitivity, and specificity in the classification of an independent test dataset.
Journal of Biomedical Informatics | 2015
Jeongwoo Kim; Hyunjin Kim; Youngmi Yoon; Sanghyun Park
Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimers disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods.
international conference of the ieee engineering in medicine and biology society | 2002
A.R. Sul; Joong-Wook Shin; ChungK Lee; Youngmi Yoon; Jose C. Principe
This paper is about the evaluation of stress reactivity and recovery using biosignals and fuzzy theory. We induced mental stress by means of a coin-stacking task. During the experiment, 4 kinds of biosignals, which are frontalis EMG, ECG, peripheral skin temperature and skin conductance level, were acquired. Then, the degree of stress was assessed by synthetically analyzing those signals using fuzzy inference. From the fuzzy inference result, the parameters (Amount of Physiological Change/Amount of Imposed Stress) and (Time to 25% Recovery), which represent reactivity and recovery respectively, were derived. We made a graph using the reactivity parameter as an abscissa and the recovery parameter as an ordinate for each subject. From the graph, the distance from the origin to the coordinate ((Amount of Physiological Change/Amount of Imposed Stress), (Time to 25% Recovery)) was introduced as a stress index. An insensitive reactivity and a fast recovery is an effective coping with stress. Therefore, the small value of the stress index proposed in this research will indicate being in good health.
systems man and cybernetics | 2010
Youngmi Yoon; Sangjay Bien; Sanghyun Park
Microarray experiments generate quantitative expression measurements for thousands of genes simultaneously, which is useful for phenotype classification of many diseases. Our proposed phenotype classifier is an ensemble method with k-top-scoring decision rules. Each rule involves a number of genes, a rank comparison relation among them, and a class label. Current classifiers, which are also ensemble methods, consist of k-top-scoring decision rules. Some of these classifiers fix the number of genes in each rule as a triple or a pair. In this paper, we generalize the number of genes involved in each rule. The number of genes in each rule ranges from 2 to N, respectively. Generalizing the number of genes increases the robustness and the reliability of the classifier for the class prediction of an independent sample. Our algorithm saves resources by combining shorter rules in order to build a longer rule. It converges rapidly toward its high-scoring rule list by implementing several heuristics. The parameter k is determined by applying leave-one-out cross validation to the training dataset.
Information Sciences | 2011
Jaegyoon Ahn; Youngmi Yoon; Sanghyun Park
Biclusters are subsets of genes that exhibit similar behavior over a set of conditions. A biclustering algorithm is a useful tool for uncovering groups of genes involved in the same cellular processes and groups of conditions under which these processes take place. In this paper, we propose a polynomial time algorithm to identify functionally highly correlated biclusters. Our algorithm identifies (1) gene sets that simultaneously exhibit additive, multiplicative, and combined patterns and allow high levels of noise, (2) multiple, possibly overlapped, and diverse gene sets, (3) biclusters that simultaneously exhibit negatively and positively correlated gene sets, and (4) gene sets for which the functional association is very high. We validate the level of functional association in our method by using the GO database, protein-protein interactions and KEGG pathways.
PLOS ONE | 2014
Min Oh; Jaegyoon Ahn; Youngmi Yoon
The growing number and variety of genetic network datasets increases the feasibility of understanding how drugs and diseases are associated at the molecular level. Properly selected features of the network representations of existing drug-disease associations can be used to infer novel indications of existing drugs. To find new drug-disease associations, we generated an integrative genetic network using combinations of interactions, including protein-protein interactions and gene regulatory network datasets. Within this network, network adjacencies of drug-drug and disease-disease were quantified using a scored path between target sets of them. Furthermore, the common topological module of drugs or diseases was extracted, and thereby the distance between topological drug-module and disease (or disease-module and drug) was quantified. These quantified scores were used as features for the prediction of novel drug-disease associations. Our classifiers using Random Forest, Multilayer Perceptron and C4.5 showed a high specificity and sensitivity (AUC score of 0.855, 0.828 and 0.797 respectively) in predicting novel drug indications, and displayed a better performance than other methods with limited drug and disease properties. Our predictions and current clinical trials overlap significantly across the different phases of drug development. We also identified and visualized the topological modules of predicted drug indications for certain types of cancers, and for Alzheimer’s disease. Within the network, those modules show potential pathways that illustrate the mechanisms of new drug indications, including propranolol as a potential anticancer agent and telmisartan as treatment for Alzheimer’s disease.
international conference of the ieee engineering in medicine and biology society | 2000
Joong-Wook Shin; D.Y. Cha; K.J. Lee; Youngmi Yoon
We tried to develop a patient monitoring system using fuzzy information. According to a trend from a client-server system to a Web-based system, this fuzzy patient monitoring system was advanced by adding a Web-based monitoring system which was offered with real-time per about 10 seconds. This system is operated in the Microsoft Internet Explorer 5.0 because of a FTP ActiveX control which transfers the updated-file of a patient monitoring information and alarms by pre-setting values on a server. This system offers not only the real-time monitoring information but also the references of a patients ECG waveform and a past recorded data.
PLOS ONE | 2011
Chihyun Park; Jaegyoon Ahn; Youngmi Yoon; Sanghyun Park
Background It is difficult to identify copy number variations (CNV) in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH) containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample. Methodology and Principal Findings We developed a multi-sample-based genomic variations detector (MGVD) that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs); CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR). Conclusions and Significance We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at: http://embio.yonsei.ac.kr/~Park/mgvd.php.