Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hye-Young Jung is active.

Publication


Featured researches published by Hye-Young Jung.


Fuzzy Sets and Systems | 2015

Fuzzy linear regression using rank transform method

Hye-Young Jung; Jin Hee Yoon; Seung Hoe Choi

In regression analysis, the rank transform (RT) method is known to be neither dependent on the shape of the error distribution nor sensitive to outliers. In this paper, we construct a so-called α-level fuzzy regression model based on the resolution identity theorem and apply RT method to this model. Fuzzy regression models with crisp input/fuzzy output and fuzzy input/fuzzy output are investigated to show the effectiveness of the proposed method. To compare its effectiveness with existing methods, we introduce a new performance measure. In addition, we propose a method to obtain a predicted output with respect to a specific target value and show that our model is more robust compared with other methods when the data contain some outliers.


soft computing | 2014

Likelihood inference based on fuzzy data in regression model

Hye-Young Jung; Woo-Joo Lee; Jin Hee Yoon; Seung Hoe Choi

In regression analysis, such as other statistical inference problems, imprecise data may be encountered. In this paper, we focused on some statistical inferences in fuzzy regression model on the basis of information the supplied by the available fuzzy data based on imprecise data. For these, we consider the maximum likelihood estimates of linear regression parameters based on fuzzy data for the variety of membership functions. Numerical example is given for estimating the regression parameters in order to provide an illustration of the proposed maximum likelihood estimation.


Computational Biology and Chemistry | 2016

A novel fuzzy set based multifactor dimensionality reduction method for detecting genegene interaction

Hye-Young Jung; Sangseob Leem; Sungyoung Lee; Taesung Park

BACKGROUND Gene-gene interaction (GGI) is one of the most popular approaches for finding the missing heritability of common complex traits in genetic association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. In order to identify the best interaction model associated with disease susceptibility, MDR compares all possible genotype combinations in terms of their predictability of disease status from a simple binary high(H) and low(L) risk classification. However, this simple binary classification does not reflect the uncertainty of H/L classification. METHODS We regard classifying H/L as equivalent to defining the degree of membership of two risk groups H/L. By adopting the fuzzy set theory, we propose Fuzzy MDR which takes into account the uncertainty of H/L classification. Fuzzy MDR allows the possibility of partial membership of H/L through a membership function which transforms the degree of uncertainty into a [0,1] scale. The best genotype combinations can be selected which maximizes a new fuzzy set based accuracy measure. RESULTS Two simulation studies are conducted to compare the power of the proposed Fuzzy MDR with that of MDR. Our results show that Fuzzy MDR has higher power than MDR. We illustrate the proposed Fuzzy MDR by analysing bipolar disorder (BD) trait of the WTCCC dataset to detect GGI associated with BD. CONCLUSIONS We propose a novel Fuzzy MDR method to detect gene-gene interaction by taking into account the uncertainly of H/L classification and show that it has higher power than MDR. Fuzzy MDR can be easily extended to handle continuous phenotypes as well. The program written in R for the proposed Fuzzy MDR is available at https://statgen.snu.ac.kr/software/FuzzyMDR.


ieee international conference on fuzzy systems | 2015

An application of F-transform to a regression model based on Theil's method

Jin Hee Yoon; Hye-Young Jung; Seung Hoe Choi; Woo-Joo Lee

Regression Analysis is an analyzing method of regression model to explain the statistical relationship between explanatory variables and response variables. This paper propose a new regression analysis applying Theils method based on F-transform. The main advantage of Theils method in regression is the robustness, which means that it is not sensitive to outliers. The proposed method uses the median of rates of increments which are obtained from F-transform, based all possible pairs of F-transformed data in order to estimate the coefficients of fuzzy regression model. An example is given to show that the proposed regression analysis applying Theils method based on F-transform is more robust than the least squares estimation (LSE) and even more robust than the original Theils method.


soft computing | 2014

Optimal properties of a fuzzy least estimator based on new operations

Jin Hee Yoon; Hye-Young Jung; Woo-Joo Lee; Seung Hoe Choi

This paper deals with optimal properties of fuzzy least squares estimators of the fuzzy linear regression model with fuzzy input-output data that has an error structure. Fuzzy least squares estimators with new operations for regression parameters were proposed earlier in our previous study based on a suitable metric, and shows that the estimators are fuzzy-type linear estimators. We propose expectations and variances by using the algebraic properties of the triangular fuzzy matrices, and show some optimal properties BLUE(Best Linear Unbiased Estimator) of the estimators. Simple computational example is given to confirm these properties.


soft computing | 2018

Fuzzy heaping mechanism for heaped count data with imprecision

Hye-Young Jung; Heawon Choi; Taesung Park

In genetic association studies, the traits of interest may sometimes be collected from the reported data. Since subjects report exact responses and/or rounded responses, the histogram of data frequently exhibits spikes at particular values. This phenomenon, known as heaping, can cause difficulties in performing the association test via standard modeling approaches. Recently, several models have been proposed to identify the true unobservable underlying distribution from heaped data. However, all of these methods depend on probabilistic assumptions regarding the heaping mechanism. Unfortunately, probabilistic models cannot represent heaped data effectively, because heaping can be caused by imprecisely reported values. This type of imprecision is different from probabilistic uncertainty, which is described well by a probabilistic model. In this paper, we propose a fuzzy heaping model to identify genetic variants for the heaped count data. Our fuzzy model uses a mixture of likelihood functions for precisely and imprecisely reported data, treating heaped data as imprecise data represented by fuzzy sets. Moreover, since reported count data may include excess zeros, as well as heaped data, we extend our fuzzy heaping model to handle excess zeros. Through simulation studies, we show that the proposed fuzzy heaping model controls type I errors effectively and has great power to identify causal variants. We illustrate the proposed fuzzy heaping model through a study of the identification of genetic variants associated with the number of cigarettes smoked per day.


BMC Medical Genomics | 2018

Fuzzy set-based generalized multifactor dimensionality reduction analysis of gene-gene interactions

Hye-Young Jung; Sangseob Leem; Taesung Park

BackgroundGene-gene interactions (GGIs) are a known cause of missing heritability. Multifactor dimensionality reduction (MDR) is one of most commonly used methods for GGI detection. The generalized multifactor dimensionality reduction (GMDR) method is an extension of MDR method that is applicable to various types of traits, and allows covariate adjustments. Our previous Fuzzy MDR (FMDR) is another extension for overcoming simple binary classification. FMDR uses continuous member-ship values instead of binary membership values 0 and 1, improving power for detecting causal SNPs and more intuitive interpretations in real data analysis. Here, we propose the fuzzy generalized multifactor dimensionality reduction (FGMDR) method, as a combined analysis of fuzzy set-based analysis and GMDR method, to detect GGIs associated with diseases using fuzzy set theory.ResultsThrough simulation studies for different types of traits, the proposed FGMDR showed a higher detection ratio of causal SNPs, compared to GMDR. We then applied FGMDR to two real data: Crohn’s disease (CD) data from the Wellcome Trust Case Control Consortium (WTCCC) with a binary phenotype and the Homeostasis Model Assessment of Insulin Resistance (HOMA-IR) data from Korean population with a continuous phenotype. The interactions derived by our method include the pre-reported interactions associated with phenotypes.ConclusionsThe proposed FGMDR performs well for GGI detection with covariate adjustments. The program written in R for FGMDR is available at http://statgen.snu.ac.kr/software/FGMDR.


International Journal of Fuzzy Systems | 2017

A Novel Forecasting Method Based on F-Transform and Fuzzy Time Series

Woo-Joo Lee; Hye-Young Jung; Jin Hee Yoon; Seung Hoe Choi

The main goal of time series analysis is to establish forecasting model based on past observations and to reduce forecasting error. To achieve these goals, the present paper proposes a new forecasting algorithm based on the fuzzy transform (F-transform) and the fuzzy logical relationships. First, the F-transform is performed based on partitioning of the universe, and the fuzzy logical relationships are employed to forecast. Two experimental applications are used to illustrate and verify the proposed algorithm. The accuracies are evaluated on the basis of average forecasting error percentage and index of agreement to compare the proposed algorithm with other existing methods.


data mining in bioinformatics | 2016

Statistical analysis for aggregated count data in genetic association studies

Haewon Choi; Hye-Young Jung; Taesung Park

In smoking behaviour studies, Cigarette Counts Per Day CPD are aggregated such as 0, one pack, two packs, etc. Analysis of such count data is a challenge, owing to its reporting bias and difficulty in estimating its appropriate distribution. In this study, we set forth to identify genetic variants, such as Single Nucleotide Polymorphisms SNPs, that correlate with aggregated count data, such as CPD. We first reviewed the existing approaches, in which the aggregated count data is a dependent variable and the SNP is an ordinal independent variable. We then considered a calibration model in which the SNP is the ordinal dependent variable and the aggregated count data is the independent variable. This calibration modelling approach becomes robust to accommodate distributional assumptions of count data. We applied our robust calibration modelling approach to CPD data from the Korean Association Resource project data of 4183 male samples. Through simulation studies, we investigated the performance of the proposed method for comparison to other competing approaches.


bioinformatics and biomedicine | 2015

Genetic association tests for cigarattes per day

Haewon Choi; Hye-Young Jung; Taesung Park

Cigarettes per day (CPD) is one of most commonly used phenotypes in nicotine dependence (ND) study. For example, a genetic association study of ND focuses on identifying single nucleotide polymorphisms (SNPs) that correlate with CPD. However, analysis of CPD data is always challenging, since the estimation of CPD distribution is difficult due to spikes at some specific values, say 10, 20 and so forth. Thus, standard maximum likelihood estimation is not appropriate. In this study, we focus on genetic association tests for identifying SNPs that correlate with CPD. We first reviewed previously proposed approaches applicable to CPD data, in which the CPD data is a dependent variable and the SNP is an ordinal independent variable. We then considered a calibration model in which the SNP is the ordinal dependent variable and the CPD is the independent variable. Unlike a standard modeling approach, this calibration modeling approach becomes sufficiently robust to accommodate distributional assumptions of CPD data. We applied our robust calibration modeling approach to CPD data from the Korean Association Resource project data of 4,183 male samples.

Collaboration


Dive into the Hye-Young Jung's collaboration.

Top Co-Authors

Avatar

Seung Hoe Choi

Korea Aerospace University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Taesung Park

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Haewon Choi

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Sangseob Leem

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Seung-Hoe Choi

Korea Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Heawon Choi

Seoul National University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sungyoung Lee

Seoul National University

View shared research outputs
Researchain Logo
Decentralizing Knowledge