Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Whasoo Bae is active.

Publication


Featured researches published by Whasoo Bae.


Journal of Nonparametric Statistics | 2005

Non-parametric hazard function estimation using the Kaplan–Meier estimator

Choongrak Kim; Whasoo Bae; Hye-Mi Choi; Byeong U. Park

Estimation of the hazard function when the data are censored is an important problem in medical research. In this article, we propose a simple non-parametric estimator of the hazard function. Its asymptotic properties are derived, and numerical comparisons with other existing estimators are made. The proposed estimator is shown to be at least as good as the other estimators from both the theoretical and the numerical points of view.


Korean Journal of Applied Statistics | 2008

Developing the Index of Foodborne Disease Occurrence

Kook-Yeol Choi; Byung-Soo Kim; Whasoo Bae; Woo-Seok Jung; Young-Joon Cho

As the Eating Out Businesses are making rapid progress and most of the schools and the firms serve the meals, the foodborne disease has occurred increasingly and lots of researches and the policies are studied to prevent it. In Korea, the foodborne disease index for prevention is developed by using bacterial growth rate on the temperature to give the information about the danger level of the foodborne disease, but the gap between real status of the occurrences and the predicted danger level has been pointed out. This study aims at developing the index of the foodborne occurrence based on the log linear model using the data of the foodborne disease occurrence and the meteorological data for the last three years(). Comparison between the new index and the existing index showed that the new index is better in explaining the foodborne disease occurrence.


Communications in Statistics-theory and Methods | 2005

Smoothing Techniques for the Bivariate Kaplan–Meier Estimator

Whasoo Bae; Hye-Mi Choi; Byeong U. Park; Choongrak Kim

ABSTRACT Bivariate survival time data arise quite often in medical research, and many estimators for the bivariate survival function have been suggested. While there are a lot of smooth estimators for the univariate Kaplan–Meier estimator, smooth versions of bivariate Kaplan–Meier estimator are not discussed yet. In this article, we suggest two smoothing techniques, the kernel smoothing and the Bezier surface smoothing, for the bivariate survival function estimator, especially for the estimator suggested by Lin and Ying (1993). Also, asymptotic results for both estimators are derived. Throughout the simulation studies, the Bezier surface smoothing turned out to be very efficient compared to the bivariate Kaplan–Meier estimator and the kernel smoothing estimator. An illustrative example based on a real data set is also given.


OR Spectrum | 2008

A simple segmentation method for DNA microarray spots by kernel density estimation

Whasoo Bae; Choongrak Kim

The DNA microarray analysis is one of the most important areas in biomedical research. For the accurate analysis of microarray data the process of segmentation, classification of pixels as foreground or background, should be done accurately. In this paper we suggest a kernel density estimation approach for the segmentation of the microarray spot. We estimate the density of n pixel intensities for a given target area by the kernel density estimation, and the resulting kernel density estimate gives bimodal density by appropriate choice of the smoothing parameter. We suggest two modes of the kernel density estimate for n pixel intensities as estimates of the foreground (mode with larger value) and the background (mode with smaller value) intensity, respectively. The segmentation method proposed in this paper is easy and simple to use, robust to the shape of spot, and very accurate.


Communications for Statistical Applications and Methods | 2005

A Study on K -Means Clustering

Whasoo Bae; Se-Won Roh

This paper aims at studying on K-means Clustering focusing on initialization which affect the clustering results in K-means cluster analysis. The four different methods(the MA method, the KA method, the Max-Min method and the Space Partition method) were compared and the clustering result shows that there were some differences among these methods, especially that the MA method sometimes leads to incorrect clustering due to the inappropriate initialization depending on the types of data and the Max-Min method is shown to be more effective than other methods especially when the data size is large.


Communications for Statistical Applications and Methods | 2014

Nonparametric Estimation of Distribution Function using Bezier Curve

Whasoo Bae; Ryeongah Kim; Choongrak Kim

Abstract In this paper we suggest an efficient method to estimate the distribution function using the Bezier curve,and compare it with existing methods by simulation studies. In addition, we suggest a robust version of cross-validation criterion to estimate the number of Bezier points, and showed that the proposed method is better thanthe existing methods based on simulation studies.Keywords: Bezier points, cross validation, mean integrated square error, smoothing techniques. 1. Introduction Nonparametric methods are often used to estimate the density function of a distribution functionbecause parametric methods are unrealistic and too restrictive to satisfy a prespecified distribution.For the nonparametric estimation of density function, the best references are Silverman (1986), Eu-bank(1988),Loader(1999)andWasserman(2006). Asnonparametricdensityestimation,kerneltypesmoothingiswidelyused. TheBeziercurve(Bezier,1977)smoothing(regardedasoneofkernel-typeapproaches) is another nonparametric method to estimate density function and regression function. Incomputationalgraphics(especiallyforthecomputer-aided-geometricdesign)Beziercurvesmoothingis popular; however, it rarely used in statistics. Kim (1996) applied a Bezier curve to density estima-tion for the first time in statistics and Kim


Communications for Statistical Applications and Methods | 2014

The General Linear Test in the Ridge Regression

Whasoo Bae; Minji Kim; Choongrak Kim

We derive a test statistic for the general linear test in the ridge regression model. The exact distribution for the test statistic is too difficult to derive; therefore, we suggest an approximate reference distribution. We use numerical studies to verify that the suggested distribution for the test statistic is appropriate. A asymptotic result for the test statistic also is considered. 1. Introduction In regression analysis, the ridge regression model (Hoerl and Kennard, 1970) is a good alternative to the classical linear model when covariates are highly correlated. The ridge regression model has been studied by many researchers and indicates many good properties. Among them, there exists a ridge transformation parameter with which ridge regression estimator has a smaller mean squared error than the classical linear regression model, and the ridge regression estimator is a Bayes estimator when the prior for the regression coefficients are Gaussian under the squared error loss function. For other properties of the ridge regression estimator, see Seber and Lee (2003) and Kim and Kang (2010). However, relatively few studies are done in the testing problem for the ridge regression coefficients. The general linear test in the classical linear model is often used. One example is the Cobb- Douglas production function (Chipman and Rao, 1964) in the field of econometrics; however, the general linear test problem in the ridge regression has not been studied. Obenchain (1977) studied interval estimation of the general linear combination of regression coefficients in the ridge regression using the singular values decomposition of the design matrix, and Hoerl and Kennard (1990) proposed a degrees of freedom in the analysis of variance model using the ridge regression. We study the general linear test problem in the ridge regression model and derive a test statistic for the general linear test that suggests an approximate reference distribution. As far as we know, this study was not considered so far since the test statistic under the general linear restriction is quite tedious and the corresponding degrees of freedom for the test statistic is quite different from the classical linear model. Note that the hat matrix in the ridge regression is not idempotent so that the degrees of freedom are not any more an integer but a real number; therefore, a traditional F-distribution in the classical linear model is not exactly suitable. This paper is organized as follows. In Section 2, the ridge regression is defined and reviews on the general linear test in the classical linear model are given. Derivation of test statistic in the ridge regression model and an appropriate definition for the degrees of freedom is suggested, and an


Korean Journal of Applied Statistics | 2010

Computing the Repurchase Index Based on Statistical Modeling

Whasoo Bae; Woo-Seok Jung; Young-Bae Lee

This paper computes the repurchase index based on statistical modeling. Using the transaction record of a certain product, the repurchase index is obtained by fitting the Poisson regression model. The customers are classified into 5 groups based on the index giving the information about the propensity to repurchase.


Communications for Statistical Applications and Methods | 2007

Comparison Study of Multi-class Classification Methods

Whasoo Bae; Gab-Dong Jeon; Kyungha Seok

As one of multi-class classification methods, ECOC (Error Correcting Output Coding) method is known to have low classification error rate. This paper aims at suggesting effective multi-class classification method (1) by comparing various encoding methods and decoding methods in ECOC method and (2) by comparing ECOC method and direct classification method. Both SVM (Support Vector Machine) and logistic regression model were used as binary classifiers in comparison.


Computational Statistics | 2005

Case influence diagnostics in the kaplan-meier estimator and the log-rank test

Choongrak Kim; Whasoo Bae

One or few observations can be highly influential on the Kaplan-Meier estimator, and consequently on the log-rank test statistic in comparing two survival functions. In this paper we derive case influence diagnostics for the Kaplan-Meier estimator and the log-rank test. We note that diagnostics in this context is quite different from the regression context where observations are usually assumed to be independent. Simulation studies are done to present some guidelines to determine influential observations deserving special attention. Illustrative examples are also given.

Collaboration


Dive into the Whasoo Bae's collaboration.

Top Co-Authors

Avatar

Choongrak Kim

Pusan National University

View shared research outputs
Top Co-Authors

Avatar

Byeong U. Park

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Hojin Yang

Pusan National University

View shared research outputs
Top Co-Authors

Avatar

Jungsu Lee

Pusan National University

View shared research outputs
Top Co-Authors

Avatar

Soonyoung Hwang

Pusan National University

View shared research outputs
Top Co-Authors

Avatar

Soyoung Noh

Pusan National University

View shared research outputs
Researchain Logo
Decentralizing Knowledge