Joong-Ho Won
Seoul National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Joong-Ho Won.
Journal of Computational and Graphical Statistics | 2015
Donghyeon Yu; Joong-Ho Won; Tae Hoon Lee; Johan Lim; Sungroh Yoon
In this article, we propose a majorization–minimization (MM) algorithm for high-dimensional fused lasso regression (FLR) suitable for parallelization using graphics processing units (GPUs). The MM algorithm is stable and flexible as it can solve the FLR problems with various types of design matrices and penalty structures within a few tens of iterations. We also show that the convergence of the proposed algorithm is guaranteed. We conduct numerical studies to compare our algorithm with other existing algorithms, demonstrating that the proposed MM algorithm is competitive in many settings including the two-dimensional FLR with arbitrary design matrices. The merit of GPU parallelization is also exhibited. Supplementary materials are available online.
IEEE Transactions on Visualization and Computer Graphics | 2013
Joong-Ho Won; Yongkweon Jeon; Jarrett Rosenberg; Sungroh Yoon; Geoffrey D. Rubin; Sandy Napel
Direct projection of 3D branching structures, such as networks of cables, blood vessels, or neurons onto a 2D image creates the illusion of intersecting structural parts and creates challenges for understanding and communication. We present a method for visualizing such structures, and demonstrate its utility in visualizing the abdominal aorta and its branches, whose tomographic images might be obtained by computed tomography or magnetic resonance angiography, in a single 2D stylistic image, without overlaps among branches. The visualization method, termed uncluttered single-image visualization (USIV), involves optimization of geometry. This paper proposes a novel optimization technique that utilizes an interesting connection of the optimization problem regarding USIV to the protein structure prediction problem. Adopting the integer linear programming-based formulation for the protein structure prediction problem, we tested the proposed technique using 30 visualizations produced from five patient scans with representative anatomical variants in the abdominal aortic vessel tree. The novel technique can exploit commodity-level parallelism, enabling use of general-purpose graphics processing unit (GPGPU) technology that yields a significant speedup. Comparison of the results with the other optimization technique previously reported elsewhere suggests that, in most aspects, the quality of the visualization is comparable to that of the previous one, with a significant gain in the computation time of the algorithm.
Applied Intelligence | 2018
Baekjin Kim; Donghyeon Yu; Joong-Ho Won
Variable selection is important in high-dimensional data analysis. The Lasso regression is useful since it possesses sparsity, soft-decision rule, and computational efficiency. However, since the Lasso penalized likelihood contains a nondifferentiable term, standard optimization tools cannot be applied. Many computation algorithms to optimize this Lasso penalized likelihood function in high-dimensional settings have been proposed. To name a few, coordinate descent (CD) algorithm, majorization-minimization using local quadratic approximation, fast iterative shrinkage thresholding algorithm (FISTA) and alternating direction method of multipliers (ADMM). In this paper, we undertake a comparative study that analyzes relative merits of these algorithms. We are especially concerned with numerical sensitivity to the correlation between the covariates. We conduct a simulation study considering factors that affect the condition number of covariance matrix of the covariates, as well as the level of penalization. We apply the algorithms to cancer biomarker discovery, and compare convergence speed and stability.
BMC Bioinformatics | 2018
Tae Hoon Lee; Sungmin Lee; Woo Young Sim; Yu Mi Jung; Sunmi Han; Joong-Ho Won; Hyeyoung Min; Sungroh Yoon
BackgroundDNA damage causes aging, cancer, and other serious diseases. The comet assay can detect multiple types of DNA lesions with high sensitivity, and it has been widely applied. Although comet assay platforms have improved the limited throughput and reproducibility of traditional assays in recent times, analyzing large quantities of comet data often requires a tremendous human effort. To overcome this challenge, we proposed HiComet, a computational tool that can rapidly recognize and characterize a large number of comets, using little user intervention.ResultsWe tested HiComet with real data from 35 high-throughput comet assay experiments, with over 700 comets in total. The proposed method provided unprecedented levels of performance as an automated comet recognition tool in terms of robustness (measured by precision and recall) and throughput.ConclusionsHiComet is an automated tool for high-throughput comet-assay analysis and could significantly facilitate characterization of individual comets by accelerating its most rate-limiting step. An online implementation of HiComet is freely available at https://github.com/taehoonlee/HiComet/.
Journal of Computational and Graphical Statistics | 2017
Tae Hoon Lee; Joong-Ho Won; Johan Lim; Sungroh Yoon
ABSTRACT We present a massively parallel algorithm for the fused lasso, powered by a multiple number of graphics processing units (GPUs). Our method is suitable for a class of large-scale sparse regression problems on which a two-dimensional lattice structure among the coefficients is imposed. This structure is important in many statistical applications, including image-based regression in which a set of images are used to locate image regions predictive of a response variable such as human behavior. Such large datasets are increasingly common. In our study, we employ the split Bregman method and the fast Fourier transform, which jointly have a high data-level parallelism that is distinct in a two-dimensional setting. Our multi-GPU parallelization achieves remarkably improved speed. Specifically, we obtained as much as 433 times improved speed over that of the reference CPU implementation. We demonstrate the speed and scalability of the algorithm using several datasets, including 8100 samples of 512 × 512 images. Compared to the single GPU counterpart, our method also showed improved computing speed as well as high scalability. We describe the various elements of our study as well as our experience with the subtleties in selecting an existing algorithm for parallelization. It is critical that memory bandwidth be carefully considered for multi-GPU algorithms. Supplementary material for this article is available online.
BMC Bioinformatics | 2018
Tae Hoon Lee; Sungmin Lee; Woo Young Sim; Yu Mi Jung; Sunmi Han; Joong-Ho Won; Hyeyoung Min; Sungroh Yoon
After publication of the original article [1], it has been found that the author affiliations have been accidentally left out in the PDF. The full affiliations can be found in this correction:
Korean Journal of Applied Statistics | 2016
Seyoon Ko; Goo Jun; Joong-Ho Won
Land cover classification is an important tool for preventing natural disasters, collecting environmental information, and monitoring natural resources. Hyperspectral imaging is widely used for this task thanks to sufficient spectral information. However, the curse of dimensionality, spatiotemporal variability, and lack of labeled data make it difficult to classify the land cover correctly. We propose a novel classification framework for land cover classification of hyperspectral data based on convolutional neural networks. The proposed framework naturally incorporates full spectral features with the information from neighboring pixels and has advantages over existing methods that require additional feature extraction or pre-processing steps. Empirical evaluation results show that the proposed framework provides good generalization power with classification accuracies better than (or comparable to) the most advanced existing classifiers.
Korean Journal of Applied Statistics | 2016
Seyoon Ko; Joong-Ho Won
Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.
IEEE Transactions on Biomedical Engineering | 2013
Yongkweon Jeon; Joong-Ho Won; Sungroh Yoon
Images captured using computed tomography and magnetic resonance angiography are used in the examination of the abdominal aorta and its branches. The examination of all clinically relevant branches simultaneously in a single 2-D image without any misleading overlaps facilitates the diagnosis of vascular abnormalities. This problem is called uncluttered single-image visualization (USIV). We can solve the USIV problem by assigning energy-based scores to visualization candidates and then finding the candidate that optimizes the score; this approach is similar to the manner in which the protein side-chain placement problem has been solved. To obtain near-optimum images, we need to explore the energy space extensively, which is often time consuming. This paper describes a method for exploring the energy space in a massively parallel fashion using graphics processing units. According to our experiments, in which we used 30 images obtained from five patients, the proposed method can reduce the total visualization time substantially. We believe that the proposed method can make a significant contribution to the effective visualization of abdominal vascular structures and precise diagnosis of related abnormalities.
Statistics & Probability Letters | 2014
Joong-Ho Won; Johan Lim; Donghyeon Yu; Byung Soo Kim; Kyunga Kim