Jaechun No | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jaechun No is active.

Explore More

Publication

Featured researches published by Jaechun No.

high performance distributed computing | 1999

Data management for large-scale scientific computations in high performance distributed systems

Alok N. Choudhary; Mahmut T. Kandemir; Harsha S. Nagesh; Jaechun No; Xiaohui Shen; Valerie E. Taylor; Sachin More; Rajeev Thakur

With the increasing number of scientific applications manipulating huge amounts of data, effective high-level data management is an increasingly important problem. Unfortunately, so far the solutions to the high‐level data management problem either require deep understanding of specific storage architectures and file layouts (as in high-performance file storage systems) or produce unsatisfactory I/O performance in exchange for ease-of-use and portability (as in relational DBMSs). In this paper we present a novel application development environment which is built around an active meta-data management system (MDMS) to handle high-level data in an effective manner. The key components of our three-tiered architecture are user application, the MDMS, and a hierarchical storage system (HSS). Our environment overcomes the performance problems of pure database-oriented solutions, while maintaining their advantages in terms of ease-of-use and portability. The high levels of performance are achieved by the MDMS, with the aid of user-specified, performance-oriented directives. Our environment supports a simple, easy-to-use yet powerful user interface, leaving the task of choosing appropriate I/O techniques for the application at hand to the MDMS. We discuss the importance of an active MDMS and show how the three components of our environment, namely the application, the MDMS, and the HSS, fit together. We also report performance numbers from our ongoing implementation and illustrate that significant improvements are made possible without undue programming effort.

conference on high performance computing (supercomputing) | 2000

Integrating Parallel File I/O and Database Support for High-Performance Scientific Data Management

Jaechun No; Rajeev Thakur; Alok N. Choudhary

Many scientific applications have large I/O requirements, in terms of both the size of data and the number of files or data sets. Management, storage, efficient access, and analysis of this data present an extremely challenging task. Traditionally, two different solutions are used for this problem: file I/O or databases. File I/O can provide high performance but is tedious to use with large numbers of files and large and complex data sets. Databases can be convenient, flexible, and powerful but do not perform and scale well for parallel supercomputing applications. We have developed a software system, called Scientific Data Manager (SDM), that aims to combine the good features of both file I/O and databases. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. SDM takes advantage of various I/O optimizations available in MPI-IO, such as collective I/O and noncontiguous requests, in a manner that is transparent to the user. As a result, users can write and retrieve data with the performance of parallel file I/O, without having to bother with the details of actually performing file I/O. In this paper, we describe the design and implementation of SDM. With the help of two parallel application templates, ASTRO3D and an Euler solver, we illustrate how some of the design criteria affect performance.

ieee international conference on high performance computing data and analytics | 1998

COMPASSION: A Parallel I/O Runtime System Including Chunking and Compression for Irregular Applications

Jesús Carretero; Jaechun No; Sung-Soon Park; Alok N. Choudhary; Pang Chen

In this paper we present two designs, namely, “Collective I/O” and “Pipelined Collective I/O”, of a runtime library for irregular applications based on the two-phase collective 1/O technique. We also present the optimization of both models by using chunking and compression mechanisms. In the first scheme, all processors participate in compressions and I/O at the same time, making scheduling of I/O requests simpler but creating a possibility of contention at the I/O nodes. In the second approach, processors are grouped into several groups, overlapping communication, compression, and I/O to reduce I/O contention dynamically. Finally, evaluation results are shown that demonstrates that we can obtain significantly high-performance for I/O above what has been possible so far.

Journal of Parallel and Distributed Computing | 2003

High-performance scientific data management system

Jaechun No; Rajeev Thakur; Alok N. Choudhary

Many scientific applications have large I/O requirements, in terms of both the size of data and the number of files or data sets. Management, storage, efficient access, and analysis of this data present an extremely challenging task. Traditionally, two different solutions have been used for this task: file I/O or databases. File I/O can provide high performance but is tedious to use with large numbers of files and large and complex data sets. Databases can be convenient, flexible, and powerful but do not perform and scale well for parallel supercomputing applications. We have developed a software system, called Scientific Data Manager (SDM), that combines the good features of both file I/O and databases. SDM provides a high-level application programming interface to the user and, internally, uses a parallel file system to store real data (using various I/O optimizations available in MPI-IO) and a database to store application-related metadata. In order to support I/O in irregular applications, SDM makes extensive use of MPI-IOs noncontiguous collective I/O functions. Moreover, SDM uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We describe the design and implementation of SDM and present performance results with two regular applications, ASTRO3D and an Euler solver, and with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code.

Journal of Parallel and Distributed Computing | 2002

Design and Implementation of a Parallel I/O Runtime System for Irregular Applications

Jaechun No; Sung Soon Park; Jesús Carretero Pérez; Alok N. Choudhary

We present the design, implementation, and evaluation of a runtime system based on collective I/O techniques for irregular applications. The design is motivated by the requirements of a large number of science and engineering applications including teraflops applications, where the data must be reorganized into a canonical form for further processing or restarts. We present two designs: “collective I/O” and “pipelined collective I/O.” In the first design, all processors participate in I/O simultaneously, making scheduling of I/O requests simpler but creating possible contention at the I/O nodes. In the second design, processors are organized into several groups so that only one group performs I/O while the next group performs the communication to rearrange data and this entire process is dynamically pipelined to reduce I/O node contention. In other words, the design provides support for dynamic contention management. We also present a software caching method using collective I/O to reduce I/O cost by reusing the data already present in the memory of other nodes. Chunking and on-line compression mechanisms are included in both models. We present performance results on the Intel Paragon at Caltech and on the ASCI/Red teraflops machine at Sandia National Laboratories.

international parallel and distributed processing symposium | 2001

A scientific data management system for irregular applications

Jaechun No; Rajeev Thakur; Dinesh K. Kaushik; Lori A. Freitag; Alok N. Choudhary

Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of “files.” Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO’s noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.

ieee international conference on high performance computing data and analytics | 1999

High Performance Parallel I/O Schemes for Irregular Applications on Clusters of Workstations

Jaechun No; Jesús Carretero; Alok N. Choudhary

Due to the convergence of the fast microprocessors with low latency and high bandwidth communication networks, clusters of workstations are being used for high-performance computing. In this paper we present the design and implementation of a runtime system to support irregular applications on cluster of workstations, called “Collective I/O Clustering”. The system provides a friendly programming model for performing I/O in irregular applications on clusters of workstations, and is completely integrated with the underlying communication and I/O system. All the performance results were obtained on the IBM-SP machine, located at Argonne National Labs.

Scientific Programming | 2016

MultiCache: Multilayered Cache Implementation for I/O Virtualization

Jaechun No; Sung-soon Park

As the virtual machine technology is becoming the essential component in the cloud environment, VDI is receiving explosive attentions from IT market due to its advantages of easier software management, greater data protection, and lower expenses. However, I/O overhead is the critical obstacle to achieve high system performance in VDI. Reducing I/O overhead in the virtualization environment is not an easy task, because it requires scrutinizing multiple software layers of guest-to-hypervisor and also hypervisor-to-host. In this paper, we propose multilayered cache implementation, called MultiCache, which combines the guest-level I/O optimization with the hypervisor-level I/O optimization. The main objective of the guest-level optimization is to mitigate the I/O latency between the back end, shared storage, and the guest VM by utilizing history logs of I/O activities in VM. On the other hand, the hypervisor-level I/O optimization was implemented to minimize the latency caused by the “passing I/O path to the host” and the “contenting physical I/O device among VMs” on the same host server. We executed the performance measurement of MultiCache using the postmark benchmark to verify its effectiveness.

Journal of the Institute of Electronics Engineers of Korea | 2016

Performance Evaluation and Analysis for Block I/O Access Pattern between KVM-based Virtual Machine and Real Machine in the Virtualized Environment

Hyeunjee Kim; Youngwoo Kim; Young-Min Kim; Hoonha Choi; Jaechun No; Sungsoon Park

Recently, virtualization is becoming the critical issue in the cloud computing due to its advantages of resource utilization and consolidation. In order to efficiently use virtualization services, several issues should be taken into account, including data reliability, security, and performance. In particular, a high write bandwidth on the virtual machine must be guaranteed to provide fast responsiveness to users. In this study, we implemented a way of visualizing comparison results between the block write pattern of KVM-based virtual machine and that of the real machine. Our final objective is to propose an optimized virtualization environment that enables to accelerate the disk write bandwidth.

The Scientific World Journal | 2014

ReHypar: a recursive hybrid chunk partitioning method using NAND-flash memory SSD.

Jaechun No; Sung-Soon Park; Cheol-Su Lim

Due to the rapid development of flash memory, SSD is considered to be the replacement of HDD in the storage market. Although SSD retains several promising characteristics, such as high random I/O performance and nonvolatility, its high expense per capacity is the main obstacle in replacing HDD in all storage solutions. An alternative is to provide a hybrid structure where a small portion of SSD address space is combined with the much larger HDD address space. In such a structure, maximizing the space utilization of SSD in a cost-effective way is extremely important to generate high I/O performance. We developed ReHypar (recursive hybrid chunk partitioning) that enables improving the space utilization of SSD in the hybrid structure. The first objective of ReHypar is to mitigate the fragmentation overhead of SSD address space, by reusing the remaining free space of I/O units as much as possible. Furthermore, ReHypar allows defining several, logical data sections in SSD address space, with each of those sections being configured with the different I/O unit. We integrated ReHypar with ext2 and ext4 and evaluated it using two public benchmarks including IOzone and Postmark.

Explore More