Yoonho Park | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yoonho Park is active.

Explore More

Publication

Featured researches published by Yoonho Park.

international conference on management of data | 2006

Design, implementation, and evaluation of the linear road bnchmark on the stream processing core

Navendu Jain; Lisa Amini; Henrique Andrade; Richard P. King; Yoonho Park; Philippe Selo; Chitra Venkatramani

Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.

Proceedings of the 4th international workshop on Data mining standards, services and platforms | 2006

SPC: a distributed, scalable platform for data mining

Lisa Amini; Henrique Andrade; Ranjita Bhagwan; Frank Eskesen; Richard P. King; Philippe Selo; Yoonho Park; Chitra Venkatramani

The Stream Processing Core (SPC) is distributed stream processing middleware designed to support applications that extract information from a large number of digital data streams. In this paper, we describe the SPC programming model which, to the best of our knowledge, is the first to support stream-mining applications using a subscription-like model for specifying stream connections as well as to provide support for non-relational operators. This enables stream-mining applications to tap into, analyze and track an ever-changing array of data streams which may contain information relevant to the streaming-queries placed on it. We describe the design, implementation, and experimental evaluation of the SPC distributed middleware, which deploys applications on to the running system in an incremental fashion, making stream connections as required. Using micro-benchmarks and a representative large-scale synthetic stream-mining application, we evaluate the performance of the control and data paths of the SPC middleware.

acm sigops european workshop | 2000

The SawMill multiserver approach

Alain Gefflaut; Trent Jaeger; Yoonho Park; Jochen Liedtke; Kevin Elphinstone; Volkmar Uhlig; Jonathon E. Tidswell; Luke Deller; Lars Reuther

Multiserver systems, operating systems composed from a set of hardware-protected servers, initially generated significant interest in the early 1990s. If a monolithic operating system could be decomposed into a set of servers with well-defined interfaces and well-understood protection mechanisms, then the robustness and configurability of operating systems could be improved significantly. However, initial multiserver systems [4, 14] were hampered by poor performance and software engineering complexity. The Mach microkernel [10] base suffered from a number of performance problems (e.g., IPC), and a number of difficult problems must be solved to enable the construction of a system from orthogonal servers (e.g., unified buffer management, coherent security, flexible server interface design, etc.).In the meantime, a number of important research results have been generated that lead us to believe that a re-evaluation of multiserver system architectures is warranted. First, microkernel technology has vastly improved since Mach. L4 [13] and Exokernel [6] are two recent microkernels upon which efficient servers have been constructed (i.e., L4Linux for L4 [12] and ExOS for Exokernel [9]). In these systems, the servers are independent OSes, but we are encouraged that the kernel and server overheads, in particular context switches overheads, are minimized. Second, we have seen marked improvements in memory management approaches that enable zero-copy protocols (e.g., fbufs [5] and emulated copy [3]). Other advances include, improved kernel modularity [7], component model services [8], multiserver security protocols, etc. Note that we are not the only researchers who believe it is time to re-examine multiservers, as a multiserver system is also being constructed on the Pebble kernel [11].In addition, there is a greater need for multiserver architectures now. Consider the emergence of a variety of specialized, embedded systems. Traditionally, each embedded system includes a specialized operating system. Given the expected proliferation of such systems, the number of operating systems that must be built will increase significantly. Tools for configuring operating systems from existing servers will become increasingly more valuable, and adequate protection among servers will be necessary to guard valuable information that may be stored on such systems (e.g., private keys). This is exactly the motivation for multiserver systems.In this paper, we define the SawMill multiserver approach. This approach consists of: (1) an architecture upon which efficient and robust multiserver systems can be constructed and (2) a set of protocol design guidelines for solving key multiserver problems. First, the SawMill architecture consists of a set of user-level servers executing on the L4 microkernel and a set of services that enable these servers to obtain and manage resources locally. Second, the SawMill protocol design guidelines enable system designers to minimize the communication overheads introduced by protection boundaries between servers. We demonstrate the SawMill approach for two server systems derived from the Linux code base: (1) an Ext2 file system and (2) an IP network system.The remainder of the paper is structured as follows. In Section 2, we define the problems that must be solved in converting a monolithic operating system into a multiserver operating system. In Sections 3 and 4, we define the SawMill architecture and the protocol design approach, respectively. In Section 5, we demonstrate some of these guidelines in the file system and network system implementations. In Section 6, we examine the performance of the current SawMill Linux system.

Ibm Journal of Research and Development | 2015

Active Memory Cube: A processing-in-memory architecture for exascale systems

Ravi Nair; Samuel F. Antao; Carlo Bertolli; Pradip Bose; José R. Brunheroto; Tong Chen; Chen-Yong Cher; Carlos H. Andrade Costa; J. Doi; Constantinos Evangelinos; Bruce M. Fleischer; Thomas W. Fox; Diego S. Gallo; Leopold Grinberg; John A. Gunnels; Arpith C. Jacob; P. Jacob; Hans M. Jacobson; Tejas Karkhanis; Choon Young Kim; Jaime H. Moreno; John Kevin Patrick O'Brien; Martin Ohmacht; Yoonho Park; Daniel A. Prener; Bryan S. Rosenburg; Kyung Dong Ryu; Olivier Sallenave; Mauricio J. Serrano; Patrick Siegl

Many studies point to the difficulty of scaling existing computer architectures to meet the needs of an exascale system (i.e., capable of executing

symposium on computer architecture and high performance computing | 2012

FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous Environment

Yoonho Park; Eric Van Hensbergen; Marius Hillenbrand; Todd Inglett; Bryan S. Rosenburg; Kyung Dong Ryu; Robert W. Wisniewski

10^{18}

workshop on hot topics in operating systems | 1999

Flexible access control using IPC redirection

Trent Jaeger; Kevin Elphinstone; Jochen Liedtke; Vsevolod Panteleenko; Yoonho Park

floating-point operations per second), consuming no more than 20 MW in power, by around the year 2020. This paper outlines a new architecture, the Active Memory Cube, which reduces the energy of computation significantly by performing computation in the memory module, rather than moving data through large memory hierarchies to the processor core. The architecture leverages a commercially demonstrated 3D memory stack called the Hybrid Memory Cube, placing sophisticated computational elements on the logic layer below its stack of dynamic random-access memory (DRAM) dies. The paper also describes an Active Memory Cube tuned to the requirements of a scientific exascale system. The computational elements have a vector architecture and are capable of performing a comprehensive set of floating-point and integer instructions, predicated operations, and gather-scatter accesses across memory in the Cube. The paper outlines the software infrastructure used to develop applications and to evaluate the architecture, and describes results of experiments on application kernels, along with performance and power projections.

annual computer security applications conference | 2001

The SawMill framework for virtual memory diversity

Mohit Aron; Yoonho Park; Trent Jaeger; Jochen Liedtke; Kevin Elphinstone; Luke Deller

Traditionally, there have been two approaches to providing an operating environment for high performance computing (HPC). A Full-Weight Kernel(FWK) approach starts with a general-purpose operating system and strips it down to better scale up across more cores and out across larger clusters. A Light-Weight Kernel (LWK) approach starts with a new thin kernel code base and extends its functionality by adding more system services needed by applications. In both cases, the goal is to provide end-users with a scalable HPC operating environment with the functionality and services needed to reliably run their applications. To achieve this goal, we propose a new approach, called Fused OS, that combines the FWK and LWK approaches. Fused OS provides an infrastructure capable of partitioning the resources of a multicoreheterogeneous system and collaboratively running different operating environments on subsets of the cores and memory, without the use of a virtual machine monitor. With Fused OS, HPC applications can enjoy both the performance characteristics of an LWK and the rich functionality of an FWK through cross-core system service delegation. This paper presents the Fused OS architecture and a prototype implementation on Blue Gene/Q. The Fused OS prototype leverages Linux with small modifications as a FWK and implements a user-level LWK called Compute Library (CL) by leveraging CNK. We present CL performance results demonstrating low noise and show micro-benchmarks running with performance commensurate with that provided by CNK.

high performance computational finance | 2009

Implementing a high-volume, low-latency market data processing system on commodity hardware using IBM middleware

Xiaolan Joy Zhang; Henrique Andrade; Bugra Gedik; Richard P. King; John F. Morar; Senthil Nathan; Yoonho Park; Raju Pavuluri; Edward John Pring; Randall Richard Schnier; Philippe Selo; Michael John Elvery Spicer; Volkmar Uhlig; Chitra Venkatramani

We present a mechanism for inter-process communication (IPC) redirection that enables efficient and flexible access control for micro-kernel systems. In such systems, services are implemented at user level, so IPC is the only means of communication between them. Thus, the system must be able to mediate IPCs to enforce its access control policy. Such mediation must enable enforcement of security policy with as little performance overhead as possible, but current mechanisms either: (1) place significant access control functionality in the kernel which increases IPC cost or (2) are static and require more IPCs than necessary to enforce access control. We define an IPC redirection mechanism that makes two improvements: (1) it removes the management of redirection policy from the kernel, so access control enforcement can be implemented outside the kernel; and (2) it separates the notion of who controls the redirection policy from the redirections themselves, so redirections can be configured arbitrarily and dynamically. We define our redirection mechanism, demonstrate its use, and examine possible efficient implementations.

international conference on computer communications | 2008

Towards Optimal Resource Allocation in Partial-Fault Tolerant Applications

Nikhil Bansal; Ranjita Bhagwan; Navendu Jain; Yoonho Park; Deepak S. Turaga; Chitra Venkatramani

We present a framework that allows applications to build and customize VM services on the LA microkernel. While the LA microkernels abstractions are quite powerful, using these abstractions effectively requires higher-level paradigms. We propose the dataspace paradigm which provides a modular VM framework. The modularity introduced by the dataspace paradigm facilitates implementation and permits dynamic configurability. Initial performance results from a prototype are promising.

computing frontiers | 2015

Data access optimization in a processing-in-memory system

Zehra Sura; Arpith C. Jacob; Tong Chen; Bryan S. Rosenburg; Olivier Sallenave; Carlo Bertolli; Samuel F. Antao; José R. Brunheroto; Yoonho Park; Kevin O'Brien; Ravi Nair

A stock market data processing system that can handle high data volumes at low latencies is critical to market makers. Such systems play a critical role in algorithmic trading, risk analysis, market surveillance, and many other related areas. We show that such a system can be built with general-purpose middleware and run on commodity hardware. The middleware we use is IBM System S, which has been augmented with transport technology from IBM WebSphere MQ Low Latency Messaging. Using eight commodity x86 blades connected with Ethernet and Infiniband, this system can achieve 80 μsec average latency at 3 times the February 2008 options market data rate and 206 μsec average latency at 15 times the February 2008 rate.

Explore More