Hanjun Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hanjun Kim is active.

Explore More

Publication

Featured researches published by Hanjun Kim.

architectural support for programming languages and operating systems | 2010

Speculative parallelization using software multi-threaded transactions

Arun Raman; Hanjun Kim; Thomas R. Mason; Thomas B. Jablin; David I. August

With the right techniques, multicore architectures may be able to continue the exponential performance trend that elevated the performance of applications of all types for decades. While many scientific programs can be parallelized without speculative techniques, speculative parallelism appears to be the key to continuing this trend for general-purpose applications. Recently-proposed code parallelization techniques, such as those by Bridges et al. and by Thies et al., demonstrate scalable performance on multiple cores by using speculation to divide code into atomic units (transactions) that span multiple threads in order to expose data parallelism. Unfortunately, most software and hardware Thread-Level Speculation (TLS) memory systems and transactional memories are not sufficient because they only support single-threaded atomic units. Multi-threaded Transactions (MTXs) address this problem, but they require expensive hardware support as currently proposed in the literature. This paper proposes a Software MTX (SMTX) system that captures the applicability and performance of hardware MTX, but on existing multicore machines. The SMTX system yields a harmonic mean speedup of 13.36x on native hardware with four 6-core processors (24 cores in total) running speculatively parallelized applications.

programming language design and implementation | 2011

Parallelism orchestration using DoPE: the degree of parallelism executive

Arun Raman; Hanjun Kim; Taewook Oh; Jae W. Lee; David I. August

In writing parallel programs, programmers expose parallelism and optimize it to meet a particular performance goal on a single platform under an assumed set of workload characteristics. In the field, changing workload characteristics, new parallel platforms, and deployments with different performance goals make the programmers development-time choices suboptimal. To address this problem, this paper presents the Degree of Parallelism Executive (DoPE), an API and run-time system that separates the concern of exposing parallelism from that of optimizing it. Using the DoPE API, the application developer expresses parallelism options. During program execution, DoPEs run-time system uses this information to dynamically optimize the parallelism options in response to the facts on the ground. We easily port several emerging parallel applications to DoPEs API and demonstrate the DoPE run-time systems effectiveness in dynamically optimizing the parallelism for a variety of performance goals.

ieee international conference on high performance computing data and analytics | 2011

A survey of the practice of computational science

Prakash Prabhu; Hanjun Kim; Taewook Oh; Thomas B. Jablin; Nick P. Johnson; Matthew Zoufaly; Arun Raman; Feng Liu; David Walker; Yun Zhang; Soumyadeep Ghosh; David I. August; Jialu Huang; Stephen R. Beard

Computing plays an indispensable role in scientific research. Presently, researchers in science have different problems, needs, and beliefs about computation than professional programmers. In order to accelerate the progress of science, computer scientists must understand these problems, needs, and beliefs. To this end, this paper presents a survey of scientists from diverse disciplines, practicing computational science at a doctoral-granting university with very high re search activity. The survey covers many things, among them, prevalent programming practices within this scientific community, the importance of computational power in different fields, use of tools to enhance performance and soft ware productivity, computational resources leveraged, and prevalence of parallel computation. The results reveal several patterns that suggest interesting avenues to bridge the gap between scientific researchers and programming tools developers.

symposium on code generation and optimization | 2012

Automatic speculative DOALL for clusters

Hanjun Kim; Nick P. Johnson; Jae W. Lee; Scott A. Mahlke; David I. August

Automatic parallelization for clusters is a promising alternative to time-consuming, error-prone manual parallelization. However, automatic parallelization is frequently limited by the imprecision of static analysis. Moreover, due to the inherent fragility of static analysis, small changes to the source code can significantly undermine performance. By replacing static analysis with speculation and profiling, automatic parallelization becomes more robust and applicable. A naïve automatic speculative parallelization does not scale for distributed memory clusters, due to the high bandwidth required to validate speculation. This work is the first automatic speculative DOALL (Spec-DOALL) parallelization system for clusters. We have implemented a prototype automatic parallelization system, called Cluster Spec-DOALL, which consists of a Spec-DOALL parallelizing compiler and a speculative runtime for clusters. Since the compiler optimizes communication patterns, and the runtime is optimized for the cases in which speculation succeeds, Cluster Spec-DOALL minimizes the communication and validation overheads of the speculative runtime. Across 8 benchmarks, Cluster Spec-DOALL achieves a geomean speedup of 43.8x on a 120-core cluster, whereas DOALL without speculation achieves only 4.5x speedup. This demonstrates that speculation makes scalable fully-automatic parallelization for clusters possible.

programming language design and implementation | 2012

Speculative separation for privatization and reductions

Nick P. Johnson; Hanjun Kim; Prakash Prabhu; Ayal Zaks; David I. August

Automatic parallelization is a promising strategy to improve application performance in the multicore era. However, common programming practices such as the reuse of data structures introduce artificial constraints that obstruct automatic parallelization. Privatization relieves these constraints by replicating data structures, thus enabling scalable parallelization. Prior privatization schemes are limited to arrays and scalar variables because they are sensitive to the layout of dynamic data structures. This work presents Privateer, the first fully automatic privatization system to handle dynamic and recursive data structures, even in languages with unrestricted pointers. To reduce sensitivity to memory layout, Privateer speculatively separates memory objects. Privateers lightweight runtime system validates speculative separation and speculative privatization to ensure correct parallel execution. Privateer enables automatic parallelization of general-purpose C/C++ applications, yielding a geomean whole-program speedup of 11.4x over best sequential execution on 24 cores, while non-speculative parallelization yields only 0.93x.

architectural support for programming languages and operating systems | 2013

Practical automatic loop specialization

Taewook Oh; Hanjun Kim; Nick P. Johnson; Jae W. Lee; David I. August

Program specialization optimizes a program with respect to program invariants, including known, fixed inputs. These invariants can be used to enable optimizations that are otherwise unsound. In many applications, a program input induces predictable patterns of values across loop iterations, yet existing specializers cannot fully capitalize on this opportunity. To address this limitation, we present Invariant-induced Pattern based Loop Specialization (IPLS), the first fully-automatic specialization technique designed for everyday use on real applications. Using dynamic information-flow tracking, IPLS profiles the values of instructions that depend solely on invariants and recognizes repeating patterns across multiple iterations of hot loops. IPLS then specializes these loops, using those patterns to predict values across a large window of loop iterations. This enables aggressive optimization of the loop; conceptually, this optimization reconstructs recurring patterns induced by the input as concrete loops in the specialized binary. IPLS specializes real-world programs that prior techniques fail to specialize without requiring hints from the user. Experiments demonstrate a geomean speedup of 14.1% with a maximum speedup of 138% over the original codes when evaluated on three script interpreters and eleven scripts each.

Robotica | 2009

Rapid control prototyping for robot soccer

Junwon Jang; Soohee Han; Hanjun Kim; Choon Ki Ahn; Wook Hyun Kwon

In this paper, we propose rapid-control prototyping (RCP) for a robot soccer using the SIMTool that has been developed in Seoul National University, Korea, for the control-aided control system design (CACSD). The proposed RCP enables us to carry out the rapid design and the verification of controls for two-wheeled mobile robots (TWMRs), players in the robot soccer, without writing C codes directly and requiring a special H/W. On the basis of the proposed RCP, a blockset for the robot soccer is developed for easy design of a variety of mathematical and logical algorithms. All blocks in the blockset are made up of basic blocks offered by the SIMTool. Applied algorithms for specific purposes can be easily and efficiently constructed with just a combination of the blocks in the blockset. As one of the algorithms implemented with the developed blockset, a novel navigation algorithm, called a reactive navigation algorithm using the direction and the avoidance vectors based scheme (RNDAVS), is proposed. It is shown through simulations and experiments that the RNDAVS designed with the proposed RCP can avoid a local minima and the goal non-reachable with obstacles nearby (GNRON) arising from the existing methods. Furthermore, in order to validate the proposed RCP in a real game, we employ an official simulation game for the robot soccer, the SimuroSot. Block diagrams are constructed for strategy, path calculation, and the interface to the SIMTool. We show that the algorithms implemented with the proposed RCP work well in the simulation game.

IFAC Proceedings Volumes | 2008

Rapid Control Prototyping for Robot Soccer

Junwon Jang; Soohee Han; Hanjun Kim; Choon Ki Ahn

Abstract In this paper, we propose rapid control prototyping (RCP) for a robot soccer using the SIMTool that has been developed in Seoul National University, Korea for the control aided control system design (CACSD). The proposed RCP enables us to carry out the rapid design and the verification of controls for two-wheeled mobile robots (TWMRs), players in the robot soccer, without writing C codes directly and requiring a special H/W. On the basis of the proposed RCP, a blockset for the robot soccer is developed for easy design of a variety of mathematical and logical algorithms. All blocks in the blockset are made up of basic blocks offered by the SIMTool. User-defined algorithms can be easily and efficiently constructed with just a combination of the blocks in the blockset. In order to validate the proposed RCP in a real game, we employ an official simulation game for the robot soccer, the SimuroSot. Block diagrams are constructed for strategy, path calculation, and the interface to the SIMTool. We show that the algorithms implemented with the proposed RCP work well in the simulation game.

Archive | 2013