Hyojin Sung
University of Illinois at Urbana–Champaign
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hyojin Sung.
conference on object-oriented programming systems, languages, and applications | 2009
Robert L. Bocchino; Vikram S. Adve; Danny Dig; Sarita V. Adve; Stephen T. Heumann; Rakesh Komuravelli; Jeffrey Overbey; Patrick Simmons; Hyojin Sung; Mohsen Vakilian
Todays shared-memory parallel programming models are complex and error-prone.While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we demonstrate that a practical type and effect system can simplify parallel programming by guaranteeing deterministic semantics with modular, compile-time type checking even in a rich, concurrent object-oriented language such as Java. We describe an object-oriented type and effect system that provides several new capabilities over previous systems for expressing deterministic parallel algorithms.We also describe a language called Deterministic Parallel Java (DPJ) that incorporates the new type system features, and we show that a core subset of DPJ is sound. We describe an experimental validation showing thatDPJ can express a wide range of realistic parallel programs; that the new type system features are useful for such programs; and that the parallel programs exhibit good performance gains (coming close to or beating equivalent, nondeterministic multithreaded programs where those are available).
architectural support for programming languages and operating systems | 2013
Hyojin Sung; Rakesh Komuravelli; Sarita V. Adve
Recent work has shown that disciplined shared-memory programming models that provide deterministic-by-default semantics can simplify both parallel software and hardware. Specifically, the DeNovo hardware system has shown that the software guarantees of such models (e.g., data-race-freedom and explicit side-effects) can enable simpler, higher performance, and more energy-efficient hardware than the current state-of-the-art for deterministic programs. Many applications, however, contain non-deterministic parts; e.g., using lock synchronization. For commercial hardware to exploit the benefits of DeNovo, it is therefore necessary to extend DeNovo to support non-deterministic applications. This paper proposes DeNovoND, a system that supports lock-based, disciplined non-determinism, with the simplicity, performance, and energy benefits of DeNovo. We use a combination of distributed queue-based locks and access signatures to implement simple memory consistency semantics for safe non-determinism, with a coherence protocol that does not require transient states, invalidation traffic, or directories, and does not incur false sharing. The resulting system is simpler, shows comparable or better execution time, and has 33% less network traffic on average (translating directly into energy savings) relative to a state-of-the-art invalidation-based protocol for 8 applications designed for lock synchronization.
architectural support for programming languages and operating systems | 2015
Hyojin Sung; Sarita V. Adve
Current shared-memory hardware is complex and inefficient. Prior work on the DeNovo coherence protocol showed that disciplined shared-memory programming models can enable more complexity-, performance-, and energy-efficient hardware than the state-of-the-art MESI protocol. DeNovo, however, severely restricted the synchronization constructs an application can support. This paper proposes DeNovoSync, a technique to support arbitrary synchronization in DeNovo. The key challenge is that DeNovo exploits race-freedom to use reader-initiated local self-invalidations (instead of conventional writer-initiated remote cache invalidations) to ensure coherence. Synchronization accesses are inherently racy and not directly amenable to self-invalidations. DeNovoSync addresses this challenge using a novel combination of registration of all synchronization reads with a judicious hardware backoff to limit unnecessary registrations. For a wide variety of synchronization constructs and applications, compared to MESI, DeNovoSync shows comparable or up to 22% lower execution time and up to 58% lower network traffic, enabling DeNovos advantages for a much broader class of software than previously possible.
IEEE Micro | 2014
Hyojin Sung; Rakesh Komuravelli; Sarita V. Adve
Recent research in disciplined shared-memory programming models presents a unique opportunity for rethinking the multicore memory hierarchy for better efficiency in terms of complexity, performance, and energy. The DeNovo hardware system showed that for deterministic programs written using such disciplined models, hardware can be much more efficient than the current state of the art. For DeNovo to be adopted by commercial systems, however, it is necessary to extend it to support nondeterministic applications as well; for example, applications using lock synchronization. This article proposes DeNovoND, a system that provides support for disciplined nondeterministic codes with locks while retaining the simplicity, performance, and energy benefits of DeNovo. The authors designed and implemented simple memory consistency semantics for safe nondeterminism using distributed queue-based locks and access signatures. The resulting protocol avoids transient states, invalidation traffic, directory sharer-lists, and false sharing, which are all significant sources of inefficiency in existing protocols. Their experiments showed that DeNovoND provides comparable or better execution time for applications designed for lock synchronization. In addition, it incurs 33 percent less network traffic on average relative to a state-of-the-art invalidation-based protocol, which directly translates into energy savings.
international symposium on performance analysis of systems and software | 2015
Robert Smolinski; Rakesh Komuravelli; Hyojin Sung; Sarita V. Adve
While many techniques have been shown to be successful at reducing the amount of on-chip network traffic, no studies have shown how close a combined approach would come to eliminating all unnecessary data traffic, nor have any studies provided insight into where the remaining challenges are. This paper systematically analyzes the traffic inefficiencies of a directory-based MESI protocol and a more efficient hardware-software co-designed protocol, DeNovo. We categorize data waste into various categories and explore several simple optimizations extending DeNovo with the aim of eliminating all of the on-chip network traffic waste. With all the proposed optimizations, we are able to completely eliminate (100%) onchip network traffic waste at L2 for some of the applications (93.5% on average) compared to the previous DeNovo protocol.
international conference on parallel architectures and compilation techniques | 2011
Byn Choi; Rakesh Komuravelli; Hyojin Sung; Robert Smolinski; Nima Honarmand; Sarita V. Adve; Vikram S. Adve; Nicholas P. Carter; Ching Tsun Chou
Archive | 2009
Byn Choi; Rakesh Komuravelli; Victor Lu; Hyojin Sung; Robert L. Bocchino
Archive | 2010
Byn Choi; Rakesh Komuravelli; Hyojin Sung; Robert L. Bocchino; Sarita V. Adve; Vikram S. Adve
Archive | 2009
Robert L. Bocchino; Vikram S. Adve; Danny Dig; Stephen T. Heumann; Rakesh Komuravelli; Jeffrey Overbey; Patrick Simmons; Hyojin Sung; Mohsen Vakilian
Advances in Computer Graphics Hardware | 2010
Byn Choi; Rakesh Komuravelli; Victor Lu; Hyojin Sung; Robert L. Bocchino; Sarita V. Adve; John Hart