Is this you? Create Your Porfile

Byeongcheol Lee

Gwangju Institute of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Byeongcheol Lee is active.

Explore More

Publication

Featured researches published by Byeongcheol Lee.

programming language design and implementation | 2010

Jinn: synthesizing dynamic bug detectors for foreign language interfaces

Byeongcheol Lee; Ben Wiedermann; Martin Hirzel; Robert Grimm; Kathryn S. McKinley

Programming language specifications mandate static and dynamic analyses to preclude syntactic and semantic errors. Although individual languages are usually well-specified, composing languages is not, and this poor specification is a source of many errors in multilingual programs. For example, virtually all Java programs compose Java and C using the Java Native Interface (JNI). Since JNI is informally specified, developers have difficulty using it correctly, and current Java compilers and virtual machines (VMs) inconsistently check only a subset of JNI constraints. This papers most significant contribution is to show how to synthesize dynamic analyses from state machines to detect foreign function interface (FFI) violations. We identify three classes of FFI constraints encoded by eleven state machines that capture thousands of JNI and Python/C FFI rules. We use a mapping function to specify which state machines, transitions, and program entities (threads, objects, references) to check at each FFI call and return. From this function, we synthesize a context-specific dynamic analysis to find FFI bugs. We build bug detection tools for JNI and Python/C using this approach. For JNI, we dynamically and transparently interpose the analysis on Java and C language transitions through the JVM tools interface. The resulting tool, called Jinn, is compiler and virtual machine independent. It detects and diagnoses a wide variety of FFI bugs that other tools miss. This approach greatly reduces the annotation burden by exploiting common FFI constraints: whereas the generated Jinn code is 22,000+ lines, we wrote only 1,400 lines of state machine and mapping code. Overall, this paper lays the foundation for a more principled approach to developing correct multilingual software and a more concise and automated approach to FFI specification.

conference on object-oriented programming systems, languages, and applications | 2009

Debug all your code: portable mixed-environment debugging

Byeongcheol Lee; Martin Hirzel; Robert Grimm; Kathryn S. McKinley

Programmers build large-scale systems with multiple languages to reuse legacy code and leverage languages best suited to their problems. For instance, the same program may use Java for ease-of-programming and C to interface with the operating system. These programs pose significant debugging challenges, because programmers need to understand and control code across languages, which may execute in different environments. Unfortunately, traditional multilingual debuggers require a single execution environment. This paper presents a novel composition approach to building portable mixed-environment debuggers, in which an intermediate agent interposes on language transitions, controlling and reusing single-environment debuggers. We implement debugger composition in Blink, a debugger for Java, C, and the Jeannie programming language. We show that Blink is (1) relatively simple: it requires modest amounts of new code; (2) portable: it supports multiple Java Virtual Machines, C compilers, operating systems, and component debuggers; and (3) powerful: composition eases debugging, while supporting new mixed-language expression evaluation and Java Native Interface (JNI) bug diagnostics. In real-world case studies, we show that language-interface errors require single-environment debuggers to restart execution multiple times, whereas Blink directly diagnoses them with one execution. We also describe extensions for other mixed-environments to show debugger composition will generalize.

compiler construction | 2007

Correcting the dynamic call graph using control-flow constraints

Byeongcheol Lee; Kevin Resnick; Michael D. Bond; Kathryn S. McKinley

To reason about programs, dynamic optimizers and analysis tools use sampling to collect a dynamic call graph (DCG). However, sampling has not achieved high accuracy with low runtime overhead. As object-oriented programmers compose increasingly complex programs, inaccurate call graphs will inhibit analysis and optimizations. This paper demonstrates how to use static and dynamic control flow graph (CFG) constraints to improve the accuracy of the DCG. We introduce the frequency dominator (FDOM), a novel CFG relation that extends the dominator relation to expose static relative execution frequencies of basic blocks. We combine conservation of flow and dynamic CFG basic block profiles to further improve the accuracy of the DCG. Together these approaches add minimal overhead (1%) and achieve 85% accuracy compared to a perfect call graph for SPEC JVM98 and DaCapo benchmarks. Compared to sampling alone, accuracy improves by 12 to 36%. These results demonstrate that static and dynamic control-flow information offer accurate information for efficiently improving the DCG.

european conference on object oriented programming | 2012

Marco: safe, expressive macros for any language

Byeongcheol Lee; Robert Grimm; Martin Hirzel; Kathryn S. McKinley

Macros improve expressiveness, concision, abstraction, and language interoperability without changing the programming language itself. They are indispensable for building increasingly prevalent multilingual applications. Unfortunately, existing macro systems are well-encapsulated but unsafe (e.g., the C preprocessor) or are safe but tightly-integrated with the language implementation (e.g., Scheme macros). This paper introduces Marco, the first macro system that seeks both encapsulation and safety. Marco is based on the observation that the macro system need not know all the syntactic and semantic rules of the target language but must only directly enforce some rules, such as variable name binding. Using this observation, Marco off-loads most rule checking to unmodified target-language compilers and interpreters and thus becomes language-scalable. We describe the Marco language, its language-independent safety analysis, and how it uses two example target-language analysis plug-ins, one for C++ and one for SQL. This approach opens the door to safe and expressive macros for any language.

automated software engineering | 2015

Mutation-Based Fault Localization for Real-World Multilingual Programs (T)

Shin Hong; Byeongcheol Lee; Taehoon Kwak; Yiru Jeon; Bongsuk Ko; Yunho Kim; Moonzoo Kim

Programmers maintain and evolve their software in a variety of programming languages to take advantage of various control/data abstractions and legacy libraries. The programming language ecosystem has diversified over the last few decades, and non-trivial programs are likely to be written in more than a single language. Unfortunately, language interfaces such as Java Native Interface and Python/C are difficult to use correctly and the scope of fault localization goes beyond language boundaries, which makes debugging multilingual bugs challenging. To overcome the aforementioned limitations, we propose a mutation-based fault localization technique for real-world multilingual programs. To improve the accuracy of locating multilingual bugs, we have developed and applied new mutation operators as well as conventional mutation operators. The results of the empirical evaluation for six non-trivial real-world multilingual bugs are promising in that the proposed technique identifies the buggy statements as the most suspicious statements for all six bugs.

Information & Software Technology | 2017

MUSEUM: Debugging real-world multilingual programs using mutation analysis

Shin Hong; Taehoon Kwak; Byeongcheol Lee; Yiru Jeon; Bongseok Ko; Yunho Kim; Moonzoo Kim

Context: The programming language ecosystem has diversified over the last few decades. Non-trivial programs are likely to be written in more than a single language to take advantage of various control/data abstractions and legacy libraries. Objective: Debugging multilingual bugs is challenging because language interfaces are difficult to use correctly and the scope of fault localization goes beyond language boundaries. To locate the causes of real-world multilingual bugs, this article proposes a mutation-based fault localization technique (MUSEUM). Method: MUSEUM modifies a buggy program systematically with our new mutation operators as well as conventional mutation operators, observes the dynamic behavioral changes in a test suite, and reports suspicious statements. To reduce the analysis cost, MUSEUM selects a subset of mutated programs and test cases. Results: Our empirical evaluation shows that MUSEUM is (i) effective: it identifies the buggy statements as the most suspicious statements for both resolved and unresolved non-trivial bugs in real-world multilingual programming projects; and (ii) efficient: it locates the buggy statements in modest amount of time using multiple machines in parallel. Also, by applying selective mutation analysis (i.e., selecting subsets of mutants and test cases to use), MUSEUM achieves significant speedup with marginal accuracy loss compared to the full mutation analysis. Conclusion: It is concluded that MUSEUM locates real-world multilingual bugs accurately. This result shows that mutation analysis can provide an effective, efficient, and language semantics agnostic analysis on multilingual code. Our light-weight analysis approach would play important roles as programmers write and debug large and complex programs in diverse programming languages.

Software - Practice and Experience | 2015

Debugging mixed-environment programs with Blink

Byeongcheol Lee; Martin Hirzel; Robert Grimm; Kathryn S. McKinley

Programmers build large‐scale systems with multiple languages to leverage legacy code and languages best suited to their problems. For instance, the same program may use Java for ease of programming and C to interface with the operating system. These programs pose significant debugging challenges, because programmers need to understand and control code across languages, which often execute in different environments. Unfortunately, traditional multilingual debuggers require a single execution environment. This paper presents a novel composition approach to building portable mixed‐environment debuggers, in which an intermediate agent interposes on language transitions, controlling and reusing single‐environment debuggers. We implement debugger composition in Blink, a debugger for Java, C, and the Jeannie programming language. We show that Blink is (i) simple: it requires modest amounts of new code; (ii) portable: it supports multiple Java virtual machines, C compilers, operating systems, and component debuggers; and (iii) powerful: composition eases debugging, while supporting new mixed‐language expression evaluation and Java native interface bug diagnostics. To demonstrate the generality of interposition, we build prototypes and demonstrate debugger language transitions with C for five of six other languages (Caml, Common Lisp, C#, Perl 5, Python, and Ruby) without modifications to their debuggers. Using real‐world case studies, we show that diagnosing language interface errors require prior single‐environment debuggers to restart execution multiple times, whereas Blink directly diagnoses them with one execution. Copyright

KIISE Transactions on Computing Practices | 2017

An Empirical Study of Diversity and Interoperability of Programming Languages

Bongsuk Ko; Byeongcheol Lee

하나의 프로그래밍 언어로 작성된 복잡한 대형 프로그램의 무결점성을 확보하는 것은 결코 쉽지 않은 일이다. 여기에 프 로그래밍 언어의 수가 증가하면 더욱더 무결점성의 확보가 어 려워진다. 둘 이상의 언어로 작성된 프로그램 코드 사이에서 새로운 종류의 프로그램 오류들이 발생하고 다수의 프로그램 분석 기술 또는 도구들은 하나의 언어에만 적용 가능하기 때 문이다. 이 문제를 해결하기 위하여 프로그램 언어 상호운영 의 안전성 확보를 위한 프로그램 분석 및 도구들이 연구되고 있다 [1,2,3,4,5]. 현재까지의 프로그래밍 언어 상호운영 연구 들은 연구의 효과를 입증하기 위하여 특별한 기준 없이 둘 또 는 세 개 정도의 프로그래밍 언어를 선택하고 그 언어들로 작 성된 열 개 이하의 공개 소프트웨어 프로젝트 프로그램들을 실험 대상으로 정하고 있다. [1,2,3,4,5] 불행히도 이런 프로 그래밍 언어 및 다중 언어 소프트웨어의 선택이 얼마나 대표 성을 가지고 있는지는 의문스러운 상황이다. 이 논문에서는 적지 않은 사용자를 가지는 우분투 소프트웨 어 생태계 내에서 프로그래밍 언어의 다원성과 상호운영성을 실증적으로 조사하였다. 2010년부터 2016년까지 총 7년 동안 배포된 네 가지 버전에 존재하는 20만여 개의 바이너리 패키 지를 구성하는 원시 코드 프로그램에 대해서 실증적 연구를 수행하였다. 배포판이 진화할수록 프로그램 언어 사용이 다원 화되고 있음을 확인하였고 둘 이상의 프로그래밍 언어로 작성 된 소프트웨어의 수가 증가함을 확인하였다. 이 연구 결과로 기존의 프로그래밍 언어 상호운영성 연구가 실재성 및 현실성 이 있음을 재확인하였다. 또한, 후속 연구로서 언어 상호운영 오류 벤치 마크 구성의 기초를 마련할 것으로 기대한다. 본 논문의 구성은 다음과 같다. 2장에서는 우분투 패키지들을 수집한 방법을 설명하고, 3장에서는 수집한 패키지를 바탕으로 사용된 프로그래밍 언어의 다원성 변화를 분석한다. 4장에서는 우분투 배포판 내에 있는 다중 언어 프로그램을 대상으로 프로그래밍 언어의 상호운용성을 분석하고 5장에서 결론을 맺는다.

IEEE Access | 2017

A Comparative Study of Programming Environments Exploiting Heterogeneous Systems

Bongsuk Ko; Seunghun Han; Yongjun Park; Moongu Jeon; Byeongcheol Lee

This paper compares programming environments that exploit heterogeneous systems to process a large amount of data efficiently. Our motivation is to investigate the feasibility of the adaptive, transparent migration of intensive computation for a large amount of data across heterogeneous programming languages and processors for high performance and programmability. We compare a variety of programming environments composed of programming languages, such as Java and C, memory space models, such as distinct and shared memory, and parallel processors, such as general-purpose CPUs and graphics processing units (GPUs) to examine their performance-programmability tradeoffs. In addition, we introduce a software-based shared virtual memory that creates a view of the host memory inside GPU kernels to enable seamless computation offloading from the host to the device. This paper reveals a programmability-performance hierarchy in which programs increase their performance at the cost of decreasing programmability. The experimental results suggest the desirability of a well-balanced system.

ACM Transactions on Architecture and Code Optimization | 2016

Adaptive Correction of Sampling Bias in Dynamic Call Graphs

Byeongcheol Lee

This article introduces a practical low-overhead adaptive technique of correcting sampling bias in profiling dynamic call graphs. Timer-based sampling keeps the overhead low but sampling bias lowers the accuracy when either observable call events or sampling actions are not equally spaced in time. To mitigate sampling bias, our adaptive correction technique weights each sample by monitoring time-varying spacing of call events and sampling actions. We implemented and evaluated our adaptive correction technique in Jikes RVM, a high-performance virtual machine. In our empirical evaluation, our technique significantly improved the sampling accuracy without measurable overhead and resulted in effective feedback directed inlining.

Explore More