Kihong Heo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kihong Heo is active.

Explore More

Publication

Featured researches published by Kihong Heo.

programming language design and implementation | 2012

Design and implementation of sparse global analyses for C-like languages

Hakjoo Oh; Kihong Heo; Wonchan Lee; Woosuk Lee; Kwangkeun Yi

In this article we present a general method for achieving global static analyzers that are precise, sound, yet also scalable. Our method generalizes the sparse analysis techniques on top of the abstract interpretation framework to support relational as well as non-relational semantics properties for C-like languages. We first use the abstract interpretation framework to have a global static analyzer whose scalability is unattended. Upon this underlying sound static analyzer, we add our generalized sparse analysis techniques to improve its scalability while preserving the precision of the underlying analysis. Our framework determines what to prove to guarantee that the resulting sparse version should preserve the precision of the underlying analyzer. We formally present our framework; we present that existing sparse analyses are all restricted instances of our framework; we show more semantically elaborate design examples of sparse non-relational and relational static analyses; we present their implemen- tation results that scale to analyze up to one million lines of C programs. We also show a set of implementation techniques that turn out to be critical to economically support the sparse analysis process.

programming language design and implementation | 2014

Selective context-sensitivity guided by impact pre-analysis

Hakjoo Oh; Wonchan Lee; Kihong Heo; Hongseok Yang; Kwangkeun Yi

We present a method for selectively applying context-sensitivity during interprocedural program analysis. Our method applies context-sensitivity only when and where doing so is likely to improve the precision that matters for resolving given queries. The idea is to use a pre-analysis to estimate the impact of context-sensitivity on the main analysiss precision, and to use this information to find out when and where the main analysis should turn on or off its context-sensitivity. We formalize this approach and prove that the analysis always benefits from the pre-analysis-guided context-sensitivity. We implemented this selective method for an existing industrial-strength interval analyzer for full C. The method reduced the number of (false) alarms by 24.4%, while increasing the analysis cost by 27.8% on average. The use of the selective method is not limited to context-sensitivity. We demonstrate this generality by following the same principle and developing a selective relational analysis.

static analysis symposium | 2016

Learning a Variable-Clustering Strategy for Octagon from Labeled Data Generated by a Static Analysis

Kihong Heo; Hakjoo Oh; Hongseok Yang

We present a method for automatically learning an effective strategy for clustering variables for the Octagon analysis from a given codebase. This learned strategy works as a preprocessor of Octagon. Given a program to be analyzed, the strategy is first applied to the program and clusters variables in it. We then run a partial variant of the Octagon analysis that tracks relationships among variables within the same cluster, but not across different clusters. The notable aspect of our learning method is that although the method is based on supervised learning, it does not require manually-labeled data. The method does not ask human to indicate which pairs of program variables in the given codebase should be tracked. Instead it uses the impact pre-analysis for Octagon from our previous work and automatically labels variable pairs in the codebase as positive or negative. We implemented our method on top of a static buffer-overflow detector for C programs and tested it against open source benchmarks. Our experiments show that the partial Octagon analysis with the learned strategy scales up to 100KLOC and is 33x faster than the one with the impact pre-analysis (which itself is significantly faster than the original Octagon analysis), while increasing false alarms by only 2 %.

ACM Transactions on Programming Languages and Systems | 2014

Global Sparse Analysis Framework

Hakjoo Oh; Kihong Heo; Wonchan Lee; Woosuk Lee; Daejun Park; Jeehoon Kang; Kwangkeun Yi

In this article, we present a general method for achieving global static analyzers that are precise and sound, yet also scalable. Our method, on top of the abstract interpretation framework, is a general sparse analysis technique that supports relational as well as nonrelational semantics properties for various programming languages. Analysis designers first use the abstract interpretation framework to have a global and correct static analyzer whose scalability is unattended. Upon this underlying sound static analyzer, analysis designers add our generalized sparse analysis techniques to improve its scalability while preserving the precision of the underlying analysis. Our method prescribes what to prove to guarantee that the resulting sparse version should preserve the precision of the underlying analyzer. We formally present our framework and show that existing sparse analyses are all restricted instances of our framework. In addition, we show more semantically elaborate design examples of sparse nonrelational and relational static analyses. We then present their implementation results that scale to globally analyze up to one million lines of C programs. We also show a set of implementation techniques that turn out to be critical to economically support the sparse analysis process.

international conference on software engineering | 2017

Machine-learning-guided selectively unsound static analysis

Kihong Heo; Hakjoo Oh; Kwangkeun Yi

We present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at the risk of missing a large amount of real bugs. By being sound, we can improve the detectability of the analyzer but it often suffers from a large number of false alarms. Our approach aims to strike a balance between these two approaches by selectively allowing unsoundness only when it is likely to reduce false alarms, while retaining true alarms. We use an anomaly-detection technique to learn such harmless unsoundness. We implemented our technique in two static analyzers for full C. One is for a taint analysis for detecting format-string vulnerabilities, and the other is for an interval analysis for buffer-overflow detection. The experimental results show that our approach significantly improves the recall of the original unsound analysis without sacrificing the precision.

conference on object oriented programming systems languages and applications | 2017

Automatically generating features for learning program analysis heuristics for C-like languages

Kwonsoo Chae; Hakjoo Oh; Kihong Heo; Hongseok Yang

We present a technique for automatically generating features for data-driven program analyses. Recently data-driven approaches for building a program analysis have been developed, which mine existi...We present a technique for automatically generating features for data-driven program analyses. Recently data-driven approaches for building a program analysis have been developed, which mine existing codebases and automatically learn heuristics for finding a cost-effective abstraction for a given analysis task. Such approaches reduce the burden of the analysis designers, but they do not remove it completely; they still leave the nontrivial task of designing so called features to the hands of the designers. Our technique aims at automating this feature design process. The idea is to use programs as features after reducing and abstracting them. Our technique goes through selected program-query pairs in codebases, and it reduces and abstracts the program in each pair to a few lines of code, while ensuring that the analysis behaves similarly for the original and the new programs with respect to the query. Each reduced program serves as a boolean feature for program-query pairs. This feature evaluates to true for a given program-query pair when (as a program) it is included in the program part of the pair. We have implemented our approach for three real-world static analyses. The experimental results show that these analyses with automatically-generated features are cost-effective and consistently perform well on a wide range of programs.

ACM Transactions on Programming Languages and Systems | 2016

Selective X-Sensitive Analysis Guided by Impact Pre-Analysis

Hakjoo Oh; Wonchan Lee; Kihong Heo; Hongseok Yang; Kwangkeun Yi

We present a method for selectively applying context-sensitivity during interprocedural program analysis. Our method applies context-sensitivity only when and where doing so is likely to improve the precision that matters for resolving given queries. The idea is to use a pre-analysis to estimate the impact of context-sensitivity on the main analysis’s precision, and to use this information to find out when and where the main analysis should turn on or off its context-sensitivity. We formalize this approach and prove that the analysis always benefits from the pre-analysis--guided context-sensitivity. We implemented this selective method for an existing industrial-strength interval analyzer for full C. The method reduced the number of (false) alarms by 24.4% while increasing the analysis cost by 27.8% on average. The use of the selective method is not limited to context-sensitivity. We demonstrate this generality by following the same principle and developing a selective relational analysis and a selective flow-sensitive analysis. Our experiments show that the method cost-effectively improves the precision in the these analyses as well.

Software - Practice and Experience | 2016

Widening with thresholds via binary search

Sol Kim; Kihong Heo; Hakjoo Oh; Kwangkeun Yi

In this paper, we present a useful technique for implementing practical static program analyzers that use widening. Our technique aims to improve the efficiency of the conventional widening‐with‐thresholds technique at a small precision compromise. In static analysis, widening is used to accelerate (or converge) fixed point iterations. Unfortunately, this acceleration often comes with a significant loss in analysis precision. A standard method to improve the precision is to apply the widening with a set of thresholds. However, this technique may significantly slow down the analysis, because in practice it is commonplace to use a large set of thresholds. In worst case, the technique increases the analysis cost by the size N of the threshold set. In this paper, we propose a technique to reduce the worst case by logN , by employing a binary search in the process of applying threshold values. We formalize the technique in the abstract interpretation framework and show that, by experiments with a realistic static analyzer for C, our technique considerably improves the efficiency (by 81.5%) of the existing method with a small compromise (20.9%) on the analysis precision. Copyright

programming language design and implementation | 2018

Accelerating search-based program synthesis using learned probabilistic models

Woosuk Lee; Kihong Heo; Rajeev Alur; Mayur Naik

A key challenge in program synthesis concerns how to efficiently search for the desired program in the space of possible programs. We propose a general approach to accelerate search-based program synthesis by biasing the search towards likely programs. Our approach targets a standard formulation, syntax-guided synthesis (SyGuS), by extending the grammar of possible programs with a probabilistic model dictating the likelihood of each program. We develop a weighted search algorithm to efficiently enumerate programs in order of their likelihood. We also propose a method based on transfer learning that enables to effectively learn a powerful model, called probabilistic higher-order grammar, from known solutions in a domain. We have implemented our approach in a tool called Euphony and evaluate it on SyGuS benchmark problems from a variety of domains. We show that Euphony can learn good models using easily obtainable solutions, and achieves significant performance gains over existing general-purpose as well as domain-specific synthesizers.

Software - Practice and Experience | 2017

Selective conjunction of context-sensitivity and octagon domain toward scalable and precise global static analysis: Selective conjunction of context-sensitivity and octagon domain toward scalable and precise global static analysis

Kihong Heo; Hakjoo Oh; Kwangkeun Yi

We present a practical technique for achieving a scalable and precise global static analysis by selectively applying context‐sensitivity and the octagon relational domain. For precise analysis, context‐sensitivity and relational analysis are key properties, but it has been hard to practically combine both of them. Our approach turns on those precision improvement features only when the analysis is likely to improve the precision to resolve given queries. The guidance comes from an impact pre‐analysis that estimates the impact of a fully context‐sensitive and relational octagon analysis. We designed a cost‐effective pre‐analysis and implemented this method in a realistic octagon analysis for full C. The experimental results show that our approach proves eight times more queries, while saving the time cost by 73.1% compared with a partially relational octagon analysis enabled by a syntactic heuristic. Copyright

Explore More