Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sang Kil Cha is active.

Publication


Featured researches published by Sang Kil Cha.


ieee symposium on security and privacy | 2012

Unleashing Mayhem on Binary Code

Sang Kil Cha; Thanassis Avgerinos; Alexandre Rebert; David Brumley

In this paper we present Mayhem, a new system for automatically finding exploitable bugs in binary (i.e., executable) programs. Every bug reported by Mayhem is accompanied by a working shell-spawning exploit. The working exploits ensure soundness and that each bug report is security-critical and actionable. Mayhem works on raw binary code without debugging information. To make exploit generation possible at the binary-level, Mayhem addresses two major technical challenges: actively managing execution paths without exhausting memory, and reasoning about symbolic memory indices, where a load or a store address depends on user input. To this end, we propose two novel techniques: 1) hybrid symbolic execution for combining online and offline (concolic) execution to maximize the benefits of both techniques, and 2) index-based memory modeling, a technique that allows Mayhem to efficiently reason about symbolic memory at the binary level. We used Mayhem to find and demonstrate 29 exploitable vulnerabilities in both Linux and Windows programs, 2 of which were previously undocumented.


international conference on software engineering | 2014

Enhancing symbolic execution with veritesting

Thanassis Avgerinos; Alexandre Rebert; Sang Kil Cha; David Brumley

We present MergePoint, a new binary-only symbolic execution system for large-scale and fully unassisted testing of commodity off-the-shelf (COTS) software. MergePoint introduces veritesting, a new technique that employs static symbolic execution to amplify the effect of dynamic symbolic execution. Veritesting allows MergePoint to find twice as many bugs, explore orders of magnitude more paths, and achieve higher code coverage than previous dynamic symbolic execution systems. MergePoint is currently running daily on a 100 node cluster analyzing 33,248 Linux binaries; has generated more than 15 billion SMT queries, 200 million test cases, 2,347,420 crashes, and found 11,687 bugs in 4,379 distinct applications.


Communications of The ACM | 2014

Automatic exploit generation

Thanassis Avgerinos; Sang Kil Cha; Alexandre Rebert; Edward J. Schwartz; Maverick Woo; David Brumley

The idea is to identify security-critical software bugs so they can be fixed first.


Journal of Communications and Networks | 2011

SplitScreen: Enabling efficient, distributed malware detection

Sang Kil Cha; Iulian Moraru; Jiyong Jang; John Truelove; David Brumley; David G. Andersen

We present the design and implementation of a novel anti-malware system called SplitScreen. SplitScreen performs an additional screening step prior to the signature matching phase found in existing approaches. The screening step filters out most non-infected files (90%) and also identifies malware signatures that are not of interest (99%). The screening step significantly improves end-to-end performance because safe files are quickly identified and are not processed further, and malware files can subsequently be scanned using only the signatures that are necessary. Our approach naturally leads to a network-based anti-malware solution in which clients only receive signatures they needed, not every malware signature ever created as with current approaches. We have implemented SplitScreen as an extension to ClamAV, the most popular open source anti-malware software. For the current number of signatures, our implementation is 2x faster and requires 2x less memory than the original ClamAV. These gaps widen as the number of signatures grows.


ieee symposium on security and privacy | 2015

Program-Adaptive Mutational Fuzzing

Sang Kil Cha; Maverick Woo; David Brumley

We present the design of an algorithm to maximize the number of bugs found for black-box mutational fuzzing given a program and a seed input. The major intuition is to leverage white-box symbolic analysis on an execution trace for a given program-seed pair to detect dependencies among the bit positions of an input, and then use this dependency relation to compute a probabilistically optimal mutation ratio for this program-seed pair. Our result is promising: we found an average of 38.6% more bugs than three previous fuzzers over 8 applications using the same amount of fuzzing time.


computer and communications security | 2013

Scheduling black-box mutational fuzzing

Maverick Woo; Sang Kil Cha; Samantha Gottlieb; David Brumley

Black-box mutational fuzzing is a simple yet effective technique to find bugs in software. Given a set of program-seed pairs, we ask how to schedule the fuzzings of these pairs in order to maximize the number of unique bugs found at any point in time. We develop an analytic framework using a mathematical model of black-box mutational fuzzing and use it to evaluate 26 existing and new randomized online scheduling algorithms. Our experiments show that one of our new scheduling algorithms outperforms the multi-armed bandit algorithm in the current version of the CERT Basic Fuzzing Framework (BFF) by finding 1.5x more unique bugs in the same amount of time.


computer and communications security | 2010

Platform-independent programs

Sang Kil Cha; Brian Pak; David Brumley; Richard J. Lipton

Given a single program (i.e., bit string), one may assume that the programs behaviors can be determined by first identifying the native runtime architecture and then executing the program on that architecture. In this paper, we challenge the notion that programs run on a single architecture by developing techniques that automatically create a single program string that a) runs on different architectures, and b) potentially has different behaviors depending upon which architecture it runs on. At a high level, a primary security implication is that any program analysis done on a program must only be considered valid for the assumed architecture. Our techniques also introduce a new type of steganography that hides execution behaviors. In order to demonstrate our techniques, we implement a system for generating platform-independent programs for x86, ARM, and MIPS. We use our system to generate real platform-independent programs.


international conference on software engineering | 2016

RETracer: triaging crashes by reverse execution from partial memory dumps

Weidong Cui; Marcus Peinado; Sang Kil Cha; Yanick Fratantonio; Vasileios P. Kemerlis

Many software providers operate crash reporting services to automatically collect crashes from millions of customers and file bug reports. Precisely triaging crashes is necessary and important for software providers because the millions of crashes that may be reported every day are critical in identifying high impact bugs. However, the triaging accuracy of existing systems is limited, as they rely only on the syntactic information of the stack trace at the moment of a crash without analyzing program semantics.In this paper, we present RETracer, the first system to triage software crashes based on program semantics reconstructed from memory dumps. RETracer was designed to meet the requirements of large-scale crash reporting services. RETracer performs binary-level backward taint analysis without a recorded execution trace to understand how functions on the stack contribute to the crash. The main challenge is that the machine state at an earlier time cannot be recovered completely from a memory dump, since most instructions are information destroying.We have implemented RETracer for x86 and x86-64 native code, and compared it with the existing crash triaging tool used by Microsoft. We found that RETracer eliminates two thirds of triage errors based on a manual analysis of 140 bugs fixed in Microsoft Windows and Office. RETracer has been deployed as the main crash triaging system on Microsoft’s crash reporting service.


automated software engineering | 2017

Testing intermediate representations for binary analysis

Soomin Kim; Markus Faerevaag; Minkyu Jung; Seungil Jung; DongYeop Oh; JongHyup Lee; Sang Kil Cha

Binary lifting, which is to translate a binary executable to a high-level intermediate representation, is a primary step in binary analysis. Despite its importance, there are only few existing approaches to testing the correctness of binary lifters. Furthermore, the existing approaches suffer from low test coverage, because they largely depend on random test case generation. In this paper, we present the design and implementation of the first systematic approach to testing binary lifters. We have evaluated the proposed system on 3 state-of-the-art binary lifters, and found 24 previously unknown semantic bugs. Our result demonstrates that writing a precise binary lifter is extremely difficult even for those heavily tested projects.


computer and communications security | 2017

IMF: Inferred Model-based Fuzzer

Hyungseok Han; Sang Kil Cha

Kernel vulnerabilities are critical in security because they naturally allow attackers to gain unprivileged root access. Although there has been much research on finding kernel vulnerabilities from source code, there are relatively few research on kernel fuzzing, which is a practical bug finding technique that does not require any source code. Existing kernel fuzzing techniques involve feeding in random input values to kernel API functions. However, such a simple approach does not reveal latent bugs deep in the kernel code, because many API functions are dependent on each other, and they can quickly reject arbitrary parameter values based on their calling context. In this paper, we propose a novel fuzzing technique for commodity OS kernels that leverages inferred dependence model between API function calls to discover deep kernel bugs. We implement our technique on a fuzzing system, called IMF. IMF has already found 32 previously unknown kernel vulnerabilities on the latest macOS version 10.12.3 (16D32) at the time of this writing.

Collaboration


Dive into the Sang Kil Cha's collaboration.

Top Co-Authors

Avatar

David Brumley

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexandre Rebert

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Maverick Woo

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

David G. Andersen

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Iulian Moraru

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Jiyong Jang

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

John Truelove

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Brian Pak

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

David Warren

Software Engineering Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge