Jinyong Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jinyong Lee is active.

Explore More

Publication

Featured researches published by Jinyong Lee.

design automation conference | 2015

Efficient dynamic information flow tracking on a processor with core debug interface

Jinyong Lee; Ingoo Heo; Yongje Lee; Yunheung Paek

Dynamic information flow tracking (DIFT) is a promising solution to prevent various attacks on software running on a processor. Previous hardware solutions usually mandate drastic change to internal processor architecture. More recent ones to minimize the change have proposed external devices for DIFT. However, these approaches intrinsically suffer from the high overhead to communicate with their external devices. Consequently, they either significantly lose performance, or inevitably make invasive modifications to the processor inside. Our solution also rely on external hardware for DIFT, but unlike theirs, ours exploits the core debug interface (CDI) to tackle the communication issue. CDI is provided in most commercial processors for debugging so that we were able to build our system simply by plugging our hardware to the processor via CDI, precluding the need for altering the processor itself. Experiments show that our hardware efficiently performs DIFT mainly thanks to the support of CDI that helps us cut substantially down the communication costs.

applied reconfigurable computing | 2012

Exploiting both pipelining and data parallelism with SIMD reconfigurable architecture

Yongjoo Kim; Jongeun Lee; Jinyong Lee; Toan X. Mai; Ingoo Heo; Yunheung Paek

Reconfigurable Architecture (RA), which provides extremely high energy efficiency for certain domains of applications, have one problem that current mapping algorithms for it do not scale well with the number of cores. One approach to this problem is using SIMD (Single Instruction Multiple Data) paradigm. However, SIMD can complicate the mapping problem by adding an additional dimension, i.e., iteration mapping, to the already inter-dependent problems of data mapping and operation mapping, and can significantly affect performance through memory bank conflicts. In this paper we introduce SIMD reconfigurable architecture, which allows for SIMD mapping at multiple levels of granularity, and investigate ways to minimize bank conflicts in a SIMD reconfigurable architecture with the related sub-problems taken into consideration. We further present data tiling and evaluate a conflict-free scheduling algorithm as a way to eliminate bank conflicts for a certain class of iteration and data mapping.

design, automation, and test in europe | 2016

Integration of ROP/JOP monitoring IPs in an ARM-based SoC

Yongje Lee; Jinyong Lee; Ingoo Heo; Dongil Hwang; Yunheung Paek

Code reuse attack (CRA) is a powerful technique that allows attackers to perform arbitrary computation by reusing the existing code fragments. To defend from CRAs while complying with the conventional ARM-based SoC design principles, the previous hardware solution suggests the use of the ARM debug interface to acquire the control flow information of an application running on the host. However, it requires tremendous storage space to store the complementary data necessary to trace the execution flow. In this paper, we propose a new hardware CRA monitor which gives both low storage overhead and high performance. For this, we have used an instrumentation technique which transforms the original ARM binary code into a form which will ease the CRA monitor to efficiently extract through the debug interface all crucial pieces of runtime information from the trace outcomes. In addition, while the previous solution was only built to detect one type of CRAs, called return-oriented programming (ROP), ours has been designed to unify the detection logics for ROP and another important type of CRAs, called jump-oriented programming (JOP). Empirical results show that our solution dramatically reduces the storage overhead for CRA detection, yet successfully detecting both ROP and JOP attacks simultaneously with negligibly low runtime overhead and moderate area overhead.

design, automation, and test in europe | 2015

Extrax: security extension to extract cache resident information for snoop-based external monitors

Jinyong Lee; Yongje Lee; Hyungon Moon; Ingoo Heo; Yunheung Paek

Advent of rootkits has urged researchers to conduct much research on defending the integrity of OS kernels. Even though recently proposed snoop-based monitors have shown to provide higher performance and security level compared to conventional hypervisor-based monitors, we discovered that the use of write-back caches in a system would seriously undermine the effectiveness of snoop-based monitors. To address the problem, we propose a special hardware unit called Extrax which makes use of existing hardware logic, core debugging interface, to extract necessary information for security monitoring. Being implemented to refine the debug information for security purposes, Extrax assists snoop-based monitors to detect attacks that exploit write-back caches. Experimental results show that our system can detect more advanced attacks, which the state-of-the-art snoop-based hardware monitors cannot capture, with moderate area overhead and power consumption.

international soc design conference | 2010

An ASIP approach for motion estimation reusing resources for H.264 intra prediction

Ingoo Heo; Sang-Hyun Park; Jinyong Lee; Yunheung Paek

For high video quality and high compression rate, H.264, the latest standard of video compression, is widely used. Motion estimation is well known application that reduces temporal redundancy and the most computation-intensive part of the standard. In order to improve the performance of motion estimation, various approaches were suggested, such as novel motion estimation algorithms, Application Specific Integrated Circuit(ASIC)s and Application Specific Instruction set Processor(ASIP)s. Among them, ASIP approach became popular because it can narrow the gap between ASICs and General Purpose programmable Processors (GPP) in terms of performance, power, cost and flexibility. ASIP gains flexibility since it is based on programmable processor, and reasonable performance by adding application specific instructions. In this paper, we introduce an ASIP for motion estimation inherited from our previous ASIP for H.264 intra prediction [5]. The proposed ASIP design shows sufficient throughput for QCIF format using Three Step Search(TSS) algorithm and little area increase about 11% compared to [5] while H.264 intra prediction is still enabled.

ACM Transactions on Design Automation of Electronic Systems | 2016

Efficient Security Monitoring with the Core Debug Interface in an Embedded Processor

Jinyong Lee; Ingoo Heo; Yongje Lee; Yunheung Paek

For decades, various concepts in security monitoring have been proposed. In principle, they all in common in regard to the monitoring of the execution behavior of a program (e.g., control-flow or dataflow) running on the machine to find symptoms of attacks. Among the proposed monitoring schemes, software-based ones are known for their adaptability on the commercial products, but there have been concerns that they may suffer from nonnegligible runtime overhead. On the other hand, hardware-based solutions are recognized for their high performance. However, most of them have an inherent problem in that they usually mandate drastic changes to the internal processor architecture. More recent ones have strived to minimize such modifications by employing external hardware security monitors in the system. However, these approaches intrinsically suffer from the overhead caused by communication between the host and the external monitor. Our solution also relies on external hardware for security monitoring, but unlike the others, ours tackles the communication overhead by using the core debug interface (CDI), which is readily available in most commercial processors for debugging. We build our system simply by plugging our monitoring hardware into the processor via CDI, precluding the need for altering the processor internals. To validate the effectiveness of our approach, we implement two well-known monitoring techniques on our proposed framework: dynamic information flow tracking and branch regulation. The experimental results on our FPGA prototype show that our external hardware monitors efficiently perform monitoring tasks with negligible performance overhead, mainly with thanks to the support of CDI, which helps us reduce communication costs substantially.

ACM Transactions on Design Automation of Electronic Systems | 2015

Implementing an Application-Specific Instruction-Set Processor for System-Level Dynamic Program Analysis Engines

Ingoo Heo; Minsu Kim; Yongje Lee; Changho Choi; Jinyong Lee; Brent ByungHoon Kang; Yunheung Paek

In recent years, dynamic program analysis (DPA) has been widely used in various fields such as profiling, finding bugs, and security. However, existing solutions have their own weaknesses. Software solutions provide flexibility in DPA but they suffer from tremendous performance overhead. In contrast, core-level hardware engines rely on specialized integrated logics and attain extremely fast computation, but they have a limited functional extensibility because the logics are tightly coupled with the host processor. To mend this, a prior system-level approach utilizes an existing channel to integrate their hardware without necessitating the host architecture modification and introduced great potential in performance. Nevertheless, the prior work does not address the detailed design and implementation of the engine, which is quite essential to leverage the deployment on real systems. To address this, in this article, we propose an implementation of programmable DPA hardware engine, called program analysis unit (PAU). PAU is an application-specific instruction-set processor (ASIP) whose instruction set is customized to reflect common features of various DPA methods. With the specialized architecture and programmability of software, our PAU aims at fast computation and sufficient flexibility. In our case studies on several DPA techniques, we show that our ASIP approach can be successfully applicable to complex DPA schemes while providing hardware-backed power in performance and software-based flexibility in analysis. Recent experiments on our FPGA prototype revealed that the performance of PAU is 4.7-13.6 times faster than pure software DPA, and the power/area consumption is also acceptably small compared to todays mobile processors.

hardware and architectural support for security and privacy | 2016

Architectural Supports to Protect OS Kernels from Code-Injection Attacks

Hyungon Moon; Jinyong Lee; Dongil Hwang; Seonhwa Jung; Jiwon Seo; Yunheung Paek

The kernel code injection is a common behavior of kernel -compromising attacks where the attackers aim to gain their goals by manipulating an OS kernel. Several security mechanisms have been proposed to mitigate such threats, but they all suffer from non-negligible performance overhead. This paper introduces a hardware reference monitor, called Kargos, which can detect the kernel code injection attacks with nearly zero performance cost. Kargos monitors the behaviors of an OS kernel from outside the CPU through the standard bus interconnect and debug interface available with most major microprocessors. By watching the execution traces and memory access events in the monitored target system, Kargos uncovers attempts to execute malicious code with the kernel privilege. According to our experiments, Kargos detected all the kernel code injection attacks that we tested, yet just increasing the computational loads on the target CPU by less than 1% on average.

ACM Transactions on Design Automation of Electronic Systems | 2017

Using CoreSight PTM to Integrate CRA Monitoring IPs in an ARM-Based SoC

Yongje Lee; Jinyong Lee; Ingoo Heo; Dongil Hwang; Yunheung Paek

The ARM CoreSight Program Trace Macrocell (PTM) has been widely deployed in recent ARM processors for real-time debugging and tracing of software. Using PTM, the external debugger can extract execution behaviors of applications running on an ARM processor. Recently, some researchers have been using this feature for other purposes, such as fault-tolerant computation and security monitoring. This motivated us to develop an external security monitor that can detect control hijacking attacks, of which the goal is to maliciously manipulate the control flow of victim applications at an attacker’s disposal. This article focuses on detecting a special type of attack called code reuse attacks (CRA), which use a recently introduced technique that allows attackers to perform arbitrary computation without injecting their code by reusing only existing code fragments. Our external monitor is attached to the outside of the host system via the system bus and ARM CoreSight PTM, and is fed with execution traces of a victim application running on the host. As a majority of CRAs violates the normal execution behaviors of a program, our monitor constantly watches and analyzes the execution traces of the victim application and detects a symptom of attacks when the execution behaviors violate certain rules that normal applications are known to adhere. We present two different implementations for this purpose: a hardware-based solution in which all CRA detection components are implemented in hardware, and a hardware/software mixed solution that can be employed in a more resource-constrained environment where the deployment of full hardware-level CRA detection is burdensome.

ACM Transactions on Design Automation of Electronic Systems | 2013

Architecture customization of on-chip reconfigurable accelerators

Jonghee W. Yoon; Jongeun Lee; Sang-Hyun Park; Yongjoo Kim; Jinyong Lee; Yunheung Paek; Doosan Cho

Integrating coarse-grained reconfigurable architectures (CGRAs) into a System-on-a-Chip (SoC) presents many benefits as well as important challenges. One of the challenges is how to customize the architecture for the target applications efficiently and effectively without performing explicit design space exploration. In this article we present a novel methodology for incremental interconnect customization of CGRAs that can suggest a new interconnection architecture which is able to maximize the performance for a given set of application kernels while minimizing the hardware cost. In our methodology, we translate the problem of interconnect customization into that of inexact graph matching, and we devised a heuristic for A* search algorithm to efficiently solve the inexact graph matching problem. Our experimental results demonstrate that our customization method can quickly find application-optimized interconnections that exhibit 80% higher performance on average compared to the base architecture which has mesh interconnections, with little energy and hardware increase in interconnections and muxes.

Explore More