Is this you? Create Your Porfile

Yongjin Ahn

University of California, Irvine

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yongjin Ahn is active.

Explore More

Publication

Featured researches published by Yongjin Ahn.

ACM Transactions on Design Automation of Electronic Systems | 2008

SoCDAL: System-on-chip design AcceLerator

Yongjin Ahn; Keesung Han; Ganghee Lee; Hyunjik Song; Junhee Yoo; Kiyoung Choi; Xingguang Feng

Time-to-market pressure and the ever-growing design complexity of multiprocessor system-on-chips have demanded an efficient design environment that enables fast exploration of large design space. In this article, we introduce a new design environment, called SoCDAL, for accelerating multiprocessor system-on-chip design through fast design-space exploration targeting real-time multimedia systems. SoCDAL is a set of mostly automated tools covering system specification, hardware/software estimation, application-to-architecture mapping, simulation model generation, and system verification through simulation. For system specification, the process network model has been widely used for system specification because of its modeling capability. However, it is hard to use for real-time systems design, since its behavior cannot be estimated statically. We introduce a new approach which enables analyzing a process network model statically with some restrictions. For the hardware/software estimation, we analyze codes statically. Application-to-architecture mapping process implements a novel algorithm to support an arbitrary number of processors, with performance evaluation by static scheduling considering communication behavior. Mapping results are used to generate simulation models automatically at several transaction levels to be pipelined to a commercial tool. We show the effectiveness of our approaches by some experimental results with multimedia applications such as JPEG, H.263, and H.264 encoders, as well as an H.264 decoder.

embedded systems for real-time multimedia | 2009

Inter-kernel data reuse and pipelining on chip-multiprocessors for multimedia applications

Luis Angel D. Bathen; Yongjin Ahn; Nikil D. Dutt; Sudeep Pasricha

The increasing demand for low power and high performance multimedia embedded systems has motivatedation bandwidth and latency requirements under a tight power budge the need for effective solutions to satisfy applict. As technology scales, it is imperative that applications are optimized to take full advantage of the underlying resources and meet both power and performance requirements. We propose a methodology capable of discovering and enabling parallelism opportunities via code transformations, efficiently distributing the computational load across resources, and minimizing unnecessary data transfers. Our approach decomposes the applications tasks into smaller units of computations called kernels, which are distributed and pipelined across the different processing resources. We exploit the ideas of inter-kernel data reuse to minimize unnecessary data transfers between kernels and early execution edges to drive performance. Our experimental results on a JPEG2000 case study show up to 80% performance improvement and 60% dynamic power reduction over standard application mapping approaches.

microprocessor test and verification | 2009

A Methodology for Power-aware Pipelining via High-Level Performance Model Evaluations

Luis Angel D. Bathen; Yongjin Ahn; Sudeep Pasricha; Nikil Dutt

Power is one of the major constraints considered during the design of embedded software. In order to reduce power consumption without sacrificing performance, software needs to be optimized in order to run as efficiently as possible on a given platform. When attempting to optimize the mapping of a piece of software on a multiprocessor system, designers often face the chicken-and-egg problem of whether to schedule tasks first, or do memory allocation first, as either step will affect the different optimization opportunities the other may provide. Because each optimization will affect the system’s power consumption, it is critically important to be able to monitor the effects these transformations have. In this paper we present a methodology that allows designers to quickly evaluate the impact each code optimization will have in the system’s power. Our exploration engine relies on SystemC-based power/performance models to quickly and accurately evaluate the dynamic power due to memory accesses as well as the expected CPU power consumption.

international symposium on vlsi design, automation and test | 2010

Inter and intra kernel reuse analysis driven pipelining on Chip-Multiprocessors

Luis Angel D. Bathen; Yongjin Ahn; Nikil D. Dutt

As the demand for low power multimedia systems continues to grow, so will the need for low cost and efficient solutions. Driven by such need, as well as improvements IC design technology, Chip-Multiprocessors (CMPs) have emerged as a potential solution. CMPs offer flexibility, low cost, low power and the ability to handle highly parallel workloads. As CMPs scale, it is up to the designer to take full advantage of their computational resources and manage their constrained memory resources efficiently. In this paper we propose a methodology that enables designers to fully exploit the target platforms computational resources without sacrificing power consumption by maximizing the applications reuse. Our approach uses code transformations to split the applications tasks into smaller units of computations or subtasks called kernels. Each kernel is analyzed for inter and intra reuse opportunities in order to minimize unnecessary data transfers between kernels. Our approach also couples both scheduling/pipelining of tasks with their memory allocations. This allows us to obtain memory aware pipelined schedules that increases throughput and reduces power consumption. Our methodology has shown up to 15% performance improvements as well as 33% power reduction when compared to state of the art techniques.

international conference on embedded computer systems: architectures, modeling, and simulation | 2007

Automatic Bus Matrix Synthesis based on Hardware Interface Selection for Fast Communication Design Space Exploration

Ganghee Lee; Seokhyun Lee; Yongjin Ahn; Kiyoung Choi

In this paper, we present an automated bus matrix synthesis flow for efficient system-on-chip communication design space exploration at the transaction level. Especially, we consider hardware interface design, since it affects overall system cost and performance. Depending on the bus interface, a hardware block can be a master or a slave. We propose a method to solve such hardware interface selection problem by analyzing communication behavior statically. In addition, in order to explore communication design space fast, we automatically generate transaction level models for the hardware blocks according to the hardware interface selection. The synthesis result is verified by transaction level simulation with a commercial tool. We give experimental results with JPEG encoder and H.264 encoder to demonstrate the efficiency of the proposed method. The results show that with our automated synthesis flow, the designer can easily and quickly obtain better communication designs through fast design space exploration. More specifically, our hardware interface selection technique is successful in achieving reduction of area of bus matrix by 41.43% with 0.58% performance overhead on average compared to the case of maximum performance.

Design Automation for Embedded Systems | 2010

Communication architecture design for reconfigurable multimedia SoC platform

Ganghee Lee; Yongjin Ahn; Seokhyun Lee; Jeongki Son; Kiwook Yoon; Kiyoung Choi

Memory and communication architecture have a significant impact on the performance, cost, and power of complex multiprocessor system-on-chip designs. In this paper, we present an automated bus matrix synthesis flow for efficient transaction-level design space exploration of communication architecture in a reconfigurable multimedia system-on-chip platform. Specifically, we consider hardware interface selection problem, which has significant effect on the overall cost of area and power. We propose a method to solve such hardware interface selection problem through static analysis of communication behavior. We experiment with JPEG encoder and H.264 encoder examples and the results show the reduction of area by 56.91% and power by 48.61% of bus matrix with 0.58% performance overhead on average compared to the case of maximum performance. According to our HW interface selection algorithm, we also experiment MPEG4 video decoder example. And the result is evaluated on the FPGA prototyping board.

ACM Transactions in Embedded Computing Systems | 2013

MultiMaKe: Chip-multiprocessor driven memory-aware kernel pipelining

Luis Angel D. Bathen; Yongjin Ahn; Sudeep Pasricha; Nikil D. Dutt

The increasing demand for low-power and high-performance multimedia embedded systems has motivated the need for effective solutions to satisfy application bandwidth and latency requirements under a tight power budget. As technology scales, it is imperative that applications are optimized to take full advantage of the underlying resources and meet both power and performance requirements. We propose MultiMaKe, an application mapping design flow capable of discovering and enabling parallelism opportunities via code transformations, efficiently distributing the computational load across resources, and minimizing unnecessary data transfers. Our approach decomposes the applications tasks into smaller units of computations called kernels, which are distributed and pipelined across the different processing resources. We exploit the ideas of inter-kernel data reuse to minimize unnecessary data transfers between kernels, early execution edges to drive performance, and kernel pipelining to increase system throughput. Our experimental results on JPEG and JPEG2000 show up to 97% off-chip memory access reduction, and up to 80% execution time reduction over standard mapping and task-level pipelining approaches.

field-programmable technology | 2009

QoS-aware dynamic power management for coarse-grained reconfigurable architecture

Ganghee Lee; Manhwee Jo; Yongjin Ahn; Kiyoung Choi; Nikil D. Dutt

In this paper, we propose system level dynamic power management of coarse-grained reconfigurable architectures through dynamic reconfiguration. The novelty of our approach is in two kinds of system level power management strategies for coarse-grained reconfigurable architectures considering the cost and quality-of-service (QoS) tradeoff by battery status. Experimental results on an MPEG4 video decoder show that our approach saves 17 percent energy when the system is in normal-battery mode and 36 percent energy when the system is in low-battery mode at the expense of some QoS degradation.

Design Automation for Embedded Systems | 2003