Ganghee Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ganghee Lee is active.

Explore More

Publication

Featured researches published by Ganghee Lee.

ACM Transactions on Design Automation of Electronic Systems | 2008

SoCDAL: System-on-chip design AcceLerator

Yongjin Ahn; Keesung Han; Ganghee Lee; Hyunjik Song; Junhee Yoo; Kiyoung Choi; Xingguang Feng

Time-to-market pressure and the ever-growing design complexity of multiprocessor system-on-chips have demanded an efficient design environment that enables fast exploration of large design space. In this article, we introduce a new design environment, called SoCDAL, for accelerating multiprocessor system-on-chip design through fast design-space exploration targeting real-time multimedia systems. SoCDAL is a set of mostly automated tools covering system specification, hardware/software estimation, application-to-architecture mapping, simulation model generation, and system verification through simulation. For system specification, the process network model has been widely used for system specification because of its modeling capability. However, it is hard to use for real-time systems design, since its behavior cannot be estimated statically. We introduce a new approach which enables analyzing a process network model statically with some restrictions. For the hardware/software estimation, we analyze codes statically. Application-to-architecture mapping process implements a novel algorithm to support an arbitrary number of processors, with performance evaluation by static scheduling considering communication behavior. Mapping results are used to generate simulation models automatically at several transaction levels to be pipelined to a commercial tool. We show the effectiveness of our approaches by some experimental results with multimedia applications such as JPEG, H.263, and H.264 encoders, as well as an H.264 decoder.

design, automation, and test in europe | 2003

Scheduling and Timing Analysis of HW/SW On-Chip Communication in MP SoC Design

Young-Chul Cho; Ganghee Lee; Sungjoo Yoo; Kiyoung Choi; Nacer-Eddine Zergainoh

On-chip communication design includes designing software (SW) parts (operating system, device drivers, interrupt service routines, etc.) as well as hardware (HW) parts (on-chip communication network, communication interfaces of processor/IP/memory, on-chip memory, etc.). For an efficient exploration of its design space, we need fast scheduling and timing analysis. In this work, we tackle two problems (one for SW and the other for HW) in on-chip communication design. One is to incorporate the dynamic behavior of SW (interrupt processing and context switching) into on-chip communication scheduling. The other is to reduce on-chip data storage required for on-chip communication, by sharing physical communication buffers with different communication transactions. To solve the problems, we present both ILP (integer linear programming) formulation and heuristic algorithm, which enable the designer to perform efficient on-chip communication scheduling and obtain accurate timing information. Experimental results show the effectiveness of our work.

international soc design conference | 2008

Automatic mapping of application to coarse-grained reconfigurable architecture based on high-level synthesis techniques

Ganghee Lee; Seokhyun Lee; Kiyoung Choi

Coarse-grained reconfigurable architecture is good for both performance and flexibility. However, it is not easy to map applications to such architecture since it requires compilation of the application and configuration of the architecture at the same time while trying to maximally exploit the parallelism in the application and the architecture. In this paper, we introduce an approach to mapping applications to coarse-grained reconfigurable architecture based on high-level synthesis techniques. We adopt performance enhancing techniques including loop unrolling and loop pipelining for temporal mapping on a reconfigurable array architecture. Experimental results with DSPstone benchmark examples show the effectiveness of the proposed approach.

adaptive hardware and systems | 2010

Thermal-aware fault-tolerant system design with coarse-grained reconfigurable array architecture

Ganghee Lee; Kiyoung Choi

Coarse-grained reconfigurable array architectures have drawn increasing attention due to their performance and flexibility. A typical coarse-grained reconfigurable array architecture has many PEs in the array, which is suitable for implementing spatial redundancy used for fault-tolerant systems design. In this paper, we propose to implement replications and a voting function on the PE array of a coarse-grained reconfigurable array architecture to design a fault-tolerant system. We also introduce thermal-aware application mapping onto the coarse-grained reconfigurable array architecture for reliability. The experiment with Viterbi decoder shows that our approach enables implementing fault-tolerance with 12% area overhead which comes from implementing conditional execution.

applied reconfigurable computing | 2010

Routing-aware application mapping considering steiner points for coarse-grained reconfigurable architecture

Ganghee Lee; Seokhyun Lee; Kiyoung Choi; Nikil D. Dutt

Coarse-grained reconfigurable architectures have drawn increasing attention due to their performance and flexibility. While many coarse-grained reconfigurable architectures have demonstrated impressive performance improvements, their effectiveness heavily depends on the quality of the compilers and/or mappers. However, this mapping process is difficult since it requires the solution of multiple problems simultaneously: compilation of the application and configuration of the architecture while maximally exploiting the parallelism in both the application and the architecture. Utilization of routing resources also adds to the complexity of the mapping process. In this paper, we introduce routing-aware mapping algorithms for coarse-grained reconfiguration architecture. In particular, we consider Steiner point routing, since it gives better results than spanning tree based routing. After presenting an optimal formulation using integer linear programming (that doesn’t scale), we present a fast heuristic mapping algorithm. Our experimental result on randomly generated examples shows that our algorithm considering Steiner point routing gives 10% better performance result than the one using spanning tree routing. And our heuristic algorithm finds optimal solutions for 96% of the cases on the average within a few seconds. We also convey similar results on a suite of benchmarks collected from Livermore loops, Mediabench, and DSPStone benchmarks.

ieee international symposium on parallel distributed processing workshops and phd forum | 2010

Automatic mapping of control-intensive kernels onto coarse-grained reconfigurable array architecture with speculative execution

Ganghee Lee; Kyungwook Chang; Kiyoung Choi

Coarse-grained reconfigurable array architectures have drawn increasing attention due to their good performance and flexibility. In general, they show high performance for compute-intensive kernel code, but cannot handle control-intensive parts efficiently, thereby degrading the overall performance. In this paper, we present automatic mapping of control-intensive kernels onto coarse-grained reconfigurable array architecture by using kernel-level speculative execution. Experimental results show that our automatic mapping tool successfully handles control-intensive kernels for coarse-grained reconfigurable array architecture. In particular, it improves the performance of the H.264 deblocking filters for luma and chroma over 26 and 16 times respectively compared to conventional software implementation. Compared to the approach using predicated execution, the proposed approach achieves 2.27 times performance enhancement.

international conference on embedded computer systems: architectures, modeling, and simulation | 2007

Automatic Bus Matrix Synthesis based on Hardware Interface Selection for Fast Communication Design Space Exploration

Ganghee Lee; Seokhyun Lee; Yongjin Ahn; Kiyoung Choi

In this paper, we present an automated bus matrix synthesis flow for efficient system-on-chip communication design space exploration at the transaction level. Especially, we consider hardware interface design, since it affects overall system cost and performance. Depending on the bus interface, a hardware block can be a master or a slave. We propose a method to solve such hardware interface selection problem by analyzing communication behavior statically. In addition, in order to explore communication design space fast, we automatically generate transaction level models for the hardware blocks according to the hardware interface selection. The synthesis result is verified by transaction level simulation with a commercial tool. We give experimental results with JPEG encoder and H.264 encoder to demonstrate the efficiency of the proposed method. The results show that with our automated synthesis flow, the designer can easily and quickly obtain better communication designs through fast design space exploration. More specifically, our hardware interface selection technique is successful in achieving reduction of area of bus matrix by 41.43% with 0.58% performance overhead on average compared to the case of maximum performance.

Design Automation for Embedded Systems | 2010

Communication architecture design for reconfigurable multimedia SoC platform

Ganghee Lee; Yongjin Ahn; Seokhyun Lee; Jeongki Son; Kiwook Yoon; Kiyoung Choi

Memory and communication architecture have a significant impact on the performance, cost, and power of complex multiprocessor system-on-chip designs. In this paper, we present an automated bus matrix synthesis flow for efficient transaction-level design space exploration of communication architecture in a reconfigurable multimedia system-on-chip platform. Specifically, we consider hardware interface selection problem, which has significant effect on the overall cost of area and power. We propose a method to solve such hardware interface selection problem through static analysis of communication behavior. We experiment with JPEG encoder and H.264 encoder examples and the results show the reduction of area by 56.91% and power by 48.61% of bus matrix with 0.58% performance overhead on average compared to the case of maximum performance. According to our HW interface selection algorithm, we also experiment MPEG4 video decoder example. And the result is evaluated on the FPGA prototyping board.

international conference on hybrid information technology | 2009

Coarse-grained reconfigurable architecture for multiple application domains: a case study

Manhwee Jo; Ganghee Lee; Kyungwook Chang; Kyuseung Han; Kiyoung Choi; Hoonmo Yang; Kiwook Yoon

In this paper, we propose a coarse-grained reconfigurable architecture, which supports both integer type application domain and floating-point type application domain. Our coarse-grained reconfigurable architecture has an 8x8 array of integer processing elements to execute 64 integer operations or 32 floating-point operations in parallel. In order to show the feasibility of the proposed architecture, we use MPEG4 decoder as an integer type application and physics engine for 3D graphics as a floating-point type application. We first analyze these applications and optimize their implementation on the proposed architecture at the system level. Then we implement the proposed architecture on an FPGA board, which decodes 12 frames per second of MPEG4 CIF images and execute up to 160 million floating-point operations per second for the physics engine at 20MHz clock frequency.

field-programmable technology | 2009

QoS-aware dynamic power management for coarse-grained reconfigurable architecture

Ganghee Lee; Manhwee Jo; Yongjin Ahn; Kiyoung Choi; Nikil D. Dutt

In this paper, we propose system level dynamic power management of coarse-grained reconfigurable architectures through dynamic reconfiguration. The novelty of our approach is in two kinds of system level power management strategies for coarse-grained reconfigurable architectures considering the cost and quality-of-service (QoS) tradeoff by battery status. Experimental results on an MPEG4 video decoder show that our approach saves 17 percent energy when the system is in normal-battery mode and 36 percent energy when the system is in low-battery mode at the expense of some QoS degradation.

Explore More