Chunho Lee
University of California, Los Angeles
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chunho Lee.
international symposium on microarchitecture | 1997
Chunho Lee; Miodrag Potkonjak; William H. Mangione-Smith
Significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been conducted in the context of general-purpose computing, and more specifically the SPEC benchmark suite. At the same time, a number of microprocessor architectures have emerged which have VLIW and SIMD structures that are well matched to the needs of the ILP compilers. Most of these processors are targeted at embedded applications such as multimedia and communications, rather than general-purpose systems. Conventional wisdom, and a history of hand optimization of inner-loops, suggests that ILP compilation techniques are well suited to these applications. Unfortunately, there currently exists a gap between the compiler community and embedded applications developers. This paper presents MediaBench, a benchmark suite that has been designed to fill this gap. This suite has been constructed through a three-step process: intuition and market driven initial selection, experimental measurement to establish uniqueness, and integration with system synthesis algorithms to establish usefulness.
international conference on computer aided design | 1997
Darko Kirovski; Chunho Lee; Miodrag Potkonjak; William H. Mangione-Smith
We developed a new hierarchical modular approach for synthesis of area-minimal core-based data-intensive systems. The optimization approach employs a novel global least-constraining most-constrained heuristic to minimize the instruction cache misses for a given application, instruction cache size and organization. Based on this performance optimization technique, we constructed a strategy to search for a minimal area processor core, and an instruction and data cache which satisfy the performance characteristics of a set of target applications. The synthesis platform integrates the existing modeling, profiling, and simulation tools with the developed system-level synthesis tools. The effectiveness of the approach is demonstrated on a variety of modern real-life multimedia and communication applications.
design automation conference | 1999
Johnson Kin; Chunho Lee; William H. Mangione-Smith; Miodrag Potkonjak
We present a framework for rapidly exploring the design space of low power application-specific programmable processors (ASPP), in particular mediaprocessors. We focus on a category of processors that are programmable yet optimized to reduce power consumption for a specific set of applications. The key components of the framework presented in this paper are a retargetable instruction level parallelism (ILP) compiler, processor simulators, a set of complete media applications written in a high level language and an architectural component selection algorithm. The fundamental idea behind the framework is that with the aid of a retargetable ILP compiler and simulators it is possible to arrange architectural parameters (e.g., the issue width, the size of cache memory units, the number of execution units, etc.) to meet low power design goals under area constraints.
Design Automation for Embedded Systems | 1999
Chunho Lee; Miodrag Potkonjak; Wayne H. Wolf
This paper presents a system level approach for the synthesis of hard real-time multitask application specific systems. The algorithm takes into account task precedence constraints among multiple hard real-time tasks and targets a multiprocessor system consisting of a set of heterogeneous off-the-shelf processors. The optimization goal is to select a minimal cost multi-subset of processors while satisfying all the required timing and precedence constraints. There are three design phases: resource allocation, assignment, and scheduling. Since the resource allocation is a search for a minimal cost multi-subset of processors, we adopted an A* search based technique for the first synthesis phase. A variation of the force-directed optimization technique is used to assign a task to an allocated processor. The final scheduling of a hard-real time task is done by the task level scheduler which is based on Earliest Deadline First (EDF) scheduling policy. Our task level scheduler incorporates force-directed scheduling methodology to address the situations where EDF is not optimal. The experimental results on a variety of examples show that the approach is highly effective and efficient.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1999
Darko Kirovski; Chunho Lee; Miodrag Potkonjak; William H. Mangione-Smith
Due to the increasing popularity of multimedia and communications applications, requirements for application-specific systems typically include design flexibility and data management ability. Since the development of such systems is a market-driven task, reducing the time to market and manufacturing cost, while still satisfying application performance requirements, is an important system synthesis requirement. We have developed a new approach for area optimization of core-based systems. The approach uses basic block relocation in order to reduce the number of cache misses and, thus, enable hardware savings during system synthesis. Given a processor model, a cache model, and a set of nonpreemptive tasks with timing constraints, the goal of the synthesis framework is to select a system configuration (processor, I-cache, and D-cache) of minimal area that satisfies the performance constraints. The system synthesis framework has two key components. The first component is a code optimization engine that relocates basic blocks within a given assembly program in order to reduce the number of cache misses. The second component is a search mechanism that leverages the improvements in code performance obtained by the first component to select the most area-efficient system configuration. In order to bridge the gap between the profiling and modeling tools, we have constructed a new performance evaluation platform. It integrates the existing modeling, profiling, and simulation tools with the developed system-level synthesis tools. The effectiveness of the synthesis approach is demonstrated on a variety of modern real-life multimedia and communication applications.
asia and south pacific design automation conference | 1998
Darko Kirovski; Chunho Lee; Miodrag Potkonjak; William H. Mangione-Smith
We developed a new modular synthesis approach for design of low-power core-based data-intensive application-specific systems on silicon. The power optimization is conducted in three steps: minimization of instruction cache misses, placement of frequently executed sequential basic blocks of code in consecutive Gray code addressed memory locations, and processor and cache application-driven selection for low power. In order to bridge the gap between the profiling and modeling tools from the two traditionally disjoint synthesis domains (architecture and CAD), we developed a new synthesis and evaluation platform. The platform integrates the existing modeling, profiling, and simulation tools with the developed system-level synthesis tools. The effectiveness of the approach is demonstrated on a variety of modern industrial-strength multimedia and communication applications.
design automation conference | 1998
Chunho Lee; Johnson Kin; Miodrag Potkonjak; William H. Mangione-Smith
In this paper we report a framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints. The framework uses a production-quality compiler and simulation tools to synthesize a high performance machine for an application. Using the framework we evaluate the validity of the fundamental assumption behind the development of application-specific programmable processors. Application-specific processors are based on the idea that applications differ from each other in key architectural parameters, such as the available instruction-level parallelism, demand on various hardware components (e.g. cache memory units, register files) and the need for different number of functional units. We found that the framework introduced in this paper can be valuable in making early design decisions such as area and architectural trade-off, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.
asia and south pacific design automation conference | 2000
Johnson Kin; Chunho Lee; William H. Mangione-Smith; Miodrag Potkonjak
Quality of service (QoS) has been an important topic of many research communities. Combined with an advanced and retargetable compiler, variability of applications-specific very large instruction word (VLIW) processors has been shown to provide an excellent platform for optimizing performance under area constraints. Although media servers and others that are natural candidates for QoS management are embedded systems and implementing QoS management using the applications-specific VLIW platform can provide a new horizon for efficient and effective system design, until now no effort has been reported. In this paper, we introduce a QoS-based system partitioning scheme that addresses the need for maximizing net benefit of a system under resource constraints by exploiting applications-specific VLIW processors. The approach utilizes the modern advances in compiler technology and architectural enhancements that are well matched to the compiler technology. Experimental results indicate that the approach presented in this paper is very effective in system partitioning for QoS given a set of heterogeneous processors.
international symposium on low power electronics and design | 1999
Chunho Lee; Johnson Kin; Miodrag Potkonjak; William H. Mangione-Smith
Distributed hypermedia systems that support collaboration are an emerging platform for creation, discovery, management and delivery of information. We present an approach to low power system design space exploration for distributed hypermedia applications. Traditionally, low power design and synthesis of application specific programmable processors has been done in the context of a given number of operations required to complete a task. Our approach utilizes the modern advances in compiler technology and architectural enhancements that are well matched to the compiler technology. This work is, to the best of our knowledge, the first attempt to address the need for synthesis of low power hypermedia processors. Also, this is the first work to address the power efficiency through exploiting instruction level parallelism (ILP) found in hypermedia tasks by a production quality ILP compiler. Using the developed framework we conduct an extensive exploration of low power system design space for a hypermedia application under area and throughput constraints. The framework introduced in this paper is very valuable in making early low power design decisions such as architectural configuration trade-offs including the cache and issue width trade-off under area and throughput constraint, and the number of branch units and issue width.
international conference on computer aided design | 1998
Chunho Lee; Miodrag Potkonjak
We present a quantitative approach to development and validation of synthetic benchmarks for behavioral synthesis systems. The approach is built on the idea of quantitative benchmark selection. We briefly explain the idea and present experimental results on the quantitative selection and validation of benchmarks. We develop a synthetic design example generator which composes the behavioral level specification of a design having the properties given by a set of numerical parameters. Experimental generation and application of synthetic design examples demonstrate the effectiveness of the proposed approach and the developed algorithms.