Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huesung Kim is active.

Publication


Featured researches published by Huesung Kim.


IEEE Transactions on Very Large Scale Integration Systems | 2001

A reconfigurable multifunction computing cache architecture

Huesung Kim; Arun K. Somani; Akhilesh Tyagi

A considerable portion of a microprocessor chip is dedicated to cache memory. However, not all applications need all the cache storage all the time, especially the computing bandwidth-limited applications. In addition, some applications have large embedded computations with a regular structure. Such applications may be able to use additional computing resources. If the unused portion of the cache could serve these computation needs, the on-chip resources would be utilized more efficiently. This presents an opportunity to explore the reconfiguration of a part of the cache memory for computing. Thus, we propose adaptive balanced computing (ABC)-dynamic resource configuration on demand from application-between memory and computing resources. In this paper, we present a cache architecture to convert a cache into a computing unit for either of the following two structured computations: finite impulse response and discrete/inverse discrete cosine transform. In order to convert a cache memory to a function unit, we include additional logic to embed multibit output lookup tables into the cache structure. The experimental results show that the reconfigurable module improves the execution time of applications with a large number of data elements by a factor as high as 50 and 60.


field programmable gate arrays | 2000

A reconfigurable multi-function computing cache architecture

Huesung Kim; Arun K. Somani; Akhilesh Tyagi

A considerable portion of a chip is dedicated to a cache memory in a modern microprocessor chip. However, some applications may not actively need all the cache storage, especially the computing bandwidth limited applications. Instead, such applications may be able to use some additional computing resources. If the unused portion of the cache could serve these computation needs, the on-chip resources would be utilized more efficiently. This presents an opportunity to explore the reconfiguration of a part of the cache memory for computing. In this paper, we present a cache architecture to convert a cache into a computing unit for either of the following two structured computations, FIR and DCT/IDCT. In order to convert a cache memory to a function unit, we include additional logic to embed multi-bit output LUTs into the cache structure. Therefore, the cache can perform computations when it is reconfigured as a function unit. The experimental results show that the reconfigurable module improves the execution time of applications with a large number of data elements by a large factor (as high as 50 and 60). In addition, the area overhead of the reconfigurable cache module for FIR and DCT/IDCT is less than the core area of those functions. Our simulations indicate that a reconfigurable cache does not take a significant delay penalty compared with a dedicated cache memory. The concept of reconfigurable cache modules can be applied at Level-2 caches instead of Level-1 caches to provide an active-Level-2 cache similar to active memories.


IEEE Transactions on Computers | 2004

Low-power high-performance reconfigurable computing cache architectures

Rama Sangireddy; Huesung Kim; Arun K. Somani

The demand for higher computing power and, thus, more on-chip computing resources; is ever increasing. The size of on-chip cache memory has also been consistently increasing to keep up with developments in implementation technology. However, some applications may not utilize full cache capacity and, on the contrary, require more computing resources. To efficiently utilize silicon real-estate on the chip, we exploit the possibility of using a part of cache memory for computational purposes to strike a balance in the usage of memory and computing resources for various applications. In an earlier part of our work, the idea of adaptive balanced computing (ABC) architecture was evolved, where a module of an L1 data cache is used as a coprocessor controlled by main processor. A part of an L1 data cache is designed as a reconfigurable functional cache (RFC) that can be configured to perform a selective core function in the media application whenever such computing capability is required. ABC architecture provides speedups ranging from 1.04x to 5.0x for various media applications. We show that a reduced number of cache accesses and lesser utilization of other on-chip resources, due to a significant reduction in execution time of application, will result in power savings. For this purpose, we first develops a model to compute the power consumed by the RFC while accelerating the computation of multimedia applications. The results show that up to a 60 percent reduction in power consumption is achieved for MPEG decoding and a reduction in the range of 10 to 20 percent for various other multimedia applications. Besides, beyond the discussions in earlier work on ABC architecture, we present a detailed circuit level implementation of the core functions in the RFC modules. Further, we go much further and study the impact of converting the conventional cache into RFC on both access time and energy consumption. The analysis is performed on a wide spectrum of cache organizations with size varying from 8KB to 256KB for varying set associativity.


ieee international conference on high performance computing data and analytics | 2002

Low-Power High-Performance Adaptive Computing Architectures for Multimedia Processing

Rama Sangireddy; Huesung Kim; Arun K. Somani

The demand for higher computing power and thus more onchip computing resources is ever increasing. The size of on-chip cache memory has also been consistently increasing. To efficiently utilize silicon real-estate on the chip, a part of L1 data cache is designed as a Reconfigurable Functional Cache (RFC), that can be configured to perform a selective core function in the media application whenever higher computing capability is required. The idea of Adaptive Balanced Computing architecture was developed, where the RFC module is used as a coprocessor controlled by main processor. Initial results have proved that ABC architecture provides speedups ranging from 1.04x to 5.0x for various media applications. In this paper, we address the impact of RFC on cache access time and energy dissipation. We show that reduced number of cache accesses and lesser utilization of other on-chip resources will result in energy savings of up to 60% for MPEG decoding, and in the range of 10% to 20% for various other multimedia applications.


field programmable custom computing machines | 1999

On reconfiguring cache for computing

Huesung Kim; Arun K. Somani; Akhilesh Tyagi

The number of transistors on chip has dramatically increased within the last decade. A considerable portion of a chip is dedicated to a cache memory in a modern microprocessor chip. However, some applications may not need all the caches for storage. In addition, some applications have embedded computations with a regular structure. The behavior of the applications is static, which implies that a specialized function unit could be beneficial for the application. This presents an opportunity to explore the use of a part of a cache for performing these regular computations. In this paper, we show one such design to convert a cache into a function unit to improve the performance of an application. A reconfigurable cache takes less area than the area of a cache and a function unit together and imposes no time overhead. In order to convert a cache memory to a function unit, we mapped multi-bit output look-up tables (LUTs) into the cache structure. Therefore, the cache can perform computations When it is reconfigured as a function unit.


ieee international conference on high performance computing, data, and analytics | 2003

Timing Issues of Operating Mode Switch in High Performance Reconfigurable Architectures

Rama Sangireddy; Huesung Kim; Arun K. Somani

The concept of a reconfigurable coprocessor controlled by the general purpose processor, with the coprocessor acting as a specialized functional unit, has evolved to accelerate applications requiring higher computing power. The idea of Adaptive Balanced Computing (ABC) architecture has evolved, where a module of Reconfigurable Functional Cache (RFC) is configured with a selective core function in the application whenever a higher computing resources are required. Initial results have proved that the ABC architecture provides with speedups ranging from 1.04x to 5.0x depending on the application and the speedups in the range of 2.61x to 27.4x are observed for the core functions. This paper further explores the issues of management of RFC, where the impact of various schemes for configuration of core function into the RFC module is studied. This paper also gives a detailed analysis on the performance of ABC architecture for various configuration schemes, including the study of the effect of the percentage of the core function in an entire application over the management of RFC modules.


international conference on computer design | 2002

Adaptive balanced computing (ABC) microprocessor using reconfigurable functional caches (RFCs)

Huesung Kim; Arun K. Somani; Akhilesh Tyagi

A general-purpose computing processor performs a wide range of functions. Although the performance of general-purpose processors has been steadily increasing, certain software technologies like multimedia and digital signal processing applications demand ever more computing power. If the computing resources are variable to the needs of an application, a better performance can be achieved. Adaptive Balanced Computing (ABC) performs a dynamic resource configuration of on-chip cache memory by converting the cache into a specialized computing unit. With a small amount of additional logic and slightly modified microarchitecture, a part of the cache memory can be configured to perform specialized computations in a conventional processor. In this paper, we evaluate the ABC using RFCs in various cache organizations to see the impact of resource reconfiguration. The simulations with multimedia and DSP applications show that the resource configuration speedups ranging from 1.04X to 3.94X in overall applications and from 2.61X to 27.4X in the core computations.


pacific rim international symposium on dependable computing | 1999

The effect of interconnect schemes on the dependability of a modular multi-processor system with shared resources

Frank M. G. Dörenberg; Huesung Kim; Arun K. Somani

AlliedSignals Avionics & Lighting business unit is expanding the performance of its flight safety avionics by means of functional integration (added functionality enabled by exchanging information between traditionally stand-alone subsystems), as well as physical integration (sharing of system resources) and full dual redundancy. Major performance goals of this integrated modular architecture are a significant increase in system dispatchability and reduction of the loss-of-function probability of individual junctions. Success of this architectural migration depends on the scheme that is used to fully interconnect the various processing and input/output modules. Two of the considered interconnect schemes are discussed: a dual LAN and a dual-dual LAN. In both schemes, all modules can receive data from all LANs. In the prior scheme, all system modules have time-multiplexed transmit privileges on both LANs. In the latter scheme (patent pending), the modules are grouped into two identical sets. The modules in a set can only transmit on two of the four LANs. Dependability of the system has been modeled and analyzed with the HIMAP tool for both schemes, and the results are presented.


Archive | 2001

Towards adaptive balanced computing (abc) using reconfigurable functional caches (rfcs)

Huesung Kim; Arun K. Somani; Akhilesh Tyagi


Lecture Notes in Computer Science | 2003

Timing issues of operating mode switch in high performance reconfigurable architectures

Rama Sangireddy; Huesung Kim; Arun K. Somani

Collaboration


Dive into the Huesung Kim's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge