Kyungtae Han | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kyungtae Han is active.

Explore More

Publication

Featured researches published by Kyungtae Han.

EURASIP Journal on Advances in Signal Processing | 2006

Optimum wordlength search using sensitivity information

Kyungtae Han; Brian L. Evans

Many digital signal processing algorithms are first developed in floating point and later converted into fixed point for digital hardware implementation. During this conversion, more than 50% of the design time may be spent for complex designs, and optimum wordlengths are searched by trading off hardware complexity for arithmetic precision at system outputs. We propose a fast algorithm for searching for an optimum wordlength. This algorithm uses sensitivity information of hardware complexity and system output error with respect to the signal wordlengths, while other approaches use only one of the two sensitivities. This paper presents various optimization methods, and compares sensitivity search methods. Wordlength design case studies for a wireless demodulator show that the proposed method can find an optimum solution in one fourth of the time that the local search method takes. In addition, the optimum wordlength searched by the proposed method yields 30% lower hardware implementation costs than the sequential search method in wireless demodulators. Case studies demonstrate the proposed method is robust for searching for the optimum wordlength in a nonconvex space.

international conference on acoustics, speech, and signal processing | 2004

Wordlength optimization with complexity-and-distortion measure and its application to broadband wireless demodulator design

Kyungtae Han; Brian L. Evans

Many digital signal processing algorithms are first developed in floating point and later mapped into fixed point for digital hardware implementation. During this mapping, wordlengths are searched to minimize total hardware cost and maximize system performance. Complexity and distortion measures have been separately researched for optimum wordlength selection. This paper proposes a complexity-and-distortion measure (CDM) method that combines these two measures. The CDM method trades off these two measures using a weighting factor. The proposed method is applied to wordlength design of a fixed broadband wireless demodulator. For this case study, the proposed method finds the optimal solution in one-third the time that exhaustive search takes. The contributions of this paper are: (1) a generalization of search methods based on complexity or distortion measures; (2) a framework of automatic wordlength optimization; and (3) a wireless demodulator case study.

international symposium on circuits and systems | 2001

Numerical word-length optimization for CDMA demodulator

Kyungtae Han; Iksu Eo; Kyungsu Kim; Hanjin Cho

This paper presents search methods to optimize word-length for digital systems. Finding the word-length is tedious work when the variables for optimization are numerous. We have proposed sequential and preplanned searches to find optimum word-length, and compared them in terms of the trials. A comparison for a given optimized point is evaluated. We apply them to word-length optimization for a CDMA demodulator, of which requirement is FER of 0.03. Our results show the sequential and the preplanned search have reduced the trials by the rate of 64% and 89%, respectively compared to a full search to optimize the word-length for the CDMA demodulator design.

international conference on computer design | 2009

Using checksum to reduce power consumption of display systems for low-motion content

Kyungtae Han; Zhen Fang; Paul S. Diefenbaugh; Richard A. Forand; Ravi R. Iyer; Donald Newell

Power consumption of the display subsytem has been a relatively less explored area compared to other components of a mobile device including computing, storage, and networking units, although the former often constitutes one of the most power-hungry portions of the system. Typical applications on a mobile device such as web browsing and text editing tend to have rather static image content; each frame hardly changes from the previous one. Efficiently detecting and handling no-motion scenarios is thus critical to extend the battery life. This paper focuses on image change detection. We propose to use checksum to detect image changes. Specifically, CRC hardware is used to optimize the power consumption of 1) refresh of a local display and 2) data compression for wireless remote display. Compared with a traditional, pixel-by-pixel comparison approach, using checksum for image change detection is not only fast, but also reduces accesses to the frame buffer, resulting in significant power savings. We have built a FPGA prototype to verify that CRC can capture image changes well enough to ensure a “visually lossless” quality.

signal processing systems | 2004

Data wordlength reduction for low-power signal processing software

Kyungtae Han; Brian L. Evans; Earl E. Swartzlander

Reducing power consumption prolongs battery life and increases integration. In digital CMOS designs, switching activity is closely connected with the total power consumption. Switching activity on programmable processors implementing linear filters, fast Fourier transforms, and other signal processing operations is dominated by the hardware multiplier. In this paper, we employ wordlength reduction of multiplicands to reduce switching activity in hardware multipliers using truncation and signed right shift methods. For 32 bit /spl times/ 32 bit Wallace and radix-4 modified Booth multipliers, truncation by 16 bits achieves a 4:1 and 2:1 reduction, respectively, in switching activity, whereas signed right shift gives little or no reduction. The key contribution of this paper is the reduction of power consumption by altering multiplicands in software without any hardware modifications.

asilomar conference on signals, systems and computers | 2006

Automatic Floating-Point to Fixed-Point Transformations

Kyungtae Han; Alex G. Olson; Brian L. Evans

Many digital signal processing and communication algorithms are first simulated using floating-point arithmetic and later transformed into fixed-point arithmetic to reduce implementation complexity. For the floating-point to fixed-point transformation, this paper describes two methods within an automated transformation environment. The first method, a gradient-based search for single-objective optimization with sensitivity information, provides a single solution, and can become trapped in local optima. The second method, a genetic algorithm for multi-objective optimization, provides a family of solutions that form a tradeoff curve for signal quality vs. implementation complexity. We provide case studies for an infinite impulse response filter. In the case study, implementation complexity is lookup table area for a field programmable gate array (FPGA) realization. We have made the transformation methods available in a software release on the Web.

international conference on computer aided design | 2015

A Polyhedral-based SystemC Modeling and Generation Framework for Effective Low-power Design Space Exploration

Wei Zuo; Warren Kemmerer; Jong Bin Lim; Louis-Noël Pouchet; Andrey Ayupov; Taemin Kim; Kyungtae Han; Deming Chen

With the prevalence of System-on-Chips there is a growing need for automation and acceleration of the design process. A classical approach is to take a C/C++ specification of the application, convert it to a SystemC (or equivalent) description of hardware implementing this application, and perform successive refinement of the description to improve various design metrics. In this work, we present an automated SystemC generation and design space exploration flow alleviating several productivity and design time issues encountered in the current design process. We first automatically convert a subset of C/C++, namely affine program regions, into a full SystemC description through polyhedral model-based techniques while performing powerful data locality and parallelism transformations. We then leverage key properties of affine computations to design a fast and accurate latency and power characterization flow. Using this flow, we build analytical models of power and performance that can effectively prune away a large amount of inferior design points very fast and generate Pareto-optimal solution points. Experimental results show that (1) our SystemC models can evaluate system performance and power that is only 0.57% and 5.04% away from gate-level evaluation results, respectively; (2) our latency and power analytical models are 3.24% and 5.31% away from the actual Pareto points generated by SystemC simulation, with 2091x faster design-space exploration time on average. The generated Pareto-optimal points provide effective low-power design solutions given different latency constraints.

international conference on computer aided design | 2015

Learning-Based Power Modeling of System-Level Black-Box IPs

Dongwook Lee; Taemin Kim; Kyungtae Han; Yatin Hoskote; Lizy Kurian John; Andreas Gerstlauer

Virtual platform prototypes are widely utilized to enable early system-level design space exploration. Accurate power models for hardware components at high levels of abstraction are needed to enable system-level power analysis and optimization. However, the limited observability of third party IPs renders traditional power modeling methods challenging and inaccurate. In this paper, we present a novel approach for extending behavioral models of black-box hardware IPs with an accurate power estimate. We leverage state-of-the-art-machine learning techniques to synthesize an abstract power model. Our model uses input and output history to track data-dependent pipeline behavior. Furthermore, we introduce a specialized ensemble learning that is composed out of individually selected cycle-by-cycle models to reduce overall complexity and further increase estimation accuracy. Results of applying our approach to various industrial-strength design examples shows that our models predict average power consumption to within 3% of a commercial gate-level power estimation tool, all while running several orders of magnitude faster.

asilomar conference on signals, systems and computers | 2005

Low-Power Multipliers with Data Wordlength Reduction

Kyungtae Han; Brian L. Evans; Earl E. Swartzlander

Multiprecision multipliers reduce power consump- tion by selecting smaller multipliers (i.e., submultiplier) according to the wordsize of the input operands. However, arbitrary levels of bit precision are not achieved by multiprecision multipliers. Two proposed wordlength reduction techniques that reduce power consumption with arbitrary levels of bit precision are considered. Expectation values of bit switching activity for reduction in the signed right shift method and the truncation method are derived. The signed right shift method and the truncation method are applied to a 16-bit radix-4 modified Booth multiplier and a 16-bit Wallace multiplier. The truncation method with 8-bit operands reduces the power consumption by 56% in the Wallace multiplier and 31% in the Booth multiplier. The signed right shift method shows no power reduction in the Wallace multiplier and 25% power reduction in the Booth multiplier. Unequal levels of precision in operands show different power reduction value for the Booth multiplier. The non-recoded operand in the Booth multiplier with 8-bit reduction has 13% more sensitivity in power consumption than the recoded multiplicand.

international symposium on quality electronic design | 2016

Statistical quality modeling of approximate hardware

Seogoo Lee; Dongwook Lee; Kyungtae Han; Emily Shriver; Lizy Kurian John; Andreas Gerstlauer

Beyond traditional bit truncation, recently proposed arithmetic and logic approximations have enriched the quality versus energy design space for custom hardware kernels in signal processing and other error-tolerant applications. Systematic exploration of such trade-offs requires fast, accurate, and generic quality-energy models that can drive datapath optimizations. Existing quality estimation approaches, however, are either based on slow simulation or limited in supported approximation types and quality metrics. In this paper, we propose a novel semi-analytical quality model that can predict a wide range of statistical metrics for arbitrary hardware approximations with deterministic error behavior. Input and error dependencies are captured using one-time error-free simulation only. Combining our quality estimation with a faithful energy model considering both switching activity and voltage scaling, we provide a complete quality-energy optimization flow. Optimization results for FFT and IDCT benchmarks show that our approach is 28x faster than purely simulation-based exploration, and 2x faster than existing hybrid approaches, all while achieving comparable estimation accuracy.

Explore More