Steven W. K. Tjiang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Steven W. K. Tjiang is active.

Explore More

Publication

Featured researches published by Steven W. K. Tjiang.

programming language design and implementation | 1995

Storage assignment to decrease code size

Stan Y. Liao; Srinivas Devadas; Kurt Keutzer; Steven W. K. Tjiang; Albert R. Wang

DSP architectures typically provide indirect addressing modes with auto-increment and decrement. In addition, indexing mode is not available, and there are usually few, if any, general-purpose registers. Hence, it is necessary to use address registers and perform address arithmetic to access automatic variables. Subsuming the address arithmetic into auto-increment and auto-decrement modes improves the size of the generated code. In this paper we present a formulation of the problem of optimal storage assignment such that explicit instructions for address arithmetic are minimized. We prove that for the case of a single address register the decision problem is NP-complete. We then generalize the problem to multiple address registers. For both cases heuristic algorithms are given. Our experimental results indicate an improvement of 3.

design automation conference | 1997

An efficient implementation of reactivity for modeling hardware in the scenic design environment

Stan Y. Liao; Steven W. K. Tjiang; Rajesh K. Gupta

Reactivity is one of the key features of hardwaredescription languages. We present an efficient implementationof reactivity in the Scenic framework that allows the systemdesigner to model hardware blocks. Scenic allows the designerto use C++ to model mixed hardware-software systems witha C++ compiler and a small library and without the need ofa complex event-driven run-time kernel often found embeddedin hardware description languages (HDL) such as VHDL andVerilog. Moreover, Scenic hardware descriptions can be easilymapped to HDL and synthesized into hardware implementationsusing commercially available tools.In this paper we present Scenics implementation of concurrency(signals and processes) and reactivity (waiting andwatching). When C++ is used as an HDL, context-switchingoverhead can become a significant performance issue duringsimulation. We introduce the notion of delayed expressionobjects, orlambdas, to reduce context-switching. Examplesand experimental results are presented to show the utility andsimulation efficiency using the Scenic framework.

international conference on computer aided design | 1995

Instruction selection using binate covering for code size optimization

Stan Y. Liao; Srinivas Devadas; Kurt Keutzer; Steven W. K. Tjiang

We address the problem of instruction selection in code generation for embedded DSP microprocessors. Such processors have highly irregular data-paths, and conventional code generation methods typically result in inefficient code. Instruction selection can be formulated as directed acyclic graph (DAG) covering. Conventional methods for instruction selection use heuristics that break up the DAG into a forest of trees and then cover them independently. This breakup can result in suboptimal solutions for the original DAG. Alternatively, the DAG covering problem can be formulated as a binate covering problem, and solved exactly or heuristically using branch-and-bound methods. We show that optimal instruction selection on a PAG in the case of accumulator-based architectures requires a partial scheduling of nodes in the DAG, and we augment the binate covering formulation to minimize spills and reloads. We show how the irregular data transfer costs of typical DSP data-paths can be modeled in the binate covering formulation.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1998

Circuit optimization using carry-save-adder cells

Taewhan Kim; William Jao; Steven W. K. Tjiang

Carry-save-adder (CSA) is the most often used type of operation in implementing a fast computation of arithmetics of register-transfer-level design in industry. This paper establishes a relationship between the properties of arithmetic computations and several optimizing transformations using CSAs to derive consistently better qualities of results than those of manual implementations. In particular, we introduce two important concepts, operation duplication and operation split, which are the main driving techniques of our algorithm for achieving an extensive utilization of CSAs. Experimental results from a set of typical arithmetic computations found in industry designs indicate that automating CSA optimization with our algorithm produces designs with up to 53% faster timing and up to 42% smaller area.

design automation conference | 1995

Code Optimization Techniques for Embedded DSP Microprocessors

Stan Y. Liao; Srinivas Devadas; Kurt Keutzer; Steven W. K. Tjiang; Albert R. Wang

We address the problem of code optimization for embedded DSP microprocessors. Such processors (e.g., those in the TMS320 series) have highly irregular datapaths, and conventional code generation methods typically result in inefficient code. In this paper we formulate and solve some optimization problems that arise in code generation for processors with irregular datapaths. In addition to instruction scheduling and register allocation, we also formulate the accumulator spilling and mode selection problems that arise in DSP microprocessors. We present optimal and heuristic algorithms that determine an instruction schedule simultaneously optimizing accumulator spilling and mode selection. Experimental results are presented.

design automation conference | 1998

Arithmetic optimization using carry-save-adders

Taewhan Kim; William Jao; Steven W. K. Tjiang

Carry-save-adder(CSA) is the most often used type of operation in implementing a fast computation of arithmetics of register-transfer level design in industry. This paper establishes a relationship between the properties of arithmetic computations and several optimizing transformations using CSAs to derive consistently better qualities of results than those of manual implementations. In particular, we introduce two important concepts, operation-duplication and operation-split, which are the main driving techniques of our algorithm for achieving an extensive utilization of CSAs. Experimental results from a set of typical arithmetic computations found in industry designs indicate that automating CSA optimization with our algorithm produces designs with significantly faster timing and less area.

Code Generation for Embedded Processors | 2002

Challenges in Code Generation for Embedded Processors

Guido Araujo; Srinivas Devadas; Kurt Keutzer; Stan Y. Liao; Sharad Malik; Ashok Sudarsanam; Steven W. K. Tjiang; Albert R. Wang

The emergence of integrated circuits in which both the program-ROM and the processor are integrated on a single die initiates a new era of problems for programming language compilers. In such a micro-architecture, code performance, and particularly code density, gain an unprecedented level of importance and new code-optimization algorithms will be required to supply the required code quality. This paper presents the first wave of a variety of new code-optimization approaches aimed at supplying the highest code quality possible.

ACM Transactions on Design Automation of Electronic Systems | 1998

A new viewpoint on code generation for directed acyclic graphs

Stan Y. Liao; Kurt Keutzer; Steven W. K. Tjiang; Srinivas Devadas

We present a new viewpoint on code generation for directed acyclic graphs (DAGs). Our formulation is based on binate covering, the problem of satisfying, with minimum cost, a set of disjunctive clauses, and can take into account commutativity of operators and of the machine model. An important contribution of this work is a set of necessary and sufficient conditions for a valid schedule to be derived, based on the notion of worms and worm-partitions. This set of conditions can be compactly expressed with clauses that relate scheduling to code selection. For the case of one-register machines, we can derive clauses that lead to generation of optimal code for the DAG. Recent advances in exact binate covering algorithms allows us to use this strategy to generate optimal code for large basic blocks. The optimal code generated by our algorithm results in significant reductions in overall code size.

Design Automation for Embedded Systems | 1998