Newton Cheung
University of New South Wales
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Newton Cheung.
design, automation, and test in europe | 2003
Newton Cheung; Jörg Henkel; Sri Parameswaran
We present a methodology that maximizes the performance of Tensilica based Application Specific Instruction-set Processor (ASIP) through instruction selection when an area constraint is given. Our approach rapidly selects from a set of pre-fabricated coprocessors/functional units from our library of pre-designed specific instructions (to evaluate our technology we use the Tensilica platform). As a result, we significantly increase application performance while area constraints are satisfied. Our methodology uses a combination of simulation, estimation and a pre-characterised library of instructions, to select the appropriate co-processors and instructions. We report that by selecting the appropriate coprocessors/functional units and specific TIE instructions, the total execution time of complex applications (we study a voice encoder/decoder), an applicationýs performance can be reduced by up to 85% compared to the base implementation. Our estimator used in the system takes typically less than a second to estimate, with an average error rate of 4% (as compared to full simulation, which takes 45 minutes). The total selection process using our methodology takes 3-4 hours, while a full design space exploration using simulation would take several days.
international conference on computer aided design | 2003
Newton Cheung; Sri Parameswaran; Jörg Henkel
This paper presents the INSIDE system that rapidly searchesthe design space for extensible processors, given area and performance constraints of an embedded application, while minimizing the design turn-around-time. Our system consists ofa) a methodology to determine which code segments are mostsuited for implementation as a set of extensible instructions,b) a heuristic algorithm to select pre-configured extensibleprocessors as well as extensible instructions (library), and c)an estimation tool which rapidly estimates the performance ofan application on a generated extensible processor. By selecting the right combination of a processor core plus extensible instructions, we achieve a performance increase on average of 2.03x (up to 7x) compared to the base processor core at aminimum hardware overhead of 25% on average.
design, automation, and test in europe | 2004
Newton Cheung; Sri Parameswaran; Jörg Henkel; Jeremy Chan
Designing custom-extensible instructions for extensible processors is a computationally complex task because of the large design space. The task of automatically matching candidate instructions in an application (e.g. written in a high-level language) to a pre-designed library of extensible instructions is especially challenging. Previous approaches have focused on identifying extensible instructions (e.g. through profiling), synthesizing extensible instructions, estimating expected performance gains etc. In this paper we introduce our approach of automatically matching extensible instructions as this key step is missing in automating the entire design flow of an ASIP with extensible instruction capabilities. Since matching using simulation is practically infeasible (simulation time), and traditional pattern matching approaches would not yield reliable results (ambiguity related to a functionally equivalent code that can be represented in many different ways), we adopt combinational equivalence checking. Our MINCE tool as part of our ASIP design flow consists of a translator, a filtering algorithm and a combinational equivalence checking tool. We report matching times of extensible instructions that are 7.3x faster on average (using Mediabench applications) compared to the best known approaches to the problem (partial simulations). In all our experiments MINCE matched correctly and the outcome of the matching step yielded an average speedup of the application of 2.47x. As a summary, our work represents a key step towards automating the whole design flow of an ASIP with extensible instruction capabilities.
international conference on computer aided design | 2004
Newton Cheung; S. Parameswarani; Jörg Henkel
Designing extensible instructions is a computationally complex task, due to the large design space each instruction is exposed to. One method of speeding up the design cycle is to characterize instructions and estimate their peculiarities during a design exploration. In this paper, we study and derive three estimation models for extensible instructions: area overhead, latency, and power consumption under a wide range of customization parameters. System decomposition and regression analysis are used as the underlying methods to characterize and analyze extensible instructions. We verify our estimation models using automatically and manually generated extensible instructions, plus extensible instructions used in large real-world applications. The mean absolute error of our estimation models arc as small as: 3.4% (6.7% max.) for area overhead, 5.9% (9.4% max.) for latency, and 4.2% (7.2% max.) for power consumption, compared to estimation through the time consuming synthesis and simulation steps using commercial tools. Our estimation models achieve an average speedup of three orders of magnitude over the commercial tools and thus enable us to conduct a fast and extensive design space exploration that would otherwise not be possible. The estimation models are integrated into our extensible processor tool suite.
asia and south pacific design automation conference | 2005
Newton Cheung; S. Parameswarant; Jörg Henkel
Automatic instruction generation is an efficient method to satisfy growing performance and meet design constraints for application specific instruction-set processors. A typical approach for instruction generation is to combine a large group of primitive instructions into a single extensible instruction for maximizing speedups. However, this approach often leads to large power dissipation and discharge current, posing a challenge to battery-powered products. In this paper, we propose a battery-aware automatic tool to design extensible instructions which minimizes power dissipation distribution by separating an instruction into multiple instructions. We verify our automatic tool using 50 different code segments, and five large real-world applications. Our tool reduces energy consumption by a further 5.8% on average (up to 17.7%) compared to extensible instructions generated by previous approaches. For real-world applications, energy consumption is reduced by 6.6% on average (up to 16.53%) as well as an increase in performance for most cases. The automatic instruction generation tool is integrated into our application specific instruction-set processor tool suite
Customizable Embedded Processors#R##N#Design Technologies and Applications | 2007
Sri Parameswaran; Jörg Henkel; Newton Cheung
Creating a custom processor that is application-specific is an onerous task upon a designer, who constantly has to ask whether the resulting design is optimal. To obtain such an optimal design is an NP-hard problem, made more time consuming because of the numerous combinations of available parts that make up the processor. This chapter shows two automatic methods to accelerate the process of designing ASIPs. The first method shows a formal method to match instructions that is not only fast but is also accurate. The second method shows a way to model instructions so that alternate implementations of instructions can be evaluated rapidly before being synthesized. Both these methods form part of a single design flow, which is described in the chapter. Numerous challenges remain to the rapid creation of ASIPs. These include taking power into consideration when selecting processor configurations and instructions, further reducing the time taken to match instructions by parallelizing matching algorithms, and modeling instructions in two separate steps, so that technology mapping is independently modeled, allowing models to be retargeted quickly as new standard cell libraries become available.
Archive | 2007
Jörg Henkel; Sri Parameswaran; Newton Cheung
Archive | 2007
Sri Parameswaran; Newton Cheung
Archive | 2005
Seng Lin Shee; Sri Parameswaran; Newton Cheung
CODES | 2005
Seng Lin Shee; Sri Parameswaran; Newton Cheung