Richard L. Walke | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard L. Walke is active.

Explore More

Publication

Featured researches published by Richard L. Walke.

asilomar conference on signals, systems and computers | 1999

Architectures for adaptive weight calculation on ASIC and FPGA

Richard L. Walke; R.W.M. Smith; Gaye Lightbody

We compare two parallel array architectures for adaptive weight calculation based an QR-decomposition by Givens rotations. We present FPGA implementations of both architectures and compare them with an ASIC-based solution. The throughput of the FPGA implementations is of the order 5-20 GigaFLOPS, making FPGA a viable alternative to ASIC implementation in applications where power consumption and volume cost are not critical.

Design Automation for Embedded Systems | 2002

Compilation From Matlab to Process Networks Realized in FPGA

Tim Harriss; Richard L. Walke; Bart Kienhuis; Ed F. Deprettere

Compaan is a software tool that is capable of automatically translating nested loop programs, written in Matlab, into parallel process network descriptions suitable for implementation in hardware. In this article, we show a methodology and tool to convert theseprocess networks into FPGA implementations. We will show that we can in principleobtain high performing realizations in a fraction of the design time currentlyemployed to realize a parameterized implementation. This allows us to rapidlyexplore a range of transformations, such as loop unrolling and skewing, togenerate a circuit that meets the requirements of a particular application.The QR decomposition algorithm is used to demonstrate the capability of thetool. We present results showing how the number of clock cycles and calculations-per-secondvary with these transformations using a simple implementation of the functionunits. We also provide an indication of what we expect to achieve in the nearfuture once the tools are completed and applied the transformations to parallel,highly pipelined implementations of the function units.

IEEE Transactions on Very Large Scale Integration Systems | 2003

Design of a parameterizable silicon intellectual property core for QR-based RLS filtering

Gaye Lightbody; Roger F. Woods; Richard L. Walke

The availability of an intellectual property core for recursive least squares (RLS) filtering could enable the RLS algorithm to replace the least mean squares algorithm in a wide range of applications. The goal of this study is to develop a parameterizable generic architecture for RLS filtering in the form of a hardware description language (HDL) description, which can be used to generate highly efficient silicon layout. The key issue is to develop a family of circuit architectures that are 100% efficient and locally connected. This paper presents a generic mapping for RLS filtering and circuit architectures that can be mapped to a range of application requirements. It outlines the transition from array to architecture covering detailed design issues such as timing and control generation. The result is a family of QR designs, which are parameterized in terms of architecture size, wordlength, performance, and arithmetic processor timing.

signal processing systems | 2000

Linear QR Architecture for a Single Chip Adaptive Beamformer

Gaye Lightbody; Richard L. Walke; Roger F. Woods; John V. McCanny

This paper presents the design of a novel single chip adaptive beamformer capable of performing 50 Gflops, (Giga-floating-point operations/second). The core processor is a QR array implemented on a fully efficient linear systolic architecture, derived using a mapping that allows individual processors for boundary and internal cell operations. In addition, the paper highlights a number of rapid design techniques that have been used to realise this system. These include an architecture synthesis tool for quickly developing the circuit architecture and the utilisation of a library of parameterisable silicon intellectual property (IP) cores, to rapidly develop detailed silicon designs.

conference on advanced signal processing algorithms architectures and implemenations | 2000

20 GFLOPS QR processor on a Xilinx Virtex-E FPGA

Richard L. Walke; Robert W. M. Smith; Gaye Lightbody

Adaptive beamforming can play an important role in sensor array systems in countering directional interference. In high-sample rate systems, such as radar and comms, the calculation of adaptive weights is a very computational task that requires highly parallel solutions. For systems where low power consumption and volume are important the only viable implementation is as an Application Specific Integrated Circuit (ASIC). However, the rapid advancement of Field Programmable Gate Array (FPGA) technology is enabling highly credible re-programmable solutions. In this paper we present the implementation of a scalable linear array processor for weight calculation using QR decomposition. We employ floating-point arithmetic with mantissa size optimized to the target application to minimize component size, and implement them as relationally placed macros (RPMs) on Xilinx Virtex FPGAs to achieve predictable dense layout and high-speed operation. We present results that show that 20GFLOPS of sustained computation on a single XCV3200E-8 Virtex-E FPGA is possible. We also describe the parameterized implementation of the floating-point operators and QR-processor, and the design methodology that enables us to rapidly generate complex FPGA implementations using the industry standard hardware description language VHDL.

international conference on acoustics speech and signal processing | 1999

Novel mapping of a linear QR architecture

Gaye Lightbody; Richard L. Walke; Roger F. Woods; John V. McCanny

This paper presents a novel architecture mapping technique which was essential in the design of a QR array which forms the core processor of a single chip adaptive beamforming system. The mapping technique assigns a QR triangular array of 2m/sup 2/+3m+1 cells down onto a linear architecture of m+1 processors. The mapping results in a linear systolic architecture with one hundred percent hardware utilisation, local interconnects and individual processors for boundary and internal cell operations. In addition, this paper highlights the effect latency has on the validity of the linear architecture.

signal processing systems | 2006

Multidimensional DSP Core Synthesis for FPGA

John McAllister; Roger F. Woods; Richard L. Walke; Darren Gerard Reilly

Current rapid synthesis approaches for reusable dedicated hardware components (cores) for digital signal processing systems are ineffective since they fail to capture and exploit the manner in which the resulting components are used as part of a heterogeneous system. This leads to counter-productive core redesign for each use of the core. This paper presents a solution to this issue which combines a novel but intuitive system modeling technique and associated core generation and integration methodology which generates reuable core architectures which may be optimised via algorithm level transformations. For an example design problem, these provide an effective rapid core synthesis and implementation exploration flow which allows a factor 3.9 throughput increase with no extra hardware expense.

IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 2003

Generic SoC QR array processor for adaptive beamforming

Zhaohui Liu; John V. McCanny; Gayle Lightbody; Richard L. Walke

A generic architecture for implementing a QR array processor in silicon is presented. This improves on previous research by considerably simplifying the derivation of timing schedules for a QR system implemented as a folded linear array, where account has to be taken of processor cell latency and timing at the detailed circuit level. The architecture and scheduling derived have been used to create a generator for the rapid design of System-on-a-Chip (SoC) cores for QR decomposition. This is demonstrated through the design of a single-chip architecture for implementing an adaptive beamformer for radar applications.

asilomar conference on signals, systems and computers | 2001

Compilation from Matlab to process networks realised in FPGA

Tim Harriss; Richard L. Walke; Bart Kienhuis; Ed F. Deprettere

Compaan is a software tool capable of automatically translating nested loop programs, written in Matlab, into parallel Kahn process network descriptions suitable for implementation in hardware. In this paper we present a tool for converting these process networks into FPGA implementations. The QR decomposition algorithm is used to demonstrate the capability of the tool to quickly generate high performance parallel implementations. This allows us to rapidly explore a range of transformations, such as loop unrolling and skewing, to generate a circuit that meets the requirements of a particular application. We present results showing how the control logic complexity and number of clock cycles vary with these transformations.

IEEE Transactions on Signal Processing | 2000

Online CORDIC algorithm and VLSI architecture for implementing QR-array processors

Robert Hamill; John V. McCanny; Richard L. Walke

A novel most significant digit first CORDIC architecture is presented that is suitable for the VLSI design of systolic array processor cells for performing QR decomposition. This is based on an online CORDIC algorithm with a constant scale factor and a latency independent of the wordlength. This has been derived through the extension of previously published CORDIC algorithms. It is shown that simplifying the calculation of convergence bounds also greatly simplifies the derivation of suitable VLSI architectures. Design studies, based on a 0.35-/spl mu/ CMOS standard cell process, indicate that 20 such QR processor cells operating at rates suitable for radar beamforming can be readily accommodated on a single chip.

Explore More