Tim Todman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tim Todman is active.

Explore More

Publication

Featured researches published by Tim Todman.

southern conference programmable logic | 2009

Cube: A 512-FPGA cluster

Oskar Mencer; Kuen Hung Tsoi; Stephen Craimer; Tim Todman; Wayne Luk; Ming Yee Wong; Philip Heng Wai Leong

Cube, a massively-parallel FPGA-based platform is presented. The machine is made from boards each containing 64 FPGA devices and eight boards can be connected in a cube structure for a total of 512 FPGA devices. With high bandwidth systolic inter-FPGA communication and a flexible programming scheme, the result is a low power, high density and scalable supercomputing machine suitable for various large scale parallel applications. A RC4 key search engine was built as an demonstration application. In a fully implemented Cube, the engine can perform a full search on the 40-bit key space within 3 minutes, this being 359 times faster than a multi-threaded software implementation running on a 2.5GHz Intel Quad-Core Xeon processor.

symposium on cloud computing | 2009

A high-level compilation toolchain for heterogeneous systems

Wayne Luk; José Gabriel F. Coutinho; Tim Todman; Yuet Ming Lam; William George Osborne; Kong Woei Susanto; Qiang Liu; W. S. Wong

This paper describes Harmonic, a toolchain that targets multiprocessor heterogeneous systems comprising different types of processing elements such as general-purposed processors (GPPs), digital signal processors (DSP), and field-programmable gate arrays (FPGAs) from a high-level C program. The main goal of Harmonic is to improve an application by partitioning and optimising each part of the program, and selecting the most appropriate processing element in the system to execute each part. The core tools include a task transformation engine, a mapping selector, a data representation optimiser, and a hardware synthesiser. We also use the C language with source-annotations as intermediate representation for the toolchain, making it easier for users to understand and to control the compilation process.

design, automation, and test in europe | 2010

Combining optimizations in automated low power design

Qiang Liu; Tim Todman; Wayne Luk

Starting from sequential programs, we present an approach combining data reuse, multi-level MapReduce, and pipelining to automatically find the most power-efficient designs that meet speed and area constraints in the design space on Field-Programmable Gate Arrays (FPGAs). This combined approach enables trade-offs in power, speed and area: we show 63% reduction in power can be achieved with 27% increase in execution time. Compared to the sequential designs, our approach yields designs with up to 158 times reduction in execution time. Moreover, for a given execution time, our combined approach generates designs using up to 1.4 times less power than those produced by the same optimizations applied separately and can also find solutions missed by separating the optimizations.

field-programmable logic and applications | 2009

Optimising designs by combining model-based and pattern-based transformations

Qiang Liu; Tim Todman; José Gabriel F. Coutinho; Wayne Luk; George A. Constantinides

We present a methodology for optimising designs written in high-level descriptions, combining mathematical model-based transformations with syntax-driven pattern-matching transformations, showing how the two kinds of transformation can benefit each other. We evaluate thismethodology by implementing an instance, combining a model-based transformation for data reuse with pattern-based transformations to improve its output. Results for three benchmarks show the implemented framework can improve system performance by up to 57 times.

field programmable logic and applications | 2014

Transparent insertion of latency-oblivious logic onto FPGAs

Eddie Hung; Tim Todman; Wayne Luk

We present an approach for inserting latency-oblivious functionality into pre-existing FPGA circuits transparently. To ensure transparency - that such modifications do not affect the designs maximum clock frequency - we insert any additional logic post place-and-route, using only the spare resources that were not consumed by the pre-existing circuit. The typical challenge with adding new functionality into existing circuits incrementally is that spare FPGA resources to host this functionality must be located close to the input signals that it requires, in order to minimise the impact of routing delays. In congested designs, however, such co-location is often not possible. We overcome this challenge by using flow techniques to pipeline and route signals from where they originate, potentially in a region of high resource congestion, into a region of low congestion capable of hosting new circuitry, at the expense of latency. We demonstrate and evaluate our approach by augmenting realistic designs with self-monitoring circuitry, which is not sensitive to latency. We report results on circuits operating over 200MHz and show that our insertions have no impact on timing, are 2-4 times faster than compile-time insertion, and incur only a small power overhead.

field-programmable technology | 2009

Automatic optimisation of MapReduce designs by geometric programming

Qiang Liu; Tim Todman; Wayne Luk; George A. Constantinides

Many important applications can be expressed using the MapReduce pattern, where a computation is decomposed into a Map phase on which each element of source data is independently operated, followed by a Reduce phase in which the mapped elements are combined with an associative operator. We develop an approach for compiling applications with the MapReduce pattern into parallel hardware. Using optimisation techniques based on geometric programming, we map the computation onto a resource-constrained architecture. Furthermore, we explore important variations of MapReduce, such as making the Reduce a linear structure rather than a tree structure. Results for four benchmarks show that our approach can improve system performance by up to 170 times compared to the initial designs.

conference on current trends in theory and practice of informatics | 2009

Design Validation by Symbolic Simulation and Equivalence Checking: A Case Study in Memory Optimization for Image Manipulation

Kong Woei Susanto; Tim Todman; José Gabriel F. Coutinho; Wayne Luk

Design optimization exploration is a key element in finding an optimal resource utilization. The exploration process applies optimizations iteratively; after applying each optimization, the result has to be validated. The research challenge for formal verification is to develop an efficient design validation flow and increase the quality of the validation. In this paper, we propose an automated validation flow to check the functional equivalence of the source design and its optimized version. This approach is based on a symbolic simulation technique to obtain the design properties and automatically check them using an equivalence checker. The novelty of this approach includes the use of model simplification techniques, such as if-conversion and loop-conversion, and state encoding to ease validation analysis.

The Journal of Supercomputing | 2005

Customisable Hardware Compilation

Tim Todman; José Gabriel F. Coutinho; Wayne Luk

Hardware compilers for high-level languages are increasingly recognised to be the key to reducing the productivity gap for advanced circuit development in general, and for reconfigurable designs in particular. This paper explains how customisable frameworks for hardware compilation can enable rapid design exploration, and reusable and extensible hardware optimisation. It describes such a framework, based on a parallel imperative language, which supports multiple levels of design abstraction, transformational development, optimisation by compiler passes, and metalanguage facilities. Our approach has been used in producing designs for applications such as signal and image processing, with different trade-offs in performance and resource usage.

field-programmable custom computing machines | 2001

Reconfigurable Designs for Ray Tracing

Tim Todman; Wayne Luk

We describe a feasibility study into using reconfigurable hardware for real-time ray tracing. The study includes mapping time-consuming parts of the algorithm into hardware, and transforming the algorithm following a breadth-first approach to improve system performance when the host bus is slow. We also examine the application of runtime reconfiguration, and estimate the reconfigurable resources required for animating complex scenes.

International Journal of Reconfigurable Computing | 2014

Using statistical assertions to guide self-adaptive systems

Tim Todman; Stephan C. Stilkerich; Wayne Luk

Self-adaptive systems need to monitor themselves, to check their internal behaviour and design assumptions about runtime inputs and conditions. This kind of monitoring for self-adaptive systems can include collecting statistics about such systems themselves which can be computationally intensive (for detailed statistics) and hence time consuming, with possible negative impact on self-adaptive response time. To mitigate this limitation, we extend the technique of in-circuit runtime assertions to cover statistical assertions in hardware. The presented designs implement several statistical operators that can be exploited by self-adaptive systems; a novel optimization is developed for reducing the number of pairwise operators from O(N) to O(log (N)). To illustrate the practicability and industrial relevance of our proposed approach, we evaluate our designs, chosen from a class of possible application scenarios, for their resource usage and the tradeoffs between hardware and software implementations.

Explore More