Jack Whitham
University of York
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jack Whitham.
embedded software | 2009
Jack Whitham; Neil C. Audsley
Scratchpads have been widely proposed as an alternative to caches for embedded systems. Advantages of scratchpads include reduced energy consumption in comparison to a cache and access latencies that are independent of the preceding memory access pattern. The latter property makes memory accesses time-predictable, which is useful for hard real-time tasks as the worst-case execution time (WCET) must be safely estimated in order to check that the system will meet timing requirements. However, data must be explicitly moved between scratchpad and external memory as a task executes in order to make best use of the limited scratchpad space. When dynamic data is moved, issues such as pointer aliasing and pointer invalidation become problematic. Previous work has proposed solutions that are not suitable for hard real-time tasks because memory accesses are not time-predictable. This paper proposes the scratchpad memory management unit (SMMU) as an enhancement to scratchpad technology. The SMMU implements an alternative solution to the pointer aliasing and pointer invalidation problems which (1) does not require whole-program pointer analysis and (2) makes every memory access operation time-predictable. This allows WCET analysis to be applied to hard-real time tasks which use a scratchpad and dynamic data, but results are also applicable in the wider context of minimizing energy consumption or average execution time. Experiments using C software show that the combination of an SMMU and scratchpad compares favorably with the best and worst case performance of a conventional data cache.
real time technology and applications symposium | 2010
Jack Whitham; Neil C. Audsley
A combination of a scratchpad and scratchpad memory management unit (SMMU) has been proposed as a way to implement fast and time-predictable memory access operations in programs that use dynamic data structures.A memory access operation is time-predictable if its execution time is known or bounded -- this is important within a hard real-time task so that the worst-case execution time (WCET) can be determined. However, the requirement for time-predictability does not remove the conventional requirement for efficiency:operations must be serviced as quickly as possible under worst-case conditions.This paper studies the capabilities of the SMMU when applied to a number of benchmark programs. A new allocation algorithm is proposed to dynamically manage the scratchpad space. In many cases,the SMMU vastly reduces the number of accesses to dynamic data structures stored in external memory along the worst-case execution path (WCEP). Across all the benchmarks,an average of 47% of accesses are rerouted to scratchpad, with nearly100% for some programs. In previous scratchpad-based work, time-predictability could only be assured for these operations using external memory.The paper also examines situations in which the SMMU does not perform so well, and discusses how these could be addressed.
ACM Transactions in Embedded Computing Systems | 2014
Jack Whitham; Neil C. Audsley; Robert I. Davis
This paper proposes Carousel, a mechanism to manage local memory space, i.e. cache or scratch pad memory (SPM), such that inter-task interference is completely eliminated. The cost of saving and restoring the local memory state across context switches is explicitly handled by the preempting task, rather than being imposed implicitly on preempted tasks. Unlike earlier attempts to eliminate inter-task interference, Carousel allows each task to use as much local memory space as it requires, permitting the approach to scale to large numbers of tasks. Carousel is experimentally evaluated using a simulator. We demonstrate that preemption has no effect on task execution times, and that the Carousel technique compares well to the conventional approach to handling interference, where worst-case interference costs are simply added to the worst-case execution times (WCETs) of lower-priority tasks.
real time technology and applications symposium | 2012
Jack Whitham; Neil C. Audsley
This paper proposes Carousel, a mechanism to manage local memory space, i.e. cache or scratch pad memory (SPM), such that inter-task interference is completely eliminated. The cost of saving and restoring the local memory state across context switches is explicitly handled by the preempting task, rather than being imposed implicitly on preempted tasks. Unlike earlier attempts to eliminate inter-task interference, Carousel allows each task to use as much local memory space as it requires, permitting the approach to scale to large numbers of tasks. Carousel is experimentally evaluated using a simulator. We demonstrate that preemption has no effect on task execution times, and that the Carousel technique compares well to the conventional approach to handling interference, where worst-case interference costs are simply added to the worst-case execution times (WCETs) of lower-priority tasks.
real-time systems symposium | 2006
Jack Whitham; Neil C. Audsley
Real-time systems design involves many important choices, including that of the processor. The fastest processors achieve performance by utilizing architectural features that make them unpredictable, leading to difficulties proving offline that application process deadlines will be met, in the worst-case. Utilizing slower, more predictable processors, may not provide sufficient instruction throughput to execute all required application processes. This exposes a key trade-off in processor selection for real-time systems: predictability versus instruction throughput. This paper proposes MCGREP, a novel CPU architecture that combines predictability, high instruction throughput and flexibility. MCGREP is entirely microprogrammed, with multiple execution units. Basic operation involves implementation of a conventional set of CPU instructions in microcode - MCGREP then executes object code suitably compiled. Advanced operation allows the application to dynamically load new microcode, enabling new application specific instructions to increase overall performance. MCGREP is implemented upon reconfigurable logic (FPGA) - an increasingly important platform for the embedded RTS. Custom microcode configurations for new instructions are generated from C source. MCGREP is shown to have performance comparable to two popular FPGA softcore CPUs (OpenRISC and Microblaze, the latter a commercial product). Flexibility is demonstrated by implementing an existing instruction set (OpenRISC) in microcode, with application-specific instructions to improve overall performance. As a further demonstration, predictable two-level interrupt and synchronization mechanisms are programmed in microcode
real time technology and applications symposium | 2008
Jack Whitham; Neil C. Audsley
Instruction scratchpads have been previously suggested as a way to reduce the worst case execution time (WCET) of hard real-time programs without introducing the analysis issues posed by caches. Trace scratchpads extend this paradigm with support for instruction level parallelism (ILP) while preserving simplicity of WCET analysis. In this paper, we demonstrate trace scratchpads using the MCGREP-2 CPU architecture. We provide a sample algorithm to automatically reduce the WCET of a program using a trace scratchpad, and compare the results with the use of an instruction scratchpad. We find that the two types of scratchpad are best used together. Instruction scratchpads provide excellent WCET improvements at low cost, but trace scratchpads reduce WCET further by optimizing worst case (WC) paths and exploiting ILP across basic block boundaries. Using our experimental implementation, we have observed WCET improvements over an instruction scratchpad of up to 149% with some Malardalen WCET benchmarks.
real-time systems symposium | 2012
Jack Whitham; Robert I. Davis; Neil C. Audsley; Sebastian Altmeyer; Claire Maiza
We present a multitasking scratchpad memory reuse scheme (MSRS) for the dynamic partitioning of scratchpad memory between tasks in a preemptive multitasking system. We specify a means to compute the worst-case response time (WCRT) and schedulability of task sets executed using MSRS. Our scratchpad-related preemption delay (SRPD) is an analog of cache-related preemption delay (CRPD), proposed in previous work as a way to compute the worst-case cost imposed upon a preempted task by preemption in a multitasking system. Unlike CRPD, however, SRPD is independent of the number of tasks and the local memory size. We compare SRPD with CRPD by experiment and determine that neither dominates the other, i.e. either may be better for certain task sets. However, MSRS leads to improved schedulability versus cache when contention for local memory space is high, either because the local memory size is small, or because the task set is large, provided that the cost of loading blocks from external memory to scratchpad is similar to the cost of loading blocks into cache.
IEEE Transactions on Computers | 2010
Jack Whitham; Neil C. Audsley
Superscalar out-of-order CPU designs can achieve higher performance than simpler in-order designs through exploitation of instruction-level parallelism in software. However, these CPU designs are often considered to be unsuitable for hard real-time systems because of the difficulty of guaranteeing the worst-case execution time (WCET) of software. This paper proposes and evaluates modifications for a superscalar out-of-order CPU core to allow instruction-level parallelism to be exploited without sacrificing time predictability and support for WCET analysis. Experiments using the M5 O3 CPU simulator show that WCETs can be two-four times smaller than those obtained using an idealized in-order CPU design, as instruction-level parallelism is exploited without compromising timing safety.
java technologies for real-time and embedded systems | 2009
Jack Whitham; Neil C. Audsley; Martin Schoeberl
This paper describes hardware methods, a lightweight and platform-independent scheme for linking real-time Java code to co-processors implemented using a hardware description language (HDL). Intended for use in embedded systems, hardware methods have similar semantics to the native methods used to interface Java code to legacy C/C++ software, but are also time-predictable, facilitating accurate worst-case execution time (WCET) analysis. By reference to several examples, the paper demonstrates the applicability of hardware methods and shows that they can (1) reduce the WCET of embedded real-time Java, and (2) improve the quality of WCET estimates in the presence of infeasible paths.
euromicro conference on real-time systems | 2010
Jack Whitham; Neil C. Audsley
This paper shows that a program using a time-predictable memory system for data storage can achieve a similar worst-case execution time (WCET) to the average-case execution time (ACET) using a conventional heuristic-based memory system including a data cache. This result is useful within any embedded system where time-predictability and performance are both important, particularly hard real-time systems carrying out intensive data processing activities. It is a counter-example to the conventional wisdom that time-predictable means “slow” in comparison to ACET-focused heuristics. To carry out the investigation, 36 “memory access models” are derived from benchmark programs and assumed to be representative of typical code. The models generate LOAD/STORE instructions to exercise a data cache or scratchpad memory management unit (SMMU). The ACET is determined for the data cache and the WCET is determined for the SMMU. After improvements are applied, results show that the SMMU WCET is within 5% of the data cache ACET for 34 models. In 16 of 36 cases, the SMMU WCET is better than the data cache ACET.