Is this you? Create Your Porfile

Lars Wehmeyer

Technical University of Dortmund

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lars Wehmeyer is active.

Explore More

Publication

Featured researches published by Lars Wehmeyer.

design, automation, and test in europe | 2002

Assigning Program and Data Objects to Scratchpad for Energy Reduction

Stefan Steinke; Lars Wehmeyer; Bo-Sik Lee; Peter Marwedel

The number of embedded systems is increasing and a remarkable percentage is designed as mobile applications. For the latter, energy consumption is a limiting factor because of todays battery capacities. Besides the processor, memory accesses consume a high amount of energy. The use of additional less power hungry memories like caches or scratchpads is thus common. Caches incorporate the hardware control logic for moving data in and out automatically. On the other hand, this logic requires chip area and energy. A scratchpad memory is much more energy efficient, but there is a need for software control of its content. In this paper, an algorithm integrated into a compiler is presented which analyses the application and selects program and data parts which are placed into the scratchpad. Comparisons against a cache solution show remarkable advantages between 12% and 43% in energy consumption for designs of the same memory size.

design, automation, and test in europe | 2004

Cache-aware scratchpad allocation algorithm

Manish Verma; Lars Wehmeyer; Peter Marwedel

In the context of portable embedded systems, reducing energy is one of the prime objectives. Most high-end embedded microprocessors include onchip instruction and data caches, along with a small energy efficient scratchpad. Previous approaches for utilizing scratchpad did not consider caches and hence fail for the au courant architecture. In the presented work, we use the scratchpad for storing instructions and propose a generic cache aware scratchpad allocation (CASA) algorithm. We report an average reduction of 8-29% in instruction memory energy consumption compared to a previously published technique for benchmarks from the mediabench suite. The scratchpad in the presented architecture is similar to a preloaded loop cache. Comparing the energy consumption of our approach against preloaded loop caches, we report average energy savings of 20-44%.

international symposium on systems synthesis | 2002

Reducing energy consumption by dynamic copying of instructions onto onchip memory

Stefan Steinke; Nils Grunwald; Lars Wehmeyer; Rajeshwari Banakar; M. Balakrishnan; Peter Marwedel

The number of mobile embedded systems is increasing and all of them are limited in their uptime by their battery capacity. Several hardware changes have been introduced during the last years, but the steadily growing functionality still requires further energy reductions, e.g. through software optimizations. A significant amount of energy can be saved in the memory hierarchy where most of the energy is consumed. In this paper, a new software technique is presented which supports the use of an onchip scratchpad memory by dynamically copying program parts into it. The set of selected program parts are determined with an optimal algorithm using integer linear programming. Experimental results show a reduction of the energy consumption by nearly 30%, a performance increase by 25% against a common cache system and energy improvements against a static approach of up to 38%.

design, automation, and test in europe | 2005

Influence of Memory Hierarchies on Predictability for Time Constrained Embedded Software

Lars Wehmeyer; Peter Marwedel

Safety-critical embedded systems having to meet real-time constraints are expected to be highly predictable in order to guarantee at design time that certain timing deadlines will always be met. This requirement usually prevents designers from utilizing caches due to their highly dynamic, thus hardly predictable, behavior. The integration of scratchpad memories represents an alternative approach which allows the system to benefit from a performance gain comparable to that of caches, while at the same time maintaining predictability. We compare the impact of scratchpad memories and caches on worst case execution time (WCET) analysis results. We show that caches, despite requiring complex techniques, can have a negative impact on the predicted WCET while the estimated WCET for scratchpad memories scales with the achieved performance gain at no extra analysis cost.

international symposium on computer architecture | 2004

Compiler-optimized usage of partitioned memories

Lars Wehmeyer; Urs Helmig; Peter Marwedel

In order to meet the requirements concerning both performance and energy consumption in embedded systems, new memory architectures are being introduced. Beside the well-known use of caches in the memory hierarchy, processor cores today also include small onchip memories called scratchpad memories whose usage is not controlled by hardware, but rather by the programmer or the compiler. Techniques for utilization of these scratchpads have been known for some time. Some new processors provide more than one scratchpad, making it necessary to enhance the workflow such that this complex memory architecture can be efficiently utilized. In this work, we present an energy model and an ILP formulation to optimally assign memory objects to different partitions of scratchpad memories at compile time, achieving energy savings of up to 22% compared to previous approaches.

asia and south pacific design automation conference | 2004

Fast, predictable and low energy memory references through architecture-aware compilation

Peter Marwedel; Lars Wehmeyer; Manish Verma; Stefan Steinke; Urs Helmig

The design of future high-performance embedded systems is hampered by two problems: First, the required hardware needs more energy than is available from batteries. Second, current cache-based approaches for bridging the increasing speed gap between processors and memories cannot guarantee predictable real-time behavior. A contribution to solving both problems is made in this paper which describes a comprehensive set of algorithms that can be applied at design time in order to maximally exploit scratch pad memories (SPMs). We show that both the energy consumption as well as the computed worst case execution time (WCET) can be reduced by up to to 80% and 48%, respectively, by establishing a strong link between the memory architecture and the compiler.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2001

Analysis of the influence of register file size on energy consumption, code size, and execution time

Lars Wehmeyer; Manoj Kumar Jain; Stefan Steinke; Peter Marwedel; M. Balakrishnan

Interest in low-power embedded systems has increased considerably in the past few years. To produce low-power code and to allow an estimation of power consumption of software running on embedded systems, a power model was developed based on physical measurement using an evaluation board and integrated into a compiler and profiler. The compiler uses the power information to choose instruction sequences consuming less power, whereas the profiler gives information about the total power consumed during execution of the generated program. The used compiler is parameterized such that, e.g., the register file size may be changed. The resulting code is evaluated with respect to code size, performance, and power consumption for different register file sizes. The extracted information is especially useful during application analysis and architecture space exploration in application-specific integrated processor (ASIP) design. Our analysis gives the designer the ability to estimate the desirable register file size for an ASIP design. The size of the register file should be considered as a design parameter since it has a strong impact on the energy consumption of embedded systems.

Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571) | 2001

Evaluating register file size in ASIP design

Manoj Kumar Jain; Lars Wehmeyer; Stefan Steinke; Peter Marwedel; M. Balakrishnan

Interest in synthesis of Application Specific Instruction Set Processors or ASIPs has increased considerably and a number of methodologies have been proposed for ASIP design. A key step in ASIP synthesis involves deciding architectural features based on application requirements and constraints. In this paper we observe the effect of changing register file size on the performance as well as power and energy consumption. Detailed data is generated and analyzed for a number of application programs. Results indicate that choice of an appropriate number of registers has a significant impact on performance.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2006

Cache-Aware Scratchpad-Allocation Algorithms for Energy-Constrained Embedded Systems

Manish Verma; Lars Wehmeyer; Peter Marwedel

In the context of mobile embedded devices, reducing energy is one of the prime objectives. Memories are responsible for a significant percentage of a systems aggregate energy consumption. Consequently, novel memories as well as novel-memory architectures are being designed to reduce the energy consumption. Caches and scratchpads are two contrasting memory architectures. The former relies on hardware logic while the latter relies on software for its utilization. To meet different requirements, most contemporary high-end embedded microprocessors include on-chip instruction and data caches along with a scratchpad. Previous approaches for utilizing scratchpad did not consider caches and hence fail for the contemporary high-end systems. Instructions are allocated onto the scratchpad, while taking into account the behavior of the instruction cache present in the system. The problem of scratchpad allocation is solved using a heuristic and also optimally using an integer linear programming formulation. An average reduction of 7% and 23% in processor cycles and instruction-memory energy, respectively, is reported when compared against a previously published technique. The average deviation between optimal and nonoptimal solutions was found to be less than 6% both in terms of processor cycles and energy. The scratchpad in the presented architecture is similar to a preloaded loop cache. Comparing the energy consumption of the presented approach against that of a preloaded loop cache, an average reduction of 9% and 29% in processor cycles and instruction-memory energy, respectively, is reported

international conference on embedded computer systems architectures modeling and simulation | 2006

Compilation and simulation tool chain for memory aware energy optimizations

Manish Verma; Lars Wehmeyer; Robert Pyka; Peter Marwedel; Luca Benini

Memories are known to be the energy bottleneck of portable embedded devices. Numerous memory aware energy optimizations have been proposed. However, both the optimization and the validation are performed in an ad-hoc manner as a coherent optimizing compilation and simulation framework does not exist as yet. In this paper, we present such a framework for performing memory hierarchy aware energy optimization. Both the compiler and the simulator are configured from a single memory hierarchy description. Significant savings of up to 50% in the total energy dissipation are reported.

Explore More