Is this you? Create Your Porfile

Jörg Henkel

Karlsruhe Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jörg Henkel is active.

Explore More

Publication

Featured researches published by Jörg Henkel.

IEEE Design & Test of Computers | 1993

Hardware-software cosynthesis for microcontrollers

Rolf Ernst; Jörg Henkel; Thomas Benner

The authors present a software-oriented approach to hardware-software partitioning which avoids restrictions on the software semantics as well as an iterative partitioning process based on hardware extraction controlled by a cost function. This process is used in Cosyma, an experimental cosynthesis system for embedded controllers. As an example, the extraction of coprocessors for loops is demonstrated. Results are presented for several benchmark designs.<<ETX>>

design automation conference | 2013

Mapping on multi/many-core systems: survey of current and emerging trends

Amit Kumar Singh; Muhammad Shafique; Akash Kumar; Jörg Henkel

The reliance on multi/many-core systems to satisfy the high performance requirement of complex embedded software applications is increasing. This necessitates the need to realize efficient mapping methodologies for such complex computing platforms. This paper provides an extensive survey and categorization of state-of-the-art mapping methodologies and highlights the emerging trends for multi/many-core systems. The methodologies aim at optimizing systems resource usage, performance, power consumption, temperature distribution and reliability for varying application models. The methodologies perform design-time and run-time optimization for static and dynamic workload scenarios, respectively. These optimizations are necessary to fulfill the end-user demands. Comparison of the methodologies based on their optimization aim has been provided. The trend followed by the methodologies and open research challenges have also been discussed.

international conference on vlsi design | 2004

On-chip networks: a scalable, communication-centric embedded system design paradigm

Jörg Henkel; Wayne H. Wolf; Srimat T. Chakradhar

As chip complexity grows, design productivity boost is expected from reuse of large parts and blocks of previous designs with the design effort largely invested into the new parts. More and more processor cores and large, reusable components are being integrated on a single silicon die but reuse of the communication infrastructure has been difficult. Buses and point to point connections, that have been the main means to connect components on a chip today, will not result in a scalable platform architecture for the billion transistor chip era. Buses can cost efficiently connect a few tens of components. Point to point connections between communication partners is practical for even fewer components. As more and more components are integrated on a single silicon die, performance bottlenecks of long, global wires preclude reuse of buses. Therefore, scalable on-chip communication infrastructure is playing an increasingly dominant role in system-on-chip designs. With the super-abundance of cheap, function-specific IP cores, design effort will focus on the weakest link: efficient on-chip communication. Future on-chip communication infrastructure will overcome the limits of bus-based systems by providing higher bandwidth, higher flexibility and by solving the clock skew problem on large chips. It may, however, present new problems: higher power consumption of the communication infrastructure and harder-to-predict performance patterns. Solutions to these problems may result in a complete overhaul of SOC design methodologies into a communication-centric design style. The envisioning of upcoming problems and possible benefits has led to intensified research in the field of what is called NoCs: Networks on Chips. The term NoCs is used in a broad meaning, encompassing the hardware communication infrastructure, the middleware and operating system communication services, and a design methodology and tools to map applications onto a network on chip. This paper discusses trends in system-on-chip designs, critiques problems and opportunities of the NoC paradigm, summarizes research activities, and outlines several directions for future research.

design automation conference | 2008

ADAM: run-time agent-based distributed application mapping for on-chip communication

M.A. Al Faruque; R. Krist; Jörg Henkel

Design-time decisions can often only cover certain scenarios and fail in efficiency when hard-to-predict system scenarios occur. This drives the development of run-time adaptive systems. To the best of our knowledge, we are presenting the first scheme for a runtime application mapping in a distributed manner using agents targeting for adaptive NoC-based heterogeneous multi-processor systems. Our approach reduces the overall traffic produced to collect the current state of the system (monitoring-traffic), needed for runtime mapping, compared to a centralized mapping scheme. In our experiment, we obtain 10.7 times lower monitoring traffic compared to the centralized mapping scheme proposed in [8] for a 64 times 64 NoC. Our proposed scheme also requires less execution cycles compared to a non-clustered centralized approach. We achieve on an average 7.1 times lower computational effort for the mapping algorithm compared to the simple nearest-neighbor (NN) heuristics proposed in (E. Carvalho et al., 2007) in a 64 times 32 NoC. We demonstrate the advantage of our scheme by means of a robot application and a set of multimedia applications and compare it to the state-of-the-art run-time mapping schemes proposed in ((E. Carvalho et al., 2007), (C.-L. Chou and R. Marculescu, 2007),(L. Smit et al., 2004)).

IEEE Transactions on Very Large Scale Integration Systems | 2002

System-level exploration for Pareto-optimal configurations in parameterized system-on-a-chip

Tony Givargis; Frank Vahid; Jörg Henkel

Provides a technique for efficiently exploring the configuration space of a parameterized system-on-a-chip (SOC) architecture to find all Pareto-optimal configurations. These configurations represent the range of meaningful power and performance tradeoffs that are obtainable by adjusting parameter values for a fixed application mapped onto the SOC architecture. The approach extensively prunes the potentially large configuration space by taking advantage of parameter dependencies. The authors have successfully incorporated the technique into the parameterized SOC tuning environment (Platune) and applied it to a number of applications.

design automation conference | 2000

Code compression for low power embedded system design

Haris Lekatsas; Jörg Henkel; Wayne H. Wolf

We propose instruction code compression as an efficient method for reducing power on an embedded system. Our approach is the first one to measure and optimize the power consumption of a complete SOC (System--On--a--Chip) comprising a CPU, instruction cache, data cache, main memory, data buses and address bus through code compression. We compare the pre-cache architecture (decompressor between main memory and cache) to a novel post-cache architecture (decompressor between cache and CPU). Our simulations and synthesis results show that our methodology results in large energy savings between 22% and 82% compared to the same system without code compression. Furthermore, we demonstrate that power savings come with reduced chip area and the same or even improved performance.

IEEE Transactions on Very Large Scale Integration Systems | 2001

An approach to automated hardware/software partitioning using a flexible granularity that is driven by high-level estimation techniques

Jörg Henkel; Rolf Ernst

Hardware/software partitioning is a key issue in the design of embedded systems when performance constraints have to be met and chip area and/or power dissipation are critical. For that reason, diverse approaches to automatic hardware/software partitioning have been proposed since the early 1990s. In all approaches so far, the granularity during partitioning is fixed, i.e., either small system parts (e.g., base blocks) or large system parts (e.g., whole functions/processes) can be swapped at once during partitioning in order to find the best hardware/software tradeoff. Since the deployment of a fixed granularity is likely to result in suboptimum solutions, we present the first approach that features a flexible granularity during hardware/software partitioning. Our approach is comprehensive in so far that the estimation techniques, our multigranularity performance estimation technique described here in detail, that control partitioning, are adapted to the flexible partitioning granularity. In addition, our multilevel objective function is described. It allows us to tradeoff various design constraints/goals (performance/hardware area) against each other. As a result, our approach is applicable to a wider range of applications than approaches with a fixed granularity. We also show that our approach is fast and that the obtained hardware/software partitions are much more efficient (in terms of hardware effort, for example) than in cases where a fixed granularity is deployed.

design automation conference | 2013

Reliable on-chip systems in the nano-era: lessons learnt and future trends

Jörg Henkel; Lars Bauer; Nikil D. Dutt; Puneet Gupta; Sani R. Nassif; Muhammad Shafique; Mehdi Baradaran Tahoori; Norbert Wehn

Reliability concerns due to technology scaling have been a major focus of researchers and designers for several technology nodes. Therefore, many new techniques for enhancing and optimizing reliability have emerged particularly within the last five to ten years. This perspective paper introduces the most prominent reliability concerns from todays points of view and roughly recapitulates the progress in the community so far. The focus of this paper is on perspective trends from the industrial as well as academic points of view that suggest a way for coping with reliability challenges in upcoming technology nodes.

design automation conference | 2014

The EDA Challenges in the Dark Silicon Era: Temperature, Reliability, and Variability Perspectives

Muhammad Shafique; Siddharth Garg; Jörg Henkel; Diana Marculescu

Technology scaling has resulted in smaller and faster transistors in successive technology generations. However, transistor power consumption no longer scales commensurately with integration density and, consequently, it is projected that in future technology nodes it will only be possible to simultaneously power on a fraction of cores on a multi-core chip in order to stay within the power budget. The part of the chip that is powered off is referred to as dark silicon and brings new challenges as well as opportunities for the design community, particularly in the context of the interaction of dark silicon with thermal, reliability and variability concerns. In this perspectives paper we describe these new challenges and opportunities, and provide preliminary experimental evidence in their support.

international conference on computer aided design | 2009

TAPE: thermal-aware agent-based power economy for multi/many-core architectures

Thomas Ebi; Mohammad Abdullah Al Faruque; Jörg Henkel

A growing challenge in embedded system design is coping with increasing power densities resulting from packing more and more transistors onto a small die area, which in turn transform into thermal hotspots. In the current late silicon era silicon structures have become more susceptible to transient faults and aging effects resulting from these thermal hotspots. In this paper we present an agent-based power distribution approach (TAPE) which aims to balance the power consumption of a multi/many-core architecture in a pro-active manner. By further taking the systems thermal state into consideration when distributing the power throughout the chip, TAPE is able to noticeably reduce the peak temperature. In our simulation we provide a fair comparison with the state-of-the-art approaches HRTM [19] and PDTM [9] using the MiBench benchmark suite [18]. When running multiple applications simultaneously on a multi/many-core architecture, we are able to achieve an 11.23% decrease in peak temperature compared to the approach that uses no thermal management [14]. At the same time we reduce the execution time (i.e. we increase the performance of the applications) by 44.2% and reduce the energy consumption by 44.4% compared to PDTM [9]. We also show that our approach exhibits higher scalability, requiring 11.9 times less communication overhead in an architecture with 96 cores compared to the state-of-the-art approaches.

Explore More