Dennis Walter
Dresden University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dennis Walter.
IEEE Transactions on Circuits and Systems Ii-express Briefs | 2013
Sebastian Höppner; Stefan Haenzsche; Georg Ellguth; Dennis Walter; Holger Eisenreich; René Schüffny
This brief presents a bang-bang all-digital phase-locked loop (ADPLL) clock generator for multiprocessor system-on-chip applications in Globalfoundries 28-nm superlow-power CMOS technology. The circuit features a single-shot phase synchronization scheme for instantaneous phase lock after power-up. This feature is used for fast frequency search during lock-in, resulting in less than 1-μs initial lock time and the capability of instantaneous restart. The ADPLL provides a wide range of output clocks from 83 MHz to 2 GHz and exhibits 31-ps accumulated jitter with 3-ps period jitter at 2 GHz. It occupies an area of only 0.00234 mm2 and consumes 0.64 mW from a 1.0-V supply.
international solid-state circuits conference | 2014
Benedikt Noethen; Oliver Arnold; Esther P. Adeva; Tobias Seifert; Erik Fischer; Steffen Kunze; Emil Matus; Gerhard P. Fettweis; Holger Eisenreich; Georg Ellguth; Stephan Hartmann; Sebastian Höppner; Stefan Schiefer; Jens-Uwe Schlüßler; Stefan Scholze; Dennis Walter; René Schüffny
Modern mobile communication systems face conflicting design constraints. On the one hand, the expanding variety of transmission modes calls for highly flexible solutions supporting the ever-growing number and diversity of application requirements. On the other hand, stringent power restrictions (e.g., at femto base stations and terminals) must be considered, while satisfying the demanding performance requirements. In order to cope with these issues, existing SDR platforms, e.g. [1-2], propose an MPSoC with a heterogeneous array of processing elements (PEs). MPSoC solutions provide programmability and parallelism yielding flexibility, processing performance and power efficiency. To schedule the resources and to apply power gating, a static approach is employed. In contrast, we present a heterogeneous MPSoC platform (Tomahawk2) with runtime scheduling and fine-grained hierarchical power management. This solution can fully adapt to the dynamically varying workload and semi-deterministic behavior in modern concurrent wireless applications. The proposed dynamic scheduler (CoreManager, CM) can be implemented either in software on a general-purpose processor or on a dedicated application-specific hardware unit. It is evident that the software approach offers the highest degree of flexibility; however, it may become a performance-bottleneck for complex applications. A high-throughput ASIC was presented in [3], but this solution does not permit scheduling algorithms to be adjusted. In this work, these limitations are overcome by implementing the CM on an ASIP.
IEEE Transactions on Very Large Scale Integration Systems | 2013
Sebastian Höppner; Holger Eisenreich; Stephan Henker; Dennis Walter; Georg Ellguth; René Schüffny
This paper presents an all-digital phase-locked loop (ADPLL) clock generator for globally asynchronous locally synchronous (GALS) multiprocessor systems-on-chip (MPSoCs). With its low power consumption of 2.7 mW and ultra small chip area of 0.0078 mm2 it can be instantiated per core for fine-grained power management like DVFS. It is based on an ADPLL providing a multiphase clock signal from which core frequencies from 83 to 666 MHz with 50% duty cycle are generated by phase rotation and frequency division. The clock meets the specification for DDR2/DDR3 memory interfaces. Additionally, it provides a dedicated high-speed clock up to 4 GHz for serial network-on-chip data links. Core frequencies can be changed arbitrarily within one clock cycle for fast dynamic frequency scaling applications. The performance including statistical analysis of mismatch has been verified by a prototype in 65-nm CMOS technology.
international solid-state circuits conference | 2012
Dennis Walter; Sebastian Höppner; Holger Eisenreich; Georg Ellguth; Stephan Henker; Stefan Hänzsche; René Schüffny; Markus Winter; Gerhard P. Fettweis
While continued scaling of feature sizes allows for an ever increasing number of cores in modern MPSoCs, power reduction and meeting on-chip bandwidth requirements are pressing concerns. Energy efficiency can be increased by per-core dynamic voltage and frequency scaling (DVFS) and by employing a globally-asynchronous, locally-synchronous (GALS) system architecture in which distribution of a synchronous high-speed clock is not required. For global on-chip communication this presents major challenges due to the need for reliable data synchronization, high bandwidth requirements and speed limiting RC effects on long wires. It has been shown recently that low-swing differential on-chip links provide highest bandwidth, low energy-per-bit and uninterrupted transfers over lengths up to 10mm [1-3, 6]. Capacitively-driven links are promising because of their built-in pre-emphasis thereby countervailing the low-pass behavior of long on-chip wires [1, 4-5]. However, all of these existing implementations focus mainly on the transmission line itself. The capacitively-driven links are not able to forward a stoppable clock signal as there is no well defined differential DC level on the wires with no data or clock activity. In addition, clocking is not reported [5] or fully synchronous, which means a high-speed clock must be distributed globally on-chip. This work provides a solution for capacitively-driven links with a parallel DC resistive divider to allow forwarded clocking with complete gating capability.
IEEE Journal of Solid-state Circuits | 2015
Sebastian Höppner; Dennis Walter; Thomas Hocker; Stephan Henker; Stefan Hänzsche; Daniel Sausner; Georg Ellguth; Jens-Uwe Schlüßler; Holger Eisenreich; René Schüffny
This paper presents a network-on-chip (NoC) SerDes transceiver architecture for long distance interconnects in the mm range within MPSoCs. Its source synchronous clocking scheme enables application in GALS systems and allows completely stoppable transceiver clocking for low idle power consumption. A capacitive line driver with combined resistive driver for well defined DC swing is employed and analyzed in detail by simulation studies. It is shown that proper DC swing definition is mandatory for robust operation of long links at high data rates. Prototypes of the transceiver over 6 mm bufferless on-chip interconnect are implemented in both 65 nm and 28 nm CMOS technologies. The 65 nm realization achieves an efficiency of 173 fJ/bit/mm at 90 Gbit/s at 1.25 V and 93 fJ/bit/mm at 45 Gbit/s low speed mode at 0.9 V. The 28 nm realization achieves 81 fJ/bit/mm at 72 Gbit/s at 1.05 V and 64 fJ/bit/mm at 36 Gbit/s low speed mode at 0.95 V. The transceiver can be seamlessly integrated as black box point-to-point connection into heterogeneous MPSoC NoCs to enable ultra-compact toplevel floorplan realization and increased energy efficiency. An example of a 20-core MPSoC in 65 nm CMOS technology with 10 serial NoC transceivers is presented.
international symposium on system-on-chip | 2010
Sebastian Höppner; Dennis Walter; Holger Eisenreich; René Schüffny
This paper analyzes high-speed source-synchronous network-on-chip data links in terms of yield loss due to delay variations. We show that statistical process variations can significantly reduce yield at high data rates and high bus widths. An on-chip delay calibration architecture for individual calibration of rise and fall delay times is proposed and analyzed on system level using Monte Carlo simulations. A sizing strategy for compensation delay elements is derived for yield maximization with low effort in terms of chip area and energy consumption.
design automation conference | 2016
Sebastian Haas; Oliver Arnold; Benedikt Nöthen; Stefan Scholze; Georg Ellguth; Andreas Dixius; Sebastian Höppner; Stefan Schiefer; Stephan Hartmann; Stephan Henker; Thomas Hocker; Jörg Schreiter; Holger Eisenreich; Jens-Uwe Schlüßler; Dennis Walter; Tobias Seifert; Friedrich Pauls; Mattis Hasler; Yong Chen; Hermann Hensel; Sadia Moriam; Emil Matus; Christian Mayr; René Schüffny; Gerhard P. Fettweis
This paper presents a heterogeneous database hardware accelerator MPSoC manufactured in 28 nm SLP CMOS. The 18 mm2 chip integrates a runtime task scheduling unit for energy-efficient query processing and hierarchical power management supported by an ultra-fast dynamic voltage and frequency scaling. Four processing elements, connected by a star-mesh network-on-chip, are accelerated by an instruction set extension tailored to fundamental dataintensive applications. We evaluate the MPSoC with typical database benchmarks focusing on scans and bitmap operations. When the processing elements operate on data stored in local memories, the chip consumes 250 mW and shows a 96x energy efficiency improvement compared to state-of-the-art platforms.
international symposium on system-on-chip | 2011
Sebastian Höppner; Dennis Walter; Georg Ellguth; René Schüffny
This paper presents asynchronous sub-sampling techniques to measure delay mismatch of clock and data lanes in high-speed serial network-on-chip (NoC) links. The techniques allow the use of low quality sampling clocks to reduce test hardware overhead for integration into complex MPSoCs with multiple NoC links. It enables compensation of delay variations to realize high-speed NoC links with sufficient yield. The proposed techniques are demonstrated at NoC links as part of an MPSoC in 65nm CMOS technology, where the calibration leads to significant reduction of bit-error-rates of a 72 GBit/s (8 GBit/s per lane) link over 4mm on-chip interconnect.
design, automation, and test in europe | 2017
Michael Raitza; Akash Kumar; Marcus Völp; Dennis Walter; Jens Trommer; Thomas Mikolajick; Walter M. Weber
Silicon nanowire reconfigurable field effect transistors (SiNW RFETs) abolish the physical separation of n-type and p-type transistors by taking up both roles in a configurable way within a doping-free technology. However, the potential of transistor-level reconfigurability has not been demonstrated in larger circuits, so far. In this paper, we present first steps to a new compact and efficient design of combinational circuits by employing transistor-level reconfiguration. We contribute new basic gates realized with silicon nanowires, such as 2/3-XOR and MUX gates. Exemplifying our approach with 4-bit, 8-bit and 16-bit conditional carry adders, we were able to reduce the number of transistors to almost one half. With our current case study we show that SiNW technology can reduce the required chip area by 16 despite larger size of the individual transistor, and improve circuit speed by 26%.
design automation conference | 2017
Sebastian Haas; Tobias Seifert; Benedikt Nöthen; Stefan Scholze; Sebastian Höppner; Andreas Dixius; Esther P. Adeva; Thomas R. Augustin; Friedrich Pauls; Sadia Moriam; Mattis Hasler; Erik Fischer; Yong Chen; Emil Matus; Georg Ellguth; Stephan Hartmann; Stefan Schiefer; Love Cederström; Dennis Walter; Stephan Henker; Stefan Hänzsche; Johannes Uhlig; Holger Eisenreich; Stefan Weithoffer; Norbert Wehn; René Schüffny; Christian Mayr; Gerhard P. Fettweis
Current and future applications impose high demands on software-defined radio (SDR) platforms in terms of latency, reliability, and flexibility. This paper presents a heterogeneous SDR MPSoC with a hexagonal network-on-chip to address these issues. It features four data processing modules and a baseband processing engine for iterative multiple-input multiple-output (MIMO) receiving. Integrated memory controllers enable dynamic data flow mapping and application isolation. In a 4 × 4 MIMO application scenario, the MPSoC achieves a throughput of 232 Mbit/s with a latency of 20 µs while consuming 414 mW. It outperforms state-of-the-art platforms in terms of throughput by a factor of 4.