Stefan Scholze
Dresden University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefan Scholze.
Biological Cybernetics | 2011
Daniel Brüderle; Mihai A. Petrovici; Bernhard Vogginger; Matthias Ehrlich; Thomas Pfeil; Sebastian Millner; Andreas Grübl; Karsten Wendt; Eric Müller; Marc-Olivier Schwartz; Dan Husmann de Oliveira; Sebastian Jeltsch; Johannes Fieres; Moritz Schilling; Paul Müller; Oliver Breitwieser; Venelin Petkov; Lyle Muller; Andrew P. Davison; Pradeep Krishnamurthy; Jens Kremkow; Mikael Lundqvist; Eilif Muller; Johannes Partzsch; Stefan Scholze; Lukas Zühl; Christian Mayr; Alain Destexhe; Markus Diesmann; Tobias C. Potjans
In this article, we present a methodological framework that meets novel requirements emerging from upcoming types of accelerated and highly configurable neuromorphic hardware systems. We describe in detail a device with 45 million programmable and dynamic synapses that is currently under development, and we sketch the conceptual challenges that arise from taking this platform into operation. More specifically, we aim at the establishment of this neuromorphic system as a flexible and neuroscientifically valuable modeling tool that can be used by non-hardware experts. We consider various functional aspects to be crucial for this purpose, and we introduce a consistent workflow with detailed descriptions of all involved modules that implement the suggested steps: The integration of the hardware interface into the simulator-independent model description language PyNN; a fully automated translation between the PyNN domain and appropriate hardware configurations; an executable specification of the future neuromorphic system that can be seamlessly integrated into this biology-to-hardware mapping process as a test bench for all software layers and possible hardware design modifications; an evaluation scheme that deploys models from a dedicated benchmark library, compares the results generated by virtual or prototype hardware devices with reference software simulations and analyzes the differences. The integration of these components into one hardware–software workflow provides an ecosystem for ongoing preparative studies that support the hardware design process and represents the basis for the maturity of the model-to-hardware mapping software. The functionality and flexibility of the latter is proven with a variety of experimental results.
Frontiers in Neuroscience | 2011
Stefan Scholze; Stefan Schiefer; Johannes Partzsch; Stephan Hartmann; Christian Mayr; Sebastian Höppner; Holger Eisenreich; Stephan Henker; Bernhard Vogginger; René Schüffny
State-of-the-art large-scale neuromorphic systems require sophisticated spike event communication between units of the neural network. We present a high-speed communication infrastructure for a waferscale neuromorphic system, based on application-specific neuromorphic communication ICs in an field programmable gate arrays (FPGA)-maintained environment. The ICs implement configurable axonal delays, as required for certain types of dynamic processing or for emulating spike-based learning among distant cortical areas. Measurements are presented which show the efficacy of these delays in influencing behavior of neuromorphic benchmarks. The specialized, dedicated address-event-representation communication in most current systems requires separate, low-bandwidth configuration channels. In contrast, the configuration of the waferscale neuromorphic system is also handled by the digital packet-based pulse channel, which transmits configuration data at the full bandwidth otherwise used for pulse transmission. The overall so-called pulse communication subgroup (ICs and FPGA) delivers a factor 25–50 more event transmission rate than other current neuromorphic communication infrastructures.
international symposium on circuits and systems | 2012
Johannes Schemmel; Andreas Grübl; Stephan Hartmann; Alexander Kononov; Christian Mayr; K. Meier; Sebastian Millner; Johannes Partzsch; Stefan Schiefer; Stefan Scholze; René Schüffny; Marc-Olivier Schwartz
This demonstration is based on the wafer-scale neuromophic system presented in the previous papers by Schemmel et. al. (20120), Scholze et. al. (2011) and Millner et. al. (2010). The demonstration setup will allow the visitors to monitor and partially manipulate the neural events at every level. They will get an insight into the complex interplay between packet-based and realtime communication necessary to combine continuous-time mixed-signal neural networks with a packet-based transport network. Several network experiments implemented on the setup will be accessible for user interaction.
international solid-state circuits conference | 2014
Benedikt Noethen; Oliver Arnold; Esther P. Adeva; Tobias Seifert; Erik Fischer; Steffen Kunze; Emil Matus; Gerhard P. Fettweis; Holger Eisenreich; Georg Ellguth; Stephan Hartmann; Sebastian Höppner; Stefan Schiefer; Jens-Uwe Schlüßler; Stefan Scholze; Dennis Walter; René Schüffny
Modern mobile communication systems face conflicting design constraints. On the one hand, the expanding variety of transmission modes calls for highly flexible solutions supporting the ever-growing number and diversity of application requirements. On the other hand, stringent power restrictions (e.g., at femto base stations and terminals) must be considered, while satisfying the demanding performance requirements. In order to cope with these issues, existing SDR platforms, e.g. [1-2], propose an MPSoC with a heterogeneous array of processing elements (PEs). MPSoC solutions provide programmability and parallelism yielding flexibility, processing performance and power efficiency. To schedule the resources and to apply power gating, a static approach is employed. In contrast, we present a heterogeneous MPSoC platform (Tomahawk2) with runtime scheduling and fine-grained hierarchical power management. This solution can fully adapt to the dynamically varying workload and semi-deterministic behavior in modern concurrent wireless applications. The proposed dynamic scheduler (CoreManager, CM) can be implemented either in software on a general-purpose processor or on a dedicated application-specific hardware unit. It is evident that the software approach offers the highest degree of flexibility; however, it may become a performance-bottleneck for complex applications. A high-throughput ASIC was presented in [3], but this solution does not permit scheduling algorithms to be adjusted. In this work, these limitations are overcome by implementing the CM on an ASIP.
international conference on electronics, circuits, and systems | 2010
Stephan Hartmann; Stefan Schiefer; Stefan Scholze; Johannes Partzsch; Christian Mayr; Stephan Henker; René Schüffny
One of the main challenges in large scale neuromorphic VLSI systems is the design of the communication infrastructure. Traditionally, the neural communication has been done via parallel asynchronous transmission of Address-Event-Representations (AER) of pulses, while the configuration was achieved via off-the-shelf chip connect protocols. Recently, there has been a move towards greater event transmission speed via a serialization of the AER protocols, as well as an integration of both communication and configuration in the same interface. We present the PCB and FPGA design of such an interface for a newly developed waferscale neuromorphic system. The serial event communication of other current approaches has been refined into a packet based synchronous (rather than asynchronous) protocol, which offers better flexibility and bandwidth utilization. A factor 30–100 greater event transmission rate has been achieved. Compared to other approaches, the full communication bandwidth can also be employed for configuration. The system offers additional functionality, such as event storage and replay. Also, a very high degree of mechanical integration has been achieved.
IEEE Transactions on Biomedical Circuits and Systems | 2016
Christian Mayr; Johannes Partzsch; Marko Noack; Stephan Hänzsche; Stefan Scholze; Sebastian Höppner; Georg Ellguth; René Schüffny
A switched-capacitor (SC) neuromorphic system for closed-loop neural coupling in 28 nm CMOS is presented, occupying 600 um by 600 um. It offers 128 input channels (i.e., presynaptic terminals), 8192 synapses and 64 output channels (i.e., neurons). Biologically realistic neuron and synapse dynamics are achieved via a faithful translation of the behavioural equations to SC circuits. As leakage currents significantly affect circuit behaviour at this technology node, dedicated compensation techniques are employed to achieve biological-realtime operation, with faithful reproduction of time constants of several 100 ms at room temperature. Power draw of the overall system is 1.9 mW.
Frontiers in Neuroscience | 2015
Marko Noack; Johannes Partzsch; Christian Mayr; Stefan Hänzsche; Stefan Scholze; Sebastian Höppner; Georg Ellguth; René Schüffny
Synaptic dynamics, such as long- and short-term plasticity, play an important role in the complexity and biological realism achievable when running neural networks on a neuromorphic IC. For example, they endow the IC with an ability to adapt and learn from its environment. In order to achieve the millisecond to second time constants required for these synaptic dynamics, analog subthreshold circuits are usually employed. However, due to process variation and leakage problems, it is almost impossible to port these types of circuits to modern sub-100nm technologies. In contrast, we present a neuromorphic system in a 28 nm CMOS process that employs switched capacitor (SC) circuits to implement 128 short term plasticity presynapses as well as 8192 stop-learning synapses. The neuromorphic system consumes an area of 0.36 mm2 and runs at a power consumption of 1.9 mW. The circuit makes use of a technique for minimizing leakage effects allowing for real-time operation with time constants up to several seconds. Since we rely on SC techniques for all calculations, the system is composed of only generic mixed-signal building blocks. These generic building blocks make the system easy to port between technologies and the large digital circuit part inherent in an SC system benefits fully from technology scaling.
international solid-state circuits conference | 2012
Markus Winter; Steffen Kunze; Esther P. Adeva; Björn Mennenga; Emil Matus; Gerhard P. Fettweis; Holger Eisenreich; Georg Ellguth; Sebastian Höppner; Stefan Scholze; René Schüffny; Tomoyoshi Kobori
In current and future wireless standards, such as WiMAX, 3GPP-LTE or LTE-Advanced, receiver terminals have to support numerous operating modes for each protocol [1], as well as sophisticated transmission techniques, especially enhanced MIMO detection and iterative forward error correction (FEC). MIMO detection and FEC belong to the most computationally complex parts of the receiver-side baseband signal processing chain. Implementations thereof must have low power consumption, but also be able to interact in a flexible and efficient way in the detection-decoding engine, while at the same time not compromising on the challenging throughput and flexibility requirements associated with 4G standards. In this paper, we present a chip implementation of a MIMO sphere detector combined with a flexible FEC engine, realizing a detection-decoding engine in silicon capable of satisfying 4G requirements with a data rate of 335Mb/s.
international symposium on neural networks | 2017
Sebastian Schmitt; Johann Klähn; Guillaume Bellec; Andreas Grübl; Maurice Güttler; Andreas Hartel; Stephan Hartmann; Dan Husmann; Kai Husmann; Sebastian Jeltsch; Vitali Karasenko; Mitja Kleider; Christoph Koke; Alexander Kononov; Christian Mauch; Eric Müller; Paul Müller; Johannes Partzsch; Mihai A. Petrovici; Stefan Schiefer; Stefan Scholze; Vasilis Thanasoulis; Bernhard Vogginger; Robert A. Legenstein; Wolfgang Maass; Christian Mayr; René Schüffny; Johannes Schemmel; K. Meier
Emulating spiking neural networks on analog neuromorphic hardware offers several advantages over simulating them on conventional computers, particularly in terms of speed and energy consumption. However, this usually comes at the cost of reduced control over the dynamics of the emulated networks. In this paper, we demonstrate how iterative training of a hardware-emulated network can compensate for anomalies induced by the analog substrate. We first convert a deep neural network trained in software to a spiking network on the BrainScaleS wafer-scale neuromorphic system, thereby enabling an acceleration factor of 10000 compared to the biological time domain. This mapping is followed by the in-the-loop training, where in each training step, the network activity is first recorded in hardware and then used to compute the parameter updates in software via backpropagation. An essential finding is that the parameter updates do not have to be precise, but only need to approximately follow the correct gradient, which simplifies the computation of updates. Using this approach, after only several tens of iterations, the spiking network shows an accuracy close to the ideal software-emulated prototype. The presented techniques show that deep spiking networks emulated on analog neuromorphic devices can attain good computational performance despite the inherent variations of the analog substrate.
design automation conference | 2016
Sebastian Haas; Oliver Arnold; Benedikt Nöthen; Stefan Scholze; Georg Ellguth; Andreas Dixius; Sebastian Höppner; Stefan Schiefer; Stephan Hartmann; Stephan Henker; Thomas Hocker; Jörg Schreiter; Holger Eisenreich; Jens-Uwe Schlüßler; Dennis Walter; Tobias Seifert; Friedrich Pauls; Mattis Hasler; Yong Chen; Hermann Hensel; Sadia Moriam; Emil Matus; Christian Mayr; René Schüffny; Gerhard P. Fettweis
This paper presents a heterogeneous database hardware accelerator MPSoC manufactured in 28 nm SLP CMOS. The 18 mm2 chip integrates a runtime task scheduling unit for energy-efficient query processing and hierarchical power management supported by an ultra-fast dynamic voltage and frequency scaling. Four processing elements, connected by a star-mesh network-on-chip, are accelerated by an instruction set extension tailored to fundamental dataintensive applications. We evaluate the MPSoC with typical database benchmarks focusing on scans and bitmap operations. When the processing elements operate on data stored in local memories, the chip consumes 250 mW and shows a 96x energy efficiency improvement compared to state-of-the-art platforms.