[PDF] Closed-loop experiments on the BrainScaleS-2 architecture

Abstract

The evolution of biological brains has always been contingent on their embodiment within their respective environments, in which survival required appropriate navigation and manipulation skills. Studying such interactions thus represents an important aspect of computational neuroscience and, by extension, a topic of interest for neuromorphic engineering. Here, we present three examples of embodiment on the BrainScaleS-2 architecture, in which dynamical timescales of both agents and environment are accelerated by several orders of magnitude with respect to their biological archetypes.

Full PDF

CClosed-loop experiments on the BrainScaleS-2 architecture

K. Schreiber,T. C. Wunderlich, C. Pehle

Kirchhoff-Institute for Physics, Heidelberg UniversityHeidelberg, Germany

M. A. Petrovici

Department of Physiology, University of BernBern, Switzerland

J. Schemmel, K. Meier

Kirchhoff-Institute for Physics, Heidelberg UniversityHeidelberg, Germany

ABSTRACT

The evolution of biological brains has always been contingent ontheir embodiment within their respective environments, in whichsurvival required appropriate navigation and manipulation skills.Studying such interactions thus represents an important aspect ofcomputational neuroscience and, by extension, a topic of interestfor neuromorphic engineering. Here, we present three examples ofembodiment on the BrainScaleS-2 architecture, in which dynam-ical timescales of both agents and environment are acceleratedby several orders of magnitude with respect to their biologicalarchetypes.

CCS CONCEPTS • Hardware → Analog and mixed-signal circuits . KEYWORDS

Closed-loop, neuromorphic hardware, path integration, reinforce-ment learning, neurorobotics

ACM Reference Format:

K. Schreiber, T. C. Wunderlich, C. Pehle, M. A. Petrovici, and J. Schemmel,K. Meier . 2020. Closed-loop experiments on the BrainScaleS-2 architecture.In

Neuro-inspired Computational Elements Workshop (NICE ’20), March 17–20, 2020, Heidelberg, Germany.

ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3381755.3381776

Neuromorphic engineering aims at overcoming certain limitationsof traditional computer architectures by reproducing particularaspects of structure and function of biological neural networks inVLSI. Within this context, it is important to note that all biologicalbrains are part of a body that interacts with its environment invarious ways: information continuously flows from a diverse rangeof sensory organs or cells into the nervous system. The nervoussystem processes this information and in turn provides signalsto organs or cells that are involved in motion or communication,resulting in actions within the environment. This interplay of infor-mation exchange and processing appears to be inseparably linkedto the working principles of biological brains and forms a closedloop of environment, body, and brain. These important aspects ofso-called embodied cognition have received increasing attention incognitive science and related fields[12], evidently pertaining to bothbiological and artificial brains. For versions of the latter running

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).

NICE ’20, March 17–20, 2020, Heidelberg, Germany © 2020 Copyright held by the owner/author(s).ACM ISBN 978-1-4503-7718-8/20/03.https://doi.org/10.1145/3381755.3381776 p l a y i n g f i e l d s t a t e a c t i o n A s y n a p t i c w e i g h t before learningafter learning C B S S s i m e n e r g y [ J ] E

25 50 75 100iteration [1000]01 p e r f . B B S S s i m t i m e [ s ] D Figure 1: Reinforcement learning with reward-modulated STDP. A) The plas-ticity processing unit (PPU) simulates a simplified version of Pong. The hor-izontal position of the ball serves as input for a 2-layer neural network, withthe resulting output dictating the target paddle position. The network re-ceives reward based on its aiming accuracy. B) Playing performance duringlearning. C) Synaptic depression automatically adapts to the excitability ofneurons. D, E) Wall-clock duration and power consumption of a single iter-ation on BrainScaleS-2 (blue) and an equivalent software simulation usingNEST (orange). on analog neuromorphic hardware, embodiment carries particularconstraints, as the continuous-time dynamics of physical analogneurons and synapses cannot be accelerated, slowed-down or evenhalted as easily as for their digital counterparts.Here, we discuss three different implementations of closed-loopexperiments on the BrainScaleS-2 system [1–3]: a neural agentplaying the game of pong, a robotic application in which the neu-ral network is connected to a pantographic robotic system, andan insectoid agent that performs a path integration task in a 2Denvironment. By allowing continuous access to the spike I/O ofthe analog neural core, as well as digital online updates of networkparameters, the hybrid architecture of BrainScaleS-2 lends itselfparticularly well to the study of neural agents that behave and learnwithin an interactive environment.

Reinforcement learning represents a natural fit for closed-loop ex-periments in which an agent tries to maximize a reward signal basedon its actions within an environment. It was recently shown thatBrainScaleS-2 can be used to implement closed-loop reinforcementlearning, where all components of the loop, including the virtualenvironment simulation, are computed on-chip, thus creating afully autonomous setup [13].Our referenced work uses a three-factor learning rule calledR-STDP [5, 6]. This rule multiplies a scalar reward signal, akinto a global neuromodulator such as dopamine, with an STDP-likesynaptic eligibility trace. In our case, the latter is computed locallyat each synapse and in an analog fashion [7].The virtual environment is a simplified version of the Pong videogame, as shown in Fig. 1A. The spiking neural network dynamicsare emulated by the neuromorphic substrate while the embeddedplasticity processor takes on a dual role: it both simulates the game a r X i v : . [ q - b i o . N C ] A p r ICE ’20, March 17–20, 2020, Heidelberg, Germany K. Schreiber, et al. A ScreenMX MY STop ViewSensor signal ↓ SpikesBrainScaleS-2Spikes ↓ Motor signal B Figure 2: PlayPen: a pantographic robotic system. A) Arbitrary optical contentof a screen is captured by a sensor S and translated into a continuous afferentstream of spikes. This input is, in turn, used by spiking networks emulatedon the BrainScaleS-2 system to control the position of the sensor by emittingspike trains that drive the motors MX and MY. B) Example screen contentsfor four possible experimental tasks: stay within a potential well, follow agradient, follow a trace, follow a dot. dynamics and computes synaptic weight updates using R-STDP,creating a fully autonomous setup.During training, the network learns to keep the paddle approxi-mately centered under the ball (Fig. 1B). Implicitly, the experimentalso demonstrates how learning can compensate fixed-pattern noisein the analog neuro-synaptic circuits (Fig. 1C): we found that whilethe excitability of uncalibrated neurons varied significantly dueto mismatch effects, synapses that would negatively impact cor-rect tracking of the ball were systematically depressed below theindividual firing threshold of their postsynaptic neuron. Further-more, this setup demonstrates the speed and power advantages ofthe BrainScaleS-2 architecture compared to software simulations(Fig. 1D/E).

By implementing a fast low-latency link for peripheral spike inputand output, we realized an experimental platform that providesBrainScaleS-2 with access to real-world robotic actuators and sen-sors. Unlike common neurorobotics, the inherent thousandfoldspeed-up of BrainScaleS-2 demands exceptionally quick actuationand sensory feedback. We therefore needed to reduce the weight –and thus also the complexity – of the mechanical components asmuch as possible. The setup translates efferent spike trains intoimpulses that control two motors of a pantographic actuator thatmoves an optical sensor over a two-dimensional surface. The datagathered by this sensor is in turn translated into an afferent spiketrain that is transmitted back to the neuromorphic core. This simpleconstruction, which we call the "PlayPen", provides an intuitiveand easily configurable setup for the study of spike-based learningin the context of real-world robotic control. The setup is depictedin Fig. 2 together with four example experiments.

Recent developments in biological imaging have facilitated unprece-dented insight into numerous functional aspects of insect brains[4, 10, 11], such as their navigational capabilities [8]. Based on phys-iological data from the bee’s brain and following [9], we emulated anetwork for path integration (Fig. 3A) that reproduces bees’ ability

A C integratorssteeringcompass FL Compass FRIntL IntRStL StRML MR B r a t e [ h w k H z / b i o H z ] Figure 3: Virtual insectoid agent performing path integration on BrainScaleS-2. A) Network schematic and activity histogram. The information flows fromthe sensory layer at the top through an integration and a steering layer tothe motor neurons at the bottom. R and L indicate the right and left side, re-spectively. B) A typical trajectory of the virtual insect which turns to randomlooping around the home position upon reaching it. C) Overlay of 100 trajec-tories like in B), each with a different random outbound journey. to return to their nest’s location after exploring the environmentfor sources of food.At the beginning of each experiment, a virtual insect performeda random walk starting from a certain starting position. During thisphase, the modeled network had no effect on the insect’s motion,but was provided with sensory data of the absolute head orientationand the optical flow field from the insect’s eyes. The insect’s headorientation was encoded by four spike sources that each representeda cardinal direction similar to a compass. The optical flow field wassimilarly represented by two spike generators that fired with a rateproportional to the optical flow as derived from the left and right eye(FL and FR). In the second part of the experiment, the return phase,the insect’s motion was determined by the two motor neurons inthe network (ML and MR), which steered the insect by providingpropulsion on the left or right hand side, similar to a tank drive.Across multiple experiments, the emulated navigation networkwas able to reliably guide the insect back to its starting position(Fig. 3C). As with Pong (Sec. 2), the plasticity processor was usedto handle multiple tasks: the processing of synaptic modulationsfor the integrator neurons, the simulation of the environment, anemulation of all sensors including the corresponding spike stimuli,the translation ofneuronal data into actions of motion, and theentire experiment control. The experiment can thus run entirelyself-contained on the BrainScaleS-2 system.

We outlined three agent-environment interaction scenarios usingspiking neural networks emulated on the BrainScaleS-2 architecture.In two of these, the sensors and actuators of an embodied neuralagent, along with its environment, were also emulated on-chipusing the embedded vector processing unit. On the other hand, thehigh-speed PlayPen setup represents a physical embodiment andenvironment that facilitates a large variety of experimental taskson accelerated time scales. These experiments demonstrate thecapabilites of the BrainScaleS-2 architecture for studying embodiedcognition and provide many of the necessary tools required for thisendeavor. losed-loop experiments on the BrainScaleS-2 architecture NICE ’20, March 17–20, 2020, Heidelberg, Germany

ACKNOWLEDGMENTS

The author wishes to thank all present and former members of theElectronic Vision(s) research group contributing to the BSS-1 andBSS-2 hardware as well as software. We gratefully acknowledgefunding from the European Union under grant agreements 604102,720270, 785907 (HBP) and the Manfred Stärk Foundation.

REFERENCES [1] S. A. Aamir, P. MÃĳller, G. Kiene, L. Kriener, Y. Stradmann, A. GrÃĳbl, J. Schem-mel, and K. Meier. 2018. A Mixed-Signal Structured AdEx Neuron for AcceleratedNeuromorphic Cores.

IEEE Transactions on Biomedical Circuits and Systems

12, 5(Oct 2018), 1027–1037. https://doi.org/10.1109/TBCAS.2018.2848203[2] S. A. Aamir, Y. Stradmann, P. MÃĳller, C. Pehle, A. Hartel, A. GrÃĳbl, J. Schemmel,and K. Meier. 2018. An Accelerated LIF Neuronal Network Array for a Large-Scale Mixed-Signal Neuromorphic Architecture.

IEEE Transactions on Circuitsand Systems I: Regular Papers

65, 12 (Dec 2018), 4299–4312. https://doi.org/10.1109/TCSI.2018.2840718[3] Sebastian Billaudelle, Yannik Stradmann, Korbinian Schreiber, Benjamin Cramer,Andreas Baumbach, Domnik Dold, Julian Göltz, Akos F Kungl, Timo C Wunder-lich, Andreas Hartel, Eric Müller, Oliver J Breitwieser, Christian Mauch, MitjaKleider, Andreas Grübl, David Stöckel, Christian Pehle, Arthur Heimbrecht,Philipp Spilger, Gerd Kiene, Vitali Karasenko, Walter Senn, Karlheinz Meier,Johannes Schemmel, and Mihai A Petrovici. 2020. Versatile emulation of spikingneural networks on an accelerated neuromorphic substrate. In

IEEE InternationalSymposium on Circuits and Systems, ISCAS 2020, Sevilla, Spain, May 17-20, 2020 .IEEE. in preparation.[4] Ann-Shyn Chiang, Chih-Yung Lin, Chao-Chun Chuang, Hsiu-Ming Chang,Chang-Huain Hsieh, Chang-Wei Yeh, Chi-Tin Shih, Jian-Jheng Wu, Guo-TzauWang, Yung-Chang Chen, et al. 2011. Three-dimensional reconstruction of brain-wide wiring networks in Drosophila at single-cell resolution.

Current Biology

21, 1 (2011), 1–11.[5] Nicolas Frémaux and Wulfram Gerstner. 2016. Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules.

Frontiers in neuralcircuits

Journalof Neuroscience

30, 40 (2010), 13326–13337.[7] S. Friedmann, J. Schemmel, A. Grübl, A. Hartel, M. Hock, and K. Meier. 2017.Demonstrating Hybrid Learning in a Flexible Neuromorphic Hardware System.

IEEE Transactions on Biomedical Circuits and Systems

11, 1 (2017), 128–142. https://doi.org/10.1109/TBCAS.2016.2579164[8] Kirsa Neuser, Tilman Triphan, Markus Mronz, Burkhard Poeck, and RolandStrauss. 2008. Analysis of a spatial orientation memory in Drosophila.

Nature

Current Biology

27, 20 (2017), 3069–3085.[10] Shinya Takemura, Arjun Bharioke, Zhiyuan Lu, Aljoscha Nern, Shiv Vitaladevuni,Patricia K Rivlin, William T Katz, Donald J Olbris, Stephen M Plaza, PhilipWinston, et al. 2013. A visual motion detection circuit suggested by Drosophilaconnectomics.

Nature

Elife

Psychonomic bulletin &review

9, 4 (2002), 625–636.[13] Timo Wunderlich, Akos Ferenc Kungl, Eric Müller, Andreas Hartel, YannikStradmann, Syed Ahmed Aamir, Andreas Grübl, Arthur Heimbrecht, KorbinianSchreiber, David Stöckel, et al. 2019. Demonstrating advantages of neuromorphiccomputation: a pilot study.