[PDF] A Design Methodology for Post-Moore's Law Accelerators: The Case of a Photonic Neuromorphic Processor

Abstract

Over the past decade alternative technologies have gained momentum as conventional digital electronics continue to approach their limitations, due to the end of Moore's Law and Dennard Scaling. At the same time, we are facing new application challenges such as those due to the enormous increase in data. The attention, has therefore, shifted from homogeneous computing to specialized heterogeneous solutions. As an example, brain-inspired computing has re-emerged as a viable solution for many applications. Such new processors, however, have widened the abstraction gamut from device level to applications. Therefore, efficient abstractions that can provide vertical design-flow tools for such technologies became critical. Photonics in general, and neuromorphic photonics in particular, are among the promising alternatives to electronics. While the arsenal of device level toolbox for photonics, and high-level neural network platforms are rapidly expanding, there has not been much work to bridge this gap. Here, we present a design methodology to mitigate this problem by extending high-level hardware-agnostic neural network design tools with functional and performance models of photonic components. In this paper we detail this tool and methodology by using design examples and associated results. We show that adopting this approach enables designers to efficiently navigate the design space and devise hardware-aware systems with alternative technologies.

Full PDF

AA Design Methodology for Post-Moores Law Accelerators: The Caseof a Photonic Neuromorphic Processor

Armin Mehrabian , Volker J Sorger , and Tarek El-Ghazawi Abstract —Over the past decade alternative technologies havegained momentum as conventional digital electronics continueto approach their limitations, due to the end of Moores Lawand Dennard Scaling. At the same time, we are facing newapplication challenges such as those due to the enormous increasein data. The attention, has therefore, shifted from homogeneouscomputing to specialized heterogeneous solutions. As an example,brain-inspired computing has re-emerged as a viable solutionfor many applications. Such new processors, however, havewidened the abstraction gamut from device level to applications.Therefore, efﬁcient abstractions that can provide vertical design-ﬂow tools for such technologies became critical. Photonics ingeneral, and neuromorphic photonics in particular, are amongthe promising alternatives to electronics. While the arsenal ofdevice level toolbox for photonics, and high-level neural networkplatforms are rapidly expanding, there has not been much workto bridge this gap. Here, we present a design methodology tomitigate this problem by extending high-level hardware-agnosticneural network design tools with functional and performancemodels of photonic components. In this paper we detail thistool and methodology by using design examples and associatedresults. We show that adopting this approach enables designers toefﬁciently navigate the design space and devise hardware-awaresystems with alternative technologies.

I. I

NTRODUCTION

With the rise of artiﬁcial intelligence (AI) applications,data-intensive workloads have surged. These, in part resultin plateaued speed and energy efﬁciency of digital von-neumann computers. Many alternative technologies and com-puting paradigms have been proposed. Photonics is one ofthese technologies, which has been a major driver of datacommunication over the past decades. One of the main chal-lenges facing a new technology is the limited and inconsis-tent availability of design and simulation tools. The ﬁeld ofphotonic computing suffers from a wide abstraction gap indesign and simulation tools. Most of such tools are currentlyfocused on the device [1] and low circuit level [2]. To competewith conventional electronics, there needs to be a long-termeffort to devise tools that complete the design ﬂow stack fromhigh-level speciﬁcation and synthesis to device and technologyattachment. Even further, for neuromorphic applications, thestack needs to incorporate top-level functionalities such asthose in training and inference of neural networks. Somerecent works in photonics have taken this route to bridgethe vertical gap by developing application-speciﬁc photonicsoftware stacks [3] [4]. Authors are with the department of Electrical and Computer En-gineering of The George Washington University. (email:[email protected];[email protected]; [email protected])

Here, we propose a design methodology applicable to neu-romorphic systems. Our methodology is based on extendingexisting commonly used neural network packages, such asGoogle Tensorﬂow. We propose to extend the hardware-agnostic arithmetic units with functional and measurementmodels of the technology, here photonics. Our approach isdistinguished from other similar works in three major ways.First, our approach allows users to beneﬁt and rely on low-level and mid-level features of Tensorﬂow such as high-speedback-end processing on a variety of hardware choices such asCPUs and GPUs. Secondly, our work particularly emphasizeson noise as a signiﬁcant component of any analog circuitincluding photonics. Lastly, familiarity with a widely-usedplatform such as Tensorﬂow, shortens the learning time andthe time to import existing work into our tool.II. D

ESIGN M ETHODOLOGY

As discussed in the previous section, we propose to ex-tend Tensorﬂow with models of actual photonic componentscommonly used in photonic neuromorphics. Our goal istwo fold, ﬁrst, to investigate the effect of non-ideal analogphotonic components on the functional performance of aneural network. Secondly, estimate the power consumption ofthese analog photonic components in such networks to giveus a better understanding of the trade-offs of adopting theneuromorphic photonics. In the rest of this section we ﬁrstintroduce a few of the most commonly used photonic com-ponents in neuromorphic photonics. Then, we brieﬂy discussthe overall hierarchy of the Tensorﬂow tool and where andhow it was extended. Lastly, we provide example mathematicaldescriptions of the modeled components.

A. Photonic Components

The recent increased popularity of photonics is mainlydue to its low operating power and high bandwidth [5].Recently, a multitude of neuromorphic photonic processorshave been proposed and even realized [6] [4] [7] [8] [9]. Inthese architectures, basic arithmetic operations are realized byphotonic devices that mimic those functionalities. Table I listssome of these arithmetic operations and their correspondingphotonic realizations.We extend Tensorﬂow with two classes of models. First,functional models, that transform ideal noise-free arithmeticoperations with their realistic analog photonic representations.Secondly, power models that aim to compute power estimates.While power models do not affect the functional performanceof a neural network such as the prediction accuracy, functionalmodels inﬂuence them. ©2020 IEEE a r X i v : . [ c s . ET ] J un ere we start by introducing a set of commonly adoptedphotonic devices. We emphasize on two example devices usedto realize photonic multiplication, namely micro-ring resonator(MRR) and Mach-Zehnder interferometer (MZI). The twodevices realize the same functionality so we use them as anexample case for design space exploration.MRRs play a signiﬁcant role in photonics. A generic MRRis a circular optical waveguide as shown in Figure 1 (a). TheMRR in Figure 1 is coupled to one Through and one

Drop waveguides. The portion of the light coupled to the ring willloop through the ring and then couple back to the

Through waveguide and create anywhere between a destructive or aconstructive interference. The level of interference dependson the wavelength of the incoming beam and the resonantfrequency of the ring. By applying a bias voltage V bias theresonant frequency of the ring is changed, thus affecting thelevel of interference.That being said, the two outputs of the MRR together can beused to create differential weighting between an incoming lightbeam and a bias voltage. It should be note that this weightingis spectrally sensitive and even can be engineered to realizeselective parallel multiplications on different wavelengths.Fig. 1: Schematic diagram of (a) a MRR device (b) a MZIdevice.Another alternative device that can be used for weightingin photonics is the Mach-Zehnder Interferometer (MZI) [9].Figure 1 (b) demonstrates a MZI device. The input light beamis split into two beams through a beamsplitter. Each beamincurs a different phase change by a phase shifter. At theoutput, a combiner combines the two phase-shifted beams.The output beam will have a different amplitude dictated bythe relative phase of the two beams, which similar to a MRRcan cause a range of interferences. Hence, by controlling theamount of phase shift, a weighting mechanism between theinput beam and the phase shift is realized.In photonics, the summation operation can be achievedoptically in two main ways; incoherently via a photodiode oralternatively, coherently by combining two phase-stabalizedphotonic beams. By feeding a set of input light beams to aphotodiode, we can add the power of the beams and generatean electrical current proportional to the sum of the incidentbeams.Another important class of components in neural networksis the nonlinear activation function. Without nonlinear acti-vation functions the whole neural network collapses into a TABLE I: Mapping of primitive math operations to theirhardware realization. Math Operation Photonic RepresentationMultiplication MRR, MZIAddition PhotodiodeConnection WaveguideNon-linear Activation Electro-Optic Modulator linear transformation, incapable of ﬁnding complex nonlineartasks. There has been many recent works in photonics tobuild nonlinear activation functions for neural networks [10][11]. One way to build a nonlinear activation function inphotonics is to map the nonlinear activation function ontothe transfer function of an electro-optic modulator (EOM).The advantage of this method is that when paired with aphotodiode, the output of photodiode is an electrical current,which can directly be used to drive an electro-optic modulatorwithout the need of any direct electrical to optical conversion.Furthermore, we can use a new laser source to be modulatedby our signal, which allows to keep signal cascadability high.

B. Google Tensorﬂow

Tensorﬂow at heart is a dataﬂow graph processor that canmap a computational graph across machines in a cluster andacross different computational devices, such as CPUs, GPUs,and TPUs. While our design methodology is for the most partfocused on the inference, the availability of training algorithmsallow the designer to beneﬁt from a wide variety of state-of-the-art train-time tools on top of a familiar user interface.Figure 2 depicts the hierarchical architecture of Tensorﬂowand our extended photonic models.Core Tensorﬂow is coded in C++ to take advantage of itsperformance and portability. Given an input graph, it partitionsthe graph into sub-graphs to be used by supported underlyingcomputing hardware. From Figure 2 it can be seen that,many of standard kernels are fused in the low-level kernelimplementations to gain better performance for standard neuralnetwork architectures. Within the low-level kernel layer, thekernels form a gamut of operations from very simple tensordeﬁnition to more complex convolutional and recurrent layers.Since these fused kernels are accessible through high-levelPython and C++ clients, we can extend these base kernelsinside the training and inference libraries.

C. Extended Models

In the rest of this section, we provide example mathematicalmodels used in this work. First example is the power modelfor the photodiode. Power in a photodiode is calculated usingthe

Responsivity as follows, R = I ph P in = λ qhc η [ AW ] (1)where P in is the power of input incident light, I ph is thephoto-current, q is the electron charge, λ is the wavelength,ig. 2: Overview of the Tensorﬂow architecture and our extended photonic model library implementation. h is the Plancks constant, and c is the speed of light. For aphotodiode, given the, technology the Responsivity is known.In this work we use values from foundry processes [12].Another example of a model we implemented here is thenoise models. Noise models fall under the functional modelsclass as they perturb the operation of otherwise an idealphotonic neural network. For the same photodiode, there aretwo types of noise sources namely Thermal noise and the Shotnoise, which are derived from, I sn = (cid:113) q ( I ph + I D )∆ f and I tn = (cid:114) K B T ∆ fR SH (2)where I D is the dark current, ∆ f is the bandwidth, K B is the Boltzmann constant, T is temperature in Kelvins and R SH is the total equivalent shunt resistance. Noise modelsare particularly interesting because they let us explore thedesign space of photonic neural networks with different noisecharacteristics.The last class of models are functional models that aim tocreate a more realistic implementation of photonic devices oradjust for functional imperfections of photonic hardware. Forexample in MRRs, which are used to realize the weightingoperation, the actual transfer function of the Through port isdeﬁned by, T T hrough = I pass I input = r a − r r acosφ + r − r r acosφ + ( r r a ) (3)where a is the attenuation, r and r are coupling coefﬁcientswith Through and Drop waveguides, and φ is the single passphase shift. As a result, when a becomes non-negligible theweighting of the incident beam and the bias voltage incur somelevel of precision loss. III. R ESULTS

In this section we present two class of results namely,the functional performance and the power estimation. Fig-ure 3 represents the comparison of the accuracy of various common neural network architectures for the classiﬁcationtask on the MNIST dataset. The CNN3, CNN5, and CNN9represent three convolutional neural networks with 3, 5, and 9convolutional layers and 16 kernels per layer. Similarly MLP3,MLP5, and MLP9 are fully-connected multi-layer perceptronnetworks. Similarly,

VGG16 , AlexNet , InceptionV3 , and

Resnet are commonly used deep neural network models [13]. As weexpected the introduction of photonic device noise adverselyimpacts the accuracy. However, it seems that MRR basedimplementations suffer less compared to MZI counterparts. Inthe second experiment we estimated photonic power for thesame class of neural network application with both MRR andMZI implementations. Figure 4 summarizes the results. Whilefor most of the architectures the power estimation of MRR-based and MZI-based systems closely follow each other, asthe number of network parameters increase, for instance for

VGG16 and

AlexNet the gap between power consumption ofthe two device implementations widens.IV. C

ONCLUSION

In this paper we proposed a structured methodology and atool that can be adopted in the design of post-Moore’s lawaccelerators using novel technologies. We considered the caseof photonic neuromorphic accelerator design, where there is alack of simulation tools that can bridge the design abstractiongap. Rather than building our tool from grounds up, weextended an existing and familiar open-source tool, namelythe Google Tensorﬂow. This allowed us to take advantage ofmany optimized low-level and mid-level functionalities andkernels, while extending Tensorﬂow libraries with functionaland measurement modules, as well as models to account forphotonic device-speciﬁc noise sources. We showed that ourtool can be used for design space explorations by selectingcandidate devices based on their power and functional perfor-mance metrics. a) MRR(b) MZI

Fig. 3: Comparison of the effect of photonic device noise on accuracy using (a) MRR and (b) MZI implementation.Fig. 4: Power estimation of commonly used neural network architectures using photonic components.R

EFERENCES[1] F. Lumerical, “Solutions. lumerical solutions, inc,” 2014.[2] L. Chrostowski, Z. Lu, J. Fl¨uckiger, J. Pond, J. Klein, X. Wang, S. Li,W. Tai, E. Y. Hsu, C. Kim et al. , “Schematic driven silicon photonicsdesign,” in

Smart Photonic and Optoelectronic Integrated Circuits XVIII ,vol. 9751. International Society for Optics and Photonics, 2016, p.975103.[3] J. Anderson, E. Kayraklioglu, S. Sun, J. Crandall, Y. Alkabani,V. Narayana, V. Sorger, and T. El-Ghazawi, “Roc: A reconﬁgurableoptical computer for simulating physical processes,”

ACM Transactionson Parallel Computing (TOPC) , vol. 7, no. 1, pp. 1–29, 2020.[4] V. Bangari, B. A. Marquez, H. Miller, A. N. Tait, M. A. Nahmias, T. F.de Lima, H.-T. Peng, P. R. Prucnal, and B. J. Shastri, “Digital electronicsand analog photonics for convolutional neural networks (deap-cnns),”

IEEE Journal of Selected Topics in Quantum Electronics , vol. 26, no. 1,pp. 1–13, 2019.[5] D. A. Miller, “Attojoule optoelectronics for low-energy informationprocessing and communications,”

Journal of Lightwave Technology ,vol. 35, no. 3, pp. 346–396, 2017.[6] M. A. Nahmias, B. J. Shastri, A. N. Tait, T. F. De Lima, and P. R.Prucnal, “Neuromorphic photonics,”

Optics and Photonics News , vol. 29,no. 1, pp. 34–41, 2018.[7] A. Mehrabian, Y. Al-Kabani, V. J. Sorger, and T. El-Ghazawi, “Pcnna: aphotonic convolutional neural network accelerator,” in . IEEE, 2018, pp.169–173. [8] A. Mehrabian, M. Miscuglio, Y. Alkabani, V. J. Sorger, and T. El-Ghazawi, “A winograd-based integrated photonics accelerator for convo-lutional neural networks,”

IEEE Journal of Selected Topics in QuantumElectronics , vol. 26, no. 1, pp. 1–12, 2019.[9] Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones,M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund et al. , “Deeplearning with coherent nanophotonic circuits,”

Nature Photonics , vol. 11,no. 7, p. 441, 2017.[10] J. George, A. Mehrabian, R. Amin, P. R. Prucnal, T. El-Ghazawi,and V. J. Sorger, “Neural network activation functions with electro-optic absorption modulators,” in . IEEE, 2018, pp. 1–5.[11] M. Miscuglio, A. Mehrabian, Z. Hu, S. I. Azzam, J. George, A. V.Kildishev, M. Pelton, and V. J. Sorger, “All-optical nonlinear activationfunction for photonic neural networks,”

Optical Materials Express ,vol. 8, no. 12, pp. 3851–3863, 2018.[12] E. Timurdogan, Z. Su, C. V. Poulton, M. J. Byrd, S. Xin, R.-J. Shiue,B. R. Moss, E. S. Hosseini, and M. R. Watts, “Aim process design kit(aimpdkv2. 0): Silicon photonics passive and active component librarieson a 300mm wafer,” in

Optical Fiber Communication Conference .Optical Society of America, 2018, pp. M3F–1.[13] A. Canziani, A. Paszke, and E. Culurciello, “An analysis of deepneural network models for practical applications,” arXiv preprintarXiv:1605.07678arXiv preprintarXiv:1605.07678