A Photonic In-Memory Computing primitive for Spiking Neural Networks using Phase-Change Materials
AA Photonic In-Memory Computing primitive for Spiking Neural Networks usingPhase-Change Materials
Indranil Chakraborty, ∗ Gobinda Saha, and Kaushik Roy
School of Electrical & Computer Engineering,Purdue University, West Lafayette, IN 47907, USA (Dated: October 25, 2018)Spiking Neural Networks (SNNs) offer an event-driven and more biologically realistic alternativeto standard Artificial Neural Networks based on analog information processing. This can poten-tially enable energy-efficient hardware implementations of neuromorphic systems which emulate thefunctional units of the brain, namely, neurons and synapses. Recent demonstrations of ultra-fastphotonic computing devices based on phase-change materials (PCMs) show promise of addressinglimitations of electrically driven neuromorphic systems. However, scaling these standalone comput-ing devices to a parallel in-memory computing primitive is a challenge. In this work, we utilize theoptical properties of the PCM, Ge Sb Te (GST), to propose a Photonic Spiking Neural Networkcomputing primitive, comprising of a non-volatile synaptic array integrated seamlessly with previ-ously explored ‘integrate-and-fire’ neurons. The proposed design realizes an ‘in-memory’ computingplatform that leverages the inherent parallelism of wavelength-division-multiplexing (WDM). Weshow that the proposed computing platform can be used to emulate a SNN inferencing engine forimage classification tasks. The proposed design not only bridges the gap between isolated computingdevices and parallel large-scale implementation, but also paves the way for ultra-fast computing andlocalized on-chip learning. I. INTRODUCTION
The phenomenal success in the field of Deep Learningusing Artifical Neural Networks (ANN) based on ana-log information processing has had far reaching conse-quences in the past decade [1]. Machines driven by suchnetworks have surpassed human in various tasks rang-ing from pattern recognitions to playing complex gamessuch as Go [2] and Chess [3]. However, the growingcomplexities of computational models involved in suchmulti-layered neural networks have rendered the train-ing and inferencing tasks extremely expensive in termsof memory and energy. The gulf between the energy ef-ficiency of the brain and standard neural network archi-tectures have led researchers to explore a bio-plausiblealternative, namely, Spiking Neural Networks (SNNs).The event-driven nature and sparse information encod-ing of SNNs make them more feasible for energy-efficientneuromorphic computing thus paving the way towardsunraveling the elusiveness of the brain. The fundamen-tal operations performed by SNNs involve parallelizeddot-product through the synaptic network followed bysubsequent integration and thresholding by the neurons.Neuromorphic systems attempting to leverage the sparseand event-driven nature of SNNs thus aim toward effi-cient emulation of these functionalities.The initial efforts [4–6] in hardware implementations ofSNNs was based on standard von-Neumann architecture[7] based on Complementary Metal Oxide Seminconduc-tor (CMOS) technology where the synaptic units of theneural networks are stored in the digital memory and ∗ [email protected] repeatedly fetched by the processor for computing oper-ations. However, the overhead of frequent data transportbetween the memory and processor have led to a shift inthe computing paradigm as ‘in-memory’ computing plat-forms [8, 9] attempt to emulate the ‘massively parallel’operations of the brain. Although the term ‘neuromor-phic’ was primarily coined [10] with CMOS technology inmind, this computing domain has branched out to non-volatile memory (NVM) technologies such as oxide-basedmemristors [11], spintronics [12], phase change materials(PCM) [13, 14], etc in the recent years. The naturalability of these resistive technologies to compute paral-lelized dot-products using crossbar structures make thempromising candidates for neuromorphic systems. Despitethe extensive efforts in NVM-based in-memory comput-ing in the electrical domain, these technologies sufferfrom different drawbacks manifesting in form of energy-efficiency, speed and sneak paths. Moreover, write la-tencies into memristors [15, 16] is a major reason whymemristive devices are not suitable for temporally scal-able architectures. Thus, there is a need to explore adifferent memory technology which can enable comput-ing as well as the possibility of lower write times.Integrated Photonics offers an alternative approach tostandard microelectronic ‘in-memory’ computing plat-forms and promises ultra-fast neural computing and in-formation processing. The recent advances in photonics-based neuromorphic computing has overseen implemen-tations of various kinds [17, 18] of neural processing unitson the photonic platform leveraging the inherent capa-bility of matrix operations of integrated optical circuits.Spike-based processing systems have also been exten-sively explored using excitable lasers [19, 20]. However,most of the photonic systems investigated in the context a r X i v : . [ c s . ET ] O c t of neuromorphic computing are based on volatile infor-mation processing which require thermal tuners to main-tain the modulation states which might turn out to beenergy expensive for large-scale systems. Non-volatilityoffers the ability to write and erase information dynam-ically desirable for large-scale implementations of neu-romorphic systems. To that effect, recent demonstra-tions of sub-ns writing speeds in GST-based PCM tech-nology through optical pulses has opened up a host ofopportunities of in-memory computing in the photonicdomain[21]. The ultra-fast switching using light over-comes the longstanding obstacle of high ‘write’ latencies[15] for PCMs in the electrical domain. The highly con-trasting optical properties of GST in its crystalline andamorphous phases have led to implementations of all-photonic memories [22], switches [23] and reconfigurablenon-volatile computing platforms [24]. More recently,photonics-based GST devices have also been explored toemulate biologically plausible synapses [25], capable ofundergoing Spike Timing Dependent Plasticity (STDP),and ‘integrate and fire’ spiking neurons [26]. Despitethese promising investigations towards fast neural com-puting based on non-volatile platform, the challenge ofscaling standalone devices to large-scale neuromorphicsystems is enormous. Thus, there is a need to explorenon-volatile memory primitive in the photonic domain,which can perform parallel computing. In this work, wepropose an all-photonic SNN computing primitive, basedon GST-based photonic neural elements, which attemptsto bridge the gap between devices to system-level imple-mentation of Photonic neural networks. We leverage theinherent wavelength division multiplexing (WDM) [27]property of optical networks to propose a non-volatilesynaptic array, while exploring and mitigating the chal-lenges arising from designs based on ring resonators ofradii comparable to the wavelength of operation. Sucha synaptic array can achieve higher densities comparedto current state-of-art photonic computing systems. Weshow how the proposed synaptic computing platform canbe seamlessly integrated with previously explored ‘inte-grate and fire’ spiking neurons to realize an ultra-fastand truly integrable Spiking Neural Network. Finally,we evaluate the performance of the proposed PhotonicSNN in the classification task of handwritten digits. II. PHOTONIC SYNAPSES
The core computational units of any neural networkare neurons and synapses. In SNNs, information is en-coded in form of spikes and the neurons and synapsesare capable of processing information through these spiketrains. As shown in Fig. 1 (a), the input trains of spikesget multiplied by the synaptic weights w , w , ..., w n andthe weighted sum is received by an ‘Integrate-and-Fire’neuron. The internal state of the neuron, known as the‘membrane potential’ ( V mem ) integrates based on the in-coming weighted spikes and is compared with a thresh- INPUT PASS
SiO Si GST k -k * r r * INPUT PASSGST element R P in P out =T p ×P in SiO Si t GST
WaveguideCross-section
GST L gap INPUT PASS
SiO Si GST k -k * r r * INPUT THROUGHGST element R P in P out =T×P in V mem V th Integrate and Fire Spiking Neuronw w w n Σ Output Spikes
Input Spike trains I I I n . .. a)b) c) a) b) L GST c) FIG. 1. (a) The basic functional elements of an SNN arespiking neurons and weighted synaptic connections. At eachtime instant, the inputs are weighted by the synaptic weightsto produce a resultant output represented as (cid:80) i P i w i . The‘integrate-and-fire’ neuron’s membrane potential ( V mem ) isupdated according to the weighted sum and compared with athreshold value ( V th ). (b) GST-embedded single bus micror-ing resonator structure with Si waveguides on SiO substrate.(c) Top view of the device illustrating the different parame-ters pertaining to the ring resonator structure. The synapticdevice performs an analog multiplication of input P in andtransmission T . old ( V th ) at every time-step. The neuron outputs a spikeonce V mem reaches V th . The synaptic functionality es-sentially corresponds to a multiplication operation of theinputs and the corresponding weights of the synapses.The basic operation performed by a single synapse canbe represented as I i w i . We show how a single bus mi-croring resonator with a GST element embedded on topof it can operate as such a synapse. The device underconsideration is a Si-on-insulator structure consisting ofa rectangular waveguide and a ring waveguide as shownin Fig. 1 (b). A GST element is deposited on one arm ofthe ring waveguide, which takes the shape of an arc andthe length of the arc is denoted as the length of the GSTelement ( L GST ). The fabrication technique of buildingsuch a structure has been well explored [23, 24]. Wavein the rectangular waveguide gets partially coupled tothe ring and constructively interferes when the round-trip phase shift equals an integer multiple of 2 π leadingto the resonant condition:2 πR ring n eff,wg = mλ m (1)where R ring is the radius of the ring waveguide, n eff,wg is the effective refractive index of the ring waveguide and λ m is the resonant wavelength. The transmission throughthe ‘PASS’ port is dependent on the device dimensionsand material such that: T p = a − arcosθ + r − arcosθ + a r (2)where a is the attenuation factor and r is the self-couplingcoefficient as shown in Fig. 1 (c). θ is the single-passphase shift. Under resonance, θ equals 2 π and the trans-mission is given by T min = (( a − r ) / (1 − ar )) .We leverage the contrasting optical properties of GSTin its amorphous (a-GST) and crystalline (c-GST) statesto manipulate the attenuation in the ring waveguide andthus vary the transmission T min at the resonance wave-length. The varying imaginary refractive indices of a-GST and c-GST leads to differential absorption of evanes-cently coupled light. The difference in optical absorptioncan be visibly observed through the cross-section viewof the fundamental mode profiles in GST-embedded Siwaveguide when excited by a TE mode electromagnetic(EM) wave as shown in Fig. 2. c-GST introduces asignificant change in waveguide mode in contrast to a-GST due to higher absorption in the GST element. Theattenuation factor ( a ) in Eqn. 2 can be related to theimaginary refractive index as: a = exp ( − πκ eff,GST L GST λ + Loss ) (3)where κ eff,GST is the effective imaginary refractive in-dex of the GST on Si-SiO stack, L GST is the length ofthe GST element, and the term ‘
Loss ’ refers to otherpropagation losses such as bending losses, etc. The GSTelement can be programmed to partially crystallized lev-els such that multi-level states can be achieved [22, 24].To note, from the perspective of neural networks, signifi-cant progress have been made towards proposing trainingalgorithms [28, 29] which preserve performance even withbinarized synapses. Thus, although multi-level stateswould be desirable from a device point of view, modifiedtraining techniques can enable reasonable performancewith low-precision synapses.The refractive indices of partially crystallized GST canbe calculated from effective permittivities approximatedby an effective-medium theory [30, 31]: (cid:15) eff ( p ) − (cid:15) eff ( p ) + 2 = p × (cid:15) c − (cid:15) c + 2 + (1 − p ) × (cid:15) a − (cid:15) a + 2 (4)where (cid:15) c and (cid:15) a are the complex permittivites of c-GSTand a-GST respectively calculated from the refractive in-dices of GST[32] by (cid:112) (cid:15) ( λ ) = n + iκ . p is the degreeof crystallization. Thus, the different levels of crystal-lization of GST leads to various levels of κ eff,GST thusleading to different levels of transmission. We leveragethe multi-level transmission to implement an all-photonicsynapse. Considering an incident optical pulse of power P in , the synaptic functionality is realized such that theoutput power P out is given by: P out = T λ m P in (5) b)a) c) FIG. 2. Cross-section view of Fundamental Mode profiles fora GST-embedded Si-SiO waveguide section for (a) a-GSTand (b) c-GST showing visible contrast in optical absorptionfor the two boundary states of GST. (c) The variation of thereal ( n eff,GST )) and imaginary ( κ eff,GST ) refractive indicesof GST with degree of crystallization. where T λ m is the transmission at resonant wavelength λ m . T λ m represents the weight of the synapse and thevarious levels of transmission with varying degree of crys-tallization states of GST can be leveraged to represent aentire range of synaptic weights with appropriate dis-cretization. We critically couple the resonator to theamorphous state such that the transmission is minimumin the amorphous state and increases with the degreeof crystallization. While individual synapses represent asimple multiplication, the weighted inputs from multiplesynapses are received by a neuron as shown in Fig. 1 (a).To emulate such a behavior, it is important to connectthese synapses in an integrated fashion. Such a synapticnetwork would perform the most ubiquitous functionalityof any neural network, a dot-product. III. PHOTONIC DOT PRODUCT ENGINE
We leverage the characteristics of the proposed non-volatile photonic synaptic device to map the synapticweights of a neural network in a Photonic Synaptic Net-work capable of performing the dot-product of the inputsand the weights.
A. Network Design
We leverage the Wavelength Division Multiplexing(WDM) technique to compute dot product operationsbetween incoming spikes and synaptic weights. We rep-resent the synaptic weights in terms of the transmission T λ of the microring resonator as discussed in the previ-ous section. To represent multiple wavelengths, we usemultiple ring resonators of increasing ring radii to rep- λ λ λ N-1 λ N P H O T O D E T E C T O R A RR AY T λ1 T λ2 T λ,N-1 T λN [P , P , …, P N ] R Radii:
Weights: R R N-1 R N . . . R < R < … FIG. 3. Synaptic dot product engine showing arrangement ofring resonators with increasing radii representing the trans-mission vector T λ = { T λ , . . . , T λ N } . WDM signals gets mod-ulated by weights corresponding to respective wavelength andthe photodetector array collects the signals to generate a cur-rent I out representing the dot product of transmission vector T λ and inputs P = { P , . . . , P N } . resent different synapses in a row as shown in Fig. 3.The number of synapses ( N ) in each row is dependenton the Free Spectral Range (FSR) of the ring resonatorand this governs the dimension of the input vector of thedot product engine. A WDM spike enters the straightwaveguide through the ‘INPUT’ port and the GST ele-ment on each ring resonator modulates the amplitude ofcorresponding wavelength by the representative synapticweight according to Eqn. (5). Thus at the ‘OUTPUT’port we obtain a multi-wavelength spike comprising ofdifferent T λ i P i products corresponding to different wave-lengths. This spike is then fed to a photodiode array(PD) which produces a current given by the sum of allthe amplitudes given by: I out = R (cid:88) i T λ i P i (6)where R is the responsivity of the PD expressed as A/W.This current is equal to the dot product of the input vec-tor P and weight vector T λ . The operation is illustratedin Fig. 3. B. Synapse Design constraints Using the WDM technique for the proposed photonicsynaptic array imposes certain constraints on the designof the synaptic devices. For accurate dot-product op-eration, it is necessary to achieve significant isolationbetween the channels in order to minimize channel-to-channel interaction. The important parameters whichconstrain the design space of the synaptic device are fi-nesse (F) and channel spacing ( λ diff ). Finesse is the ra-tio of free spectral range (FSR) and full-width at halfmaximum (FWHM). For a single bus ring resonator,FWHM and FSR are expressed as [33]: F W HM = (1 − ra ) λ m πn g L √ ra (7) F SR = λ m n g L (8) F inesse = F SRF W HM (9) where L = 2 πR ring is the circumference of the ring, n g is the group index and rest of the parameters bear thesame meaning as defined earlier. The interference due toadjacent channels can be modeled as: T (cid:48) λ i | λ = λ i = T λ i | λ = λ i × T λ i | λ = λ i +1 × T λ i | λ = λ i − T (cid:48) λ i | λ = λ i = α λ i T λ i | λ = λ i (10)Here, T (cid:48) λ i | λ = λ i is the modified transmission due tointerference from the adjacent resonant wavelengths, T λ i | λ = λ i ,λ i +1 ,λ = λ i − are the transmissions of i th ring atthe i th , ( i + 1) th and ( i − th resonant wavelengths re-spectively. α λ i represents the non-ideal factor whichshould ideally be close to 1. α λ i decreases with decreas-ing channel spacing ( λ diff ) and increasing FWHM. Forour design, we decided the minimum radius of the ringto be 1.5 µm in order to achieve a high density synap-tic array for better scalability. Rings of similar size havebeen demonstrated previously [34] with certain modifica-tions that we will discuss next. The rest of the param-eters concerning the synapses were chosen to maximizethe number of rings in a single row (N) while maintaining α λ i close to 1 under the condition that N λ diff < F SR .A number of challenges arise for rings of radius compa-rable to the wavelength of operation. Firstly, to achievea critical coupling in the low-loss amorphous state, thepower coupling gap between the bus and the ring waveg-uide needs to be small ( < nm ). This is because the in-teraction length between the ring and the straight waveg-uide is quite short and hence to achieve reasonable cou-pling, even to match the small intrinsic loss in the ringin low-loss amorphous state of GST, we require a smallpower coupling gap. Such gaps become extremely diffi-cult to fabricate. An alternative to using lower gaps hasbeen demonstrated [34] for rings of small radii. Reducingthe width of the bus waveguide increases the spatial pe-riod of the propagating mode due to the lower effectiverefractive index. This results in a better phase matchwith the mode in the tightly curved ring waveguide. Forthe rest of our analysis, we have used a bus waveguide ofwidth 0.35 µm and a coupling gap of 135 nm . IV. PHOTONIC INTEGRATE-AND-FIRENEURONS The proposed photonic dot-product engine needs tobe interfaced with spiking neurons to realize a PhotonicSNN inferencing platform. In this work, we explore aPhotonic ‘Integrate-and-Fire’ neuron that we have pro-posed previously [26]. We revisit the concept of a Pho-tonic Integrate-and-Fire Neuron explored in our previouswork [26]. The neuron consists of an ‘Integration Unit’and a ‘Firing Unit’. The ‘Integration unit’ of the neu-ron consists of two add-drop ring resonators with GSTdeposited on top of each as shown in Fig. 4 (a). Thepurpose of the two ring resonators is to perform bipolarintegration, i.e., the respective devices are fed by positive ∑ Drop Drop Input ThroughInput Through Membrane PotentialAmplifier INTEGRATION UNIT FIRING UNIT Rect. Waveguide GST Element GST Element Positive Weighted Sum Negative Weighted Sum Output a) b) A C P amp Write Pulses Reset Pulse Membrane Potential P thresh P rest Spike Event Incident Spikes FIG. 4. (a) Schematic of a bipolar integrate and fire neuronbased on GST-Embedded Ring resonator devices showing theintegration and firing unit. (b) Timing diagram showing theintegration of membrane potential for various incident pulsesdemonstrating the operation of the proposed neuron and negative weighted sums from the synapses to performintegration in the appropriate direction. The significanceof positive and negative weighted sums would be clearerin the next section. The neuron operates in alternate‘write’ and ‘read’ cycles. The GST elements on the ringresonators are initially in crystalline state. With incident‘write’ pulses, the GST element begins to get partiallyamorphized. During the ‘read’ phase, with partial amor-phization, transmission at the ‘THROUGH’ port of eachring resonator decreases and that at the ‘DROP’ portincreases. Essentially, with incoming pulses, the trans-mission through the ‘DROP’ and ‘THROUGH’ ports getpositively and negatively integrated respectively. Theseproperties of the device can be combined to mimic thebehavior of a bipolar integrate and fire neuron. The‘DROP’ and ‘THROUGH’ port of the positive and nega-tive integrating ring resonator respectively are connectedto an inteferometer. The output of the interferometerrepresents the membrane potential of the spiking neu-ron. To perform the thresholding action, the membranepotential is fed to the ‘Firing unit’ of the neuron. Thisunit consists of an amplifier, a circulator and a rectan-gular waveguide with GST deposited on top. Duringthe ‘read’ phase of the neuron, the resulting membranepotential after being amplified and directed by the cir-culator towards the rectangular waveguide, attempts toamorphize the initially crystalline GST element on therectangular waveguide. Initially, the output of the am-plifier A ( P amp ) is insufficient to amorphize the GST onrectangular waveguide and hence rendering it unable totransmit an output spike. However, when the membrane λ λ λ λ λ λ λ λ λ N-1 λ N-1 λ N-1 λ N-1 λ N λ N λ N λ N P H O T O D E T E C T O R A RR AYS L ASE R D I O D E A RR AY Neuron 1Neuron 2Neuron 3Neuron M T T T N-1,1 T N1 T T T N-1,2 T N2 T T T N-1,3 T N3 T T T N-1,M T NM [P , P , …, P N ] [O ][O ][O ][O M ] I j = R∑P i T ij O j = kI j N × M U L T I P L EXE R × M SP L I TT E R [P , P , …, P N ][P , P , …, P N ] [P , P , …, P N ] P P P N P N-1 [I ] [I ] [I ] [I M ] FIG. 5. Synaptic dot product engine showing arrangement ofring resonators with increasing radii representing the trans-mission vector T λ = { T λ , . . . , T λ N } . WDM signals gets mod-ulated by weights corresponding to respective wavelength andthe photodetector array collects the signals to generate a cur-rent I out representing the dot product of transmission vector T λ and inputs P = { P , . . . , P N } . k is an amplification factor. potential integrates enough to the cross the threshold, onincidence of several write pulses, P amp is ensured to behigh enough to amorphize the GST on the rectangularwaveguide, thus enabling it to transmit a spike. Oncethe neuron fires, a ‘RESET’ pulse resets the states of thedevices to their initial states and the membrane potentialdrops to the resting potential ( P rest ) as shown in Fig. 4(b). Further details of the writing and reading schemeshave been presented in [26]. V. OPERATION OF ALL-PHOTONIC SPIKINGNEURAL NETWORK Implementation of a SNN based on the Photonic Dot-Product Engine (PDPE) and ‘integrate-and-fire’ neu-rons described above involves integration of the proposedstructures. As elucidated above, the basic computationalfunction of a neural network is a dot product. To realizeparallel instances of such a functionality using the afore-mentioned PDPE, we use a splitter (SPL) to feed theWDM input spikes to multiple PDPE rows with the in-put vector and obtain the dot-products of each rows fromrespective PD arrays as shown in Fig. 5. Essentially, theoutput vector thus obtained from the PD arrays gives usthe multiplication of the vector of input spikes P i witha N × M synaptic network T ij . The M outputs I j ob-tained from the PD arrays are fed to laser diodes (LD)which converts the electrical current to optical spikes thuscompleting the parallel dot-product operations and canbe represented as: O O ... O M ∝ (cid:2) P P . . . P N (cid:3) T T . . . T M T T . . . T M ... ... . . . ... T N T N . . . T NM (11) TABLE I. Simulation ParametersParameters ValuesSi Ring Waveguide X-Section 0.45 × µm Si Bus Waveguide X-Section 0.35 × µm Coupling Gap ( L gap ) 0.135 µm GST Length ( L GST ) 170 nm - 220nmGST Thickness ( t GST ) 10 nmGST Width ( W GST ) 0.44 µm Si Refractive Index ( n Si ) [35] 3.5SiO Refractive Index( n SiO ) [36] 1.4c-GST Refractive Index( n c − GST + iκ c − GST ) [37] 7.2+1.9ia-GST Refractive Index( n a − GST + iκ a − GST ) [37] 4.6+0.18i We now present how such a photonic synaptic networkbased can be integrated with the proposed bipolar IFNeurons to realize a photonic SNN. The schematic ofsuch a photonic SNN is illustrated in Fig. 6. To accountfor negative weights in a neural network, we representthe element of the weight matrix T to be comprised of apositive and negative component: T ij = T + ij + T − ij T + ij = T ij , T − ij = T low , when T ij > T + ij = T low , T − ij = | T ij | , when T ij < T low is the transmission corresponding to the lowestprogrammable state considered. Two PDPE arrays aredeployed for mapping the positive and negative compo-nents respectively as depicted in Fig. 6. The dot-productoutputs from the LD arrays of the two DPE arrays canbe represented as: O + j = (cid:88) i P i T + ij O − j = (cid:88) i P i T − ij (13)These outputs from the j th rows are received by the j th IF neuron discussed earlier. The outputs from the posi-tive and negative PDPE arrays are received by the posi-tive and negative integrating ring resonators in the neu-ron respectively. The two ring resonators integrate inthe opposite direction based on the two inputs and theresulting integration mimics the desired integration thata biological ‘integrate-and-fire’ neuron performs, givenby: V mem,j [ t ] = V mem,j [ t − 1] + (cid:88) i P i T ij (14)Here, (cid:80) i P i T ij = (cid:80) i ( P i T + ij − P i T − ij ). V mem,j [ t ] is the in-ternal state or the membrane potential of the j th neuronat time t . The resulting membrane potential is passed toa Firing Unit as described in Fig. 4 such that the neuron produces an output spike once the V mem,j [ t ] reaches athreshold. The output spikes from all the neurons of thecurrent layer are then fed to the next synaptic array layer.Fig. 6 delineates the operation of basic building blocksof a neural network. We perform large scale system-levelsimulations by emulating the behavorial model of the pro-posed spike processing system to assess the performanceof neuromorphic systems based on this fabric.It is important to consider the architecture-level facetsof any computing primitive. The proposed design is anal-ogous to memristive crossbars, where the high fan-in intothe neurons is resolved by the inherent parallelism of thecomputing framework. In our design, each neuron re-ceives two inputs, from the positive and negative synap-tic array, and the output of that neuron is fed to oneof the 16 inputs of the synaptic array of the next layer.In reality, neural networks are of far bigger sizes thanwhat the proposed design can accommodate. As a re-sult, multiple instances of the proposed primitive can beused with time-multiplexing to perform the entire vector-matrix multiplication operation. The partial sums fromthese instances are collected and added before being fedto the neuron. Output from a neuron is again servedas inputs to the synaptic arrays storing the weights ofthe next layer of the neural network. Similar architec-tures have been explored using memristive technologies[16, 38]. This work is concerned with device and cir-cuit primitive of a spike-based photonic non-volatile in-ferencing engine which will act as a computing core of alarge-scale system similar to technologies in the electricaldomain. VI. RESULTSA. Simulation Framework 1. Device Simulations We evaluated the performance of the proposed all-photonic SNN fabric by designing a device-circuit-algorithm co-simulation framework. First, the devicecharacteristics of each ring resonator in a DPE row issimulated for 4 different degrees of crystallization of theGST element using commercial-grade simulator Lumer-ical FDTD Solutions[39] based on the finite-differencetime-domain (FDTD) method. The fixed parametersused for these simulations are listed in Table I. The mode-profiles were obtained through Electromagnetic simula-tions using the Finite Element method in COMSOL Mul-tiphysics [40]. 2. Device to System Framework The device characteristics, obtained from the FDTDsimulations are analyzed and a Gaussian fit is applied onthe data for interpolation. We develop a device to system λ λ λ λ λ λ λ λ λ N-1 λ N-1 λ N-1 λ N-1 λ N λ N λ N λ N P H O T O D E T E C T O R A RR AYS L ASE R D I O D E A RR AY Neuron M [P , P , …, P N ] [O + ] [O + ][O ] [O M+ ] N × M U L T I P L EXE R × M SP L I TT E R P P P N P N-1 λ λ λ λ λ λ λ λ λ N-1 λ N-1 λ N-1 λ N-1 λ N λ N λ N λ N P H O T O D E T E C T O R A RR AYS L ASE R D I O D E A RR AY Neuron M [P , P , …, P N ] [O - ] [O - ][O ] [O M - ] N × M U L T I P L EXE R × M SP L I TT E R P P P N P N-1 Neuron Neuron Firing Unit ∑ DropDropInput Through Input Through Membrane Potential Neuron Soma Output Spikes NEGATIVE DPE ARRAY POSITIVE DPE ARRAY T + T + T N-1,1+ T N1+ T + T + T N-1,2 + T N2 + T + T + T N-1,3 + T N3 + T + T + T N-1,M + T NM + T - T - T N-1,1 - T N1 - T - T - T N-1,2- T N2- T - T - T N-1,3 - T N3 - T T T N-1,M- T NM- FIG. 6. Schematic of an All-Photonic Spiking Neural Network. Two DPE arrays are deployed to represent the positive andnegative components of the weights. The outputs of the DPE arrays are converted to optical spikes and passed to integrate-and-fire neurons. The structure of an integrate-and-fire neuron is illustrated in a circle. Each neuron has two inputs correspondingoutputs from the positive and negative DPE arrays. The neuron outputs a spike when the membrane potential crosses itsthreshold. co-design framework by building behavorial models of theproposed synapses and neurons based on the fitted de-vice characteristics. The models are used to evaluate theinferencing performance of the standard neural networktopology on standard digit recognition task based on theMNIST dataset using the Deep Learning Toolbox[41] inMATLAB. The MNIST dataset consists of 60000 imagesin the training set and 10000 images in the testing set. B. Device Simulations We considered 16 ring resonators of radii linearly in-creasing from 1.5 µm to 1.59 µm in any particular DPErow. The choice of number of devices, N , in a single rowis discussed earlier. The length of the GST element is in-creased accordingly and chosen iteratively to ensure uni-form transmission characteristics across the wavelengthrange of operation. We performed FDTD simulations foreach device with 4 different degrees of crystallization ofGST (30%, 50%, 80%, 100%) and the observed transmis-sion characteristics for the rings are shown in Fig. 7 (a). Expectedly, the transmission for each device decreaseswith decreasing degree of crystallization. The observedFSR was 53.1 nm and difference between the highestand lowest resonant wavelength was 47nm, which is wellwithin the FSR, thus ensuring no interference from reso-nant wavelengths beyond the region of operation. Fig. 7(b) and (c) show the contrast in electric field absorptionby the GST element in the ring resonator for 30% and100% crystallized GST. We observe certain variationsacross different wavelengths which can be minimized byfurther adjustments of lengths of the GST element. How-ever, from the perspective of neuromorphic applications,these variations prove to be insignificant. We will explorethe impact of such variations in our evaluation of the pro-posed neuromorphic processing engine. We exploit thedependence of transmission on degree of crystallizationto realize the synaptic behavior of the rings. Fig. 8 (a)shows the Gaussian fit of the simulated data across de-grees of crystallization varying from 0% to 100%. Note,the Gaussian fit provides a fairly accurate representationof the observed data and is a powerful tool to speed upour analysis in light of the computationally expensive GSTElement GSTElement a) b) c) INPUT PASS INPUT PASS FIG. 7. (a) Normalized transmission for 16 different rings for4 degrees of crystallization (30 %, 50 %, 80 %, 100 %) showinga decreasing trend with decreasing degree of crystallization.The range of wavelength for the 16 rings is less than the FSRfor the design. (b) and (c) shows the electric field profile inthe ring resonator system showing visible contrast in opticalabsorption and field transmission at the ‘PASS’ port in theGST element for c-GST and 30% c-GST respectively. FDTD simulations. It can be observed that transmis-sion has a non-linear relationship with p and hence, op-eration of the rings as synapses would require the GSTelement to be programmed to states with non-linearlyincreasing p . This can be achieved with appropriate am-plitude of the programming stimulus. Fig. 8 (b) showsthe transmission levels for each ring corresponding to 16discretized programmable states or Levels. The degreesof crystallization, p , for each state is shown in the insetof Fig. 8 (b). The linear relationship between transmis-sion and Levels is a necessity for the target application,i.e., a dot-product operation for neuromorphic comput-ing which led us to the choice of programmable stateswith the non-linear distribution of p . C. Interference Errors The transmission characteristics of the different ringsfor varying states of the GST element is used to evalu-ate the accuracy of the dot-product operation performedusing the proposed synaptic network. The error in thecomputation stems from the premise of overlapping fre-quency response between adjacent channels. The advan-tage of the proposed implementation over electrical coun-terparts is that in the electrical domain, the losses dueto line resistance is a function of input and the weightsthus rendering them difficult to model. The impact ofthe error in this setup is only dependent on the weightlevel and hence, can be easily modeled, analyzed andeven corrected in light of the proposed application. InEqn. 9, we have formulated a behavorial model of the a) b) FIG. 8. (a) Gaussian fit of simulated data points across de-grees of crystallization ranging from 0 % and 100 %. (b) Lin-early varying transmission across 16 different programmablestates (Levels) of the GST. Inset shows the degrees of crys-tallization corresponding to the Levels. error arising from interference due to adjacent channels.Fig. 9 shows the map of non-ideality factor α λ i for all 16rings for 16 different levels. This was calculated throughfitting of the extracted α λ i from Fig. 7 (a) based onEqn. 9. We observe that errors are highest for rings ofhigher radius and for the highest levels. This can be at-tributed to higher FWHM for rings of higher radius dueto the longer lengths of the GST element used to achieveuniform transmission levels across the operating range ofwavelength. We include these error characteristics corre-sponding to each ring for our system level evaluation ofthe proposed photonic SNN inferencing framework. D. System Level SNN performance We develop a device to algorithm level framework toperform system level analysis of the photonic SNN im-plementation. A SNN, like any other neural network,consists of multiple layers of neurons connected throughsynapses. The unique property of SNNs is that the in-puts to the network are discretized spike events insteadof analog values. The synapses act as weights which getmultiplied with amplitude of the incoming stimulus andthe resulting weighted-sum, i.e., dot-product of all im- FIG. 9. Map of non-ideality factor ( α λ i ) arising due to inter-ference from adjacent rings for each ring in the DPE row. pulses coming from different synapses is received by theneuron. We map the device characteristics of each in-dividual synapse and ‘integrate-and-fire’ spiking neuronsdiscussed previously to explore the validity of operationof the proposed devices as synapses and neurons in such aSNN. Let us now explain how we perform the evaluationof a SNN on the proposed PCM-based photonic infer-encing framework. We consider a fully connected neuralnetwork consisting of 3 layers, namely, the input layer,the hidden layer and output layer as shown in Fig. 10(a). This type of topology is well explored [42]. For ouranalysis, we consider a network with M = 784, N = 500, P = 10. We analyze the accuracy of such a network ina standard handwritten digit recognition task based onthe MNIST dataset [43]. A popular way of implement-ing spike-based inferencing systems is to train a networkas an Artificial Neural Network (ANN) and then con-vert it to a SNN by well explored conversion algorithms[42, 44]. The weights of the network are trained usingthe Backpropagation algorithm [45] as in case of Artifi-cial Neural Networks (ANN). The neurons in ANNs areusually non-linear mathematical functions, such as Rec-tified Linear Units (ReLU) [46], sigmoid or tanh withReLU being the most popularly chosen neuron function-ality. During conversion, an artificial neuron with ReLUfunctionality can be directly converted to an IF neuron,mathematically [42]. The details of the operation of theIF neuron has been elucidated in our earlier work[26].The trained weights of the network after the ANN isconverted to a SNN are mapped to the observed charac-teristics of each synaptic device in the proposed synapticnetwork. The synaptic network has the provision of op-erating 16 synapses simultaneously. To perform the dot-product of larger dimensions, the synaptic network needsto be time-multiplexed as discussed earlier. To simulatelarge-dimension operations with the proposed synapticnetwork, we repeat the device characteristics every 16 Input layer M Neurons Hidden layerN Neurons Output layer P Neurons Synaptic Connections (w ji1 ) N×M Synaptic Connections (w ji2 ) P×N(a i ) (z j1 ) (a i2 ) (z j2 ) (a i L )σ σ a)b) FIG. 10. (a) Fully connected neural network topology consist-ing of an input layer (M), a hidden layer (N) and an outputlayer (P) of neurons. The resulting synaptic networks are ofsizes N × M and P × N (b) Evolution of classification accu-racy of handwritten digit recognition task based of MNISTdataset comparing our proposed Photonic SNN to ideal SNNperformance. Here ideal SNN corresponds to software-levelfunctionalities without considering device characteristics. synapses. The weights of the network can be negative.To account for negative weights, two dot-product enginesare deployed, shown in Fig. 6 as described earlier.The pixels of input images of size 28 × 28 are dividedinto streams of spikes whose frequency is proportionalto the pixel intensity. At every time-step, the input caneither be ‘0’ when there is no spike or ‘1’ in the eventof a spike. The behavorial model of the SNN inferenc-ing framework described above was implemented usingthe MATLAB Deep Learning Toolbox [41] using the net-work topology shown in Fig. 10 (a). The network is eval-uated at every time-step by passing the inputs throughthe forward path from the input layer to the output layerthrough the synaptic network and activity of the networkwas recorded. Finally, the output neuron with the high-est spiking activity is compared with the label of theinput image to determine the accuracy of the recognitionsystem. The classification performance of the proposedphotonic SNN is compared with an ideal SNN in Fig.10 (b). Here, ideal SNN essentially means software-levelevaluation without taking device characteristics into con-0sideration. We observe that there is a degradation in ac-curacy of 0.52 % after 35 time-steps from the ideal casearising from the different variations in device characteris-tics discussed earlier. To note, the concept of time-stepshere correspond to how many times we evaluate the net-work over the Poisson-distributed input spikes generatedfrom the image. The duration of a time-step is not rel-evant in this context as we do not include any tempo-ral dynamics in the system. We further attempted toisolate the contribution of synaptic device variations tothe observed degradation in accuracy by considering acomparison test case: ideal synapses with proposed neu-rons. That accuracy degradation amounted to 0.1% af-ter 35 time-steps. This implies 0.42% degradation due tosynaptic variations.We evaluated the energy consumption of the the basicbuilding blocks for our system, the synaptic array andthe neurons. The energy consumed by each synapse canbe estimated by the transmission (or the weight) of thesynaptic device. As the information being processed isbased on spike events, the input can either be ‘1’ or a ‘0’.Experimental demonstrations [22] have shown that read-out for GST-based Si photonic devices can be achievedby pulse energies of 0.48 pJ. For our case, due to smallerGST footprints, we consider input ‘1’ to correspond toa pulse of amplitude 0.25 mW. The power consumed bythe synapse is thus given by (1-T) mW where T is thetransmission of the synapse. As these read pulses willeventually write into the neurons, we choose a pulsewidthof 200 ps, which is the minimum pulsewidth required towrite into the GST, as we observed previously [26]. Con-sidering these metrics for the read pulses and power cal-culations for each synapse, we estimated the energy con-sumption of the entire classification operation describedabove. The resulting average energy consumption forfirst layer of the neural network in the synaptic array wascalculated to be ∼ . f J per synapse per time-step ofevaluation. For the second layer, the energy consumptionwas ∼ . f J per synapse per time-step. The differenceis energy consumption in the two layers is due to moresparse spiking activity in the second layer. The energyconsumed by each neuron was calculated in our previouswork to be 5 pJ per time-step. The writing energies forPCM devices of similar feature sizes [47, 48] in the electri-cal domain can amount upto 14-19 pJ while operating atspeeds of 40-100ns. The total energy consumption for animage classification was calculated ∼ nJ (178 nJ con-sumed by the synaptic operations and 83 nJ consumedby the neurons). Although the energy consumption iscomparable to CMOS technology [49], photonics poten-tially offers a faster operation at sub-ns speeds. To note,in this work, we have considered a significantly high readpulse (0.25 mW) through the synapses which is reflectedin the high energy per inference operation. The proposedsynapses can be potentially read with a pulse of lower am-plitude based on the sensitivity of the photodetectors andthat will significantly improve the energy requirements ofthe system. Moreover, the speed of operation in the pho- READ INPUT PASS GST elementR WRITE INPUT t gap W write < W wg W wg a)b) FIG. 11. (a) Structure and arangement of input write waveg-uide at a distance t gap to the synaptic device. The width ofthe write waveguide ( W write ) is smaller than that of the ringwaveguide ( W wg ) for asymmetric coupling. (b) Transmissioncharacteristics of 1.59 µm ring for different values of t gap com-pared with the case without a write waveguide. Inset 1 (Blue)shows a zoomed-in view of the transmission characteristics toshow the different cases clearly. Inset 2 (Red) shows the vari-ation of percentage error in transmission at read wavelength1562.85 nm with t gap . tonic domain is significantly higher since read latencies ofthe neuromorphic systems based on memristors usuallyoccur in orders of ns. These benefits encouraged us tofurther explore the possibility of neuromorphic hardwaredesign based on this technology. VII. DISCUSSION The proposed photonic SNN inferencing frameworkfills a major void of scaling from device to systems in cur-rent state-of-the-art photonic neuromorphic works basedon PCMs. However, few challenges stand in the way ofphysical demonstration of the proposal that need to beovercome. Firstly, reconfigurability of the proposed non-volatile synaptic array is a necessity. Various reconfigura-bility schemes have been explored on the phase-changebased photonic platforms [24, 32]. We explored the pos-sibility of adding an input bend waveguide (WG write ) asa writing port for each synapse at a distance such thatthe inferencing framework is unaffected. The width ofWG write ( W write ) is intentionally considered to be muchlower than the ring waveguide of the synaptic device.1This is done to achieve asymmetric coupling such thatduring writing, the wave leaks out of WG write appropri-ately for efficient writing while during standard inferenc-ing operation, the wave remains mostly confined withinthe ring. Fig. 11 (a) shows the structure and arrange-ment of WG write adjacent to the proposed synaptic de-vice. t gap denotes the distance between the ring waveg-uide and WG write . We observe that error in transmissionduring normal inferencing operation due to the presenceof the WG write is around 0.5 % for t gap ∼ nm . Forthe same distance, we calculated the transient field cou-pling from the WG write to the ring to be 70 %. Thus,this writing scheme is a viable option for achieving re-configurability in the proposed network.The dimensions chosen for our analysis are cateredtowards achieving desirable functionality for ring res-onators of small radii of around ∼ . µm . The mainmotivation behind using small ring resonators was toachieve high area density for scalability. We have ex-plored a number of challenges arising from such smallrings such as non-uniform bending and coupling lossesacross the range of wavelength and fabrication difficul-ties to achieve critical coupling. We have attempted tomitigate such challenges by appropriate design. Further,we delineated the design constraints for scaling individ-ual synapses to a network of synapses which is necessaryfor large-scale neuromorphic systems. GST-based pho-tonic platforms also experience a small resonance shiftbetween the different programmable states of the PCM.The resonance shift between the any two states can bequantified by [23]:∆ λ m λ m,in = ∆ n eff,GST n g,eff . L GST πR ring (15)Here, λ m,in is the resonant wavelength in the initial state,∆ n eff,GST is the difference in effective refractive indexbetween the states, n g,eff is the group index. For ourcase, it amounts to approximately 0.012 nm. In additionto the variations arising from device characteristics, wealso explored errors arising due to interference from ad-jacent channels and their impact on the performance ofthe proposed photonic SNN. From our analysis, it canbe observed that the network size, N considered in oursynaptic fabric is a rather conservative design. N canbe further increased which would result in higher errors.However, the effect of such variations have been mod-elled in Eqn (9) and the resulting accuracy degradationcan be recovered by modifying the training algorithm asexplored for memristive technologies [50].The challenges of errors arising due to interference be-tween adjacent rings essentially stems from the usageof WDM-based computation. To that effect, the lim-itations of array size due to WDM merits discussion.WDM, while introducing parallelism in the system, isconstrained by the finesse of the rings. In this work,we have shown that we can use 16 rings in a single dot-product engine row which implies that the array can pro-cess 16 inputs in parallel. The size of the array is thus limited to 16 × N where N would be limited by the areaand not design constraints. However, analogous com-puting units in the electrical domain using memristivecrossbars are also limited in size due to electro-migrationlimits, sneak-paths and line-resistances. The photonic ar-ray on the other hand, although limited in one directiondue to finesse, can be possibly extended to larger sizesin the direction of N . Moreover, time multiplexing is apopular practice when implementing large scale neuralnetworks on memristive networks, as alluded to earlier.The possibility of fast writing into PCMs can potentiallymake these photonic arrays more suitable for temporallyscalable architectures.An alternative way to implement Photonic Neural Net-works is through the use of inteferometers [18] wherethe weights of the network are controlled through phase-shifters. Such phase-shifters can consume significantamount of power per synapse to maintain the weight.On the other hand, non-volatile elements based on PCMscan potentially encode the weights without requiring anypower to maintain their states. However, we do not usethe concept of phase-shift for our design. We encode theweights in terms of levels of partial crystallization. Non-volatility is necessary for large-scale neuromorphic sys-tems for primarily two reasons: i) it eliminates the needfor phase-shifters as constant tuning is not required, andii) it provides a platform for in-memory computing ratherthan storing the synaptic weights in a separate memory.In this work, the intention to use non-volatile materialbased memory primitive is to eliminate the need for ther-mal tuners. To the best of our knowledge, this is thefirst proposal of photonic neuromorphic platform froma scalable system point of view based on a non-volatilememory primitive. Recent proposals [51, 52] have lookedat scalable systems to realize complex neural dynam-ics using for dynamic learning. However, the flux-basedmemory in such systems are dependent on temperatureand also on the run-time of operation. Such detailedneuro-biological functionalities make them more suitablefor brain-like simulations similar to NeuroGrid [53] inthe electrical domain. In this work, we do not incor-porate complex biological dynamics of SNNs in our sys-tem and rather focus on leveraging the inherent sparsityof spike-based processing while performing image clas-sification for energy efficiency. The primary motivationbehind exploring this primitive stems from building a po-tentially reconfigurable neuromorphic system which per-forms energy-efficient inferencing. For building such neu-romorphic platforms to perform spike-based processingin standard architectures, in-memory computing offerssignificant promise. To that effect, non-volatile mem-ory primitives are quintessential and more suitable asthey potentially eliminate the need for off-chip DRAMaccesses, thus alleviating memory bottlenecks.A popular way of implementing such spike-based in-ferencing systems is to train a network as an ArtificialNeural Network (ANN) and then convert it to a Spik-ing Neural Network (SNN) by well explored conversion2algorithms[42]. This method has seen considerable suc-cess [44] in image classification, far beyond the scope ofspike-based training algorithms. The neurons in ANNsare usually non-linear mathematical functions, such asRectified Linear Units (ReLU), sigmoid or tanh withReLU being the most popularly chosen neuron function-ality. During conversion, an artificial neuron with ReLUfunctionality can be directly converted to an IF neuron,mathematically [44]. This explains why we have chosenIF neuron as the spiking neuron in our proposal. IF neu-rons are not associated with time-constants as it does notinclude leak factors and the operations are fairly simpleunlike other spiking neurons. The proposal concerns withbuilding spike-based photonic neuromorphic inferencingplatform for image classification task. Note, the neurondoes not bear exact resemblance to biological neuron,however, the design leverages the event-driven behaviorof biological neurons. The aim of this work is to builda fast neuromorphic inferencing platform in the spikingdomain to perform machine learning tasks such as im-age classification. Several works [53] have previously ex-plored brain-like neuron and synaptic functionalities withmore significant resemblance for complex neural simula-tions, albeit in the electrical domain.The major advantage of building neuromorphic sys-tems based on Photonics rests in its speed of operation.The primary bottleneck in ‘write’ latencies arise from theprogramming time of the IF neuron which can also beperformed at 200 ps . Although the current technology ispower expensive during writing, the speed of writing stillenables us to achieve a reasonable energy efficiency. Withfurther optimization of switching techniques or by use ofalternative PCMs with lower switching power, further en-ergy benefits can also be aimed for to achieve comparableenergy consumption to other technologies in the electri-cal domain. In turn, the proposed photonics computingplatform eliminates various drawbacks usually faced inthe electrical counterparts such as metal wire resistance,electromigration, sneak paths, etc. Despite the inherentchallenges in the design and implementation, our pro-posed SNN framework based on GST-on-silicon photon-ics neuromorphic fabric enables parallelism through in-tegration of a synaptic network with IF neurons. Such a design paves the way for scalable photonic architecturessuitable for large-scale neuromorphic systems catered toperform fast computations. VIII. CONCLUSION We have proposed a photonic Spiking Neural Networkcomputing primitive through seamless integration of non-volatile synapses and ‘Integrate-and-Fire’ Neurons basedon Phase-change materials. The microring resonator de-vices explored for such synapses and neurons leverage thedifferential optical absorption of GST for non-volatility.We use the WDM technique to scale individual synapsesinto a large-scale synaptic array capable of performingparallelized dot-products. Our design is based on ringresonators of radius comparable to the wavelength of op-eration in order to achieve high area density while main-taining performance. We explore several challenges in-volved in such small ring resonators and proposed cer-tain design modifications to achieve uniform and desir-able characteristics across the entire operating range ofwavelength. Finally, we developed a device to systemlevel framework to evaluate the performance of the pro-posed photonic in-memory computing primitive and IFneurons as an SNN inferencing engine by building be-havioral models of the photonic neuromorphic fabric andachieve comparable performance to an ideal network.Neuromoprhic systems based on Integrated Photonics of-fer an alternative dimension to the current wave of ex-ploring beyond von-Neumann computing frameworks andour proposed photonic SNN inferencing engine achieves asignificant step towards proposing individual non-volatiledevices capable of performing in-memory computing andscaling to a network of such devices to realize a trulyintegrated Spiking Neural Network. ACKNOWLEDGMENT The work was supported in part by, ONR-MURI pro-gram, the National Science Foundation, Intel Corpora-tion and by the DoD Vannevar Bush Fellowship. [1] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton,“Deep learning,” Nature , 436–444 (2015).[2] David Silver, Aja Huang, Chris J. Maddison, ArthurGuez, Laurent Sifre, George van den Driessche, JulianSchrittwieser, Ioannis Antonoglou, Veda Panneershel-vam, Marc Lanctot, Sander Dieleman, Dominik Grewe,John Nham, Nal Kalchbrenner, Ilya Sutskever, TimothyLillicrap, Madeleine Leach, Koray Kavukcuoglu, ThoreGraepel, and Demis Hassabis, “Mastering the game ofgo with deep neural networks and tree search,” Nature , 484–489 (2016). [3] Murray Campbell, A.Joseph Hoane, and Feng hsi-ung Hsu, “Deep blue,” Artificial Intelligence , 57–83(2002).[4] R. Serrano-Gotarredona, M. Oster, P. Lichtsteiner,A. Linares-Barranco, R. Paz-Vicente, F. Gomez-Rodriguez, L. Camunas-Mesa, R. Berner, M. Rivas-Perez, T. Delbruck, Shih-Chii Liu, R. Douglas,P. Hafliger, G. Jimenez-Moreno, A.C. Ballcels,T. Serrano-Gotarredona, A.J. Acosta-Jimenez,and B. Linares-Barranco, “CAVIAR: A 45k neu-ron, 5m synapse, 12g connects/s AER hardwaresensory–processing– learning–actuating system for high- speed visual object recognition and tracking,” IEEETransactions on Neural Networks , 1417–1438 (2009).[5] P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cas-sidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam,C. Guo, Y. Nakamura, B. Brezzo, I. Vo, S. K. Esser,R. Appuswamy, B. Taba, A. Amir, M. D. Flickner, W. P.Risk, R. Manohar, and D. S. Modha, “A million spiking-neuron integrated circuit with a scalable communicationnetwork and interface,” Science , 668–673 (2014).[6] Steve B. Furber, Francesco Galluppi, Steve Temple, andLuis A. Plana, “The SpiNNaker project,” Proceedings ofthe IEEE , 652–665 (2014).[7] John Von Neumann, The computer and the brain (YaleUniversity Press, 2012).[8] Avishek Biswas and Anantha P. Chandrakasan, “Conv-RAM: An energy-efficient SRAM with embedded convo-lution computation for low-power CNN-based machinelearning applications,” in (IEEE, 2018).[9] Akhilesh Jaiswal, Indranil Chakraborty, Amogh Agrawal,and Kaushik Roy, “8t sram cell as a multi-bit dot prod-uct engine for beyond von-neumann computing,” arXivpreprint arXiv:1802.08601 (2018).[10] C. Mead, “Neuromorphic electronic systems,” Proceed-ings of the IEEE , 1629–1636 (1990).[11] Can Li, Miao Hu, Yunning Li, Hao Jiang, Ning Ge, EricMontgomery, Jiaming Zhang, Wenhao Song, NoraicaD´avila, Catherine E. Graves, Zhiyong Li, John Paul Stra-chan, Peng Lin, Zhongrui Wang, Mark Barnell, Qing Wu,R. Stanley Williams, J. Joshua Yang, and Qiangfei Xia,“Analogue signal and image processing with large mem-ristor crossbars,” Nature Electronics , 52–59 (2017).[12] Abhronil Sengupta and Kaushik Roy, “Encoding neuraland synaptic functionalities in electron spin: A pathwayto efficient neuromorphic computing,” Applied PhysicsReviews , 041105 (2017).[13] Sukru B. Eryilmaz, Duygu Kuzum, Rakesh Jeyas-ingh, SangBum Kim, Matthew BrightSky, Chung Lam,and H.-S. Philip Wong, “Brain-like associative learn-ing using a nanoscale non-volatile phase change synap-tic device array,” Frontiers in Neuroscience (2014),10.3389/fnins.2014.00205.[14] Tomas Tuma, Angeliki Pantazi, Manuel Le Gallo, AbuSebastian, and Evangelos Eleftheriou, “Stochastic phase-change neurons,” Nature Nanotechnology , 693–699(2016).[15] Bipin Rajendran, Yong Liu, Jae-sun Seo, KailashGopalakrishnan, Leland Chang, Daniel J Friedman, andMark B Ritter, “Specifications of nanoscale devices andcircuits for neuromorphic computational systems,” IEEETransactions on Electron Devices , 246–253 (2013).[16] Ali Shafiee, Anirban Nag, Naveen Muralimanohar, Ra-jeev Balasubramonian, John Paul Strachan, Miao Hu,R Stanley Williams, and Vivek Srikumar, “Isaac: A con-volutional neural network accelerator with in-situ analogarithmetic in crossbars,” ACM SIGARCH Computer Ar-chitecture News , 14–26 (2016).[17] Kristof Vandoorne, Pauline Mechet, Thomas VanVaerenbergh, Martin Fiers, Geert Morthier, David Ver-straeten, Benjamin Schrauwen, Joni Dambre, and Pe-ter Bienstman, “Experimental demonstration of reservoircomputing on a silicon photonics chip,” Nature Commu-nications (2014), 10.1038/ncomms4541. [18] Yichen Shen, Nicholas C Harris, Scott Skirlo, MihikaPrabhu, Tom Baehr-Jones, Michael Hochberg, Xin Sun,Shijie Zhao, Hugo Larochelle, Dirk Englund, et al. , “Deeplearning with coherent nanophotonic circuits,” NaturePhotonics , 441 (2017).[19] Alexander N Tait, Mitchell A Nahmias, Bhavin J Shas-tri, and Paul R Prucnal, “Broadcast and weight: an in-tegrated network for scalable photonic spike processing,”Journal of Lightwave Technology , 3427–3439 (2014).[20] Alexander N. Tait, Thomas Ferreira de Lima, Ellen Zhou,Allie X. Wu, Mitchell A. Nahmias, Bhavin J. Shastri,and Paul R. Prucnal, “Neuromorphic photonic networksusing silicon photonic weight banks,” Scientific Reports (2017), 10.1038/s41598-017-07754-z.[21] Carlos R´ıos, Nathan Youngblood, Zengguang Cheng,Manuel Le Gallo, Wolfram HP Pernice, C David Wright,Abu Sebastian, and Harish Bhaskaran, “In-memorycomputing on a photonic platform,” arXiv preprintarXiv:1801.06228 (2018).[22] Carlos Rios, Matthias Stegmaier, Peiman Hosseini,Di Wang, Torsten Scherer, C. David Wright, HarishBhaskaran, and Wolfram H. P. Pernice, “Integrated all-photonic non-volatile multi-level memory,” Nature Pho-tonics , 725–732 (2015).[23] Matthias Stegmaier, Carlos Rios, Harish Bhaskaran,C. David Wright, and Wolfram H. P. Pernice, “Non-volatile all-optical 1 × , 1600346 (2016).[24] Jiajiu Zheng, Amey Khanolkar, Peipeng Xu, Shane Col-burn, Sanchit Deshmukh, Jason Myers, Jesse Frantz, EricPop, Joshua Hendrickson, Jonathan Doylend, NicholasBoechler, and Arka Majumdar, “GST-on-silicon hybridnanophotonic integrated circuits: a non-volatile quasi-continuously reprogrammable platform,” Optical Mate-rials Express , 1551 (2018).[25] Zengguang Cheng, Carlos R´ıos, Wolfram HP Pernice,C David Wright, and Harish Bhaskaran, “On-chip pho-tonic synapse,” Science advances , e1700160 (2017).[26] Indranil Chakraborty, Gobinda Saha, Abhronil Sen-gupta, and Kaushik Roy, “Toward fast neural computingusing all-photonic phase change spiking neurons,” Scien-tific reports , 12980 (2018).[27] Lin Yang, Ruiqiang Ji, Lei Zhang, Jianfeng Ding, andQianfan Xu, “On-chip cmos-compatible optical signalprocessor,” Optics express , 13560–13565 (2012).[28] Itay Hubara, Matthieu Courbariaux, Daniel Soudry, RanEl-Yaniv, and Yoshua Bengio, “Binarized neural net-works,” in Advances in neural information processing sys-tems (2016) pp. 4107–4115.[29] Mohammad Rastegari, Vicente Ordonez, Joseph Red-mon, and Ali Farhadi, “XNOR-net: ImageNet classi-fication using binary convolutional neural networks,” in Computer Vision – ECCV 2016 (Springer InternationalPublishing, 2016) pp. 525–542.[30] Yiguo Chen, Xiong Li, Yannick Sonnefraud, Antonio I.Fern´andez-Dom´ınguez, Xiangang Luo, Minghui Hong,and Stefan A. Maier, “Engineering the phase front oflight with phase-change material based planar lenses,”Scientific Reports (2015), 10.1038/srep08660.[31] Nikolai V. Voshchinnikov, Gorden Videen, and ThomasHenning, “Effective medium theories for irregular fluffystructures: aggregation of small particles,” Applied Op-tics , 4065 (2007). [32] Wolfram HP Pernice and Harish Bhaskaran, “Photonicnon-volatile memories using phase change materials,”Applied Physics Letters , 171101 (2012).[33] Wim Bogaerts, Peter De Heyn, Thomas Van Vaeren-bergh, Katrien De Vos, Shankar Kumar Selvaraja,Tom Claes, Pieter Dumon, Peter Bienstman, DriesVan Thourhout, and Roel Baets, “Silicon microring res-onators,” Laser & Photonics Reviews , 47–73 (2012).[34] Qianfan Xu, David Fattal, and Raymond G. Beausoleil,“Silicon microring resonators with 1.5- µ m radius,” Op-tics Express , 4309 (2008).[35] David E Aspnes and AA Studna, “Dielectric functionsand optical parameters of Si, Ge, GaP, GaAs, GaSb, InP,InAs, and InSb from 1.5 to 6.0 ev,” Physical review B ,985 (1983).[36] IH Malitson, “Interspecimen comparison of the refractiveindex of fused silica,” Josa , 1205–1209 (1965).[37] Sang-Youl Kim, Sang J Kim, Hun Seo, and Myong RKim, “Variation of the complex refractive indices withsb-addition in ge-sb-te alloy and their wavelength depen-dence,” in Optical Data Storage’98 , Vol. 3401 (Interna-tional Society for Optics and Photonics, 1998) pp. 112–116.[38] Aayush Ankit, Abhronil Sengupta, Priyadarshini Panda,and Kaushik Roy, “Resparc: A reconfigurable andenergy-efficient architecture with memristive crossbarsfor deep spiking neural networks,” in Proceedings of the54th Annual Design Automation Conference 2017 (ACM,2017) p. 27.[39] Lumerical, Lumerical Inc. (2017).[40] Comsol, Multiphysics Reference Guide for COMSOL 4.2 (2011).[41] Rasmus Berg Palm, “Prediction as a candidate for learn-ing deep hierarchical models of data,” Technical Univer-sity of Denmark (2012).[42] Peter U Diehl, Daniel Neil, Jonathan Binas, MatthewCook, Shih-Chii Liu, and Michael Pfeiffer, “Fast-classifying, high-accuracy spiking deep networks throughweight and threshold balancing,” in Neural Networks(IJCNN), 2015 International Joint Conference on (IEEE, 2015) pp. 1–8.[43] “MNIST handwritten digit database,”http://yann.lecun.com/exdb/mnist/.[44] Abhronil Sengupta, Yuting Ye, Robert Wang, Chiao Liu,and Kaushik Roy, “Going deeper in spiking neural net- works: Vgg and residual architectures,” arXiv preprintarXiv:1802.02627 (2018).[45] David E Rumelhart, Geoffrey E Hinton, and Ronald JWilliams, “Learning representations by back-propagatingerrors,” nature , 533 (1986).[46] Vinod Nair and Geoffrey E Hinton, “Rectified linear unitsimprove restricted boltzmann machines,” in Proceedingsof the 27th international conference on machine learning(ICML-10) (2010) pp. 807–814.[47] Benjamin C Lee, Engin Ipek, Onur Mutlu, and DougBurger, “Architecting phase change memory as a scal-able dram alternative,” in ACM SIGARCH ComputerArchitecture News , Vol. 37 (ACM, 2009) pp. 2–13.[48] H-S Philip Wong, Simone Raoux, SangBum Kim, JialeLiang, John P Reifenberg, Bipin Rajendran, MehdiAsheghi, and Kenneth E Goodson, “Phase change mem-ory,” Proceedings of the IEEE , 2201–2227 (2010).[49] Abhronil Sengupta, Maryam Parsa, Bing Han, andKaushik Roy, “Probabilistic deep spiking neural systemsenabled by magnetic tunnel junction,” IEEE Transac-tions on Electron Devices , 2963–2970 (2016).[50] Indranil Chakraborty, Deboleena Roy, and Kaushik Roy,“Technology aware training in memristive neuromorphicsystems for nonideal synaptic crossbars,” IEEE Transac-tions on Emerging Topics in Computational Intelligence , 335–344 (2018).[51] Jeffrey M Shainline, Adam N McCaughan, Sonia MBuckley, Christine A Donnelly, Manuel Castellanos-Beltran, Michael L Schneider, Richard P Mirin, andSae Woo Nam, “Superconducting optoelectronic neuronsiii: Synaptic plasticity,” arXiv preprint arXiv:1805.01937(2018).[52] Jeffrey M Shainline, Jeff Chiles, Sonia M Buckley,Adam N McCaughan, Richard P Mirin, and Sae WooNam, “Superconducting optoelectronic neurons v: Net-works and scaling,” arXiv preprint arXiv:1805.01942(2018).[53] Ben Varkey Benjamin, Peiran Gao, Emmett McQuinn,Swadesh Choudhary, Anand R Chandrasekaran, Jean-Marie Bussat, Rodrigo Alvarez-Icaza, John V Arthur,Paul A Merolla, and Kwabena Boahen, “Neurogrid:A mixed-analog-digital multichip system for large-scaleneural simulations,” Proceedings of the IEEE102