[PDF] Optical Neural Networks: The 3D connection

Abstract

We motivate a new canonical strategy for integrating photonic neural networks (NNs) by leveraging 3D printing. Our believe is that a NN's parallel and dense connectivity is not scalable without 3D integration. 3D additive fabrication complemented with photonic signal transduction can dramatically augment the current capabilities of 2D CMOS and integrated photonics. Here we review some of our recent advances made towards such a breakthrough architecture.

Full PDF

OOptical Neural Networks: The 3D connection

AIP/123-QED

Optical Neural Networks: The 3D connection

Niyazi Ulas Dinc, Demetri Psaltis, and Daniel Brunner a)1) Optics Laboratory, ´Ecole Polytechnique F´ed´erale de Lausanne, Lausanne,Switzerland. D´epartement d’Optique P. M. Duﬃeux, Institut FEMTO-ST,Universit´e Bourgogne-Franche-Comt´e CNRS UMR 6174, Besan¸con,France. (Dated: 31 August 2020)

We motivate a new canonical strategy for integrating photonic neural networks (NN)by leveraging 3D printing. Our believe is that a NNs parallel and dense connectivityis not scalable without 3D integration. 3D additive fabrication complemented withphotonic signal transduction can dramatically augment the current capabilities of 2DCMOS and integrated photonics. Here we review some of our recent advances madetowards such a breakthrough architecture. a) Electronic mail: daniel.brunnerfemto-st.fr a r X i v : . [ c s . ET ] A ug ptical Neural Networks: The 3D connection FIG. 1. (A) In a Neural Network (NN) typically millions of connections link simple nonlinearneurons which are arranged in layers. (B) In the brain short, medium and long range (a, b, c,respectively) neural connections are established in the volume of white and grey matter. Adaptedfrom Schz, et al., Encyclopedia of Neuroscience 2009.

I. INTRODUCTION

Several decades passed between the introduction and the large-scale exploration of neuralnetworks (NN). Since the proposal of simple NNs in 1943 , the ﬁeld has gone throughmultiple cycles of euphoria and challenges until reaching todays large-scale interest andexploitation . Readily available high-performance computing systems now allow emulatingpowerful (deep) NN architectures whose connections are optimized based on computationallyexpensive learning concepts such as gradient back-propagation. As a consequence, NNscurrently excel on previously unseen scales, but at the same time the constraints of todaysCMOS-based computing threatens to limit the reach of this revolution.As illustrated by their name, the initial objective of NN, cf. Fig. 1(A), was providinga logical calculus of the ideas immanent in nervous activity , and as such their composi-tion mirrors a most rudimentary aspect of the mammalian neo-cortex: nodes are denselylinked into a network with connections much like synapses, dendrites and axons connect-ing biological neurons. However, this is only possible in the context of a global structuralproperty of the neocortex in which neurons, and even more so connections, are distributed2ptical Neural Networks: The 3D connectionacross a 3D volume, cf. Fig. 1(B). The majority of cortical neurons are arranged in planeslocated inside the grey matter that wraps around the brain, and stacks of neurons formshort-range connections (labelled a in Fig. 1(B)) which travers the grey matters volume.Crucially, grey matter encloses white matter, and inside this volume the brains long-range(labelled b and c in Fig. 1(B)) connections are located. 3D connections are therefore acanonical feature of brain architecture. The scale and connectivity of the human brains net-work would otherwise simply not ﬁt inside the human skull. The brain therefore provides avery good primer for exploiting 3D circuit topology. Even though the 3D topology of brainsemerged from evolutionary development, science and engineering can deliberately combineadvantageous strategies and concepts. Combining the 3D network topology of biologicalbrains with photonic signal transduction is a highly appealing strategy for next generationNN computing. In this paper, we elaborate the potential of 3D printing technology forintegrated photonic NN chips. Such additive fabrication enables true 3D integration andnaturally complements the mostly 2D lithography that struggles to implement parallel NNconnections with a scalable strategy. Photonics oﬀers fundamental energy, speed and latencyadvantages when establishing the communication between NN neurons along the stagger-ing amount of network connections. 3D printing is a potential path for 3D integration ofoptically interconnected Si or other electro-optic chips. II. CANONICAL 3D PHOTONIC NEURAL NETWORK ARCHITECTURE

Physically realizing dense connections for the large number of neurons (typically ¿1000units) contained in each NN layer results in a formidable challenge. A parallel NN proces-sor needs to provide a dedicated physical link for each connection, which is diﬃcult sincethe amount of possible connections scales quadratically with the number of neurons. Aconnections deﬁning property is its strength, and its physical implementation for exampleby memristors, micro-rings or holographic memory always occupies some basic unit of areai.e. volume. Integration in 2D results in a quadratic scaling of the circuits area with anetworks size , cf. Fig. 2(A). In a 3D implementation weights can be stacked, for exam-ple, in planes, and for the simplest organization , both, the number of required planes andmemory-elements per plane scale linearly with the number of neurons. This mitigates thesize-scalability roadblock and 3D routing may well be a fundamental prerequisite for scalable3ptical Neural Networks: The 3D connection FIG. 2. (A) Realizing the connections in 2D interconnects is not scalable, and 3D integrationis essential for parallel NN integration. (B) In our canonical 2D/3D photonic NN neurons arearranged in 2D while connections are established in the 3D volume between layers of neurons,where the NN correctly identiﬁes an apple. and parallel NN chips. Realizing such 3D circuits electronically is challenging due to thecapacitive coupling and the associated energy dissipation when sending information alongsignalling wires.In order to overcome these challenges we investigated a new, canonical photonic NNarchitecture where neurons in the form of nonlinear components are arranged in 2D sheets,while connections are integrated in 3D printed photonic circuits, cf. Fig. 2(B). We donot constrain the nature of photonic neurons or the 3D routing strategy. All-optical aswell as electro-optical components acting as neurons are possible, and the 3D photonicinterconnect can be realized by refractive index modiﬁcations in a 3D medium, multiplestacks of diﬀractive-optics planes as well as complex 3D circuitry of photonic waveguides . III. 3D NANO-PRINTING TECHNOLOGY

Additive manufacturing (AM) has been a popular method for prototyping ever since itwas developed in the 1980s as it does not require special tooling or molds. However, itstrue advantage over most conventional manufacturing methods is AMs ability to produce3D parts of great complexity, which is unfeasible or even impossible with subtractive or 2Dlithographic methods. Among various AM techniques, two-photon polymerization (TPP) is4ptical Neural Networks: The 3D connection

FIG. 3. (A) 3D printing scheme with the objective focusing the femtosecond laser pulse into thephotoresist. (B) Layer-by-layer printing. of special interest since it provides sub-micron feature sizes in materials that are transpar-ent in the optical domain with refractive index values close to those of glass. TPP utilizesfemtosecond lasers to expose and polymerize photoresists. The two-photon process is ofsigniﬁcance as it enables feature sizes below the Abbe diﬀraction limit thanks to the poly-merizations quadratic dependence on exposure intensity. One-photon processes in turn yieldlarger polymerized voxels due to a linear dependence of polymerization on exposure inten-sity. Control of the light intensity threshold for polymerization and quenching eﬀects furthercontribute to sub-diﬀraction resolution. TPP exposure-dose can be controlled through scan-ning speed and laser intensity, which provides control over the degree of polymerization ofthe photoresist and hence over the local refractive index. This enables the possibility ofprinting graded-index (GRIN) elements . 3D direct-laser writing systems oﬀer robust, com-mercial TPP setups where complex optical elements can be printed (c.f. Fig. 3) at diﬀerentresolutions by selecting among diﬀerent resin-objective pairs. In subsequent sections, wepresent diﬀerent optical elements that were fabricated by a Nanoscribe 3D printer.For the concepts presented in this paper, the most important feature of AM/TPP is5ptical Neural Networks: The 3D connection FIG. 4. SEM micrographs of 3D printed waveguides realizing parallel interconnects with highconnectivity (A) and according to Haar ﬁlters (B). the ability to access independently each voxel in the fabrication volume, which enablesholographic as well as wave-guide based photonic connections. From the holography pointof view it is key to go beyond 1/M2, which is the eﬃciency relation where M is the numberof multiplexed holograms . This fundamental limitation holds for any optical holographicmaterial where recording is accomplished by means of multiple optical exposures due to thesuperposition of multiple holograms following a recording sequence that is designed to usethe dynamic range of the index modulation equally. Crucially, eﬃciency could be improvedto 1/M if the hologram were constructed voxel-by-voxel or in a multilayered fashion. TPPmakes it practical to adopt both options. In addition, the ability to access each point inthe volume enables the fabrication of complex 3D-routed waveguides that deﬁne the opticalsignals path in 3D, reminiscent of the dendrites and axons in the brain. IV. 3D DISCRETE-WAVEGUIDE INTERCONNECTS

As previously introduced, connections between biological neurons are made by dedicatedwires formed by axons connected to dendrites via synapses, and the photonic equivalent ofsuch spatially discrete links is the optical waveguide. An optical waveguide utilizes the prin-ciple of total internal reﬂection, where a medium with a higher refractive index is surroundedby a medium with a lower refractive index. Recently Moughames et al.

3D printed such6ptical Neural Networks: The 3D connection

FIG. 5. (A) 3D rendering of the OVE in with the ideal input and output pairs; (B) SEM image ofthe printed structure and the corresponding experimental results. (C) XY, YZ and XZ cut planes ofa GRIN optimized for Haar ﬁltering. The colorbar shows RI variation. (D) Corresponding outputﬁelds obtained by simulating the propagation of inputs through the optimized GRIN volume. Allﬁeld plots have a window size of 32x32 m2 and color code shows the normalized amplitude foreach. optical waveguides using a Nanoscribe 3D printer and connections in the form of opticalsplitters realized the dense connectivity between neurons.Diﬀerent connection topologies were demonstrated. Arranging 1 to 81 splitters in an15x15 input waveguide array, c.f. Fig. 4(A), demonstrated a 3D printed dense interconnectfor 225 neurons in an area of only 300x300 m2. Inspired by convolutional NNs, the sameauthors realized Boolean Haar ﬁlters arranged in a 7x7 array, see Fig. 4(B). Such arrays canﬁlter images containing 21x21 pixels in parallel, which in principle is suﬃcient for realizinga convolutional layer applied to the MNIST handwritten digit dataset. Most importantlythe area of both 3D interconnects scales linearly with the number of inputs. V. GRADIENT INDEX CONTINUOUS INTERCONNECTS

Multilayered diﬀractive optical elements, c.f. Fig. 5, can also perform interconnectiontasks utilizing the 3D via optical volume elements (OVEs). OVEs can be designed by utiliz-ing a nonlinear optimization scheme, learning tomography (LT), which calculates the topog-raphy of either multilayered or GRIN volume elements to approximate desired mappings.Figure 5(A,B) shows an demonstration by Dinc et al., which acts as an angular multiplexer7ptical Neural Networks: The 3D connection(lantern) that maps plane waves with diﬀerent incidence angles to linearly polarized mul-timode ﬁber modes . It provides an interconnect between single mode ﬁbers stacked withdiﬀerent angles and a multimode ﬁber to map each single mode ﬁber input/output to aspeciﬁc mode of multimode ﬁber, hence performs mode-division multiplexing. Another ex-ample of LT computed OVEs realizing Haar ﬁlters such as demonstrated in are shown inFig. 5(C,D). VI. POSSIBILITIES FOR PHOTONIC NEURONS

The function of a NN neuron is the summation of its inputs followed by a nonlineartransformation. Summation of the individual ﬁelds impinging on a neuron can be realizedin photonics by the superposition of optical ﬁelds. Unfortunately, nonlinearity is since manyyears the Achilles-heel of photonics compared to electronics. However, modern photonicdevices have signiﬁcantly lowered the energy consumption which can now be below 100 fJper nonlinear transformation . Many standard nonlinear photonic components have po-tentially high modulation bandwidths, fast response times and can directly be interfacedwith fully parallel as well as dense 3D photonic interconnect. Photonic neurons combinedwith our 2D/3D canonical NN architecture therefore oﬀer new concepts for addressing thelong-standing challenges of parallelism and connection density for high-speed NN computers.In order to make most eﬃcient use of the footprint and circuit volume, photonic neuronsneed to be arranged in a 2D array. Furthermore, neurons that accept multi-mode ﬁelds astheir input could potentially be beneﬁcial as this relaxes design constraints and allows forhigh-density integration of 3D photonic waveguides without a cladding. Finally, any opticaltransformation is associated with losses and the 3D photonic interconnect is no exception;neurons including optical ampliﬁcation would mitigate such losses. At this stage, we canimagine all-optical, electro-optical as well as plasmonic neurons, and the most promisingconcept will certainly have to strike a balance between speed, eﬃciency, ﬂexibility andpotentially ampliﬁcation. 8ptical Neural Networks: The 3D connection VII. OUTLOOK

The viability of integrating photonic circuits suited for NN interconnects in 3D has re-cently been demonstrated in principle . Ultimately, scalability is key for computing hard-ware, which implies that stacking 2D neurons and 3D interconnects into deep photonic NNsrequires optical losses to be counterbalanced by ampliﬁcation without resulting in an un-sustainable thermal energy deposition inside the integrated photonic circuit. However, thecomputational power of a NN relies on more than simply establishing speciﬁc connections inparallel. The nonlinearity of its neurons is a fundamental requirement for solving complextasks, and here signiﬁcant room for improvement exists. Another deﬁning feature of NNis the optimization of their connections during training. New, ideally in-situ optimizationstrategies are in urgent demand. In combination with plasticity such as non-volatile memris-tive eﬀects, these concepts would signiﬁcantly reduce the complexity of potential auxiliarysupport circuits as well as of the 3D interconnect itself.

REFERENCES W. S. McCulloch and W. Pitts, The Bulletin of Mathematical Biophysics (1943),10.1007/BF02478259, arXiv:1011.1669v3. Y. LeCun, Y. Bengio, and G. Hinton, Nature , 436 (2015). J. Moughames, X. Porte, M. Thiel, G. Ulliac, M. Jacquot, L. Larger, M. Kadic, andD. Brunner, Optica , 640 (2020). N. U. Dinc, J. Lim, E. Kakkava, C. Moser, and D. Psaltis, Nanophotonics ahead of print ,1 (2020). A. ˇZukauskas, I. Matulaitiene, D. Paipulas, G. Niaura, M. Malinauskas, and R. Gadonas,Laser and Photonics Reviews , 706 (2015). D. Psaltis, D. Brady, X.-G. Gu, and S. Lin, Nature , 325 (1990). G. Barbastathis and D. Psaltis, in

Holographic Data Storage (2000) pp. 21–42. T. Heuser, M. Pﬂ¨uger, I. Fischer, J. A. Lott, D. Brunner, and S. Reitzenstein, Journal ofPhysics Photonics acceptedaccepted