[PDF] All-optical neuromorphic binary convolution with a spiking VCSEL neuron for image gradient magnitudes

Abstract

All-optical binary convolution with a photonic spiking vertical-cavity surface-emitting laser (VCSEL) neuron is proposed and demonstrated experimentally for the first time. Optical inputs, extracted from digital images and temporally encoded using rectangular pulses, are injected in the VCSEL neuron which delivers the convolution result in the number of fast (<100 ps long) spikes fired. Experimental and numerical results show that binary convolution is achieved successfully with a single spiking VCSEL neuron and that all-optical binary convolution can be used to calculate image gradient magnitudes to detect edge features and separate vertical and horizontal components in source images. We also show that this all-optical spiking binary convolution system is robust to noise and can operate with high-resolution images. Additionally, the proposed system offers important advantages such as ultrafast speed, high energy efficiency and simple hardware implementation, highlighting the potentials of spiking photonic VCSEL neurons for high-speed neuromorphic image processing systems and future photonic spiking convolutional neural networks.

Full PDF

AAll-optical neuromorphic binary convolution with a spiking VCSEL neuron for image gradient magnitudes Y AHUI Z HANG , J OSHUA R OBERTSON , S HUIYING X IANG , M ATĚJ H EJDA , J ULIÁ N B UENO , AND A NTONIO H URTADO Institute of Photonics, SUPA Dept. of Physics, University of Strathclyde, TIC Centre, 99 George Street, Glasgow G1 1RD, United Kingdom State Key Laboratory of Integrated Service Networks, Xidian University, Xi'an 710071, China * Corresponding author: [email protected]; [email protected]

Received 09 October 2020; revised XX Month, XXXX; accepted XX Month XXXX; posted XX Month XXXX (Doc. ID XXXXX); published XX Month XXXX

Convolutional neural networks (CNNs) have seen tremendous success in many applications, such as speech and image recognition [1,2], computer vision [3] and document analysis [4]. However, CNN-based systems are computationally expensive due to their complicated architectures and the large number of parameters they rely on. CNNs therefore typically require the implementation of multicore central processing units and graphics processing units to compensate for the rather high computational expense [5-6]. This makes CNN architectures often unsuitable for smaller devices like phones and smart cameras where power and speed have strict limitations. To address these drawbacks, the optimization and the discovery of new high speed and low power consumption platforms for CNNs are urgently required. For the optimization of CNNs, binary CNNs, which are simple, efficient, and accurate approximations of complete CNNs, can be introduced [7-9]. In binary CNNs, the weights given to the inputs of each convolutional layer are approximated with binary values [7]. Therefore, binary CNNs boast 58 × faster convolutional operations and 32 × less memory requirements than those of traditional CNNs [7]. Several optimized binary versions of CNNs have been proposed for training processes and image classification tasks [7,10-11]. However, beyond the optimization of CNNs, a new platform offering high speed and low power consumption remains highly desirable. Photonics is considered a highly promising candidate for future neural network implementations given the unique advantages it provides such as high speed, wide bandwidth and low power consumption [12 – EXPERIMENTAL SETUP AND THEORETICAL MODEL

We present here the experimental arrangement and the theoretical model of the all-optical binary convolution system based on a photonic spiking VCSEL neuron. In this work we set a source digital image and a kernel as the two inputs of the convolution system. The value of any one pixel in the source image or kernel is limited to 0 or 1. A. EXPERIMENTAL SETUP

Figure 1 shows the schematic diagram of the fiber-optic experimental setup. Two separate electrical signals are generated with a high-bandwidth arbitrary waveform generator (AWG) representing the source image and the kernel used for the convolution process, respectively. These electrical signals (from Channels 1 and 2 of the AWG) are individually amplified by RF Amplifiers 1 and 2, before they are fed into two 10 GHz Mach-Zehnder intensity modulators (Mod1 and Mod2) to encode the source image and kernel into an external optical signal. The latter is generated by a 1300 nm tunable laser (TL). An optical isolator (OI) is included after the TL to avoid unwanted light reflections that might lead to spurious results. A variable optical attenuator (VOA) is used after the OI to adjust the strength of the light signal from the TL. The polarization of the optical signal from the TL is adjusted using three polarization controllers (PC1, PC2 and PC3), where PC1 and PC2 are specifically used to match the polarization of the optical signal to that which maximizes the performance of the two modulators, encoding respectively the image (Mod1) and the kernel (Mod2) information into the optical path. PC3 is used to adjust the final polarization of the encoded optical signal such that it matches the polarization of the targeted VCSEL mode. A 50:50 optical coupler (OC1) is used to split the light signal into two paths. The first one is connected to a power meter (PM) to monitor the input strength, whilst the second one is directly injected into a commercially-available 1300nm-VCSEL through an optical circulator (CIRC). The output of the VCSEL, acting as a spiking optical neuron, is sent to an 8 GHz real-time oscilloscope (SCOPE) and an optical spectrum analyzer (OSA) for analysis. The VCSEL was kept at a constant temperature of 293 K with an applied bias current of 6.5 mA (the lasing threshold current of the VCSEL was 𝐼 𝑡ℎ = 𝜆 𝑦 ) and to the subsidiary mode as the orthogonally polarized mode (or X-polarized mode, XP mode, 𝜆 𝑥 ). Figure 2(b) shows in turn the optical spectrum of the 1300nm-VCSEL device in the spiking regime as it is subject to optical injection into the orthogonally polarized mode of the device. Upon injection of the external optical signal into the XP mode of the device, the XP mode becomes the dominant mode whilst the YP mode becomes attenuated. The frequency detuning between the external optically injected signal and the XP mode of the VCSEL was equal to -5.64 GHz. The power of the optically injected signal was 127 μW . B. THEORETICAL MODEL

We use an extension of the well-known spin-flip model (SFM) to model the operation of the VCSEL acting as a spiking optical neuron. In our formulation we add extra terms to the model’s equations to account for the source image and kernel inputs. The rate equations can be described as follows [26, 27]:

Fig. 1.

Experimental setup of the binary convolution system based on a single VCSEL. TL: tunable laser; OI: optical isolator; VOA: variable optical attenuator; PC1, PC2, and PC3: polarization controllers; AWG: arbitrary waveform generator; Mod1, Mod2: Mach-Zehnder modulators; OC1, OC2: optical couplers; CIRC: circulator; Bias & T Controller: bias and temperature controller; PD: photodetector; PM: power meter; SCOPE: oscilloscope; OSA: optical spectrum analyzer.

Fig. 2.

Optical spectra of free-running VCSEL used in the experiment (a). Optical spectra of the VCSEL subject to constant optical injection (b). Two polarization modes of VCSEL are referred as 𝜆 𝑦 (parallel) and 𝜆 𝑥 (orthogonal)

22 * * [ (1 ) ( )]

N x y y x x y dN N E E in E E E Edt  = − + + − + − (2)

22 * * d [ ( ) ( )] s N x y y x x y n n n E E iN E E E Edt  = − − + + − (3) where the subscripts 𝑥 , 𝑦 represent the XP and YP modes of the VCSEL, respectively. 𝐸 𝑥,𝑦 is the slowly varying complex amplitude of the field in the XP and YP modes. 𝑁 is the total carrier inversion between conduction and valence bands. 𝑛 is the difference between carrier inversions with opposite spins. 𝑘 denotes the field decay rate. 𝛾 𝑎 and 𝛾 𝑝 are the linear dichroism and the birefringence rate, respectively. 𝛼 is the linewidth enhancement factor. 𝛾 𝑁 is the decay rate of 𝑁 . 𝛾 𝑠 is the spin-flip rate. 𝜇 represents the normalized pump current. 𝑘 𝑖𝑛𝑗 is the injected strength and, 𝐸 𝑖𝑛𝑗𝑥1 and 𝐸 𝑖𝑛𝑗𝑥2 indicate respectively the source image and kernel inputs. Δ𝜔 𝑥 is defined as Δ𝜔 𝑥 = 𝜔 𝑖𝑛𝑗𝑥 − 𝜔 , where 𝜔 𝑖𝑛𝑗𝑥 is the angular frequency of the externally injected light in the XP mode, 𝜔 = (𝜔 𝑥 + 𝜔 𝑦 )/2 is the center frequency between the XP and YP modes with 𝜔 𝑥 = 𝜔 + 𝛼𝛾 𝑎 − 𝛾 𝑝 and 𝜔 𝑦 = 𝜔 − 𝛼𝛾 𝑎 + 𝛾 𝑝 . The frequency detuning between the externally injected signal and the XP mode is set as: Δ𝑓 𝑥 = 𝑓 𝑖𝑛𝑗𝑥 − 𝑓 𝑥 . Hence, in Eq. (1), Δ𝜔 𝑥 = 2𝜋Δ𝑓 𝑥 +𝛼𝛾 𝑎 − 𝛾 𝑝 . 𝐹 𝑥,𝑦 are the spontaneous emission noise terms which can be written as: F ( )2 sp Nx

N n N n    = + + − (4)

F ( )2 sp Ny i N n N n    = − + − − (5) where 𝛽 𝑠𝑝 is the strength of the spontaneous emission and, 𝜉 and 𝜉 are independent complex Gaussian white noise terms of zero mean and a unit variance. We numerically solve Eqs. (1) - (4) using the fourth-order Runge-Kutta method. The parameter values configured for the 1300 nm VCSEL are as follows [26]: 𝑘 = 185𝑛𝑠 −1 , 𝛾 𝑎 = 2𝑛𝑠 −1 , 𝛾 𝑝 =128𝑛𝑠 −1 , 𝛼 = 2 , 𝛾 𝑁 = 0.5𝑛𝑠 −1 , 𝛾 𝑠 = 110𝑛𝑠 −1 , 𝛽 𝑠𝑝 = 10 −6 and 𝑘 𝑖𝑛𝑗 = 125𝑛𝑠 −1 . With these parameters, the YP mode is the main lasing mode, and the XP mode is subsidiary mode, as in Fig. 2(a). EXPERIMENTAL AND NUMERICAL RESULTS

In this section, we firstly provide an experimental proof-of-concept demonstration of all-optical binary convolution with a spiking VCSEL neuron. We then calculate the image gradient magnitudes from a basic “ S quare” source image and a complex “Horse head” source image by means of all-optical binary convolution. Simulation results on the binary convolution and the calculation of image gradient magnitudes are also presented using a “Horse” source image from the latest version of the Berkeley Segmentation Data Set [32]. Finally, the robustness of our binary convolution system is also tested numerically by adding noise to the source image and kernel inputs. A. EXPERIMENTAL RESULTS

Figure 3 shows an example of a binary 2D convolution calculation, where a submatrix (9 pixels) from a source image and a kernel are element-wise multiplied and the subsequent values of the multiplication are summated. In our experiment, we temporally encoded each pixel of the source image and the kernel inputs using rectangular pulses. Pixels of value “1” were optically encoded using intensity modulated power drops in the

TL’s light (via MZ modulators, Mods 1-2) whereas pixels of value “0” produced no intensity modulation in TL’s light . The duration of each rectangular pulse encoding a pixel was set to 1.5 ns to match the refractory period of the experimentally-measured spiking dynamics from the VCSEL neuron [16]. The experimental optical realization of the binary convolution example provided in Fig. 3 is depicted graphically in Fig. 4. Figures 4(a) and 4(b) plot respectively the temporally-encoded 9 pixel ( ) image submatrix and kernel inputs generated for the example given in Fig. 3. Given that the optically-encoded source image and kernel inputs were injected into the VCSEL synchronously, we delayed the kernel input such that its modulation (in Mod2) occurred on-top of the corresponding modulated image input (from Mod 1). We introduced a delay time in the kernel input (directly using the AWG) equal to the time required for a light pulse to travel from Mod1 to Mod2. Figure 4(c) shows the optical signal measured after Mod2 in the setup, combining in a single input line the temporal image and kernel information given in Figs. 4(a) and 4(b). This signal, which was injected into the VCSEL neuron to perform the binary convolution had three different levels (low, medium and high) depending on the specific pixel values in the , , , ,, 1 2 , d ( ) ( ) (1 )() [ ( ) ( )] x x y a x y p x y x yi ty x inj injx injx x y E k E i k E k i NEdt inE k E t E t e F       = −  −  + + + + + Fig. 3.

Example of a single step during a 2D binary convolution operation. During this step, a Hadamard (element-wise) product is calculated for a submatrix of the image and the kernel, and all the values in the multiplication result are summed up to obtain a single value.

Fig. 4.

Experimental convolution operation. (a) Inputs of Channel 1 (Image in Fig. 3). (b) Inputs of Channel 2 (kernel in Fig. 3). (c) Inputs of VCSEL. (d) Outputs of VCSEL (the results of convolution). mage and kernel at a given instance. We control the conditions of the injected signal (in Fig. 4(c)) in such a way that the medium and high input levels injection-lock the VCSEL to the external signal, delivering a constant stable temporal output. The lowest input level brings the VCSEL out of the injection-locking and into a dynamical region where the device produces fast spiking dynamical responses [24]. Figure 4(d) shows the experimentally measured time- series at the VCSEL neuron’s output, yielding stable or spiking outputs depending on the input intensity levels (from Fig. 4(c)). Importantly, Fig. 4(d) shows that the number of spikes fired by the VCSEL neuron directly provides the result of the binary convolution. It can be seen in Fig. 4(d) that four fast (<100 ps long) spikes are fired by the VCSEL neuron, the same result as that of the binary convolution example in Fig. 3. Figure 5 shows a temporal map [17] merging in a single plot 100 superimposed consecutive convolutional outputs from the photonic spiking VCSEL neuron. The image and kernel inputs and the experimental conditions are the same as those shown in Fig. 4. Spike events are depicted in yellow in the colour map of Fig. 5 and steady state responses appear in light blue. Figure 5 clearly shows that binary convolutional result to 100 consecutive inputs remains the same producing, in all 100 cases, 4 separate spiking responses at the VCSEL’s output. The optical binary convolutional results obtained with the spiking VCSEL neuron are therefore consistent and reproducible. This proof-of-concept result obtained with a spiking VCSEL highlights a new, controllable way to perform convolution operations for information and image processing tasks. B. CALCULATION OF IMAGE GRADIENT MAGNITUDES

In this section, image gradient magnitude, critical to image edge detection, is calculated using our approach based on a single spiking VCSEL neuron and optical binary convolution. The image gradient magnitude

𝐺(𝑥) of a given pixel 𝑥 is calculated using the following equations [33]: ( ) ( ) ( ) X Y

G x G x G x = + (6) ( ) ( ( ) ) ( ( ) )

X X X

G x B x B B x B + − =  −  (7) ( ) ( ( ) ) ( ( ) )

Y Y Y

G x B x B B x B + − =  −  (8)

Four binary convolutions, i.e.

𝐵(𝑥) ⊗ 𝐵

𝑋,𝑌± , are used in 𝐺 𝑋 (𝑥) and 𝐺 𝑌 (𝑥) . 𝐵(𝑥) = ∑ 𝑠(𝑖 𝑝 , 𝑖 𝑥 ) ∙ 2 𝑝𝑁−1𝑝=0 is the N-bit local binary pattern descriptor of a pixel 𝑥 . 𝑖 𝑥 is the central pixel intensity and 𝑖 𝑝 is the intensity of the 𝑝 -th neighbour of 𝑥 in the source pattern. The comparison operator is defined as: p x xp x if i i Ti i otherwise  − =  (9) where 𝑇 𝑥 = 𝑖 𝑥 + 20 and 𝑁 = 5 × 5 − 1 . The range of the local binary pattern descriptor of a pixel is presented in gray color in Fig. 6(a). In Fig. 6(b), a “ S quare” source image is made -up of a solid black

10 × 10 pixel square on a

24 × 24 pixel white background. In the grayscale image, the intensities of white and black pixels are 255 and 0, respectively. For example, the intensity of the red-highlighted pixel x in Fig. 6(b) is i x =255. We arrange and serialize the pixels in the range of local binary pattern descriptor by columns. The 1 st neighbour pixel intensity is i =0, hence according to Eq. (9), s (i , i x ) =1; The 3 rd neighbour is i =255, hence, s (i , i x ) =0. B(x) can be therefore calculated for the red-highlighted pixel in Fig. 6(b) as follows: ( , ) - - - ( , ) 1 1 1 1 1( , ) - - - ( , ) 1 1 1 1 1( )= ( , ) - - ( , ) = 0 0 0 0( , ) - - - ( , ) 0 0 0 0 0( , ) - - - ( , ) 0 0 0 0 0 x xx xx xx xx x s i i s i is i i s i iB x s i i x s i i xs i i s i is i i s i i                         (10)

For the red-highlighted pixel x in Fig. 6(b) , “1” in 𝐵(𝑥) corresponds to a white pixel and “0” corresponds to a black pixel in the source image.

In Eqs. (7) and (8), 𝐵 𝑋+ , 𝐵 𝑋− , 𝐵 𝑌+ and 𝐵 𝑌− are the four kernels that are adopted as in Ref. [33]. Figure 6(c) shows the areas of the four different kernels. Pixels which fall outside the highlighted areas in Fig. 6(c) for a given string are set to zero. For example: Fig. 6 (a) Gray color: range of the local binary pattern descriptor of pixel x . (b) A

24 × 24 pixel source “ S quare” image. The red-highlight indicates a given pixel in the image. (c) The four convolutions ( 𝐵 𝑋+ , 𝐵 𝑋+ , 𝐵 𝑌+ and 𝐵 𝑌− ) of the binary pattern. Bits which fall outside the highlighted areas for a given string are set to zero. Fig. 5.

Temporal map of 100 superimposed consecutive convolutional results measured experimentally at the output of spiking VCSEL neuron. X B +          (11) We arrange and serialize the pixels of

𝐵(𝑥) and the four kernels by columns. For example, the string of

𝐵(𝑥) is [1, 1, 0, 0, 0, 1, 1, 0, 0, 0 … ] and the string of 𝐵 𝑥+ is [1, 0, 1, 0, 1, 0, 1, 1, 1, 0 … ]. We studied experimentally the response of the VCSEL neuron under the injection of the “S quare ” source image and kernel operators included in Figs. 6(b) and (c). Specifically, Fig. 7 showcases the experimentally-recorded results at the VCSEL output for each kernel when operating on the red-highlighted pixel in Fig. 6(b). It can be seen in Fig. 7(a) that fast (sub-ns) spikes are only triggered by the 1 st and 7 th pixels. Therefore, the convolutional result for 𝐵(𝑥) ⊗ 𝐵 𝑋+ is 2, as it was expected. Similarly, from Figs. 7(b) - (d) we can see that 2, 6 and 0 sub- ns spikes are elicited at the VCSEL’s output for kernels 𝐵 𝑥− , 𝐵 𝑦+ and, 𝐵 𝑦+ , respectively. Using the experimental results measured from the spiking VCSEL neuron we calculate off-line 𝐺 𝑋 (𝑥) , 𝐺 𝑌 (𝑥) and 𝐺(𝑥) to determine the image gradient magnitude. Based on the experimentally-measured results in Figs. 7(a) - (d), 𝐺 𝑋 (𝑥) , 𝐺 𝑌 (𝑥) and 𝐺(𝑥) are 0, 6 and 6 respectively using Eqs. (7) - (9). The experimental process in Fig. 7 is repeated consecutively for every single pixel in the “ S quare” source image (Fig. 6(b)) to calculate their image gradient magnitudes. The latter are used to build the reconstructed image in Fig. 8(a), providing a gradient map for the “ S quare” source image . Figure 8(a) clearly reveals a “ hollow ” square shape in the experimentally produced gradient map; hence detecting all edge-features of the source image. In Fig. 8(a), the pixels with a gradient magnitude 𝐺(𝑥) > 3 can be selected to thin the response and reveal the true edges of the “S quare ” [33, 34]. Additionally, Figs. 8(b) and 8(c) plot separately the reconstructed images using the obtained values for 𝐺 𝑋 (𝑥) and 𝐺 𝑌 (𝑥) from the experimentally measured time-series at the VCSEL n euron’s output . Figures 8(b) and 8(c) reveal that both vertical and horizontal lines can be individually detected from the source image in Fig. 6(b) using respectively the magnitudes 𝐺 𝑋 (𝑥) and 𝐺 𝑌 (𝑥) . To further investigate our experimental system, we focused on demonstrating the achievement of gradient maps from a complex source image using the all-optical binary convolution of this work, as seen in Fig. 9. For this purpose, we selected as a source image for our VCSEL based binary convolution system a complex “Horse head” image (Fig.9(b)). This is a

100 × 105 pixel portion of the “ Horse ” image from the Berkeley Segmentation Data Set [32] (also included in Fig.9(a)). The colour image was converted to grayscale before we applied the same experimental methods used previously to obtain the results included in Fig. 8 above. The values of 𝐺(𝑥) , 𝐺 𝑋 (𝑥) and 𝐺 𝑌 (𝑥) experimentally achieved for the complex “horse head” image ( Fig.9(b)) are shown in Figs. 9(c) - (e), respectively. These gradient maps reveal the successful detection of the edge features in this complex image; hence permitting

Fig. 7.

Four convolutional results with four highlighted areas kernels for one pixel which has red box in Fig. 6.

Fig. 8.

Gradient maps of the “ S quare” source image. Visualizations of (a) 𝐺 , (b) 𝐺 𝑋 and (c) 𝐺 𝑌 maps of the “ S quare” source image based on the optical binary convolution performed by the VCSEL neuron. Fig. 9.

The “Horse head” image and the gradient maps of the “Horse head” image. (a)

Source “Horse” image . The blue box indicates the ‘Horse Head’ image used for analysis (b) . Visualizations of the (c) 𝐺 , (d) 𝐺 𝑋 and (e) 𝐺 𝑌 maps of the “Horse head” image obtained from the optical binary convolution performed with the VCSEL neuron. he successful recreation of the outline and shape of the horse head. This effectively demonstrates that the reported all-optical binary convolution technique with a VCSEL neuron is also suitable for complex high-resolution source images. C. NUMERICAL RESULTS

In this section, binary convolution based on a single VCSEL neuron is performed numerically. The robustness of the system to perform all-optical binary convolution under noisy inputs and for larger kernels is investigated. Finally, the calculation of image gradient magnitudes with our photonic approach using a single VCSEL neuron is presented numerically using the ‘Horse’ image from the latest version of the Berkeley Segmentation Data Set [32]. The binary convolution example given in Fig. 3 and experimentally performed with the VCSEL neuron (see Fig. 4) is numerically simulated using the SFM model in Figs. 10(a1) - (c1).

Pixels of value “1” are numerically implemented using power drop pulses with strength 𝐾 𝑝 = 0.852 ( 𝐾 𝑝 = pulse power/ constant power) with a duration of 1.5 ns (as in the experimental demonstration). The frequency detuning between the externally injected signal and the XP mode in the VCSEL model is set to -3.66 GHz. Figures 10 (a1) - (c1) plot the numerically obtained results for the all-optical binary convolution with a VCSEL neuron. Specifically, Figs. 10(a1) and (b1) plot respectively the time series for the temporally encoded image (Fig. 10(a1)) and kernel (Fig. 10(b1)) inputs, whilst Fig. 10(c1) plots the numerically calculated output from the VCSEL neuron. The latter clearly shows that the simulation successfully reproduces the outcome of the experimental all-optical binary convolution (see Fig 4(d)) where 4 spikes are elicited by the VCSEL. This excellent agreement between the modelled results and the experimental findings gives us confidence to test the robustness of the photonic binary convolution system under the injection of inputs with added noise. To study this aspect, we model the response of the VCSEL binary convolutional system under the injection of noisy inputs with a configured SNR= 20 dB (see results in Figs. 10(a2) and 10(b2)). Specifically, Fig. 10 (c2) shows that the exact same response is obtained from the VCSEL neuron as compared to the case with no added noise in Fig. 10(c1). This outlines the robustness to noise of the proposed all-optical VCSEL convolutional system. Additionally, the numerical convolution with a larger  pixel kernel is tested numerically in Figs. 10(a3) - 10(c3) using Eq. (10) and Eq. (11) as inputs. Figure 10(c3) shows that the modelled convolutional result obtained from the VCSEL neuron also produces two fast spike events; hence yielding the exact same outcome as obtained experimentally in Fig. 7(a). We can therefore deduce that the convolution results that can be obtained with our VCSEL neuron based approach are not limited by the dimension of the kernel operators or the resolution of the image. Figure 11 shows the numerically calculated gradient maps obtained with a spiking VCSEL neuron for the “ Horse ” source image [32] with a resolution of 481 ×

321 pixels (Fig.11(a) and Fig. 9(a)). Figures 11(b-d) show the calculated gradients maps for

𝐺(𝑥) , 𝐺 𝑋 (𝑥) and 𝐺 𝑌 (𝑥) , respectively. These were obtained using the kernel introduced in the experimental study of the “Square” source image ( see Figs. 6-8). It can be seen that the numerical simulation successfully reveals the image edge information through the gradient magnitude 𝐺(𝑥) , as seen in Fig. 11(b), as well as the individual horizontal and vertical edge features of the source image through 𝐺 𝑋 (𝑥) and 𝐺 𝑌 (𝑥) , as seen in Figs. 11(c) and 11(d), respectively. These results, showing good overall agreement with the experimental findings of Fig. 9, therefore numerically validate that the gradient magnitude can be successfully calculated with a photonic spiking VCSEL-neuron, irrespectively of the image dimensionality. CONCLUSION

In this work, we proposed and investigated experimentally and numerically an all-optical binary convolution system using a VCSEL operating as a photonic spiking neuron. The inputs (image and kernel) are encoded temporally using fast rectangular pulses (1.5 ns-long) and optically injected into the VCSEL n euron. The latter’s optical output directly provides the results of the convolution in the number of (sub-ns long) spikes fired. In addition to performing all-optical binary convolution, we demonstrated experimentally and numerically the ability of the proposed system to calculate the image gradient magnitudes from digital source images. This feature was successfully used to identify key edge features from a source image as well as its separate horizontal and vertical components. Furthermore, we investigated numerically the robustness of the proposed VCSEL-based convolutional system to input noise. This simple system, using a single commercially-available VCSEL operating at the key telecom wavelength of 1300 nm, offers a novel photonic solution to binary convolution with the advantage of being highly energy efficient and hardware friendly.

Fig. 11 “Horse” image and gradient maps of the “Ho r se” image. (a) “Horse” image. Visualizations of 𝐺 (b), 𝐺 𝑋 (c) and 𝐺 𝑌 (d) maps of “Horse” image based on the numerical optical binary convolution in VCSEL. Fig. 10. (a1) - (a3) Inputs of channel 1 (Image in Fig. 3). (b1) - (b3) Inputs of channel 2 (Kernel in Fig. 3). (c1) - (c3)

VCSEL neuron’s output. (a1) - (c1) Convolutional operation in the VCSEL neuron without noise. (a2) - (c2) Convolutional operation in the VCSEL neuron with added inputs noise of SNR=20 dB. (a3) - (c3) Convolution operation with a pixel kernel. his opens exciting prospects for a new photonic spiking platform for future optical binary spiking CNNs. Furthermore, the high-speed, low cost and neuronal functionalities of these photonic spiking systems hold promise for numerous processing tasks expanding into fields such as computer vision and artificial intelligence.

Funding Information.

Office of Naval Research Global (ONRGNICOP-N62909-18-1-2027); the European Commission (828841-ChipAI-H2020-FETOPEN2018-

EPSRC Doctoral Training Partnership (EP/N509760). National Natural Science Foundation of China (61674119, 61974177); China Scholarship Council;

Acknowledgments.

We thank Prof T. Ackemann and Prof. A. Kemp (University of Strathclyde) for lending some of the equipment used in this work.

Disclosures.

The authors declare no conflicts of interest.

References O. Abdel-

Hamid, A. R. Mohamed, H. Jiang and G. Penn, “Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition,” in 2012 IEEE international conference on Acoustics, speech and signal processing (ICASSP) (Kyoto, 2012), pp. 4277-4280. 2.

J. Fu, H. Zheng and T. Mei, “Look closer to see better: Recurrent attention convolutional neural network for fine- grained image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, (Honolulu, 2017), pp. 4438-4446. 3. K. Gopalakrishnan, S. K. Khaitan, A. Choudhary and A. Agrawal, “Deep convolutional neural networks with transfer learning for computer vision-based data- driven pavement distress detection,” Constr. Build.

Mater., , 322-330 (2017). 4.

P. Y. Simard, D. Steinkraus and J. C. Platt, “Best practices for convolutional neural networks applied to visual document analysis,” In Icdar, (2003 , Aug), vol 3, No. 2003. 5.

C. Farabet, C. Couprie, L. Najman, and Y. LeCun, “Learning hierarchical features for scene labeling,” IEEE Trans. Pattern Anal.

Mach. Intell., , 13582979, (2013). 6. L. Cavigelli, M. Magno, and L. Benini, “Accelerating real -time embedded scene labeling with convolutional networks,” in

Proceedings of the 52nd Annual Design Automation Conference, (New York, 2015), pp. 108:1 – M. Rastegari, V. Ordonez, J. Redmon and A. Farhadi, “Xnor -net: Imagenet classification using binary convolutional neural networks,” In European conference on computer vision (ECCV), Springer, Cham. (2016), pp. 525-542. 8.

F. Juefei-

Xu, V. Naresh Boddeti and M. Savvides, “Local binary convolutional neural networks,” In Proceeding s of the IEEE conference on computer vision and pattern recognition, (Honolulu, 2017), pp. 19-28. 9.

X. Lin, C. Zhao and W. Pan, “Towards accurate binary convolutional neural network,” In Advances in Neural Information Processing Systems (NIPS), (2017), pp. 345-353. 10.

M. Courbariaux, Y. Bengio, and J.-

P. David, “BinaryConnect: training deep neural networks with binary weights during propagations ,” In Advances in Neural Information Processing Systems (NIPS), (2015), pp. 3105 – M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv and Y. Bengio, “Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or - (2016). 12.

M. Turconi, B. Garbin, M. Feyereisen, M. Giudici and S. Barland, “ Control of excitable pulses in an injection-locked semiconductor laser," Phys. Rev. E, , 022923 (2013). 13. P. R. Prucnal, B. J. Shastri, T. F. de Lima, M. A. Nahmias and A. N. Tait, “Recent progress in semiconductor excitable lasers for photonic spike processing,” Adv . Opt. and Photonics, , 228-299 (2016). 14. S. Xiang, Y. Zhang, J. Gong, X. Guo, L. Lin and Y. Hao, “ STDP-based unsupervised spike pattern learning in a photonic spiking neural network with VCSELs and VCSOAs ,” IEEE J. Sel. Top. Quantum Electron, , 18686194 (2019). 15. Y. Zhang, S. Xiang, X. Guo, A. Wen and Y. Hao, “All -optical inhibitory dynamics in photonic neuron based on polarization mode competition in a VCSEL with an embedded saturable absorber,” Opt.

Lett., , 1548-1551 (2019). 16. J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran and W. H. P. Pernice, “ All-optical spiking neurosynaptic networks with self-learning capabilities ,” Nature, , 208-214 (2019). 17.

J. Robertson, E. Wade, Y. Kopp, J. Bueno and A. Hurtado, “ Toward neuromorphic photonic networks of ultrafast spiking laser neurons ,” IEEE J. Sel. Top. Quantum Electron, , 18900045 (2019). 18. A. Mehrabian, Y. Al-Kabani, V. J. Sorger and T. El-

Ghazawi, “PCNNA: a photonic convolutional neural network accelerator,” In 2018, IEEE

International System-on-Chip Conference (SOCC) IEEE (Arlington, 2018, September) pp. 169-173. 19.

H. Bagherian, S. Skirlo, Y. Shen, H. Meng, V. Ceperic and M. Soljacic, “On - chip optical convolutional neural networks,” arXiv preprint arXiv:1808.03303 (2018). 20. S. Xu, J. Wang, R. Wang, J. Chen and W. Zou, “High -accuracy optical convolution unit architecture for convolutional neural networks by cascaded acousto- optical modulator arrays,” Opt.

Express, , 19778-19787 (2019). 21. S. Xu, J. Wang and W. Zou, “High -energy-efficiency integrated photonic convolutional neural networks,” arXiv preprint arXiv:1910.12635, (2019).

K. Iga and H. E. Li, “Vertical -cavity surface- emitting laser devices,” Berlin:

Springer. (2003) 23.

R. Michalzik, “V

CSELs: Fundamentals, Technology and Applications of Vertical-Cavity Surface-

Emitting Lasers,” (Springer Series in Optical

Sciences), vol. 166. Berlin, Germany: Springer-Verlag, 2013. 24.

A. Hurtado and J. Javaloyes, “Controllable spiking patterns in long -wavelength vertical cavity surface emitting lasers for neuromorphic photonics systems,”

Appl. Phys. Lett. , 241103 (2015). 25.

B. Garbin, J. Javaloyes, G. Tissoni and S. Barland , “Topological solitons as addressable phase bits in a driven laser,” Nat. commun., , 5915 (2015). 26. S. Y. Xiang, A. J. Wen and W. Pan, “Emulation of spiking response and spiking frequency property in VCSEL- based photonic neuron,”

IEEE Photonics J. , 1504109 (2016). 27. T. Deng, J. Robertson, and A. Hurtado, “Controlled propagation of spiking dynamics in vertical-cavity surface-emitting lasers: towards neuromorphic photonic networks,”

IEEE J. Sel. Top. Quantum Electron. , 1800408 (2017). 28. J. Robertson, T. Den g, J. Javaloyes and A. Hurtado, “Controlled inhibition of spiking dynamics in VCSELs for neuromorphic photonics: theory and experiments,” Opt. Lett. , 1560-1563 (2017). 29. A. Dolcemascolo, B. Garbin, B. Peyce, R. Veltz and S. Barland , “Resonator neuron and triggering multipulse excitability in laser with injected signal,”

Phys. Rev. E, , 062211 (2018). 30. J. Robertson, M. Hejda, J. Bueno and A. Hurtado, “ Ultrafast optical integration and pattern classification for neuromorphic photonics based on spiking VCSEL neurons ,” Sci. Rep., , 1-8 (2020). 31. M. Hejda, J. Robertson, J. Bueno and

A. Hurtado, “Spike -based information encoding in VCSELs for neuromorphic photonic systems,” J . Phys. Photonics, , 044001 (2020). 32. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. “Contour detection and hierarchical image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., , 898 –

916 (2011). 33.

P. L. St-

Charles, G. A. Bilodeau and R. Bergevin, “Fast image gradients using binary feature convolutions,” In Proceedings of the IEEE onference on computer vision and pattern recognition workshops, (2016), pp.1-9. 34.

J. Canny. “A computational approach to edge detection,” IEEE Trans.

Pattern Anal. Mach. Intell., , 679 –

698 (1986).698 (1986).