Neuromorphic quantum computing
NNeuromorphic quantum computing
Christian Pehle ∗ Kirchhoff-Institute for PhysicsHeidelberg UniversityIm Neuenheimer Feld 227, D-69120 Heidelberg
Christof Wetterich † Institute for Theoretical PhysicsHeidelberg UniversityPhilosophenweg 16, D-69120 Heidelberg
We propose that neuromorphic computing can perform quantum operations. Spiking neurons inthe active or silent states are connected to the two states of Ising spins. A quantum density matrixis constructed from the expectation values and correlations of the Ising spins. As a step towardsquantum computation we show for a two qubit system that quantum gates can be learned as achange of parameters for neural network dynamics.
It has been suggested that artificial neural networksor neuromorphic computers can perform the operationsnecessary for quantum computing [1, 2]. It has alreadybeen argued that several important tasks within quantumcomputation can be performed by neural networks [3–5].Beyond this, the proposal of [1, 2] suggests that a fullquantum computation may be possible by use of stochas-tic information in a system of neurons. The present workproposes for a first time a concrete implementation ofquantum operations by spiking neurons.We demonstrate three important steps in that direc-tion:1. We show that the correlation map [1] from expecta-tion values and correlations of classical Ising spinsto a quantum density matrix is complete in the caseof two qubits. This means that for every possiblequantum density matrix ρ there exists a probabil-ity distribution p for the configurations of classicalIsing spins, such that ρ is realised by the correlationmap. This completeness was not proven before.2. We establish how unitary transformations of ρ cor-responding to arbitrary quantum gates can be per-formed by a change of probability distributions p .3. We describe spiking neurons by a system of differ-ential equations with parameters W . Expectationvalues and correlations of appropriate Ising spinscan be extracted from the dynamics of the inter-acting neurons. They depend on the parameters W . We demonstrate that for every given two-qubitdensity matrix ρ the system of neurons can adaptor learn suitable parameters W such that the cor-relation map of expectation values and correlationsof Ising spins expresses the density matrix.Our approach centers on the use of expectation valuesand particular correlations of Ising spins for the con-struction of quantum density matrices. Correlated Ising systems are efficient models for many biological systems[6]. We concentrate here on two qubits and discuss thescaling to a higher number of qubits at the end.The correlation map for a two qubit quantum systeminvolves six classical Ising spins [1] that are grouped intwo pairs of three Ising spins s ( i ) k . Here i = 1 , k = 1 . . . × χ ofexpectation values and correlations as χ = 1 , χ k = (cid:104) s (1) k (cid:105) , χ l = (cid:104) s (2) l (cid:105) , χ kl = (cid:104) s (1) k s (2) l (cid:105) . (1)If we denote by τ k , k = 1 . . . τ the identity matrix, we can define the U (4)-generators L µν = τ µ ⊗ τ ν , µ = 0 . . . , ν = 0 . . . . (2)The bit quantum map organises the expectation valuesand correlations (1) into a density matrix ρ = 14 χ µν L µν . (3)(Summation over double indices is implied.) We denotethis map by f : R × → C × , χ (cid:55)→ ρ = 14 χ µν L µν . (4)The density matrix ρ is a complex hermitian 4 × ρ ) = 1. A quantum density matrix has tobe positive - all eigenvalues λ of ρ have to obey λ ≥ p or the expectation values (1) that can realizea quantum density matrix. These restrictions are calledthe ”quantum constraints”.The states τ = 0 . . .
63 of the classical spin systemcorrespond to the 2 = 64 configurations of Ising spins { s (1)1 , s (1)2 , s (1)3 , s (2)1 , s (2)2 , s (2)3 } that can take the values s = a r X i v : . [ c ond - m a t . d i s - nn ] M a y FIG. 1. Probabilities corresponding to the density ma-trices of the pure state ψ + = √ ( | (cid:105) + | (cid:105) ) and a ran-domly generated density matrix ρ ( A ) and their transforma-tion under a CNOT gate ( B ). The labels 0 . . .
63 are bestthought of as bit vectors and label the state of 6 classicalspins s ( i ) k , i = 1 , , k = 1 . . .
3, that is for example 3 is a labelfor the spin state [ − , − , − , − , , × (cid:104) s (1) k (cid:105) and cor-relations (cid:104) s (1) k s (2) l (cid:105) corresponding to ψ + ( C ) and ρ ( D ). Thefirst row contains the three expectation values (cid:104) s (1) k (cid:105) , the firstcolumn the three expectation values (cid:104) s (2) k (cid:105) . The remaining3 × (cid:104) s (1) k s (2) l (cid:105) . Thetransformed spin expectation values and correlations relatedto the density matrices CNOT( ψ + ) and CNOT( ρ ) are shownin ( E ) and ( F ) respectively. ±
1. The probabilistic system is defined by associating toeach τ a probability p τ ≥ , (cid:80) τ p τ = 1 . The expectationvalues (1) are formed in the standard way, multiplyingthe values of ( s γ ) τ or ( s γ ) τ ( s δ ) τ in a given state τ with p τ and summing over τ . This results in a weighted sum χ k = σ ( k τ p τ , χ k = σ (0 k ) τ p τ , χ kl = σ ( kl ) τ p τ , (5)with signs σ ( a ) τ = ± σ ( k τ = ( − τ )[ k ] , (6) σ (0 k ) τ = ( − τ )[3+ k ] , (7) σ ( kl ) τ = ( − τ )[ k ] ( − τ )[3+ l ] . (8)Here bin( τ ) denotes the 6 bit binary representation of anumber 0 . . .
63 and bin( τ )[ k ] an the entry at index k ofthat vector. For the example τ = (1 , − , − , , ,
1) onehas bin[ τ ] = (1 , , , , , τ )[2] = 0 reads out thevalue at the second place and σ (20) τ = − s (1)2 in the state τ . We denote this map by g : R → R × , p (cid:55)→ χ. (9) The bit quantum map b can be seen as a map fromthe classical probability distribution p to the quantumdensity matrix, b = g ◦ f : R → C × , p (cid:55)→ ρ. (10)It has the property that the particular classical expecta-tion values and correlations (5) coincide with the expec-tation values of quantum spins in the cartesian directions,and the corresponding quantum correlations. The den-sity matrices can be constructed (or reconstructed) byuse of the particular classical or quantum correlations.This is analogous to the reconstruction of the densitymatrix for photons by appropriate correlations [7–9]. Ex-amples for probability distributions p realizing particularquantum density matrices ρ are shown in figure 1.Our first task is a demonstration that the bit quantummap b is complete, in the sense that for every positivehermitian normalized quantum density matrix there ex-ists at least one classical probability distribution p whichrealizes ρ by use of equation (10). For this task it will beconvenient to work with the ”classical wave function” q which is a type of probability amplitude. Its components q τ are related to p τ by p τ = q τ (cid:80) τ q τ . (11)This defines one further map h : R → R , q (cid:55)→ p = q | q | . (12)Changes of the probability distribution correspond torotations of the vector q , with p τ > (cid:80) τ p τ = 1guaranteed by the construction (11). We need to findfor each given density 4 × ρ ∈ C × a vector q ∈ R , such that it maps to the density matrix ρ underthe composition b ◦ h . Clearly any such q is not unique.In order to find one such q we minimize l ρ ( q ) = | ( b ◦ h )( q ) − ρ | (13)by gradient descent on q . In order to verify numericallythat this approach works we performed the following test:Starting with a randomly generated density matrix ρ ,we iteratively apply the three unitary transformationsCNOT, Hadamard H ⊗ I and rotation R / ⊗ I to ob-tain density matrices ρ i . The quantum gates preservethe positivity of ρ . Applying them many times leads to adense covering to the space of quantum density matrices,which becomes a full covering if the number of steps goesto infinity. We further explore the space of ρ by investi-gating different ρ . For each ρ i we solve the optimizationproblem with loss l ρ i ( q ) to obtain corresponding vectors q i . The results of this optimization are shown in figure2. For all ρ i we find vectors q realizing the bit quantummap b ◦ h with high precision already after rather few FIG. 2. Optimization results for finding a vectors q i ∈ R ,such the loss l ρ i ( q i ) (13) is minimized. We generate 10 den-sity matrices ρ i by starting from a randomly generated densitymatrix ρ and iteratively apply three unitary transformationsCNOT, Hadamard H ⊗ I and rotation R / ⊗ I . The opti-mization is stopped once the loss falls below 10 − . Figure( A ) is a histogram of the resulting final losses, figure ( B ) is ahistogram of the required iterations. iterations. We conclude that for for two qubits the bitquantum map is complete.Considering one of the vectors q found in this way asa representation of ρ , our second step asks how a givenunitary transformation of ρ can be represented as a trans-formation of q . Considering q as a classical probabilis-tic object, we ask how the classical system can learn aquantum operation. The goal is to find for any vector q ∈ R a matrix M ∈ R × , such that the trans-formed vector M q yields the density matrix
U ρU † re-lated to ρ by a given unitary transformation U . A sim-ple ”learning” or optimization goal consists in minimiz-ing the Frobenius norm between the two density matrices U ρU † = U ( b ◦ h )( q ) U † ≡ U (( b ◦ h )( q )) and ( b ◦ h )( M q ), l U ( M ) = | U(( b ◦ h )( q )) − ( b ◦ h )( M q ) | (14)Here we use the notation U ( ρ ) = U ρU † . This minimization can again be implemented usinggradient descent. In figure 1 we show the resulting prob-abilities p = h ( q ) , p (cid:48) = h ( M q ) , (15)as well as the corresponding matrices of expectation val-ues and correlations χ = ( g ◦ h )( q ) , χ (cid:48) = ( g ◦ h )( M q ) , (16)when this minimization procedure is applied to two ex-ample initial density matrices and their associated q vec-tors. The unitary transformation U is taken to be CNOT.We have investigated many different ρ and always founda satisfactory M with this procedure. Thus the classicalsystem ”learns” the matrix M necessary for a unitary quantum gate. We emphasize that M depends on q , suchthat M q = M ( q ) q is a non-linear map [1].We finally turn to our third task of implementation byneuromorphic computing. So far the Ising spins s enteredonly indirectly through their state probabilities p adn as-sociated correlations and expectation values χ . We nextwant to study classical systems for which correlations andexpectation values can be determined from temporal av-erages. In the context of neuroscience there is a longhistory of applying spin-glass models to the study of bi-ological neural networks [6]. The general idea is to con-sider each biological neuron to have two states active and silent . A neuron is considered active (or refractory) if ithas produced a spike within a certain sampling time win-dow and silent otherwise. Within this framework exper-imental work has been carried out to study pairwise andhigher correlations of biological neurons (e.g. [10, 11]).One immediate way of obtaining the state probabilities p is from neural sampling [12, 13]. Since we need onlythe expectation values and correlations (1) we directlyfocus on these quantities and do not aim to resolve theprobability distributions completely. The six spin vari-ables s ( i ) k ( t ) are associated to six particular neurons in alarger network, with s = 1 if the neuron is active (refrac-tory) and s = − (cid:104) s ( i ) k (cid:105) = 1 T (cid:90) T s ( i ) k ( t )dt , (17) (cid:104) s (1) k s (2) l (cid:105) = 1 T (cid:90) T s (1) k ( t ) s (2) l ( t )dt . (18)Instead of spin variables s which take values { , − } ,we want to consider variables z with values in { , } , s = 2 z −
1. A given selected neuron has the value z = 1during the refractory period (active state), and z = 0 oth-erwise (silent state). We choose a simple so called leaky-integrate and fire (LIF) neuron model. The model isspecified by 3 n state variables v i , I i , r i , i = 1 . . . n , calledthe membrane voltages, synaptic currents and refractorystates. The dynamics has phases of continuous evolutionand jumps or spikes. The continuous evolution obeys thedifferential equations˙ v = (1 − Θ( r ))( v l − v + I ) (19)˙ I = − I + I in (20)˙ r = − t refrac Θ( r ) . (21)Here Θ denotes the Heaviside function and multiplica-tion in (19) is pointwise for every i separately. Char-acteristic for a spiking neuron model are the jumps ordiscrete changes of the state variables v, I, r at particu-lar times. Whenever one of the membrane voltages v i reaches a threshold ( v th ) i during the continuous evolu-tion v i − ( v th ) i = 0 , (22)the state variables undergo discontinuous jumps. Thetransition equations v + = v − + ξ ( v r − v th ) , (23) I + = I − + W rec · ξ, (24) r + = r − + ξ, (25)specify the state before and after the transition, v ± = v ( t ± ) , I ± = I ( t ± ) and r ± = r ( t ± ). Here ξ ∈ R n isa binary vector, ξ i = 1 for the value of i for which thethreshhold voltage is reached and (22) obeyed, and ξ j = 0for v j (cid:54) = ( v th ) j . In equation (23) the multiplication ispointwise such that only the voltage of the spiking neuronchanges. In (22-25) v th , v r , v l ∈ R n are the threshold , reset and leak potential. Finally I in is an input currentand W rec is a matrix in R n × n which parameterises theresponse of the currents I j of all neurons to the ”firing”of the spiking neuron i .Equation (21) specifies a linear decrease of r i until r i =0 is reached. As long as r i > t refrac after its firing at t k . For this period itsvoltage does not change,˙ v i ([ t k , t k + t refrac ]) = 0 , (26)as implied by equation (19).Ising spins s ( t ) or the associated occupation numbers z ( t ) are defined by z ( t ) = Θ( r ( t )) , (27)which is one during the refractory period and zero oth-erwise. Six selected neurons define the Ising spins s ( i ) k =2 z ( i ) k − ρ byequations (1)(3).The last quantity to be specified is the input currentto the n LIF neurons. We model spike input from m additional input spike sources. At times t l the sourceneuron q l fires. The response of the n LIF neurons isgiven by( I in ) j = (cid:88) l ( W in ) j,q l δ ( t − t l ) q l ∈ , . . . , m. (28)Here δ ( t − t l ) is the Dirac delta distribution, such thatwe model an immediate response of all n LIF-neuronsto every input spike at t l . The n × m matrix W in ∈ R n × m parametrises the height of the jump in the currents I j upon arrival of a spike at input source q l . In ourexperiments the arrival times t l of the input spikes are the result of m independent Poisson point processes withrates λ q , q = 1 . . . m .By solving the differential equations with jumps forgiven input spikes we can compute z j ( t ) and therefore s ( i ) k ( t ) = 2 z ( i ) k ( t ) − . (29)In the experiments we carried out we took n >
6. Thesix spin variables s ( i ) k are recovered by projecting to thefirst six components π : R n → R , ( s , . . . , s n ) (cid:55)→ ( s , . . . , s ) . (30)Given a density matrix ρ , we can now formulate anoptimization problem for W in , W rec . One obstacle in do-ing so is that the jumps introduce discontinuities, whichneed to be taken into account. We choose here to solvethis issue by introducing a smooth approximation to theheaviside function Θ (cid:15) . More specifically we use a fastsigmoid as in [14] with parameter 1 /(cid:15) = 100. This isa common way in which spiking neuron models can bemade amendable to gradient descent optimization [14–17]. The loss function is then l ρ ( W rec , W in ) = | ρ − f ( χ ( π ( s ))) | , (31)where χ is defined as before (1) by the spin expectationvalues and correlations in equations (17-18).In figure 3 we show the result of the optimization pro-cess for two density matrices, together with a view ofthe resulting recurrent weight matrix W rec restricted tothe spins s ( i ) k , as well as the spin expectation and cor-relation matrices χ . The membrane threshold is set toone, v th = 1 , and the leak and reset potentials to zero, v l = v r = 0. We integrate for T = 10 timesteps withan integration step of dt = 1ms. We choose an input di-mension of m = 128 and consider n = 64 recurrently con-nected LIF neurons. We set t refrac = dt (this eliminatesthe need to take the equations for the refractory stateinto account) and draw the input spikes with poissonfrequency λ = 700Hz. The learning rate of the gradientdescent starts at η = 10 and is exponentially decreasedwith decay constant 1 / F ( ρ, σ ) = (tr (cid:112) √ ρσ √ ρ ) . Instead of optimizing for thefidelity directly we minimize the square of the Frobeniusnorm | ρ − σ | . By the Fuchs–van de Graaf inequalities weknow that 1 − (cid:112) F ( ρ, σ ) ≤ | ρ − σ | ≤ (cid:112) − F ( ρ, σ ) . Since the trace norm is in turn bounded by the Frobe-nius norm the fidelity approaches one as the square ofthe Frobenius norm goes to zero. Computing the fidelitydirectly is computationally more expensive and also has
FIG. 3. Example training progress ( A ) for the target densitymatrices of the pure state ψ + = √ ( | (cid:105) + | (cid:105) ) and a ran-domly generated density matrix ρ . The left side ( A ) shows thefidelity as a function of the number of training epochs. Theright displays the corresponding final recurrent weight matri-ces W rec ( B , C ) and spin expectation and correlation matrices χ ( D , E ). the added disadvantage that it isn’t real valued for arbi-trary complex matrices ρ, σ .While the learning for the entangled pure state takessomewhat longer than for the randomly chosen state, itis clear that after a reasonable learning time the neuronaldynamics has adapted in order to represent the quantumdensity matrix with acceptable fidelity. Combined withthe learning of the unitary quantum gates above one con-cludes that this type of neural network can learn unitarytransformations for two qubit quantum systems. Detailsof an optimal learning algorithm for spiking neural net-works remain to be worked out.In this work we focussed on the case of two qubits.The correlation map can be extended to n -qubits. Usinggenerators L µ ...µ n = n (cid:79) i =1 τ µ i µ i = 0 . . . , (32)we write the density matrix as ρ = 12 n χ µ ...µ n L µ ...µ n . (33)The coefficients χ µ ...µ n are determined by correlating 3 n spins as follows: Write s ( i ) µ ( t ) = (1 , s ( i ) k ( t )) with i = 1 ..n , k = 1 .. µ = 0 . . .
3. Define σ µ ...µ n ( t ) = s (1) µ ( t ) . . . s ( n ) µ n ( t ) , (34) and χ µ ...µ n = 1 T (cid:90) T σ µ ...µ n ( t )dt . (35)This procedure involves up to n -point correlations. Sofar it has not been proven that for n > n -qubit quantum systemcan be represented by correlations of a moderate numberof 3 n classical Ising spins.As n grows it becomes increasingly difficult to evaluateall necessary correlations that include up to n -point func-tions. Time averaging of Ising spins representing somequantity above a threshhold ( s j = 1) or below a thresh-hold ( s j = −
1) is comparatively economical in this re-spect. It is sufficient to measure for all Ising spins s j thetimes above or below threshold. With t − j = T − t + j onecan use identities of the type (cid:104) s j (cid:105) = t + j − t − j T = 2 t + j T − , (36) (cid:104) s j s i (cid:105) = 2( t ++ ji + t −− ji ) T − , (37) (cid:104) s k s j s i (cid:105) = 2( t +++ jik + t + −− jik + t − + − jik + t −− + jik ) T − , (38)where t ++ ji is the time when both s j and s i are abovethreshhold and so on.Nevertheless, the number 2 n of elements of the densitymatrix grows very rapidly with an increasing number n of qubits. This issue is common to all approaches whichuse at every step of the computation the full informa-tion about the quantum state. We find it unlikely thatcomputations by real quantum systems or artificial neu-rons need the full information contained in the densitymatrix ρ . It becomes then an important task to find outwhich part of the probabilistic information is involved forpractical questions.As an example we consider the search for the groundstate energy of the one-dimensional quantum Ising modelwith Hamiltonian H = − J (cid:88) i =1 ,...,n τ ( i )3 τ ( i +1)3 − h (cid:88) i τ ( i )1 , (39)one of the benchmark tasks in [3]. The goal is to find adensity matrix ρ , such that tr( Hρ ) is minimized. Thisrequires only a small subset of the χ µ ...µ n , namely those n two-point functions for which µ i = µ i +1 = 3 and allother µ k = 0, as well as n expectation values for which µ i = 1 and all other µ k = 0. For each optimization stepone only needs to extract for n pairs of neighboring spinsthe times t (3)++ i,i +1 and t (3) −− i,i +1 , as well as t (1)+ i . This seemsrealistic for a rather large number of qubits n . It remainsto be seen whether an implementation in terms of recur-rently connected spiking neurons yields good results inthis case.The computation of the expectation values χ µ ...µ n in-volves purely classical probabilistic settings, as neuronsfiring in biological or artificial systems. No low temper-ature or high degree of isolation of small subsystems isneeded for such a realisation of quantum features. Ourfindings are an example how quantum evolution can berealised by probabilistic classical systems [19, 20].This work was supported by the EU Horizon 2020framework program under grant agreement 720270 (Hu-man Brain Project), the Heidelberg Graduate School forFundamental Physics (HGSFP) and the DFG Collabora-tive Research Centre SFB 1225 (ISOQUANT). ∗ [email protected] † [email protected][1] C. Wetterich, Nucl. Phys. B , 114776 (2019),arXiv:1806.05960 [quant-ph].[2] C. Pehle, K. Meier, M. Oberthaler, and C. Wetterich,arXiv preprint arXiv:1810.10335 (2018).[3] G. Carleo and M. Troyer, Science , 602 (2017),arXiv:1606.02318 [cond-mat.dis-nn].[4] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer,R. Melko, and G. Carleo, Nature Physics , 447 (2018).[5] Y. Levine, O. Sharir, N. Cohen, and A. Shashua, Phys-ical review letters , 065301 (2019).[6] J. J. Hopfield, Proceedings of the national academy ofsciences , 2554 (1982).[7] D. F. V. James, P. G. Kwiat, W. J. Munro, and A. G.White, Phys. Rev. A , 052312 (2001).[8] M. Paris and J. Rehacek, Quantum state estimation , Vol. 649 (Springer Science & Business Media, 2004).[9] N. Kiesel,
Experiments on multiphoton entanglement ,Ph.D. thesis, lmu (2007).[10] E. Schneidman, M. J. Berry, R. Segev, and W. Bialek,Nature , 1007 (2006).[11] I. E. Ohiorhenuan, F. Mechler, K. P. Purpura, A. M.Schmid, Q. Hu, and J. D. Victor, Nature , 617(2010).[12] L. Buesing, J. Bill, B. Nessler, and W. Maass, PLOSComputational Biology , 1 (2011).[13] M. A. Petrovici, J. Bill, I. Bytschok, J. Schemmel, andK. Meier, Phys. Rev. E , 042312 (2016).[14] F. Zenke and S. Ganguli, Neural computation , 1514(2018).[15] S. K. Esser, P. A. Merolla, J. V. Arthur, A. S.Cassidy, R. Appuswamy, A. Andreopoulos, D. J.Berg, J. L. McKinstry, T. Melano, D. R. Barch,C. di Nolfo, P. Datta, A. Amir, B. Taba, M. D.Flickner, and D. S. Modha, Proceedings of theNational Academy of Sciences Advances in Neural Information ProcessingSystems (2018) pp. 787–797.[17] E. O. Neftci, H. Mostafa, and F. Zenke, IEEE SignalProcessing Magazine , 51 (2019).[18] J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson,C. Leary, D. Maclaurin, and S. Wanderman-Milne,“JAX: composable transformations of Python+NumPyprograms,” (2018).[19] C. Wetterich, Nuclear Physics B , 35 (2018),arXiv:1611.04820.[20] C. Wetterich, Annals of Physics393